Bug #48975
closedafter setting certificate only one rgw pods starts
0%
Description
Hi
I am on ceph 15.2.8 with 2 rgw pods defined.
after we switched to ssl and added the certificate/key to the config database with
ceph config-key set rgw/cert/adcubum/enge_u227.crt -i /root/<cert_file>
ceph config-key set rgw/cert/adcubum/enge_u227.key -i /root/<key_file>
ceph config set client.rgw.<rgw_realm>.<rgw_zone> rgw_frontends "beast port=80 ssl_port=443 ssl_certificate=config://rgw/cert/adcubum/enge_u227.crt ssl_private_key=config://rgw/cert/adcubum/enge_u227.key"
only one of the two rgw's is starting while the other complains about the certificate:
2021-01-22T15:58:31.940+0000 7f1fc4af2280 0 deferred set uid:gid to 167:167 (ceph:ceph)
2021-01-22T15:58:31.940+0000 7f1fc4af2280 0 ceph version 15.2.8 (bdf3eebcd22d7d0b3dd4d5501bee5bac354d5b55) octopus (stable), process radosgw, pid 1
2021-01-22T15:58:31.940+0000 7f1fc4af2280 0 framework: beast
2021-01-22T15:58:31.940+0000 7f1fc4af2280 0 framework conf key: port, val: 80
2021-01-22T15:58:31.940+0000 7f1fc4af2280 0 framework conf key: ssl_port, val: 443
2021-01-22T15:58:31.940+0000 7f1fc4af2280 0 framework conf key: ssl_certificate, val: config://rgw/cert/adcubum/enge_u227.crt
2021-01-22T15:58:31.940+0000 7f1fc4af2280 0 framework conf key: ssl_private_key, val: config://rgw/cert/adcubum/enge_u227.key
2021-01-22T15:58:31.940+0000 7f1fc4af2280 1 radosgw_Main not setting numa affinity
2021-01-22T15:58:32.144+0000 7f1fc4af2280 0 framework: beast
2021-01-22T15:58:32.144+0000 7f1fc4af2280 0 framework conf key: ssl_certificate, val: config://rgw/cert/$realm/$zone.crt
2021-01-22T15:58:32.144+0000 7f1fc4af2280 0 framework conf key: ssl_private_key, val: config://rgw/cert/$realm/$zone.key
2021-01-22T15:58:32.144+0000 7f1fc4af2280 0 starting handler: beast
2021-01-22T15:58:32.146+0000 7f1fc4af2280 -1 ssl_private_key was not found: rgw/cert/adcubum/enge_u227.key
2021-01-22T15:58:32.147+0000 7f1fc4af2280 -1 ssl_private_key was not found: rgw/cert/adcubum/enge_u227.crt
2021-01-22T15:58:32.147+0000 7f1fc4af2280 -1 no ssl_certificate configured for ssl_port
2021-01-22T15:58:32.147+0000 7f1fc4af2280 -1 ERROR: failed initializing frontend
- ceph config-key get rgw/cert/adcubum/enge_u227.crt
obtained 'rgw/cert/adcubum/enge_u227.crt'
-----BEGIN CERTIFICATE-----
....
- ceph config-key get rgw/cert/adcubum/enge_u227.key
obtained 'rgw/cert/adcubum/enge_u227.key'
-----BEGIN RSA PRIVATE KEY-----
....
Not sure what we are missing here.
Files
Updated by Casey Bodley about 3 years ago
- Status changed from New to Triaged
- Assignee set to Mark Kogan
Updated by Mark Kogan about 3 years ago
Please provide details of the setup to work on reproducing ...
(for example is it a rook or podman environment?)
(a reproducer comamnds flow would accelerate the debugging)
Thanks
Updated by Patrik Fürer about 3 years ago
cluster is setup with cephadm and podman
used this command to deploy rgw:
ceph orch apply rgw adcubum enge_u227 --placement="2 host1 host2"
both rgw daemons have been running fine (although not used by the users).
then the request was coming to switch to a secure communication so took a wildcard certificate (and its key) for the domain and copied it to /var/lib/ceph/<fsid>/home/ on the administration node
openened a cephadm shell and did run the commands:
ceph config-key set rgw/cert/adcubum/enge_u227.crt -i /root/cert.pem
ceph config-key set rgw/cert/adcubum/enge_u227.key -i /root/cert.key
ceph config set client.rgw.adcubum.enge_u227 rgw_frontends "beast port=80 ssl_port=443 ssl_certificate=config://rgw/cert/adcubum/enge_u227.crt ssl_private_key=config://rgw/cert/adcubum/enge_u227.key"
then I restarted the rgw daemons with
ceph orch restart rgw
only 1 of the two daemons is starting anymore with one giving the error from the ticket description.
other rgw is running fine.
Updated by Mark Kogan about 3 years ago
could the issue be that both rgws try to bind to the same port?
is it possible to please change one of the rgw's to bind to different ports?
(for example:)
rgw #1 rgw_frontends "beast port=80 ssl_port=443 ssl_certificate=config://rgw/cert/adcubum/enge_u227.crt ssl_private_key=config://rgw/cert/adcubum/enge_u227.key" rgw #2 rgw_frontends "beast port=81 ssl_port=444 ssl_certificate=config://rgw/cert/adcubum/enge_u227.crt ssl_private_key=config://rgw/cert/adcubum/enge_u227.key"
provided that 81 and 444 are unused per
netstat -nap
and if not resolved please attach logs with
debug_rgw=20
Updated by Patrik Fürer about 3 years ago
- File ceph-client.rgw.adcubum.enge_u227.adzh-srlp-cdn02.hfqhki.log ceph-client.rgw.adcubum.enge_u227.adzh-srlp-cdn02.hfqhki.log added
as the pods are running on different nodes that should not matter and logs do not indicate such a thing.
i have attached the log from the node where rgw is failing
Updated by Patrik Fürer about 3 years ago
I tried to remove the rgw and configured only port 8080 and did a apply rgw and no pods are deployed anymore. so seems ssl setting is no issue.
Probably best to close this case and I will check again and open a new issue if problem with deployment persists.
Updated by Patrik Fürer about 3 years ago
btw. I found the issue. it was because the health status was on HEALTH_WARN and no HEALTH_OK and the orchestrator was not willing to deploy the rgw when health is not ok.
Updated by Casey Bodley over 2 years ago
- Status changed from Triaged to Closed
Patrik Fürer wrote:
btw. I found the issue. it was because the health status was on HEALTH_WARN and no HEALTH_OK and the orchestrator was not willing to deploy the rgw when health is not ok.
great thanks