monmap drops rebooted mon if deployed via label
In Ceph Pacific 16.2.4, I assigned 5 mons via 'ceph orch apply mon label:mon' normally using docker images on the hosts with the 'mon' tag.
When I reboot any host, the monmap drops that mon deployed on that host, though it remains in systemd and the dashboard lists it as 'running'. Further 'ceph orch apply..' commands do not re-launch the mon on the host. The 'mon' tag is still listed on that host.
If I remove the tag, wait for the dashboard to remove the mon (listed as 'running', though not in the monmap), the re-add the tag to that host, the mon redeploys and operates normally and is listed in the monmap.
I've repeated this now about 5 times, it happens without regard to which host is rebooted.
#2 Updated by Sebastian Wagner 11 days ago
- Status changed from New to Need More Info
can you run https://gist.github.com/sebastian-philipp/8e18f4815e90dc0f51fe3fbff8c8aae5 and attach the result? Also having the monmap before and after would be helpful.
#3 Updated by Harry Coin 11 days ago
Yes, and the results are attached. This is a little sandbox system in a workshop, 4 of 5 hosts running osds, 5 of 5 hosts running mons.
This is 100% repeatable and very easy to reproduce on your own: just assign the mon hosts the tag 'mon', then do ceph orch apply mon label:mon, wait for it all to sync up, reboot one of them, notice the monmap has dropped the rebooted system and notice on the rebooted system the dashboard lists the mon has having 'stopped'.
To recover, delete the mon tag from the host, notice mon listed as 'stopped' is then removed from the host (the reduced mon map hasn't changed), add the 'mon' tag back to the host and notice the restoration of operations (both monmap and running container) as they were prior to the mon host's reboot.
In the case I ran here for you just now: The relevant syslog entries after rebooting a host (it does not matter which host running a mon gets rebooted) are:
Jun 8 10:28:55 noc4 bash10918: Stalls(count): 0 level0_slowdown, 0 level0_slowdown_with_compaction, 0 level0_numfiles, 0 level0_numfiles_with_compaction, 0 stop for pending_compaction_bytes, 0 slowdown for pending_compaction_bytes, 0 memtable_compaction, 0 memtable_slowdown, interval 0 total count
Jun 8 10:28:55 noc4 bash10918: * File Read Latency Histogram By Level [default] *
Jun 8 10:28:55 noc4 bash10918: debug 2021-06-08T15:28:55.048+0000 7f7f9ff4e700 0 mon.noc4 does not exist in monmap, will attempt to join an existing cluster
Jun 8 10:28:55 noc4 bash10918: debug 2021-06-08T15:28:55.048+0000 7f7f9ff4e700 0 using public_addr v2:[fc00:1002:c7::44]:0/0 -> [v2:[fc00:1002:c7::44]:3300/0,v1:[fc00:1002:c7::44]:6789/0]
Jun 8 10:28:55 noc4 bash10918: debug 2021-06-08T15:28:55.048+0000 7f7f9ff4e700 0 starting mon.noc4 rank -1 at public addrs [v2:[fc00:1002:c7::44]:3300/0,v1:[fc00:1002:c7::44]:6789/0] at bind addrs [v2:[fc00:1002:c7::44]:3300/0,v1:[fc00:1002:c7::44]:6789/0] mon_data /var/lib/ceph/mon/ceph-noc4 fsid 4067126d-01cb-40af-824a-881c130140f8
Jun 8 10:28:55 noc4 bash10918: debug 2021-06-08T15:28:55.052+0000 7f7f9ff4e700 1 mon.noc4@-1(?) e64 preinit fsid 4067126d-01cb-40af-824a-881c130140f8
Jun 8 10:28:55 noc4 bash10918: debug 2021-06-08T15:28:55.052+0000 7f7f9ff4e700 -1 mon.noc4@-1(?) e64 not in monmap and have been in a quorum before; must have been removed
Jun 8 10:28:55 noc4 bash10918: debug 2021-06-08T15:28:55.052+0000 7f7f9ff4e700 -1 mon.noc4@-1(???) e64 commit suicide!
Jun 8 10:28:55 noc4 bash10918: debug 2021-06-08T15:28:55.052+0000 7f7f9ff4e700 -1 failed to initialize
Jun 8 10:28:55 noc4 dockerd1457: time="2021-06-08T10:28:55.127175846-05:00" level=info msg="ignoring event" container=b1b05c4f42153526d5a924e6870cc8a0a79c1bbfc3eb2d220395de2f38f6ba45 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
But of course, the mon label is still there, it was never removed.
2>&1 | tee sphil_$1
To fix the misleading fix suggested in the log entry complaining of missing cephadm access to root on the hosts, I did:
chown cephadm /etc/ceph/ceph.client.admin.keyring
The proper keys were in the authorized_keys files for cephadm in /root/.ssh/authorized_keys all along.
Uploaded before reboot & after rebooting host noc4, which had a running mon docker daemon prior, but not after reboot.