https://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2021-06-04T23:03:12ZCeph Orchestrator - Bug #51027: monmap drops rebooted mon if deployed via labelhttps://tracker.ceph.com/issues/51027?journal_id=1965562021-06-04T23:03:12ZNeha Ojhanojha@redhat.com
<ul><li><strong>Project</strong> changed from <i>Ceph</i> to <i>Orchestrator</i></li></ul> Orchestrator - Bug #51027: monmap drops rebooted mon if deployed via labelhttps://tracker.ceph.com/issues/51027?journal_id=1966852021-06-08T10:25:14ZSebastian Wagner
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>Need More Info</i></li></ul><p>can you run <a class="external" href="https://gist.github.com/sebastian-philipp/8e18f4815e90dc0f51fe3fbff8c8aae5">https://gist.github.com/sebastian-philipp/8e18f4815e90dc0f51fe3fbff8c8aae5</a> and attach the result? Also having the monmap before and after would be helpful.</p> Orchestrator - Bug #51027: monmap drops rebooted mon if deployed via labelhttps://tracker.ceph.com/issues/51027?journal_id=1967732021-06-08T16:01:37ZHarry Coin
<ul><li><strong>File</strong> <a href="/attachments/download/5555/sphil_after">sphil_after</a> added</li><li><strong>File</strong> <a href="/attachments/download/5554/sphil_before">sphil_before</a> added</li></ul><p>Yes, and the results are attached. This is a little sandbox system in a workshop, 4 of 5 hosts running osds, 5 of 5 hosts running mons.</p>
<p>This is 100% repeatable and very easy to reproduce on your own: just assign the mon hosts the tag 'mon', then do ceph orch apply mon label:mon, wait for it all to sync up, reboot one of them, notice the monmap has dropped the rebooted system and notice on the rebooted system the dashboard lists the mon has having 'stopped'.</p>
<p>To recover, delete the mon tag from the host, notice mon listed as 'stopped' is then removed from the host (the reduced mon map hasn't changed), add the 'mon' tag back to the host and notice the restoration of operations (both monmap and running container) as they were prior to the mon host's reboot.</p>
<p>In the case I ran here for you just now: The relevant syslog entries after rebooting a host (it does not matter which host running a mon gets rebooted) are:<br />...<br />Jun 8 10:28:55 noc4 bash<sup><a href="#fn10918">10918</a></sup>: Stalls(count): 0 level0_slowdown, 0 level0_slowdown_with_compaction, 0 level0_numfiles, 0 level0_numfiles_with_compaction, 0 stop for pending_compaction_bytes, 0 slowdown for pending_compaction_bytes, 0 memtable_compaction, 0 memtable_slowdown, interval 0 total count<br />Jun 8 10:28:55 noc4 bash<sup><a href="#fn10918">10918</a></sup>: <strong>* File Read Latency Histogram By Level [default] *</strong><br />Jun 8 10:28:55 noc4 bash<sup><a href="#fn10918">10918</a></sup>: debug 2021-06-08T15:28:55.048+0000 7f7f9ff4e700 0 mon.noc4 does not exist in monmap, will attempt to join an existing cluster<br />Jun 8 10:28:55 noc4 bash<sup><a href="#fn10918">10918</a></sup>: debug 2021-06-08T15:28:55.048+0000 7f7f9ff4e700 0 using public_addr v2:[fc00:1002:c7::44]:0/0 -> [v2:[fc00:1002:c7::44]:3300/0,v1:[fc00:1002:c7::44]:6789/0]<br />Jun 8 10:28:55 noc4 bash<sup><a href="#fn10918">10918</a></sup>: debug 2021-06-08T15:28:55.048+0000 7f7f9ff4e700 0 starting mon.noc4 rank -1 at public addrs [v2:[fc00:1002:c7::44]:3300/0,v1:[fc00:1002:c7::44]:6789/0] at bind addrs [v2:[fc00:1002:c7::44]:3300/0,v1:[fc00:1002:c7::44]:6789/0] mon_data /var/lib/ceph/mon/ceph-noc4 fsid 4067126d-01cb-40af-824a-881c130140f8<br />Jun 8 10:28:55 noc4 bash<sup><a href="#fn10918">10918</a></sup>: debug 2021-06-08T15:28:55.052+0000 7f7f9ff4e700 1 mon.noc4@-1(<cite>?) e64 preinit fsid 4067126d-01cb-40af-824a-881c130140f8<br />Jun 8 10:28:55 noc4 bash<sup><a href="#fn10918">10918</a></sup>: debug 2021-06-08T15:28:55.052+0000 7f7f9ff4e700 -1 mon.noc4@-1(</cite>?) e64 not in monmap and have been in a quorum before; must have been removed<br />Jun 8 10:28:55 noc4 bash<sup><a href="#fn10918">10918</a></sup>: debug 2021-06-08T15:28:55.052+0000 7f7f9ff4e700 -1 mon.noc4@-1(???) e64 commit suicide!<br />Jun 8 10:28:55 noc4 bash<sup><a href="#fn10918">10918</a></sup>: debug 2021-06-08T15:28:55.052+0000 7f7f9ff4e700 -1 failed to initialize<br />Jun 8 10:28:55 noc4 dockerd<sup><a href="#fn1457">1457</a></sup>: time="2021-06-08T10:28:55.127175846-05:00" level=info msg="ignoring event" container=b1b05c4f42153526d5a924e6870cc8a0a79c1bbfc3eb2d220395de2f38f6ba45 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete" <br />...</p>
<p>But of course, the mon label is still there, it was never removed.</p>
<p>Adding { <br />your script</p>
<blockquote>
<p>2>&1 | tee sphil_$1</p>
</blockquote>
<p>To fix the misleading fix suggested in the log entry complaining of missing cephadm access to root on the hosts, I did:<br />chown cephadm /etc/ceph/ceph.client.admin.keyring <br />The proper keys were in the authorized_keys files for cephadm in /root/.ssh/authorized_keys all along.</p>
<p>Uploaded before reboot & after rebooting host noc4, which had a running mon docker daemon prior, but not after reboot.</p> Orchestrator - Bug #51027: monmap drops rebooted mon if deployed via labelhttps://tracker.ceph.com/issues/51027?journal_id=1967742021-06-08T16:07:35ZHarry Coin
<ul></ul><p>P.S. It might be a good idea to think of a better debug log message phrase than 'commit suicide'.</p> Orchestrator - Bug #51027: monmap drops rebooted mon if deployed via labelhttps://tracker.ceph.com/issues/51027?journal_id=1980672021-06-29T08:11:22ZLoïc Dacharyloic@dachary.org
<ul><li><strong>Target version</strong> deleted (<del><i>v16.2.5</i></del>)</li></ul> Orchestrator - Bug #51027: monmap drops rebooted mon if deployed via labelhttps://tracker.ceph.com/issues/51027?journal_id=1991852021-07-15T13:45:38ZHarry Coin
<ul></ul><p>I think it's a mistake to put this in the 'orchestrator' problem list, because I think the logic that decides whether a mon should commit suicide lives in the mon -- and it doesn't consider whether the mon exists in the monmap because of a label. So it removes itself improperly when it finds it's not in 'the monmap' -- except it ought to be in the monmap.</p>
<p>The failure description:<br />Jul 15 08:40:25 noc1.1.quietfountain.com bash<sup><a href="#fn193661">193661</a></sup>: debug 2021-07-15T13:40:25.066+0000 7f516385d700 0 mon.noc1 does not exist in monmap, will attempt to join an existing cluster<br />Jul 15 08:40:25 noc1.1.quietfountain.com bash<sup><a href="#fn193661">193661</a></sup>: debug 2021-07-15T13:40:25.066+0000 7f516385d700 0 using public_addr v2:[fc00:1002:c7::41]:0/0 -> [v2:[fc00:1002:c7::41]:3300/0,v1:[fc00:1002:c7::41]:6789/0]<br />Jul 15 08:40:25 noc1.1.quietfountain.com bash<sup><a href="#fn193661">193661</a></sup>: debug 2021-07-15T13:40:25.070+0000 7f516385d700 0 starting mon.noc1 rank -1 at public addrs [v2:[fc00:1002:c7::41]:3300/0,v1:[fc00:1002:c7::41]:6789/0] at bind addr><br />Jul 15 08:40:25 noc1.1.quietfountain.com bash<sup><a href="#fn193661">193661</a></sup>: debug 2021-07-15T13:40:25.070+0000 7f516385d700 1 mon.noc1@-1(<cite>?) e88 preinit fsid 4067126d-01cb-40af-824a-881c130140f8<br />Jul 15 08:40:25 noc1.1.quietfountain.com bash<sup><a href="#fn193661">193661</a></sup>: debug 2021-07-15T13:40:25.074+0000 7f516385d700 -1 mon.noc1@-1(</cite>?) e88 not in monmap and have been in a quorum before; must have been removed<br />Jul 15 08:40:25 noc1.1.quietfountain.com bash<sup><a href="#fn193661">193661</a></sup>: debug 2021-07-15T13:40:25.074+0000 7f516385d700 -1 mon.noc1@-1(???) e88 commit suicide!<br />Jul 15 08:40:25 noc1.1.quietfountain.com bash<sup><a href="#fn193661">193661</a></sup>: debug 2021-07-15T13:40:25.074+0000 7f516385d700 -1 failed to initialize</p> Orchestrator - Bug #51027: monmap drops rebooted mon if deployed via labelhttps://tracker.ceph.com/issues/51027?journal_id=1991862021-07-15T13:48:01ZHarry Coin
<ul></ul><p>Still a problem in Pacific 16.2.5. Pretty much makes the 'assignment of mons by label' useless since the mon is lost upon host reboot.</p> Orchestrator - Bug #51027: monmap drops rebooted mon if deployed via labelhttps://tracker.ceph.com/issues/51027?journal_id=2004662021-08-04T13:20:36ZSebastian Wagner
<ul><li><strong>Related to</strong> <i><a class="issue tracker-1 status-1 priority-4 priority-default" href="/issues/50272">Bug #50272</a>: cephadm: after downsizing mon service from 5 to 3 daemons, cephadm reports "stray" daemons</i> added</li></ul> Orchestrator - Bug #51027: monmap drops rebooted mon if deployed via labelhttps://tracker.ceph.com/issues/51027?journal_id=2004812021-08-04T14:44:35ZSebastian Wagner
<ul><li><strong>Assignee</strong> set to <i>Adam King</i></li><li><strong>Priority</strong> changed from <i>Normal</i> to <i>High</i></li></ul> Orchestrator - Bug #51027: monmap drops rebooted mon if deployed via labelhttps://tracker.ceph.com/issues/51027?journal_id=2006572021-08-05T20:21:19ZAdam King
<ul><li><strong>Status</strong> changed from <i>Need More Info</i> to <i>In Progress</i></li><li><strong>Pull request ID</strong> set to <i>42690</i></li></ul> Orchestrator - Bug #51027: monmap drops rebooted mon if deployed via labelhttps://tracker.ceph.com/issues/51027?journal_id=2011922021-08-12T16:56:42ZDavid Ormanormandj@corenode.com
<ul></ul><p>We can confirm this impacts 16.2.5 clusters. On host failures/reboots, we have to undeploy/redeploy monitors, which is quite dangerous when considering some of the potential failure scenarios.</p> Orchestrator - Bug #51027: monmap drops rebooted mon if deployed via labelhttps://tracker.ceph.com/issues/51027?journal_id=2014132021-08-17T15:22:46ZStefan Fleischmann
<ul></ul><p>Is there any workaround for this other than redeploying? As David said this is dangerous. We had quite some trouble to recover after a hardware failure and some unexpected reboots.</p> Orchestrator - Bug #51027: monmap drops rebooted mon if deployed via labelhttps://tracker.ceph.com/issues/51027?journal_id=2014142021-08-17T15:26:31ZHarry Coin
<ul></ul><p>If you want to use the label deployment feature: Not that I was able to find. It's a real problem. And it's been allowed to sit out there a long time. This is one of the reasons the folks avoid the whole 'container and orchestrator drama'. How was it even possible the testing routines didn't notice 'hey, you lose a monitor on reboot' before release?</p> Orchestrator - Bug #51027: monmap drops rebooted mon if deployed via labelhttps://tracker.ceph.com/issues/51027?journal_id=2014422021-08-18T10:19:08ZCory Snydercsnyder@iland.com
<ul><li><strong>Backport</strong> set to <i>pacific</i></li></ul> Orchestrator - Bug #51027: monmap drops rebooted mon if deployed via labelhttps://tracker.ceph.com/issues/51027?journal_id=2014432021-08-18T10:20:30ZCory Snydercsnyder@iland.com
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Pending Backport</i></li></ul> Orchestrator - Bug #51027: monmap drops rebooted mon if deployed via labelhttps://tracker.ceph.com/issues/51027?journal_id=2025912021-09-07T13:30:22ZSebastian Wagner
<ul><li><strong>Status</strong> changed from <i>Pending Backport</i> to <i>Resolved</i></li></ul> Orchestrator - Bug #51027: monmap drops rebooted mon if deployed via labelhttps://tracker.ceph.com/issues/51027?journal_id=2066572021-11-26T12:18:29ZSebastian Wagner
<ul><li><strong>Related to</strong> <i><a class="issue tracker-1 status-1 priority-4 priority-default" href="/issues/53033">Bug #53033</a>: cephadm removes MONs during upgrade 15.2.14 > 16.2.6 which leads to failed quorum and broken cluster</i> added</li></ul> Orchestrator - Bug #51027: monmap drops rebooted mon if deployed via labelhttps://tracker.ceph.com/issues/51027?journal_id=2164702022-05-19T08:19:56ZThomas Roth
<ul></ul><p>Interesting that this was changed to Resolved 8 months ago - we have a test cluster installed with 16.2.7 from the start, and this behaviour is still there!</p>
<pre>
lxmon1:~# cephadm bootstrap --mon-ip 10.20.2.161
</pre><br />Use user cephadm and distribute key:<br /><pre>
lxmon1:~# cp /etc/ceph/ceph.pub /var/lib/cephadm/.ssh/authorized_keys
lxmon1:~# scp /etc/ceph/ceph.pub lxmon2:/var/lib/cephadm/.ssh/authorized_keys
lxmon1:~# ceph cephadm set-user cephadm
</pre>
<p>Add mon<br /><pre>
lxmon1:~# ceph orch host add lxmon2 10.20.2.162
</pre></p>
<p>Add OSDs, pools, radosgw, cephfs...</p>
<p>Reboot lxmon1 - all hosts Offline!</p>
<pre>
lxmon1:~# ceph orch host ls
HOST ADDR LABELS STATUS
lxmon1 10.20.2.161 _admin Offline
lxmon2 10.20.2.162 _admin Offline
...
</pre>
<p>This is ridiculous.</p>