https://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2012-11-15T07:39:38ZCeph Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=142932012-11-15T07:39:38ZJoao Eduardo Luis
<ul><li><strong>Category</strong> set to <i>Monitor</i></li><li><strong>Status</strong> changed from <i>New</i> to <i>In Progress</i></li><li><strong>Assignee</strong> set to <i>Joao Eduardo Luis</i></li><li><strong>Priority</strong> changed from <i>Normal</i> to <i>High</i></li></ul><p>Thanks for reporting it. Will look into it.</p>
<p>By the way, this is what the crash looks like:</p>
<pre>
-171> 2012-11-15 18:32:33.937581 7f438b594700 -1 mon/PaxosService.cc: In function 'void PaxosService::propose_pending()' thread 7f438b594700 time 2012-11-15 18:32:33.936425
mon/PaxosService.cc: 110: FAILED assert(have_pending)
ceph version 0.54 (commit:60b84b095b1009a305d4d6a5b16f88571cbd3150)
1: (PaxosService::propose_pending()+0x29b) [0x4bad8b]
2: (MDSMonitor::tick()+0xda3) [0x4f08e3]
3: (MDSMonitor::on_active()+0x11) [0x4e8a51]
4: (PaxosService::_active()+0x22a) [0x4bb02a]
5: (Context::complete(int)+0xa) [0x49d46a]
6: (finish_contexts(CephContext*, std::list<Context*, std::allocator<Context*> >&, int)+0xbc) [0x4a20ac]
7: (Monitor::recovered_leader(int)+0x36d) [0x49135d]
8: (Paxos::handle_accept(MMonPaxos*)+0x511) [0x4b5691]
9: (Paxos::dispatch(PaxosServiceMessage*)+0x20b) [0x4b8d7b]
10: (Monitor::_ms_dispatch(Message*)+0xb7a) [0x49ca3a]
11: (Monitor::ms_dispatch(Message*)+0x32) [0x4ae942]
12: (DispatchQueue::entry()+0x68a) [0x648e1a]
13: (DispatchQueue::DispatchThread::entry()+0xd) [0x5e505d]
14: (()+0x6b50) [0x7f4391974b50]
15: (clone()+0x6d) [0x7f439032570d]
</pre> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=143502012-11-23T10:45:19ZJoao Eduardo Luis
<ul></ul><p>It appears that this crash is somewhat related with an election, and then something made the 'have_pending' variable to become false after 'PaxosService::_active()'.</p>
<p>Given the log level on the monitor, there is no way to assess if a new proposal was created, but here's the thing: if 'have_pending' was false from the get go, then a new pending would be created in PaxosService::_active(); if it was true, then a new proposal would have to be proposed mid-'::_active()' and make us incur in some sort of weird race -- but there is <strong>no evidence</strong> that such thing happened.</p>
<p>Looking at the log we notice that, in the minutes before the failed assertion, there was a couple of successive elections:</p>
<pre>
2012-11-15 11:42:09.520194 7f76b45db700 -1 mon.a@1(peon) e1 *** Got Signal Terminated ***
2012-11-15 11:42:09.520217 7f76b45db700 1 mon.a@1(peon) e1 shutdown
[monitor starts]
2012-11-15 11:42:22.200331 7f82b63ee780 1 mon.a@-1(probing) e1 init fsid 24f77b4c-ce12-4071-9746-affad8c0c3ba
2012-11-15 11:42:22.297173 7f82b63ee780 0 mon.a@-1(probing) e1 my rank is now 1 (was -1)
2012-11-15 11:42:22.298234 7f82ae9e2700 1 mon.a@1(probing) e1 discarding message auth(proto 0 30 bytes epoch 1) v1 and sending client elsewhere; we are not in quorum
2012-11-15 11:42:22.298269 7f82ae9e2700 1 mon.a@1(probing) e1 discarding message auth(proto 0 30 bytes epoch 1) v1 and sending client elsewhere; we are not in quorum
[snip duplicate messages]
2012-11-15 11:42:26.299568 7f82ae9e2700 0 log [INF] : mon.a calling new monitor election
[snip more discarded auth messages]
[election finished, mon.a is a peon]
2012-11-15 11:42:36.879704 7f82ae9e2700 -1 mon.a@1(peon).paxos(logm updating c 2693275..2693776) lease_expire from mon.0 85.25.26.225:6789/0 is 0.765689 seconds in the past; mons are laggy or clocks are too skewed
2012-11-15 11:42:38.163912 7f82ae9e2700 -1 mon.a@1(peon).paxos(auth updating c 5710..5731) lease_expire from mon.0 85.25.26.225:6789/0 is 0.607050 seconds in the past; mons are laggy or clocks are too skewed
2012-11-15 11:42:38.215428 7f82ae9e2700 -1 mon.a@1(peon).paxos(pgmap active c 2639017..2639518) lease_expire from mon.0 85.25.26.225:6789/0 is 0.486926 seconds in the past; mons are laggy or clocks are too skewed
2012-11-15 11:42:38.215598 7f82ae9e2700 -1 mon.a@1(peon).paxos(mdsmap active c 1..5608) lease_expire from mon.0 85.25.26.225:6789/0 is 0.487026 seconds in the past; mons are laggy or clocks are too skewed
2012-11-15 11:42:41.804620 7f82ae9e2700 1 mon.a@1(peon).osd e11052 e11052: 4 osds: 3 up, 4 in
2012-11-15 11:42:45.802772 7f82ae9e2700 1 mon.a@1(peon).osd e11053 e11053: 4 osds: 2 up, 4 in
2012-11-15 11:42:48.219469 7f82ae9e2700 1 mon.a@1(peon).osd e11054 e11054: 4 osds: 4 up, 4 in
2012-11-15 11:42:50.223239 7f82ae9e2700 1 mon.a@1(peon).osd e11055 e11055: 4 osds: 3 up, 4 in
2012-11-15 11:43:10.140877 7f82ae9e2700 0 log [INF] : mon.a calling new monitor election
2012-11-15 11:43:16.255093 7f82afcf5700 0 -- 188.138.84.236:6789/0 >> 85.25.26.225:6789/0 pipe(0x1d3d8c0 sd=24 :6789 pgs=12 cs=1 l=0).fault with nothing to send, going to standby
2012-11-15 11:43:18.312369 7f82ae9e2700 1 mon.a@1(peon).osd e11056 e11056: 4 osds: 4 up, 4 in
2012-11-15 11:43:19.007349 7f82ae9e2700 1 mon.a@1(peon).osd e11057 e11057: 4 osds: 4 up, 4 in
2012-11-15 11:43:26.848048 7f82ae9e2700 0 log [INF] : mon.a calling new monitor election
[new election underway]
2012-11-15 11:43:42.271375 7f82ae9e2700 1 mon.a@1(electing) e1 discarding message mon_subscribe({monmap=2+,osdmap=11058}) v2 and sending client elsewhere; we are not in quorum
2012-11-15 11:43:47.373292 7f82ae1e1700 0 log [INF] : mon.a@1 won leader election with quorum 1,2
</pre>
<p>Soon after, the assertion failed.</p>
<p>Given the code path, there's a chance that a Paxos proposal was underway too. The only potential reference to this comes in the form of a 'OSDMonitor::prepare_failure()' moments before the crash:</p>
<pre>
2012-11-15 11:43:56.324468 7f82ae9e2700 1 mon.a@1(leader).osd e11058 prepare_failure osd.0 188.138.84.236:6801/12408 from osd.3 85.25.99.55:6800/29613 is reporting failure:1
2012-11-15 11:43:56.324486 7f82ae9e2700 0 log [DBG] : osd.0 188.138.84.236:6801/12408 reported failed by osd.3 85.25.99.55:6800/29613
</pre>
<p>But given that we have no more informations than this, the only thing that makes sense is that there was an on-going MDSMonitor proposal when we woke everyone up in 'Monitor::recovered_leader()'. I'm working on a patch that will make sure that we only wake up a paxos machine's callbacks when the paxos machine is active, and this should not happen again, but will be tough to confirm this as the root cause.</p> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=143512012-11-23T11:21:36ZJoao Eduardo Luis
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Fix Under Review</i></li></ul><p>wip-mon-misc-fixes contains commit 96b82ebf870bda5af33624f698785bac6ae78234 that should fix this bug.</p>
<p>The approach is basically to close any window that may lead to awakening contexts waiting on any one of the queues (such as waiting_for_active) while a paxos machine is not in a state that could cope with the contexts being woken up.</p>
<p>This appears to be the reason for this bug. For instance, say that we have a 'C_Active' context waiting on the MDSMonitor::waiting_for_active. Now say that the MDSMonitor is about to start a proposal. Now imagine that another paxos machine (say, the OSDMonitor) finishes its proposal, ends up in Monitor::recovered_leader() and starts waking up all the contexts in all the Paxos machines. If it awakes that 'C_Active' context on the MDSMonitor at the wrong time, nasty things may happen.</p>
<p>This patch may or may not fix this issue (as we have no way to confirm using the logs), but it certainly fixes <em>some issue</em> that may or may not have happened yet.</p> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=143522012-11-23T11:30:41ZArtem Grinblat
<ul><li><strong>File</strong> <a href="/attachments/download/559/failed_in_logs1.png">failed_in_logs1.png</a> <a class="icon-only icon-magnifier" title="View" href="/attachments/559/failed_in_logs1.png">View</a> added</li><li><strong>File</strong> <a href="/attachments/download/560/failed_in_logs2.png">failed_in_logs2.png</a> <a class="icon-only icon-magnifier" title="View" href="/attachments/560/failed_in_logs2.png">View</a> added</li></ul><p>Joao, what kind of ceph configuration would make the logs more useful in the future?<br />Ceph crashes fairly often here, maybe I would be able to reproduce the issue with a more detailed logs. See screenshots.</p> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=143712012-11-24T19:05:03ZJoao Eduardo Luis
<ul><li><strong>Status</strong> changed from <i>Fix Under Review</i> to <i>Resolved</i></li></ul> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=143722012-11-24T19:12:13ZJoao Eduardo Luis
<ul></ul><p>Sorry, marked this as resolved and only noticed the last comment afterwards (silly me).</p>
<p>For the monitor, a good log level of 20 for the monitor and a 10 for ms are probably the most useful in terms of the information gathered, but they will also grow your logs considerably. If you are not able to cope with that amount of logging, then we can obtain some fair amount of infos from a debug level of 10 on the monitor and ms 1.</p>
<p>In terms of configuration, that would be something like<br /><pre>
[mon]
debug mon = 20 (or 10)
debug ms = 10 (or 1)
</pre></p>
<p>I'm pretty confident that the patch will fix this issue, but if you happen to stumble across this bug again, please feel free to poke me ;-)</p> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=146252012-12-05T00:56:18ZArtem Grinblat
<ul><li><strong>File</strong> <a href="/attachments/download/573/ceph-mon.3495.7z">ceph-mon.3495.7z</a> added</li></ul><p>"FAILED assert(have_pending)" still happens with Ceph 0.55.<br />I have attached the logs.</p> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=146262012-12-05T02:59:18ZJoao Eduardo Luis
<ul><li><strong>Status</strong> changed from <i>Resolved</i> to <i>In Progress</i></li></ul><p>Thanks for following up on this. And thanks for cranking up the debug levels! :-)</p> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=146272012-12-05T05:45:03ZJoao Eduardo Luis
<ul></ul><p>Artem,</p>
<p>The good news is that, given the bug being triggered yet again on the same spot, it would appear to be localized to the MDSMonitor; the not-so-good news is that I'm still looking for the root cause.</p>
<p>Are you doing anything in particular to trigger this bug? I would love to be able to reproduce it. Also, if you could set 'debug paxos = 10' on the monitor it would be great, as it would allow me to pinpoint what is causing the 'have_pending' to turn false -- debugging paxos usually isn't relevant enough to be worth to spam your logs, thus why I didn't suggested it earlier, but in this case it would be nice to have it.</p> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=146312012-12-05T06:21:08ZArtem Grinblat
<ul></ul><p>I'm not doing anything special, the usual librados uploads and removes. There is also a recovery in background: "recovery 178902/6622808 degraded (2.701%)".<br />I'll upload the 'debug paxos = 10' logs if (when) the mon crashes again.</p> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=146332012-12-05T06:32:01ZArtem Grinblat
<ul></ul><blockquote>
<p>Are you doing anything in particular to trigger this bug?</p>
</blockquote>
<p>With Ceph 0.54 I've noticed that<br /><pre>
shared_ptr<std::forward_list<gstring>> ids (std::make_shared<std::forward_list<gstring>>());
for (auto it = rados.second->objects_begin(), end = rados.second->objects_end(); it != end; ++it) {
ids->emplace_front (it->first);
if (++*cephTotal % 100 == 0) std::this_thread::sleep_for (std::chrono::milliseconds (valgrind ? 200 : 20)); // Be nice to Ceph.
}
</pre><br />always fails sooner or later, even with the loop running against a thread-local Ceph connections, whereas "rados -p {pool} ls" (which does exactly the same loop) - works. So I have a suspicion that Ceph isn't tested that well when the same process uses librados from multiple threads. Don't know if it is related to this bug or not.</p> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=146342012-12-05T06:33:58ZArtem Grinblat
<ul></ul><p>(As a workaround I'm now doing "nice rados -p {pool} ls > {pool}.ids" and reading ids from the file).</p> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=146372012-12-05T08:39:13ZJosh Durgin
<ul></ul><p>Could you open a separate bug with the crash you see with the multithreaded object listing? There is some shared state for IoCtxs that shouldn't be relied on from multiple threads, but listing objects shouldn't hit that.</p> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=146392012-12-05T09:36:42ZArtem Grinblat
<ul></ul><blockquote>
<p>Could you open a separate bug with the crash you see with the multithreaded object listing?</p>
</blockquote>
<p>I might be able later.<br />What would be the best logging settings for that report? Thanks.</p> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=146402012-12-05T09:45:45ZJosh Durgin
<ul></ul><pre>
debug ms = 1
debug objecter = 20
debug rados = 20
</pre> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=146422012-12-05T11:07:10ZArtem Grinblat
<ul><li><strong>File</strong> <a href="/attachments/download/574/ceph.3495.rados20.7z">ceph.3495.rados20.7z</a> added</li></ul><p>Attached is the file with a paxos=10 log.</p> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=146462012-12-05T12:04:49ZJoao Eduardo Luis
<ul></ul><p>Thanks Artem. Your log confirmed my suspicions regarding to what the bug was.</p>
<p>If you have the chance, could you try commit 46c55a19e72ae1f16037807fda8abe59b671f7d3 on wip-3495? It should fix the issue, but I haven't been able to reproduce the bug in order to be 100% sure.</p> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=146662012-12-06T05:27:46ZArtem Grinblat
<ul></ul><blockquote>
<p>If you have the chance, could you try commit 46c55a19e72ae1f16037807fda8abe59b671f7d3 on wip-3495? It should fix the issue, but I haven't been able to reproduce the bug in order to be 100% sure.</p>
</blockquote>
<p>Joao, thanks for the fix!<br />I'd prefer to wait for 0.56.</p> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=146722012-12-06T06:52:23ZJoao Eduardo Luis
<ul></ul><p>Sure. I'll try to reproduce this bug again before bobtail is released. From the looks of it, messing with the MDS enough while bringing the cluster up ought to trigger it, so I might be able to do it with.</p>
<p>Thanks for reporting it and all your help tracking it!</p> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=146872012-12-07T04:21:29ZJoao Eduardo Luis
<ul></ul><p>Managed to trigger this bug using</p>
<pre>
$ while [ 1 ]; do ./init-ceph restart mon.a ; sleep 30 ; done
$ while [ 1 ]; do ./init-ceph restart mds.a ; sleep 2 ; done
$ while [ 1 ]; do ./init-ceph restart osd.1 ; sleep 2 ; done
</pre>
<p>The bug was triggered on mon.b, everything running v0.55.</p> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=146902012-12-07T07:46:12ZMatthew Viavia@matthewvia.info
<ul></ul><p>I was encountering this particular bug, causing all my monitors to crash shortly after start. I patched my 0.55 packages with the patch from wip-3495, and have not encountered a single crash under heavy load for about 15 hours.</p> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=146912012-12-07T08:36:09ZJoao Eduardo Luis
<ul></ul><p>Good to know Matthew. I haven't been able to reproduce it either since using a patched version.</p>
<p>And the patch has been merged onto next, so I'm marking the bug as resolved.</p> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=146922012-12-07T08:36:28ZJoao Eduardo Luis
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Resolved</i></li></ul> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=148332012-12-13T15:18:10ZJoao Eduardo Luis
<ul><li><strong>Status</strong> changed from <i>Resolved</i> to <i>In Progress</i></li></ul><p>This bug occurred again on 0.55.1, as described on bug <a class="issue tracker-1 status-10 priority-5 priority-high3 closed" title="Bug: 0.55.1 ceph-mon crashes under heavy load (Duplicate)" href="https://tracker.ceph.com/issues/3618">#3618</a>. Re-opening as it appears that there's still something else going on.</p> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=148342012-12-13T15:28:05ZJoao Eduardo Luis
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Resolved</i></li></ul><p>Sorry, jumped the gun here. Thought the fix was on 0.55.1 but it is not. It's on next though.</p> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=202082013-04-10T08:56:57ZJoao Eduardo Luis
<ul><li><strong>Status</strong> changed from <i>Resolved</i> to <i>In Progress</i></li></ul><p>Matthew Roy, imjustmattew on IRC, stumbled upon this bug yet again, this time on 0.60.</p>
<p>This was reintroduced with the monitor rework which basically undid the patch introduced with <a class="changeset" title="mon: Monitor: rework 'paxos' to a list instead of a vector After adding the gv patches, during M..." href="https://tracker.ceph.com/projects/ceph/repository/revisions/e1c27fe178bb76b2f3127876b868968730bdb074">e1c27fe178bb76b2f3127876b868968730bdb074</a> and I suspect the solution is to redo the patch.</p> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=202182013-04-10T09:36:36ZJoao Eduardo Luis
<ul></ul><p>Well, my above comment is not entirely true. The patch itself was not undone, but rather the patch's focus was on the multiple Paxos instances that were used back then. Currently, we only have one Paxos instance, so we should focus on the services.</p>
<p>While the obvious workaround is to force a given order on how we poke the services, in order to avoid the MDSMonitor invoking the OSDMonitor's propose_pending() before the OSDMonitor has been poked. This however seems wrong in the sense that we should instead force the MDSMonitor to wait for the OSDMonitor to become active prior to performing write operations.</p> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=204422013-04-16T17:52:29ZJoao Eduardo Luis
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Fix Under Review</i></li></ul><p>Pushed wip-3495 to github. So far, the fix appears to be holding up without breaking anything else.</p>
<p>I'm leaving the test running for the night and should be more certain by the morning.</p>
<p>In the meantime, reviews are welcome.</p> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=204492013-04-17T04:17:48ZJoao Eduardo Luis
<ul></ul><p>Fix has been holding for the last 12 hours or so.</p> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=204632013-04-17T09:14:10ZMatthew Roymroy@sandbox-ed.com
<ul></ul><p>Is the gitbuilder build from that WIP branch fairly safe to test until 0.61 (or will using it horrifically break the upgrade path to 0.61?)</p>
<p>Obviously there are no warranties or guarantees express or implied :)</p> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=204812013-04-17T10:14:11ZSage Weilsage@newdream.net
<ul></ul><p>should be safe!</p> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=205062013-04-17T18:06:47ZMatthew Roymroy@sandbox-ed.com
<ul></ul><p>After running for a few hours the monitors are not dying! I think you might have got it. I'll update this after it runs for a day or so.</p> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=205182013-04-18T01:29:19ZJoao Eduardo Luis
<ul></ul><p>Thanks for the update!</p> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=205702013-04-19T07:45:11ZMatthew Roymroy@sandbox-ed.com
<ul></ul><p>This has been stable for me for >24-hours. I think you've got it. Thanks for all your help!</p> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=205792013-04-19T09:20:02ZDenis kaganovichmahatma@eu.by
<ul></ul><p>Fixme if I use wrong branch, but:</p>
<pre><code>0> 2013-04-19 19:06:29.120708 7fb556116700 -1 mon/PaxosService.cc: In function 'void PaxosService::propose_pending()' thread 7fb556116700 time 2013-04-19 19:06:29.119272<br />mon/PaxosService.cc: 127: FAILED assert(have_pending)</code></pre>
<pre><code>ceph version 0.60-524-g46d8b9f (46d8b9f2c6e36d988e39b116785c7be76af69f10)<br /> 1: (PaxosService::propose_pending()+0x35b) [0x513f4b]<br /> 2: (MDSMonitor::tick()+0x94c) [0x552e0c]<br /> 3: (MDSMonitor::on_active()+0x11) [0x54e9a1]<br /> 4: (PaxosService::_active()+0x250) [0x514230]<br /> 5: (Context::complete(int)+0xa) [0x4dc63a]<br /> 6: (finish_contexts(CephContext*, std::list&lt;Context*, std::allocator&lt;Context*&gt; >&, int)+0xa6) [0x4e3c06]<br /> 7: (Paxos::handle_last(MMonPaxos*)+0x93e) [0x50d90e]<br /> 8: (Paxos::dispatch(PaxosServiceMessage*)+0x1bb) [0x50ea4b]<br /> 9: (Monitor::_ms_dispatch(Message*)+0xde4) [0x4dbf14]<br /> 10: (Monitor::ms_dispatch(Message*)+0x32) [0x4ff2a2]<br /> 11: (DispatchQueue::entry()+0x33a) [0x70efda]<br /> 12: (DispatchQueue::DispatchThread::entry()+0xd) [0x6892ad]<br /> 13: (()+0x84bd) [0x7fb55be594bd]<br /> 14: (clone()+0x6d) [0x7fb55a0b6f7d]</code></pre> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=205852013-04-19T09:34:32ZJoao Eduardo Luis
<ul></ul><p>Denis, you appear to be using master [1]; the fix is only available on wip-3495.</p>
<p>[1]:</p>
<pre>
joao@tardis:~/inktank/src/ceph/src$ git branch -a --contains 46d8b9f2c6e36d988e39b116785c7be76af69f10
remotes/gh/HEAD -> gh/master
remotes/gh/master
</pre> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=206012013-04-19T10:42:20ZJoao Eduardo Luis
<ul><li><strong>Status</strong> changed from <i>Fix Under Review</i> to <i>7</i></li></ul> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=207262013-04-22T15:20:43ZSage Weilsage@newdream.net
<ul><li><strong>Priority</strong> changed from <i>High</i> to <i>Urgent</i></li></ul> Ceph - Bug #3495: ceph-mon crashhttps://tracker.ceph.com/issues/3495?journal_id=207782013-04-23T12:44:52ZGreg Farnumgfarnum@redhat.com
<ul><li><strong>Status</strong> changed from <i>7</i> to <i>Resolved</i></li></ul><p>Merged into next with <a class="changeset" title="Merge pull request #222 from ceph/wip-3495 Reviewed-by: Greg Farnum <greg@inktank.com>" href="https://tracker.ceph.com/projects/ceph/repository/revisions/426e3be64e851947b288e43bc0ee932ae7f214bb">426e3be64e851947b288e43bc0ee932ae7f214bb</a></p>