Bug #22041
'ceph osd df tree' crashes new mons
Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
Monitor
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Description
Probably the same root cause as #21770 because it appears under the same circumstances.
I'm seeing the following crash when running "ceph osd df tree":
Nov 04 16:46:43 new-croit-host-BEEF03 ceph-mon[11133]: 2017-11-04 17:46:43.948275 7fc18c807700 -1 *** Caught signal (Aborted) ** in thread 7fc18c807700 thread_name:ms_dispatch ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable) 1: (()+0x930b84) [0x5618c8d9eb84] 2: (()+0x110c0) [0x7fc195f480c0] 3: (gsignal()+0xcf) [0x7fc19336efcf] 4: (abort()+0x16a) [0x7fc1933703fa] 5: (()+0x40a2a9) [0x5618c88782a9] 6: (print_osd_utilization(OSDMap const&, PGStatService const*, std::ostream&, ceph::Formatter*, bool)+0x1a9) [0x5618c8b38c49] 7: (OSDMonitor::preprocess_command(boost::intrusive_ptr<MonOpRequest>)+0xb57) [0x5618c8965337] 8: (OSDMonitor::preprocess_query(boost::intrusive_ptr<MonOpRequest>)+0x2c0) [0x5618c896f320] 9: (PaxosService::dispatch(boost::intrusive_ptr<MonOpRequest>)+0x7f8) [0x5618c8916468] 10: (Monitor::handle_command(boost::intrusive_ptr<MonOpRequest>)+0x233b) [0x5618c87d9d1b] 11: (Monitor::dispatch_op(boost::intrusive_ptr<MonOpRequest>)+0xa49) [0x5618c87e1139] 12: (Monitor::_ms_dispatch(Message*)+0x6d3) [0x5618c87e21c3] 13: (Monitor::ms_dispatch(Message*)+0x23) [0x5618c880f963] 14: (DispatchQueue::entry()+0xeda) [0x5618c8d459aa] 15: (DispatchQueue::DispatchThread::entry()+0xd) [0x5618c8af059d] 16: (()+0x7494) [0x7fc195f3e494] 17: (clone()+0x3f) [0x7fc193424aff] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
it happens under the exact same circumstances as #21770, i.e., only on new mons in new clusters and restarting all existing mons and mgrs at the same time fixes it.
Steps to reproduce:
- create a new cluster with one mon and one mgr, do not restart them!
- create a few OSDs and at least one pool (some IO operations on the pool might be neccessary, not sure)
- create a new mon/mgr
- run ceph osd df tree a few times
- the new mon crashes, but never the initial mon
History
#1 Updated by Paul Emmerich over 6 years ago
this is fixed in 12.2.2, thanks!
#2 Updated by Nathan Cutler over 6 years ago
- Status changed from New to Resolved