Bug #7648
closedceph-mon corner case denial of service
90%
Description
ceph-mon corner case denial of service
Improperly positioned (in CRUSH map) OSD may cause mon process to stop responding, consume all virtual memory and die by OOM killer. For example:
$ ceph osd tree- id weight type name up/down reweight
-1 1.08 root default
-2 1.08 host hosta
1 1.08 osd.1 up 1
0 0 osd.0 down 0
^^^ osd.0 is not under root default/host (e.g., due to botched ceph-disk-prepare)
$ ceph -f json-pretty mon_status
{ "name": "2",
"rank": 2,
"state": "peon",
"election_epoch": 48,
"quorum": [
0,
1,
2],
"outside_quorum": [],
"extra_probe_peers": [],
"sync_provider": [],
"monmap": { "epoch": 2,
"fsid": "XXXXXXXXXX",
"modified": "2014-03-06 17:15:40.485887",
"created": "2014-03-06 17:13:05.099978",
"mons": [
{ "rank": 0,
"name": "0",
"addr": "10.0.0.6:6789\/0"},
{ "rank": 1,
"name": "1",
"addr": "10.0.0.10:6789\/0"},
{ "rank": 2,
"name": "2",
"addr": "10.0.0.11:6789\/0"}]}}
^^^ all 3 monitors are up and healthy
$ ceph osd find 0
^^^ hangs. one of the monitors stops responding, runs at 100% CPU until its VIRT consumes all physical memory + swap and then OOM-killed
$ sudo ceph --admin-daemon=/var/run/ceph/ceph-mon.1.asok mon_status
{ "name": "1",
"rank": 1,
"state": "leader",
"election_epoch": 50,
"quorum": [
1,
2],
"outside_quorum": [],
"extra_probe_peers": [],
"sync_provider": [],
"monmap": { "epoch": 2,
"fsid": "XXXXXXXXXXXX",
"modified": "2014-03-06 17:15:40.485887",
"created": "2014-03-06 17:13:05.099978",
"mons": [
{ "rank": 0,
"name": "0",
"addr": "10.0.0.6:6789\/0"},
{ "rank": 1,
"name": "1",
"addr": "10.0.0.10:6789\/0"},
{ "rank": 2,
"name": "2",
"addr": "10.0.0.11:6789\/0"}]}}
^^^
monitor id==0 is out, its core dump fills the filesystem (image > 200GB)
from mon logs:
0> 2014-03-07 16:19:49.433591 7fb970a19700 -1 ** Caught signal (Aborted) *
in thread 7fb970a19700
ceph version 0.72.2 (a913ded2ff138aefb8cb84d347d72164099cfd60)
1: /usr/bin/ceph-mon() [0x8b0f22]
2: (()+0xf030) [0x7fb97609c030]
3: (gsignal()+0x35) [0x7fb9749bc475]
4: (abort()+0x180) [0x7fb9749bf6f0]
5: (_gnu_cxx::_verbose_terminate_handler()+0x11d) [0x7fb97521189d]
6: (()+0x63996) [0x7fb97520f996]
7: (()+0x639c3) [0x7fb97520f9c3]
8: (()+0x63bee) [0x7fb97520fbee]
9: (tc_new()+0x48e) [0x7fb9762e2aee]
10: (std::vector<std::pair<std::string, std::string>, std::allocator<std::pair<std::string, std::string> > >::_M_insert_aux(_gnu_cxx::_normal_iterator<std::pair<std::string, std::string>*, std::vector<std::pair<std::string, std::string>, std::allocator<std::pair<std::string, std::string> > > >, std::pair<std::string, std::string> const&)+0x160) [0x7368c0]
11: (CrushWrapper::get_full_location_ordered(int, std::vector<std::pair<std::string, std::string>, std::allocator<std::pair<std::string, std::string> > >&)+0x4ce) [0x73537e]
12: (CrushWrapper::get_full_location(int)+0x5f) [0x7356bf]
13: (OSDMonitor::preprocess_command(MMonCommand*)+0xb80) [0x678f60]
14: (OSDMonitor::preprocess_query(PaxosServiceMessage*)+0x23b) [0x67c15b]
15: (PaxosService::dispatch(PaxosServiceMessage*)+0x5cc) [0x650abc]
16: (Monitor::handle_command(MMonCommand*)+0x9c8) [0x6147a8]
17: (Monitor::dispatch(MonSession*, Message*, bool)+0x3e2) [0x61d6a2]
18: (Monitor::_ms_dispatch(Message*)+0x1c6) [0x61db16]
19: (Monitor::ms_dispatch(Message*)+0x32) [0x63ba82]
20: (DispatchQueue::entry()+0x4eb) [0x88c3db]
21: (DispatchQueue::DispatchThread::entry()+0xd) [0x7c469d]
22: (()+0x6b50) [0x7fb976093b50]
23: (clone()+0x6d) [0x7fb974a64a7d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
if one would repeat `ceph find 0', the next monitor will go down and so on ...
Updated by Sage Weil about 10 years ago
- Status changed from New to Won't Fix
This only works on osd.0; for other osds, both dumpling and emperor behave (with bogus output)
flab:src 04:32 PM $ ./ceph osd create *** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH *** 1 flab:src 04:32 PM $ ./ceph osd find 1 *** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH *** { "osd": 1, "ip": ":\/0", "crush_location": { "": "", "host": "localhost", "rack": "localrack", "root": "default"}}flab:src 04:32 PM $
on firefly, the function in question was rewritten and works as intended.
Updated by Sage Weil over 9 years ago
- Status changed from Won't Fix to In Progress
- Assignee set to Sage Weil
works for any osd that exists but is not in the crush map, it seems
Updated by Sage Weil over 9 years ago
- Status changed from In Progress to Fix Under Review
- Assignee changed from Sage Weil to Loïc Dachary
Updated by Loïc Dachary over 9 years ago
- Status changed from Fix Under Review to Pending Backport
- % Done changed from 0 to 90
- Backport set to dumpling emperor
the backport needs to be on emperor also
Updated by Loïc Dachary over 9 years ago
- Status changed from Pending Backport to Fix Under Review
emperor backport https://github.com/ceph/ceph/pull/2585
Updated by Sage Weil over 9 years ago
- Status changed from Fix Under Review to Resolved
- Backport changed from dumpling emperor to dumpling, emperor