Actions
Bug #24162
closedcontrol-c on ceph cli leads to segv
Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
mimic
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
root@mira055:~# ceph -s -m mira055 ^CCluster connection interrupted or timed out /build/ceph-13.1.0-119-g1df1443/src/msg/async/Event.cc: In function 'EventCenter::~EventCenter()' thread 7fb210ff9700 time 2018-05-17 15:44:36.522510 /build/ceph-13.1.0-119-g1df1443/src/msg/async/Event.cc: 174: FAILED assert(time_events.empty()) ceph version 13.1.0-119-g1df1443 (1df1443c1cf67ee58486067bfaf0a5c5cb08d325) mimic (rc) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x7fb21e187132] 2: (()+0x2e42f7) [0x7fb21e1872f7] 3: (EventCenter::~EventCenter()+0x1de) [0x7fb21e2c0eee] 4: (PosixWorker::~PosixWorker()+0x5d) [0x7fb21e2c6ded] 5: (PosixNetworkStack::~PosixNetworkStack()+0x86) [0x7fb21e2c6f66] 6: (void ceph::_any::op_func<StackSingleton>(ceph::_any::op, void*)+0x81) [0x7fb21e2bf311] 7: (std::_Rb_tree<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::type_index>, std::pair<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::type_index> const, ceph::immobile_any<576ul> >, std::_Select1st<std::pair<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::type_index> const, ceph::immobile_any<576ul> > >, CephContext::associated_objs_cmp, std::allocator<std::pair<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::type_index> const, ceph::immobile_any<576ul> > > >::_M_erase(std::_Rb_tree_node<std::pair<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::type_index> const, ceph::immobile_any<576ul> > >*)+0x33) [0x7fb21e387fd3] 8: (CephContext::~CephContext()+0x1d) [0x7fb21e3838cd] 9: (CephContext::put()+0x174) [0x7fb21e383ff4] 10: (librados::RadosClient::~RadosClient()+0x1e4) [0x7fb22727c5e4] 11: (librados::RadosClient::~RadosClient()+0x9) [0x7fb22727c639] 12: (rados_shutdown()+0x2d) [0x7fb227228d2d] 13: (()+0x18894) [0x7fb22753f894] 14: (PyEval_EvalFrameEx()+0x729e) [0x4c30ce] 15: (PyEval_EvalCodeEx()+0x306) [0x4b9ab6] 16: (PyEval_EvalFrameEx()+0x603f) [0x4c1e6f] 17: (PyEval_EvalFrameEx()+0x553f) [0x4c136f] 18: (PyEval_EvalCodeEx()+0x306) [0x4b9ab6] 19: /usr/bin/python2.7() [0x4d54b9] 20: /usr/bin/python2.7() [0x4eebee] 21: (PyObject_Call()+0x3e) [0x4a577e] 22: (PyEval_CallObjectWithKeywords()+0x30) [0x4c5e10] 23: /usr/bin/python2.7() [0x589172] 24: (()+0x76ba) [0x7fb228b906ba] 25: (clone()+0x6d) [0x7fb2288c641d]
Updated by Sage Weil almost 6 years ago
root@reesi006:~# ceph -s -m mira056 --debug-ms 20 2018-05-18 19:18:25.616 7f5fe66bd700 2 Event(0x7f5fe00b5c00 nevent=5000 time_id=1).set_owner idx=0 owner=140049864447744 2018-05-18 19:18:25.616 7f5fe66bd700 20 Event(0x7f5fe00b5c00 nevent=5000 time_id=1).create_file_event create event started fd=5 mask=1 original mask is 0 2018-05-18 19:18:25.616 7f5fe66bd700 20 EpollDriver.add_event add event fd=5 cur_mask=0 add_mask=1 to 4 2018-05-18 19:18:25.616 7f5fe66bd700 20 Event(0x7f5fe00b5c00 nevent=5000 time_id=1).create_file_event create event end fd=5 mask=1 original mask is 1 2018-05-18 19:18:25.616 7f5fe66bd700 10 stack operator() starting 2018-05-18 19:18:25.616 7f5fe56bb700 2 Event(0x7f5fe010f8d0 nevent=5000 time_id=1).set_owner idx=2 owner=140049847662336 2018-05-18 19:18:25.616 7f5fe56bb700 20 Event(0x7f5fe010f8d0 nevent=5000 time_id=1).create_file_event create event started fd=12 mask=1 original mask is 0 2018-05-18 19:18:25.616 7f5fe5ebc700 2 Event(0x7f5fe00e2a40 nevent=5000 time_id=1).set_owner idx=1 owner=140049856055040 2018-05-18 19:18:25.616 7f5fe56bb700 20 EpollDriver.add_event add event fd=12 cur_mask=0 add_mask=1 to 11 2018-05-18 19:18:25.616 7f5fe5ebc700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=1).create_file_event create event started fd=9 mask=1 original mask is 0 2018-05-18 19:18:25.616 7f5fe5ebc700 20 EpollDriver.add_event add event fd=9 cur_mask=0 add_mask=1 to 7 2018-05-18 19:18:25.616 7f5fe56bb700 20 Event(0x7f5fe010f8d0 nevent=5000 time_id=1).create_file_event create event end fd=12 mask=1 original mask is 1 2018-05-18 19:18:25.616 7f5fe56bb700 10 stack operator() starting 2018-05-18 19:18:25.616 7f5fe5ebc700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=1).create_file_event create event end fd=9 mask=1 original mask is 1 2018-05-18 19:18:25.616 7f5fe5ebc700 10 stack operator() starting 2018-05-18 19:18:25.616 7f5fe8276700 10 -- - ready - 2018-05-18 19:18:25.616 7f5fe8276700 1 Processor -- start 2018-05-18 19:18:25.616 7f5fe8276700 1 -- - start start 2018-05-18 19:18:25.616 7f5fe8276700 10 -- - create_connect 172.21.6.138:6789/0, creating connection and registering 2018-05-18 19:18:25.616 7f5fe8276700 10 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_NONE pgs=0 cs=0 l=0)._connect csq=0 2018-05-18 19:18:25.616 7f5fe8276700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=1).wakeup 2018-05-18 19:18:25.616 7f5fe8276700 10 -- - get_connection mon.0 172.21.6.138:6789/0 new 0x7f5fe0142d10 2018-05-18 19:18:25.616 7f5fe8276700 10 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=0).send_keepalive 2018-05-18 19:18:25.616 7f5fe8276700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=1).wakeup 2018-05-18 19:18:25.616 7f5fe5ebc700 20 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=0).process prev state is STATE_CONNECTING 2018-05-18 19:18:25.616 7f5fe8276700 1 -- - --> 172.21.6.138:6789/0 -- auth(proto 0 34 bytes epoch 0) v1 -- 0x7f5fe0146ef0 con 0 2018-05-18 19:18:25.616 7f5fe8276700 15 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=0).send_message inline write is denied, reschedule m=0x7f5fe0146ef0 2018-05-18 19:18:25.616 7f5fe5ebc700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=1).create_file_event create event started fd=14 mask=1 original mask is 0 2018-05-18 19:18:25.616 7f5fe5ebc700 20 EpollDriver.add_event add event fd=14 cur_mask=0 add_mask=1 to 7 2018-05-18 19:18:25.616 7f5fe5ebc700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=1).create_file_event create event end fd=14 mask=1 original mask is 1 2018-05-18 19:18:25.616 7f5fe5ebc700 20 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0).process prev state is STATE_CONNECTING 2018-05-18 19:18:25.616 7f5fe5ebc700 10 NetHandler reconnect reconnect: Operation already in progress 2018-05-18 19:18:25.616 7f5fe5ebc700 10 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0)._process_connection nonblock connect inprogress 2018-05-18 19:18:25.616 7f5fe5ebc700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=1).create_file_event create event started fd=14 mask=2 original mask is 1 2018-05-18 19:18:25.616 7f5fe5ebc700 20 EpollDriver.add_event add event fd=14 cur_mask=1 add_mask=2 to 7 2018-05-18 19:18:25.616 7f5fe5ebc700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=1).create_file_event create event end fd=14 mask=2 original mask is 3 2018-05-18 19:18:25.616 7f5fe5ebc700 10 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0).handle_write 2018-05-18 19:18:25.616 7f5fe5ebc700 10 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0).handle_write 2018-05-18 19:18:25.616 7f5fe5ebc700 20 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0).process prev state is STATE_CONNECTING_RE 2018-05-18 19:18:25.616 7f5fe5ebc700 10 NetHandler reconnect reconnect: Connection refused 2018-05-18 19:18:25.616 7f5fe5ebc700 1 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0)._process_connection reconnect failed 2018-05-18 19:18:25.616 7f5fe5ebc700 2 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0)._process_connection connection refused! 2018-05-18 19:18:25.616 7f5fe5ebc700 20 EpollDriver.del_event del event fd=14 cur_mask=3 delmask=3 to 7 2018-05-18 19:18:25.616 7f5fe5ebc700 10 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=0).fault waiting 0.200000 2018-05-18 19:18:25.816 7f5fe5ebc700 20 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=0).process prev state is STATE_CONNECTING 2018-05-18 19:18:25.816 7f5fe5ebc700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=2).create_file_event create event started fd=14 mask=1 original mask is 0 2018-05-18 19:18:25.816 7f5fe5ebc700 20 EpollDriver.add_event add event fd=14 cur_mask=0 add_mask=1 to 7 2018-05-18 19:18:25.816 7f5fe5ebc700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=2).create_file_event create event end fd=14 mask=1 original mask is 1 2018-05-18 19:18:25.816 7f5fe5ebc700 20 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0).process prev state is STATE_CONNECTING 2018-05-18 19:18:25.816 7f5fe5ebc700 10 NetHandler reconnect reconnect: Operation already in progress 2018-05-18 19:18:25.816 7f5fe5ebc700 10 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0)._process_connection nonblock connect inprogress 2018-05-18 19:18:25.816 7f5fe5ebc700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=2).create_file_event create event started fd=14 mask=2 original mask is 1 2018-05-18 19:18:25.816 7f5fe5ebc700 20 EpollDriver.add_event add event fd=14 cur_mask=1 add_mask=2 to 7 2018-05-18 19:18:25.816 7f5fe5ebc700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=2).create_file_event create event end fd=14 mask=2 original mask is 3 2018-05-18 19:18:25.816 7f5fe5ebc700 20 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0).process prev state is STATE_CONNECTING_RE 2018-05-18 19:18:25.816 7f5fe5ebc700 10 NetHandler reconnect reconnect: Connection refused 2018-05-18 19:18:25.816 7f5fe5ebc700 1 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0)._process_connection reconnect failed 2018-05-18 19:18:25.816 7f5fe5ebc700 2 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0)._process_connection connection refused! 2018-05-18 19:18:25.816 7f5fe5ebc700 20 EpollDriver.del_event del event fd=14 cur_mask=3 delmask=3 to 7 2018-05-18 19:18:25.816 7f5fe5ebc700 10 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=0).fault waiting 0.400000 2018-05-18 19:18:26.216 7f5fe5ebc700 20 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=0).process prev state is STATE_CONNECTING 2018-05-18 19:18:26.216 7f5fe5ebc700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=3).create_file_event create event started fd=14 mask=1 original mask is 0 2018-05-18 19:18:26.216 7f5fe5ebc700 20 EpollDriver.add_event add event fd=14 cur_mask=0 add_mask=1 to 7 2018-05-18 19:18:26.216 7f5fe5ebc700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=3).create_file_event create event end fd=14 mask=1 original mask is 1 2018-05-18 19:18:26.216 7f5fe5ebc700 20 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0).process prev state is STATE_CONNECTING 2018-05-18 19:18:26.216 7f5fe5ebc700 10 NetHandler reconnect reconnect: Operation already in progress 2018-05-18 19:18:26.216 7f5fe5ebc700 10 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0)._process_connection nonblock connect inprogress 2018-05-18 19:18:26.216 7f5fe5ebc700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=3).create_file_event create event started fd=14 mask=2 original mask is 1 2018-05-18 19:18:26.216 7f5fe5ebc700 20 EpollDriver.add_event add event fd=14 cur_mask=1 add_mask=2 to 7 2018-05-18 19:18:26.216 7f5fe5ebc700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=3).create_file_event create event end fd=14 mask=2 original mask is 3 2018-05-18 19:18:26.216 7f5fe5ebc700 20 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0).process prev state is STATE_CONNECTING_RE 2018-05-18 19:18:26.216 7f5fe5ebc700 10 NetHandler reconnect reconnect: Connection refused 2018-05-18 19:18:26.216 7f5fe5ebc700 1 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0)._process_connection reconnect failed 2018-05-18 19:18:26.216 7f5fe5ebc700 2 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0)._process_connection connection refused! 2018-05-18 19:18:26.216 7f5fe5ebc700 20 EpollDriver.del_event del event fd=14 cur_mask=3 delmask=3 to 7 2018-05-18 19:18:26.216 7f5fe5ebc700 10 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=0).fault waiting 0.800000 ^CCluster connection interrupted or timed out 2018-05-18 19:18:26.532 7f5fce7fc700 20 Event(0x7f5fe00b5c00 nevent=5000 time_id=1).wakeup 2018-05-18 19:18:26.532 7f5fce7fc700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=4).wakeup 2018-05-18 19:18:26.532 7f5fce7fc700 20 Event(0x7f5fe010f8d0 nevent=5000 time_id=1).wakeup /build/ceph-13.1.0-129-gd3a3371/src/msg/async/Event.cc: In function 'EventCenter::~EventCenter()' thread 7f5fce7fc700 time 2018-05-18 19:18:26.538558 /build/ceph-13.1.0-129-gd3a3371/src/msg/async/Event.cc: 174: FAILED assert(time_events.empty()) ceph version 13.1.0-129-gd3a3371 (d3a3371798df3316c173c967bc5300119f20cd22) mimic (rc) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x7f5febca1132] 2: (()+0x2e42f7) [0x7f5febca12f7] 3: (EventCenter::~EventCenter()+0x1de) [0x7f5febddaeee] 4: (PosixWorker::~PosixWorker()+0x5d) [0x7f5febde0ded] 5: (PosixNetworkStack::~PosixNetworkStack()+0x86) [0x7f5febde0f66] 6: (void ceph::_any::op_func<StackSingleton>(ceph::_any::op, void*)+0x81) [0x7f5febdd9311] 7: (std::_Rb_tree<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::type_index>, std::pair<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::type_index> const, ceph::immobile_any<576ul> >, std::_Select1st<std::pair<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::type_index> const, ceph::immobile_any<576ul> > >, CephContext::associated_objs_cmp, std::allocator<std::pair<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::type_index> const, ceph::immobile_any<576ul> > > >::_M_erase(std::_Rb_tree_node<std::pair<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::type_index> const, ceph::immobile_any<576ul> > >*)+0x33) [0x7f5febea1fd3] 8: (CephContext::~CephContext()+0x1d) [0x7f5febe9d8cd] 9: (CephContext::put()+0x174) [0x7f5febe9dff4] 10: (librados::RadosClient::~RadosClient()+0x1e4) [0x7f5ff4d965e4] 11: (librados::RadosClient::~RadosClient()+0x9) [0x7f5ff4d96639] 12: (rados_shutdown()+0x2d) [0x7f5ff4d42d2d] 13: (()+0x18894) [0x7f5ff5059894] 14: (PyEval_EvalFrameEx()+0x729e) [0x4c30ce] 15: (PyEval_EvalCodeEx()+0x306) [0x4b9ab6] 16: (PyEval_EvalFrameEx()+0x603f) [0x4c1e6f] 17: (PyEval_EvalFrameEx()+0x553f) [0x4c136f] 18: (PyEval_EvalCodeEx()+0x306) [0x4b9ab6] 19: /usr/bin/python2.7() [0x4d54b9] 20: /usr/bin/python2.7() [0x4eebee] 21: (PyObject_Call()+0x3e) [0x4a577e] 22: (PyEval_CallObjectWithKeywords()+0x30) [0x4c5e10] 23: /usr/bin/python2.7() [0x589172] 24: (()+0x76ba) [0x7f5ff66aa6ba] 25: (clone()+0x6d) [0x7f5ff63e041d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. Aborted
Updated by Sage Weil almost 6 years ago
- Status changed from 12 to Fix Under Review
hacky workaround: https://github.com/ceph/ceph/pull/22093
Updated by Sage Weil almost 6 years ago
- Status changed from Fix Under Review to Pending Backport
- Backport set to mimic
mimic backport https://github.com/ceph/ceph/pull/22225
Updated by Nathan Cutler almost 6 years ago
- Copied to Backport #24294: mimic: control-c on ceph cli leads to segv added
Updated by Sage Weil almost 6 years ago
- Status changed from Pending Backport to Resolved
Actions