Project

General

Profile

Actions

Bug #24162

closed

control-c on ceph cli leads to segv

Added by Sage Weil almost 6 years ago. Updated almost 6 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
mimic
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

root@mira055:~# ceph -s -m mira055
^CCluster connection interrupted or timed out
/build/ceph-13.1.0-119-g1df1443/src/msg/async/Event.cc: In function 'EventCenter::~EventCenter()' thread 7fb210ff9700 time 2018-05-17 15:44:36.522510
/build/ceph-13.1.0-119-g1df1443/src/msg/async/Event.cc: 174: FAILED assert(time_events.empty())
 ceph version 13.1.0-119-g1df1443 (1df1443c1cf67ee58486067bfaf0a5c5cb08d325) mimic (rc)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x7fb21e187132]
 2: (()+0x2e42f7) [0x7fb21e1872f7]
 3: (EventCenter::~EventCenter()+0x1de) [0x7fb21e2c0eee]
 4: (PosixWorker::~PosixWorker()+0x5d) [0x7fb21e2c6ded]
 5: (PosixNetworkStack::~PosixNetworkStack()+0x86) [0x7fb21e2c6f66]
 6: (void ceph::_any::op_func<StackSingleton>(ceph::_any::op, void*)+0x81) [0x7fb21e2bf311]
 7: (std::_Rb_tree<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::type_index>, std::pair<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::type_index> const, ceph::immobile_any<576ul> >, std::_Select1st<std::pair<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::type_index> const, ceph::immobile_any<576ul> > >, CephContext::associated_objs_cmp, std::allocator<std::pair<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::type_index> const, ceph::immobile_any<576ul> > > >::_M_erase(std::_Rb_tree_node<std::pair<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::type_index> const, ceph::immobile_any<576ul> > >*)+0x33) [0x7fb21e387fd3]
 8: (CephContext::~CephContext()+0x1d) [0x7fb21e3838cd]
 9: (CephContext::put()+0x174) [0x7fb21e383ff4]
 10: (librados::RadosClient::~RadosClient()+0x1e4) [0x7fb22727c5e4]
 11: (librados::RadosClient::~RadosClient()+0x9) [0x7fb22727c639]
 12: (rados_shutdown()+0x2d) [0x7fb227228d2d]
 13: (()+0x18894) [0x7fb22753f894]
 14: (PyEval_EvalFrameEx()+0x729e) [0x4c30ce]
 15: (PyEval_EvalCodeEx()+0x306) [0x4b9ab6]
 16: (PyEval_EvalFrameEx()+0x603f) [0x4c1e6f]
 17: (PyEval_EvalFrameEx()+0x553f) [0x4c136f]
 18: (PyEval_EvalCodeEx()+0x306) [0x4b9ab6]
 19: /usr/bin/python2.7() [0x4d54b9]
 20: /usr/bin/python2.7() [0x4eebee]
 21: (PyObject_Call()+0x3e) [0x4a577e]
 22: (PyEval_CallObjectWithKeywords()+0x30) [0x4c5e10]
 23: /usr/bin/python2.7() [0x589172]
 24: (()+0x76ba) [0x7fb228b906ba]
 25: (clone()+0x6d) [0x7fb2288c641d]

Related issues 1 (0 open1 closed)

Copied to RADOS - Backport #24294: mimic: control-c on ceph cli leads to segvResolvedSage WeilActions
Actions #1

Updated by Sage Weil almost 6 years ago

root@reesi006:~# ceph -s -m mira056 --debug-ms 20
2018-05-18 19:18:25.616 7f5fe66bd700  2 Event(0x7f5fe00b5c00 nevent=5000 time_id=1).set_owner idx=0 owner=140049864447744
2018-05-18 19:18:25.616 7f5fe66bd700 20 Event(0x7f5fe00b5c00 nevent=5000 time_id=1).create_file_event create event started fd=5 mask=1 original mask is 0
2018-05-18 19:18:25.616 7f5fe66bd700 20 EpollDriver.add_event add event fd=5 cur_mask=0 add_mask=1 to 4
2018-05-18 19:18:25.616 7f5fe66bd700 20 Event(0x7f5fe00b5c00 nevent=5000 time_id=1).create_file_event create event end fd=5 mask=1 original mask is 1
2018-05-18 19:18:25.616 7f5fe66bd700 10 stack operator() starting
2018-05-18 19:18:25.616 7f5fe56bb700  2 Event(0x7f5fe010f8d0 nevent=5000 time_id=1).set_owner idx=2 owner=140049847662336
2018-05-18 19:18:25.616 7f5fe56bb700 20 Event(0x7f5fe010f8d0 nevent=5000 time_id=1).create_file_event create event started fd=12 mask=1 original mask is 0
2018-05-18 19:18:25.616 7f5fe5ebc700  2 Event(0x7f5fe00e2a40 nevent=5000 time_id=1).set_owner idx=1 owner=140049856055040
2018-05-18 19:18:25.616 7f5fe56bb700 20 EpollDriver.add_event add event fd=12 cur_mask=0 add_mask=1 to 11
2018-05-18 19:18:25.616 7f5fe5ebc700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=1).create_file_event create event started fd=9 mask=1 original mask is 0
2018-05-18 19:18:25.616 7f5fe5ebc700 20 EpollDriver.add_event add event fd=9 cur_mask=0 add_mask=1 to 7
2018-05-18 19:18:25.616 7f5fe56bb700 20 Event(0x7f5fe010f8d0 nevent=5000 time_id=1).create_file_event create event end fd=12 mask=1 original mask is 1
2018-05-18 19:18:25.616 7f5fe56bb700 10 stack operator() starting
2018-05-18 19:18:25.616 7f5fe5ebc700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=1).create_file_event create event end fd=9 mask=1 original mask is 1
2018-05-18 19:18:25.616 7f5fe5ebc700 10 stack operator() starting
2018-05-18 19:18:25.616 7f5fe8276700 10 -- - ready -
2018-05-18 19:18:25.616 7f5fe8276700  1  Processor -- start
2018-05-18 19:18:25.616 7f5fe8276700  1 -- - start start
2018-05-18 19:18:25.616 7f5fe8276700 10 -- - create_connect 172.21.6.138:6789/0, creating connection and registering
2018-05-18 19:18:25.616 7f5fe8276700 10 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_NONE pgs=0 cs=0 l=0)._connect csq=0
2018-05-18 19:18:25.616 7f5fe8276700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=1).wakeup
2018-05-18 19:18:25.616 7f5fe8276700 10 -- - get_connection mon.0 172.21.6.138:6789/0 new 0x7f5fe0142d10
2018-05-18 19:18:25.616 7f5fe8276700 10 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=0).send_keepalive
2018-05-18 19:18:25.616 7f5fe8276700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=1).wakeup
2018-05-18 19:18:25.616 7f5fe5ebc700 20 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=0).process prev state is STATE_CONNECTING
2018-05-18 19:18:25.616 7f5fe8276700  1 -- - --> 172.21.6.138:6789/0 -- auth(proto 0 34 bytes epoch 0) v1 -- 0x7f5fe0146ef0 con 0
2018-05-18 19:18:25.616 7f5fe8276700 15 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=0).send_message inline write is denied, reschedule m=0x7f5fe0146ef0
2018-05-18 19:18:25.616 7f5fe5ebc700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=1).create_file_event create event started fd=14 mask=1 original mask is 0
2018-05-18 19:18:25.616 7f5fe5ebc700 20 EpollDriver.add_event add event fd=14 cur_mask=0 add_mask=1 to 7
2018-05-18 19:18:25.616 7f5fe5ebc700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=1).create_file_event create event end fd=14 mask=1 original mask is 1
2018-05-18 19:18:25.616 7f5fe5ebc700 20 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0).process prev state is STATE_CONNECTING
2018-05-18 19:18:25.616 7f5fe5ebc700 10 NetHandler reconnect reconnect: Operation already in progress
2018-05-18 19:18:25.616 7f5fe5ebc700 10 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0)._process_connection nonblock connect inprogress
2018-05-18 19:18:25.616 7f5fe5ebc700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=1).create_file_event create event started fd=14 mask=2 original mask is 1
2018-05-18 19:18:25.616 7f5fe5ebc700 20 EpollDriver.add_event add event fd=14 cur_mask=1 add_mask=2 to 7
2018-05-18 19:18:25.616 7f5fe5ebc700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=1).create_file_event create event end fd=14 mask=2 original mask is 3
2018-05-18 19:18:25.616 7f5fe5ebc700 10 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0).handle_write
2018-05-18 19:18:25.616 7f5fe5ebc700 10 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0).handle_write
2018-05-18 19:18:25.616 7f5fe5ebc700 20 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0).process prev state is STATE_CONNECTING_RE
2018-05-18 19:18:25.616 7f5fe5ebc700 10 NetHandler reconnect reconnect: Connection refused
2018-05-18 19:18:25.616 7f5fe5ebc700  1 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0)._process_connection reconnect failed 
2018-05-18 19:18:25.616 7f5fe5ebc700  2 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0)._process_connection connection refused!
2018-05-18 19:18:25.616 7f5fe5ebc700 20 EpollDriver.del_event del event fd=14 cur_mask=3 delmask=3 to 7
2018-05-18 19:18:25.616 7f5fe5ebc700 10 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=0).fault waiting 0.200000
2018-05-18 19:18:25.816 7f5fe5ebc700 20 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=0).process prev state is STATE_CONNECTING
2018-05-18 19:18:25.816 7f5fe5ebc700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=2).create_file_event create event started fd=14 mask=1 original mask is 0
2018-05-18 19:18:25.816 7f5fe5ebc700 20 EpollDriver.add_event add event fd=14 cur_mask=0 add_mask=1 to 7
2018-05-18 19:18:25.816 7f5fe5ebc700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=2).create_file_event create event end fd=14 mask=1 original mask is 1
2018-05-18 19:18:25.816 7f5fe5ebc700 20 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0).process prev state is STATE_CONNECTING
2018-05-18 19:18:25.816 7f5fe5ebc700 10 NetHandler reconnect reconnect: Operation already in progress
2018-05-18 19:18:25.816 7f5fe5ebc700 10 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0)._process_connection nonblock connect inprogress
2018-05-18 19:18:25.816 7f5fe5ebc700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=2).create_file_event create event started fd=14 mask=2 original mask is 1
2018-05-18 19:18:25.816 7f5fe5ebc700 20 EpollDriver.add_event add event fd=14 cur_mask=1 add_mask=2 to 7
2018-05-18 19:18:25.816 7f5fe5ebc700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=2).create_file_event create event end fd=14 mask=2 original mask is 3
2018-05-18 19:18:25.816 7f5fe5ebc700 20 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0).process prev state is STATE_CONNECTING_RE
2018-05-18 19:18:25.816 7f5fe5ebc700 10 NetHandler reconnect reconnect: Connection refused
2018-05-18 19:18:25.816 7f5fe5ebc700  1 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0)._process_connection reconnect failed 
2018-05-18 19:18:25.816 7f5fe5ebc700  2 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0)._process_connection connection refused!
2018-05-18 19:18:25.816 7f5fe5ebc700 20 EpollDriver.del_event del event fd=14 cur_mask=3 delmask=3 to 7
2018-05-18 19:18:25.816 7f5fe5ebc700 10 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=0).fault waiting 0.400000
2018-05-18 19:18:26.216 7f5fe5ebc700 20 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=0).process prev state is STATE_CONNECTING
2018-05-18 19:18:26.216 7f5fe5ebc700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=3).create_file_event create event started fd=14 mask=1 original mask is 0
2018-05-18 19:18:26.216 7f5fe5ebc700 20 EpollDriver.add_event add event fd=14 cur_mask=0 add_mask=1 to 7
2018-05-18 19:18:26.216 7f5fe5ebc700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=3).create_file_event create event end fd=14 mask=1 original mask is 1
2018-05-18 19:18:26.216 7f5fe5ebc700 20 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0).process prev state is STATE_CONNECTING
2018-05-18 19:18:26.216 7f5fe5ebc700 10 NetHandler reconnect reconnect: Operation already in progress
2018-05-18 19:18:26.216 7f5fe5ebc700 10 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0)._process_connection nonblock connect inprogress
2018-05-18 19:18:26.216 7f5fe5ebc700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=3).create_file_event create event started fd=14 mask=2 original mask is 1
2018-05-18 19:18:26.216 7f5fe5ebc700 20 EpollDriver.add_event add event fd=14 cur_mask=1 add_mask=2 to 7
2018-05-18 19:18:26.216 7f5fe5ebc700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=3).create_file_event create event end fd=14 mask=2 original mask is 3
2018-05-18 19:18:26.216 7f5fe5ebc700 20 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0).process prev state is STATE_CONNECTING_RE
2018-05-18 19:18:26.216 7f5fe5ebc700 10 NetHandler reconnect reconnect: Connection refused
2018-05-18 19:18:26.216 7f5fe5ebc700  1 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0)._process_connection reconnect failed 
2018-05-18 19:18:26.216 7f5fe5ebc700  2 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0)._process_connection connection refused!
2018-05-18 19:18:26.216 7f5fe5ebc700 20 EpollDriver.del_event del event fd=14 cur_mask=3 delmask=3 to 7
2018-05-18 19:18:26.216 7f5fe5ebc700 10 -- - >> 172.21.6.138:6789/0 conn(0x7f5fe0142d10 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=0).fault waiting 0.800000
^CCluster connection interrupted or timed out
2018-05-18 19:18:26.532 7f5fce7fc700 20 Event(0x7f5fe00b5c00 nevent=5000 time_id=1).wakeup
2018-05-18 19:18:26.532 7f5fce7fc700 20 Event(0x7f5fe00e2a40 nevent=5000 time_id=4).wakeup
2018-05-18 19:18:26.532 7f5fce7fc700 20 Event(0x7f5fe010f8d0 nevent=5000 time_id=1).wakeup
/build/ceph-13.1.0-129-gd3a3371/src/msg/async/Event.cc: In function 'EventCenter::~EventCenter()' thread 7f5fce7fc700 time 2018-05-18 19:18:26.538558
/build/ceph-13.1.0-129-gd3a3371/src/msg/async/Event.cc: 174: FAILED assert(time_events.empty())
 ceph version 13.1.0-129-gd3a3371 (d3a3371798df3316c173c967bc5300119f20cd22) mimic (rc)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x7f5febca1132]
 2: (()+0x2e42f7) [0x7f5febca12f7]
 3: (EventCenter::~EventCenter()+0x1de) [0x7f5febddaeee]
 4: (PosixWorker::~PosixWorker()+0x5d) [0x7f5febde0ded]
 5: (PosixNetworkStack::~PosixNetworkStack()+0x86) [0x7f5febde0f66]
 6: (void ceph::_any::op_func<StackSingleton>(ceph::_any::op, void*)+0x81) [0x7f5febdd9311]
 7: (std::_Rb_tree<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::type_index>, std::pair<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::type_index> const, ceph::immobile_any<576ul> >, std::_Select1st<std::pair<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::type_index> const, ceph::immobile_any<576ul> > >, CephContext::associated_objs_cmp, std::allocator<std::pair<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::type_index> const, ceph::immobile_any<576ul> > > >::_M_erase(std::_Rb_tree_node<std::pair<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::type_index> const, ceph::immobile_any<576ul> > >*)+0x33) [0x7f5febea1fd3]
 8: (CephContext::~CephContext()+0x1d) [0x7f5febe9d8cd]
 9: (CephContext::put()+0x174) [0x7f5febe9dff4]
 10: (librados::RadosClient::~RadosClient()+0x1e4) [0x7f5ff4d965e4]
 11: (librados::RadosClient::~RadosClient()+0x9) [0x7f5ff4d96639]
 12: (rados_shutdown()+0x2d) [0x7f5ff4d42d2d]
 13: (()+0x18894) [0x7f5ff5059894]
 14: (PyEval_EvalFrameEx()+0x729e) [0x4c30ce]
 15: (PyEval_EvalCodeEx()+0x306) [0x4b9ab6]
 16: (PyEval_EvalFrameEx()+0x603f) [0x4c1e6f]
 17: (PyEval_EvalFrameEx()+0x553f) [0x4c136f]
 18: (PyEval_EvalCodeEx()+0x306) [0x4b9ab6]
 19: /usr/bin/python2.7() [0x4d54b9]
 20: /usr/bin/python2.7() [0x4eebee]
 21: (PyObject_Call()+0x3e) [0x4a577e]
 22: (PyEval_CallObjectWithKeywords()+0x30) [0x4c5e10]
 23: /usr/bin/python2.7() [0x589172]
 24: (()+0x76ba) [0x7f5ff66aa6ba]
 25: (clone()+0x6d) [0x7f5ff63e041d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Aborted
Actions #2

Updated by Sage Weil almost 6 years ago

  • Status changed from 12 to Fix Under Review
Actions #3

Updated by Sage Weil almost 6 years ago

  • Status changed from Fix Under Review to Pending Backport
  • Backport set to mimic
Actions #4

Updated by Nathan Cutler almost 6 years ago

  • Copied to Backport #24294: mimic: control-c on ceph cli leads to segv added
Actions #5

Updated by Sage Weil almost 6 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF