Project

General

Profile

Bug #55351

ceph-mon crash in handle_forward when add new message type

Added by RenCheng Yang 7 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Monitor
Target version:
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

the call backtread below´╝Ü

#0 0x00007ff1c10e94ab in raise () from /lib64/libpthread.so.0
#1 0x000056214e54df9a in reraise_fatal (signum=6) at /usr/src/debug/ceph-12.2.7-5087-ga119cbe/src/global/signal_handler.cc:74
#2 handle_fatal_signal (signum=6) at /usr/src/debug/ceph-12.2.7-5087-ga119cbe/src/global/signal_handler.cc:138
#3 <signal handler called>
#4 0x00007ff1bc8391f7 in raise () from /lib64/libc.so.6
#5 0x00007ff1bc83a8e8 in abort () from /lib64/libc.so.6
#6 0x000056214e1d11b3 in ceph::__ceph_assert_fail (assertion=assertion@entry=0x56214e72a6f3 "msg", file=file@entry=0x56214e7284f8 "/root/rpmbuild/BUILD/ceph-12.2.7-5087-ga119cbe/src/messages/MForward.h",
line=line@entry=100, func=func@entry=0x56214e72c520 <MForward::claim_message()::__PRETTY_FUNCTION__> "PaxosServiceMessage* MForward::claim_message()")
at /usr/src/debug/ceph-12.2.7-5087-ga119cbe/src/common/assert.cc:66
#7 0x000056214defacc7 in claim_message (this=0x5621592ff180) at /usr/src/debug/ceph-12.2.7-5087-ga119cbe/src/messages/MForward.h:100
#8 Monitor::handle_forward (this=this@entry=0x562159223400, op=...) at /usr/src/debug/ceph-12.2.7-5087-ga119cbe/src/mon/Monitor.cc:3931
#9 0x000056214def77e0 in Monitor::dispatch_op (this=this@entry=0x562159223400, op=...) at /usr/src/debug/ceph-12.2.7-5087-ga119cbe/src/mon/Monitor.cc:4562
#10 0x000056214def8833 in Monitor::_ms_dispatch (this=this@entry=0x562159223400, m=m@entry=0x5621592ff180) at /usr/src/debug/ceph-12.2.7-5087-ga119cbe/src/mon/Monitor.cc:4311
#11 0x000056214df25fe3 in Monitor::ms_dispatch (this=0x562159223400, m=0x5621592ff180) at /usr/src/debug/ceph-12.2.7-5087-ga119cbe/src/mon/Monitor.h:908
#12 0x000056214e4deaeb in ms_deliver_dispatch (m=0x5621592ff180, this=0x5621591ae700) at /usr/src/debug/ceph-12.2.7-5087-ga119cbe/src/msg/Messenger.h:668
#13 DispatchQueue::entry (this=0x5621591ae858) at /usr/src/debug/ceph-12.2.7-5087-ga119cbe/src/msg/DispatchQueue.cc:197
#14 0x000056214e28296d in DispatchQueue::DispatchThread::entry (this=<optimize

//-----------error log--------------
2022-04-17 10:36:21.611993 - info ceph-mon can't decode unknown message type 2057 MSG_AUTH=17
2022-04-17 10:36:21.612015 - info ceph-mon can't decode unknown message type 2057 MSG_AUTH=17
2022-04-17 10:36:21.618800 - info ceph-mon /yrc_docker/ceph/src/messages/MForward.h: In function 'PaxosServiceMessage* MForward::claim_message()' thread 7f998116d700 time 2022-04-17 10:36:21.6172
92
2022-04-17 10:36:21.618977 - info ceph-mon /yrc_docker/ceph/src/messages/MForward.h: 100: FAILED assert(msg)
2022-04-17 10:36:21.619846 - info ceph-mon ceph version (a119cbebbae3cb27992fcb5f1b70e1d8b9f57d77) luminous (stable)
2022-04-17 10:36:21.619902 - info ceph-mon 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x114) [0x56211833a674]
2022-04-17 10:36:21.619934 - info ceph-mon 2: (()+0x430907) [0x562118063907]
2022-04-17 10:36:21.619970 - info ceph-mon 3: (Monitor::dispatch_op(boost::intrusive_ptr<MonOpRequest>)+0x1230) [0x562118060420]
2022-04-17 10:36:21.620030 - info ceph-mon 4: (Monitor::_ms_dispatch(Message*)+0xb03) [0x562118061473]
2022-04-17 10:36:21.620080 - info ceph-mon 5: (Monitor::ms_dispatch(Message*)+0x23) [0x56211808ec23]
2022-04-17 10:36:21.620135 - info ceph-mon 6: (DispatchQueue::entry()+0x10eb) [0x5621186481bb]
2022-04-17 10:36:21.620189 - info ceph-mon 7: (DispatchQueue::DispatchThread::entry()+0xd) [0x5621183ec01d]
2022-04-17 10:36:21.620260 - info ceph-mon 8: (()+0x7e25) [0x7f998ec52e25]
2022-04-17 10:36:21.620351 - info ceph-mon 9: (clone()+0x6d) [0x7f998a46d34d]
2022-04-17 10:36:21.620414 - info ceph-mon NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2022-04-17 10:36:21.620474 - info ceph-mon 2022-04-17 10:36:21.619673 7f998116d700 -1 assert.cc:58 /ceph/src/messages/MForward.h: In function 'PaxosServiceMessage* MForward::claim_mess
age()' thread 7f998116d700 time 2022-04-17 10:36:21.617292
2022-04-17 10:36:21.620530 - info ceph-mon

//-------------desc-------------
I add new message type in version v2, the older version is v1, then I have three ceph-node, like A,B,C. B is leader.
then I upgrade A node's ceph version, then client send new message to A, the message only leader can do, the A forward the
new message to B, therefore, B crash. the ceph version we use is 12.2.7, but I check master code, seems like the same.

Also available in: Atom PDF