Project

General

Profile

Actions

Bug #24304

closed

MgrStatMonitor decode crash on 12.2.4->12.2.5 upgrade

Added by John Spray almost 6 years ago. Updated over 5 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
Correctness/Safety
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

May 25 12:21:06 magna044.ceph.redhat.com ceph-mon[30366]: 2018-05-25 12:21:06.540921 7f8a87f49ec0 -1 mon.magna044@-1(probing).mgrstat failed to decode mgrstat state; luminous dev version?
May 25 12:21:06 magna044.ceph.redhat.com ceph-mon[30366]: 2018-05-25 12:21:06.754346 7f8a79d83700 -1 mon.magna044@0(leader).mgrstat failed to decode mgrstat state; luminous dev version?
May 25 12:21:08 magna044.ceph.redhat.com ceph-mon[30366]: terminate called after throwing an instance of 'ceph::buffer::malformed_input'
May 25 12:21:08 magna044.ceph.redhat.com ceph-mon[30366]: what():  buffer::malformed_input: void object_stat_sum_t::decode(ceph::buffer::list::iterator&) decode past end of struct encoding
May 25 12:21:08 magna044.ceph.redhat.com ceph-mon[30366]: *** Caught signal (Aborted) **
May 25 12:21:08 magna044.ceph.redhat.com ceph-mon[30366]: in thread 7f8a7e58c700 thread_name:ms_dispatch
May 25 12:21:08 magna044.ceph.redhat.com ceph-mon[30366]: ceph version 12.2.5-15.el7cp (8af5074c84901971d2c7807ba8270b44b5fbc09b) luminous (stable)
May 25 12:21:08 magna044.ceph.redhat.com ceph-mon[30366]: 1: (()+0x8f6621) [0x55d7416b2621]
May 25 12:21:08 magna044.ceph.redhat.com ceph-mon[30366]: 2: (()+0xf680) [0x7f8a872fa680]
May 25 12:21:08 magna044.ceph.redhat.com ceph-mon[30366]: 3: (gsignal()+0x37) [0x7f8a84635207]
May 25 12:21:08 magna044.ceph.redhat.com ceph-mon[30366]: 4: (abort()+0x148) [0x7f8a846368f8]
May 25 12:21:08 magna044.ceph.redhat.com ceph-mon[30366]: 5: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f8a84f447d5]
May 25 12:21:08 magna044.ceph.redhat.com ceph-mon[30366]: 6: (()+0x5e746) [0x7f8a84f42746]
May 25 12:21:08 magna044.ceph.redhat.com ceph-mon[30366]: 7: (()+0x5e773) [0x7f8a84f42773]
May 25 12:21:08 magna044.ceph.redhat.com ceph-mon[30366]: 8: (()+0x5e993) [0x7f8a84f42993]
May 25 12:21:08 magna044.ceph.redhat.com ceph-mon[30366]: 9: (object_stat_sum_t::decode(ceph::buffer::list::iterator&)+0x587) [0x55d7414c3117]
May 25 12:21:08 magna044.ceph.redhat.com ceph-mon[30366]: 10: (object_stat_collection_t::decode(ceph::buffer::list::iterator&)+0x67) [0x55d7414df317]
May 25 12:21:08 magna044.ceph.redhat.com ceph-mon[30366]: 11: (pool_stat_t::decode(ceph::buffer::list::iterator&)+0x5a) [0x55d7414dfeaa]
May 25 12:21:08 magna044.ceph.redhat.com ceph-mon[30366]: 12: (PGMapDigest::decode(ceph::buffer::list::iterator&)+0x1cc) [0x55d7412337ec]
May 25 12:21:08 magna044.ceph.redhat.com ceph-mon[30366]: 13: (MgrStatMonitor::prepare_report(boost::intrusive_ptr<MonOpRequest>)+0x72) [0x55d7413699b2]
May 25 12:21:08 magna044.ceph.redhat.com ceph-mon[30366]: 14: (MgrStatMonitor::prepare_update(boost::intrusive_ptr<MonOpRequest>)+0xbf) [0x55d741369def]
May 25 12:21:08 magna044.ceph.redhat.com ceph-mon[30366]: 15: (PaxosService::dispatch(boost::intrusive_ptr<MonOpRequest>)+0xaf8) [0x55d74129dc28]
May 25 12:21:08 magna044.ceph.redhat.com ceph-mon[30366]: 16: (Monitor::dispatch_op(boost::intrusive_ptr<MonOpRequest>)+0x51f) [0x55d74117dc7f]
May 25 12:21:08 magna044.ceph.redhat.com ceph-mon[30366]: 17: (Monitor::_ms_dispatch(Message*)+0x7eb) [0x55d74117f2fb]
May 25 12:21:08 magna044.ceph.redhat.com ceph-mon[30366]: 18: (Monitor::ms_dispatch(Message*)+0x23) [0x55d7411ab463]
May 25 12:21:08 magna044.ceph.redhat.com ceph-mon[30366]: 19: (DispatchQueue::entry()+0x792) [0x55d74165d9c2]
May 25 12:21:08 magna044.ceph.redhat.com ceph-mon[30366]: 20: (DispatchQueue::DispatchThread::entry()+0xd) [0x55d741455ccd]
May 25 12:21:08 magna044.ceph.redhat.com ceph-mon[30366]: 21: (()+0x7dd5) [0x7f8a872f2dd5]
May 25 12:21:08 magna044.ceph.redhat.com ceph-mon[30366]: 22: (clone()+0x6d) [0x7f8a846fdb3d]
Actions #1

Updated by John Spray almost 6 years ago

  • Status changed from New to Closed

This appears to be specific to a downstream build, closing.

Actions #2

Updated by Josh Durgin almost 6 years ago

  • Project changed from mgr to RADOS
  • Category changed from MgrMonitor to Correctness/Safety
  • Status changed from Closed to Fix Under Review
  • Assignee set to Josh Durgin

This is due to the fast-path decoding for object_stat_sum_t not being updated in the backport. Fix: https://github.com/ceph/ceph/pull/22253

Actions #3

Updated by Josh Durgin almost 6 years ago

  • Status changed from Fix Under Review to Pending Backport
  • Backport set to mimic, luminous
Actions #4

Updated by Josh Durgin almost 6 years ago

  • Status changed from Pending Backport to Fix Under Review
  • Backport deleted (mimic, luminous)

wrong bug

Actions #5

Updated by Josh Durgin over 5 years ago

  • Status changed from Fix Under Review to Resolved
Actions

Also available in: Atom PDF