Bug #42570
closedmgr: qa: upgrade mimic-master "src/osd/osd_types.h: 2313: FAILED ceph_assert(pos <= end)"
0%
Description
2019-10-30T19:19:31.669 INFO:tasks.ceph.ceph_manager.ceph:need seq 38654705758 got 0 for osd.0 2019-10-30T19:19:32.213 INFO:tasks.ceph.mon.a.smithi201.stderr:2019-10-30T19:19:32.209+0000 7ff3a30a0700 -1 mon.a@0(leader).mgrstat failed to decode mgrstat state; luminous dev version? buffer::end_of_buffer 2019-10-30T19:19:32.214 INFO:tasks.ceph.mon.c.smithi201.stderr:2019-10-30T19:19:32.209+0000 7fbee3b6f700 -1 mon.c@2(peon).mgrstat failed to decode mgrstat state; luminous dev version? buffer::end_of_buffer 2019-10-30T19:19:32.215 INFO:tasks.ceph.mon.b.smithi201.stderr:2019-10-30T19:19:32.213+0000 7f8fe3ec1700 -1 mon.b@1(peon).mgrstat failed to decode mgrstat state; luminous dev version? buffer::end_of_buffer 2019-10-30T19:19:32.225 INFO:tasks.ceph.mon.c.smithi201.stderr:2019-10-30T19:19:32.217+0000 7fbee3b6f700 -1 mon.c@2(peon).mgrstat failed to decode mgrstat state; luminous dev version? buffer::end_of_buffer 2019-10-30T19:19:32.225 INFO:tasks.ceph.mon.a.smithi201.stderr:2019-10-30T19:19:32.217+0000 7ff3a30a0700 -1 mon.a@0(leader).mgrstat failed to decode mgrstat state; luminous dev version? buffer::end_of_buffer 2019-10-30T19:19:32.225 INFO:tasks.ceph.mon.b.smithi201.stderr:2019-10-30T19:19:32.221+0000 7f8fe3ec1700 -1 mon.b@1(peon).mgrstat failed to decode mgrstat state; luminous dev version? buffer::end_of_buffer 2019-10-30T19:19:32.226 INFO:tasks.ceph.mgr.y.smithi201.stderr:/build/ceph-15.0.0-6608-gfb29d02/src/osd/osd_types.h: In function 'static void store_statfs_t::_denc_finish(ceph::buffer::v14_2_0::ptr::const_iterator&, __u8*, __u8*, char**, uint32_t*)' thread 7f0ebff38700 time 2019-10-30T19:19:32.223103+0000 2019-10-30T19:19:32.226 INFO:tasks.ceph.mgr.y.smithi201.stderr:/build/ceph-15.0.0-6608-gfb29d02/src/osd/osd_types.h: 2313: FAILED ceph_assert(pos <= end) 2019-10-30T19:19:32.234 INFO:tasks.ceph.mgr.y.smithi201.stderr:/build/ceph-15.0.0-6608-gfb29d02/src/osd/osd_types.h: In function 'static void store_statfs_t::_denc_finish(ceph::buffer::v14_2_0::ptr::const_iterator&, __u8*, __u8*, char**, uint32_t*)' thread 7f0ec0739700 time 2019-10-30T19:19:32.226306+0000 2019-10-30T19:19:32.234 INFO:tasks.ceph.mgr.y.smithi201.stderr:/build/ceph-15.0.0-6608-gfb29d02/src/osd/osd_types.h: 2313: FAILED ceph_assert(pos <= end) 2019-10-30T19:19:32.240 INFO:tasks.ceph.mgr.y.smithi201.stderr: ceph version 15.0.0-6608-gfb29d02 (fb29d023301243910681de545c877cc84d2ca724) octopus (dev) 2019-10-30T19:19:32.241 INFO:tasks.ceph.mgr.y.smithi201.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152) [0x7f0ec5791e02] 2019-10-30T19:19:32.241 INFO:tasks.ceph.mgr.y.smithi201.stderr: 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x7f0ec5791fdd] 2019-10-30T19:19:32.241 INFO:tasks.ceph.mgr.y.smithi201.stderr: 3: (std::enable_if<denc_traits<store_statfs_t, void>::supported&&denc_traits<store_statfs_t, void>::need_contiguous, void>::type ceph::decode<store_statfs_t, denc_traits<store_statfs_t, void> >(store_statfs_t&, ceph::buffer::v14_2_0::list::iterator_impl<true>&)+0x1d1) [0x7f0ec5bd92c1] 2019-10-30T19:19:32.241 INFO:tasks.ceph.mgr.y.smithi201.stderr: 4: (osd_stat_t::decode(ceph::buffer::v14_2_0::list::iterator_impl<true>&)+0x4a6) [0x7f0ec5bbfc16] 2019-10-30T19:19:32.241 INFO:tasks.ceph.mgr.y.smithi201.stderr: 5: (MPGStats::decode_payload()+0x6b) [0x7f0ec59e8e5b] 2019-10-30T19:19:32.241 INFO:tasks.ceph.mgr.y.smithi201.stderr: 6: (decode_message(CephContext*, int, ceph_msg_header&, ceph_msg_footer&, ceph::buffer::v14_2_0::list&, ceph::buffer::v14_2_0::list&, ceph::buffer::v14_2_0::list&, boost::intrusive_ptr<Connection>)+0x1097) [0x7f0ec59759b7] 2019-10-30T19:19:32.242 INFO:tasks.ceph.mgr.y.smithi201.stderr: 7: (ProtocolV1::handle_message_footer(char*, int)+0x129) [0x7f0ec5a17f39] 2019-10-30T19:19:32.242 INFO:tasks.ceph.mgr.y.smithi201.stderr: 8: (()+0x4eeb0d) [0x7f0ec5a10b0d] 2019-10-30T19:19:32.242 INFO:tasks.ceph.mgr.y.smithi201.stderr: 9: (AsyncConnection::process()+0x5fc) [0x7f0ec59ff85c] 2019-10-30T19:19:32.242 INFO:tasks.ceph.mgr.y.smithi201.stderr: 10: (EventCenter::process_events(unsigned int, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*)+0x7dd) [0x7f0ec5a56b5d] 2019-10-30T19:19:32.242 INFO:tasks.ceph.mgr.y.smithi201.stderr: 11: (()+0x53c968) [0x7f0ec5a5e968] 2019-10-30T19:19:32.243 INFO:tasks.ceph.mgr.y.smithi201.stderr: 12: (()+0xbd66f) [0x7f0ec464666f] 2019-10-30T19:19:32.243 INFO:tasks.ceph.mgr.y.smithi201.stderr: 13: (()+0x76db) [0x7f0ec4b1d6db] 2019-10-30T19:19:32.243 INFO:tasks.ceph.mgr.y.smithi201.stderr: 14: (clone()+0x3f) [0x7f0ec3d0388f] 2019-10-30T19:19:32.243 INFO:tasks.ceph.mgr.y.smithi201.stderr: ceph version 15.0.0-6608-gfb29d02 (fb29d023301243910681de545c877cc84d2ca724) octopus (dev) 2019-10-30T19:19:32.243 INFO:tasks.ceph.mgr.y.smithi201.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152) [0x7f0ec5791e02] 2019-10-30T19:19:32.243 INFO:tasks.ceph.mgr.y.smithi201.stderr: 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x7f0ec5791fdd] 2019-10-30T19:19:32.244 INFO:tasks.ceph.mgr.y.smithi201.stderr: 3: (std::enable_if<denc_traits<store_statfs_t, void>::supported&&denc_traits<store_statfs_t, void>::need_contiguous, void>::type ceph::decode<store_statfs_t, denc_traits<store_statfs_t, void> >(store_statfs_t&, ceph::buffer::v14_2_0::list::iterator_impl<true>&)+0x1d1) [0x7f0ec5bd92c1] 2019-10-30T19:19:32.244 INFO:tasks.ceph.mgr.y.smithi201.stderr: 4: (osd_stat_t::decode(ceph::buffer::v14_2_0::list::iterator_impl<true>&)+0x4a6) [0x7f0ec5bbfc16] 2019-10-30T19:19:32.244 INFO:tasks.ceph.mgr.y.smithi201.stderr: 5: (MPGStats::decode_payload()+0x6b) [0x7f0ec59e8e5b] 2019-10-30T19:19:32.244 INFO:tasks.ceph.mgr.y.smithi201.stderr: 6: (decode_message(CephContext*, int, ceph_msg_header&, ceph_msg_footer&, ceph::buffer::v14_2_0::list&, ceph::buffer::v14_2_0::list&, ceph::buffer::v14_2_0::list&, boost::intrusive_ptr<Connection>)+0x1097) [0x7f0ec59759b7] 2019-10-30T19:19:32.244 INFO:tasks.ceph.mgr.y.smithi201.stderr: 7: (ProtocolV1::handle_message_footer(char*, int)+0x129) [0x7f0ec5a17f39] 2019-10-30T19:19:32.244 INFO:tasks.ceph.mgr.y.smithi201.stderr: 8: (()+0x4eeb0d) [0x7f0ec5a10b0d] 2019-10-30T19:19:32.245 INFO:tasks.ceph.mgr.y.smithi201.stderr: 9: (AsyncConnection::process()+0x5fc) [0x7f0ec59ff85c] 2019-10-30T19:19:32.245 INFO:tasks.ceph.mgr.y.smithi201.stderr: 10: (EventCenter::process_events(unsigned int, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*)+0x7dd) [0x7f0ec5a56b5d] 2019-10-30T19:19:32.245 INFO:tasks.ceph.mgr.y.smithi201.stderr: 11: (()+0x53c968) [0x7f0ec5a5e968] 2019-10-30T19:19:32.245 INFO:tasks.ceph.mgr.y.smithi201.stderr: 12: (()+0xbd66f) [0x7f0ec464666f] 2019-10-30T19:19:32.245 INFO:tasks.ceph.mgr.y.smithi201.stderr: 13: (()+0x76db) [0x7f0ec4b1d6db] 2019-10-30T19:19:32.245 INFO:tasks.ceph.mgr.y.smithi201.stderr: 14: (clone()+0x3f) [0x7f0ec3d0388f] 2019-10-30T19:19:32.245 INFO:tasks.ceph.mgr.y.smithi201.stderr:*** Caught signal (Aborted) ** 2019-10-30T19:19:32.246 INFO:tasks.ceph.mgr.y.smithi201.stderr: in thread 7f0ebff38700 thread_name:msgr-worker-2
From: /ceph/teuthology-archive/pdonnell-2019-10-30_18:50:00-fs:upgrade-master-distro-basic-smithi/4456235/teuthology.log
Updated by Sage Weil over 4 years ago
db84d9ea8f3d1d46ba4cc3116aea052e8554261d from pr 30951 [1] is the problem. it adds ping times, which are at osd_stat_t struct_v=14 in master, as struct_v=9 in mimic. the osd_stat_t encoded by mimic looks like gibberish to nautilus and master.
[1] Errata: db84d9ea8f3d1d46ba4cc3116aea052e8554261d is in https://github.com/ceph/ceph/pull/30225
Updated by Sage Weil over 4 years ago
- Status changed from New to Fix Under Review
- Pull request ID set to 31267
reverting in mimic for now. the nautilus PR https://github.com/ceph/ceph/pull/30195 is similarly broken but hasn't merged yet.
Updated by Patrick Donnelly over 4 years ago
- Project changed from mgr to RADOS
- Assignee set to Sage Weil
- Source set to Q/A
Updated by Patrick Donnelly over 4 years ago
- Assignee changed from Sage Weil to David Zafman
- Pull request ID changed from 31267 to 31275
Updated by Patrick Donnelly over 4 years ago
- Target version changed from v15.0.0 to v13.2.7
Updated by Yuri Weinstein over 4 years ago
Updated by David Zafman over 4 years ago
- Status changed from Fix Under Review to Resolved
- Backport set to luminous
I put in the backports, but they were already completed without creating backport trackers.
Nautilus: Separate fix not needed because it was rolled into the feature pull request (https://github.com/ceph/ceph/pull/30195 -> bd29e44a653d67f344fe0fd915a207aa6feae6e6)
Mimic: This fix
Luminous: https://github.com/ceph/ceph/pull/31277
Updated by David Zafman over 4 years ago
- Related to Feature #40640: Network ping monitoring added
Updated by David Zafman over 4 years ago
- Related to Backport #41696: mimic: Network ping monitoring added
Updated by David Zafman over 4 years ago
- Related to Backport #41697: luminous: Network ping monitoring added