Actions
Bug #24145
closedosdmap decode error in rados/standalone/*
Status:
Duplicate
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
2018-05-15T22:58:39.209 INFO:tasks.workunit.client.0.smithi116.stdout: -15> 2018-05-15 22:50:42.795 7f6d7504f700 10 osd.3 pg_epoch: 45 pg[1.1( empty local-lis/les=37/38 n=0 ec=2/2 lis/c 37/37 les/c/f 38/39/0 37/37/37) [3,0,2] r=0 lpr=39 crt=0'0 mlcod 0'0 unknown mbc={}] handle_advance_map [3,0,2]/[3,0,2] -- 3/3 2018-05-15T22:58:39.209 INFO:tasks.workunit.client.0.smithi116.stdout: -14> 2018-05-15 22:50:42.795 7f6d7504f700 20 osd.3:1.update_pg_epoch 1.1 45 -> 46 2018-05-15T22:58:39.209 INFO:tasks.workunit.client.0.smithi116.stdout: -13> 2018-05-15 22:50:42.795 7f6d7504f700 10 osd.3 pg_epoch: 46 pg[1.1( empty local-lis/les=37/38 n=0 ec=2/2 lis/c 37/37 les/c/f 38/39/0 37/37/37) [3,0,2] r=0 lpr=39 crt=0'0 mlcod 0'0 unknown mbc={}] state<Reset>: Reset advmap 2018-05-15T22:58:39.209 INFO:tasks.workunit.client.0.smithi116.stdout: -12> 2018-05-15 22:50:42.795 7f6d7504f700 10 osd.3 pg_epoch: 46 pg[1.1( empty local-lis/les=37/38 n=0 ec=2/2 lis/c 37/37 les/c/f 38/39/0 37/37/37) [3,0,2] r=0 lpr=39 crt=0'0 mlcod 0'0 unknown mbc={}] check_recovery_sources no source osds () went down 2018-05-15T22:58:39.210 INFO:tasks.workunit.client.0.smithi116.stdout: -11> 2018-05-15 22:50:42.795 7f6d7504f700 20 osd.3 53 get_map 47 - loading and decoding 0x5564e0a8c400 2018-05-15T22:58:39.210 INFO:tasks.workunit.client.0.smithi116.stdout: -10> 2018-05-15 22:50:42.795 7f6d7504f700 10 osd.3 53 add_map_bl 47 2690 bytes 2018-05-15T22:58:39.211 INFO:tasks.workunit.client.0.smithi116.stdout: 0> 2018-05-15 22:50:42.807 7f6d7504f700 -1 *** Caught signal (Aborted) ** 2018-05-15T22:58:39.211 INFO:tasks.workunit.client.0.smithi116.stdout: in thread 7f6d7504f700 thread_name:tp_osd_tp 2018-05-15T22:58:39.211 INFO:tasks.workunit.client.0.smithi116.stdout: 2018-05-15T22:58:39.211 INFO:tasks.workunit.client.0.smithi116.stdout: ceph version 13.1.0-100-g33c5549 (33c55492ab1ace07e76616c794b63f09b18b52ea) mimic (rc) 2018-05-15T22:58:39.211 INFO:tasks.workunit.client.0.smithi116.stdout: 1: (()+0x915bc0) [0x5564de4ccbc0] 2018-05-15T22:58:39.211 INFO:tasks.workunit.client.0.smithi116.stdout: 2: (()+0x11390) [0x7f6d97c92390] 2018-05-15T22:58:39.211 INFO:tasks.workunit.client.0.smithi116.stdout: 3: (gsignal()+0x38) [0x7f6d973df428] 2018-05-15T22:58:39.211 INFO:tasks.workunit.client.0.smithi116.stdout: 4: (abort()+0x16a) [0x7f6d973e102a] 2018-05-15T22:58:39.212 INFO:tasks.workunit.client.0.smithi116.stdout: 5: (__gnu_cxx::__verbose_terminate_handler()+0x135) [0x7f6d996c7de5] 2018-05-15T22:58:39.212 INFO:tasks.workunit.client.0.smithi116.stdout: 6: (__cxxabiv1::__terminate(void (*)())+0x6) [0x7f6d9962f5e6] 2018-05-15T22:58:39.212 INFO:tasks.workunit.client.0.smithi116.stdout: 7: (()+0x734631) [0x7f6d9962f631] 2018-05-15T22:58:39.212 INFO:tasks.workunit.client.0.smithi116.stdout: 8: (()+0x735d24) [0x7f6d99630d24] 2018-05-15T22:58:39.212 INFO:tasks.workunit.client.0.smithi116.stdout: 9: (OSDMap::decode(ceph::buffer::list::iterator&)+0x1915) [0x7f6d9935d985] 2018-05-15T22:58:39.212 INFO:tasks.workunit.client.0.smithi116.stdout: 10: (OSDMap::decode(ceph::buffer::list&)+0x31) [0x7f6d9935ece1] 2018-05-15T22:58:39.212 INFO:tasks.workunit.client.0.smithi116.stdout: 11: (OSDService::try_get_map(unsigned int)+0x508) [0x5564ddf83918] 2018-05-15T22:58:39.212 INFO:tasks.workunit.client.0.smithi116.stdout: 12: (OSD::advance_pg(unsigned int, PG*, ThreadPool::TPHandle&, PG::RecoveryCtx*)+0x19d) [0x5564ddf8787d] 2018-05-15T22:58:39.212 INFO:tasks.workunit.client.0.smithi116.stdout: 13: (OSD::dequeue_peering_evt(OSDShard*, PG*, std::shared_ptr<PGPeeringEvent>, ThreadPool::TPHandle&)+0x1a1) [0x5564ddf88041] 2018-05-15T22:58:39.213 INFO:tasks.workunit.client.0.smithi116.stdout: 14: (PGPeeringItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x50) [0x5564de1f0c70] 2018-05-15T22:58:39.213 INFO:tasks.workunit.client.0.smithi116.stdout: 15: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x590) [0x5564ddf97840] 2018-05-15T22:58:39.213 INFO:tasks.workunit.client.0.smithi116.stdout: 16: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x46e) [0x7f6d991e406e] 2018-05-15T22:58:39.213 INFO:tasks.workunit.client.0.smithi116.stdout: 17: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x7f6d991e60f0]
/a/sage-2018-05-14_21:04:26-rados-wip-sage2-testing-2018-05-14-1426-distro-basic-smithi/2532970
The core wouldn't let me look at the buffer, but we did appear to crash on the
DECODE_START_LEGACY_COMPAT_LEN(8, 7, 7, bl); // wrapper
line at the top of OSDMap::decode(). hrm.
Updated by Sage Weil almost 6 years ago
- Subject changed from osdmap decode error in rados/standalone/erasure-code.yaml to osdmap decode error in rados/standalone/*
- Priority changed from Normal to High
2018-05-24T11:29:55.027 INFO:tasks.workunit.client.0.smithi016.stderr:terminate called after throwing an instance of 'ceph::buffer::malformed_input' 2018-05-24T11:29:55.027 INFO:tasks.workunit.client.0.smithi016.stderr: what(): buffer::malformed_input: void OSDMap::decode(ceph::buffer::list::iterator&) no longer understand old encoding version 8 < 48 2018-05-24T11:29:55.028 INFO:tasks.workunit.client.0.smithi016.stderr:*** Caught signal (Aborted) ** 2018-05-24T11:29:55.028 INFO:tasks.workunit.client.0.smithi016.stderr: in thread 7f5fe9c8c200 thread_name:ceph-osd 2018-05-24T11:29:55.033 INFO:tasks.workunit.client.0.smithi016.stderr: ceph version 13.1.1-65-ge8d43bc (e8d43bc1d0fca786ae27bd1ce8e3dcac6d87c961) mimic (rc) 2018-05-24T11:29:55.033 INFO:tasks.workunit.client.0.smithi016.stderr: 1: (()+0x913840) [0x5597aa33b840] 2018-05-24T11:29:55.033 INFO:tasks.workunit.client.0.smithi016.stderr: 2: (()+0x11390) [0x7f5fdfa81390] 2018-05-24T11:29:55.033 INFO:tasks.workunit.client.0.smithi016.stderr: 3: (gsignal()+0x38) [0x7f5fdf1ce428] 2018-05-24T11:29:55.034 INFO:tasks.workunit.client.0.smithi016.stderr: 4: (abort()+0x16a) [0x7f5fdf1d002a] 2018-05-24T11:29:55.034 INFO:tasks.workunit.client.0.smithi016.stderr: 5: (__gnu_cxx::__verbose_terminate_handler()+0x135) [0x7f5fe14b64f5] 2018-05-24T11:29:55.034 INFO:tasks.workunit.client.0.smithi016.stderr: 6: (__cxxabiv1::__terminate(void (*)())+0x6) [0x7f5fe141dcf6] 2018-05-24T11:29:55.034 INFO:tasks.workunit.client.0.smithi016.stderr: 7: (()+0x733d41) [0x7f5fe141dd41] 2018-05-24T11:29:55.034 INFO:tasks.workunit.client.0.smithi016.stderr: 8: (()+0x735434) [0x7f5fe141f434] 2018-05-24T11:29:55.034 INFO:tasks.workunit.client.0.smithi016.stderr: 9: (OSDMap::decode(ceph::buffer::list::iterator&)+0x1915) [0x7f5fe114a325] 2018-05-24T11:29:55.034 INFO:tasks.workunit.client.0.smithi016.stderr: 10: (OSDMap::decode(ceph::buffer::list&)+0x31) [0x7f5fe114b681] 2018-05-24T11:29:55.035 INFO:tasks.workunit.client.0.smithi016.stderr: 11: (OSDService::try_get_map(unsigned int)+0x508) [0x5597a9defa28] 2018-05-24T11:29:55.035 INFO:tasks.workunit.client.0.smithi016.stderr: 12: (OSD::load_pgs()+0x460) [0x5597a9df0a10] 2018-05-24T11:29:55.035 INFO:tasks.workunit.client.0.smithi016.stderr: 13: (OSD::init()+0xcd3) [0x5597a9dfbc63]
/a/sage-2018-05-23_14:55:44-rados-wip-sage3-testing-2018-05-22-2126-distro-basic-smithi/2577552
rados/standalone/{supported-random-distro$/{ubuntu_16.04.yaml} workloads/scrub.yaml}
Updated by Kefu Chai almost 6 years ago
- Is duplicate of Bug #23492: Abort in OSDMap::decode() during qa/standalone/erasure-code/test-erasure-eio.sh added
Actions