Bug #20277
bluestore crashed while performing scrub
% Done:
0%
Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
BlueStore
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
ceph version 12.0.3 (f2337d1b42fa49dbb0a93e4048a42762e3dffbbf) 1: (()+0x9bb95a) [0x562f2224595a] 2: (()+0x110c0) [0x7f0296ebb0c0] 3: (gsignal()+0xcf) [0x7f0295cf5fcf] 4: (abort()+0x16a) [0x7f0295cf73fa] 5: (()+0x2be37) [0x7f0295ceee37] 6: (()+0x2bee2) [0x7f0295ceeee2] 7: (()+0x3b4062) [0x562f21c3e062] 8: (BlueStore::Blob::get_ref(BlueStore::Collection*, unsigned int, unsigned int)+0) [0x562f220e1150] 9: (BlueStore::Blob::get_ref(BlueStore::Collection*, unsigned int, unsigned int)+0x242) [0x562f220e1392] 10: (BlueStore::Blob::decode(BlueStore::Collection*, ceph::buffer::ptr::iterator&, unsigned long, unsigned long*, bool)+0x60d) [0x562f220fdf9d] 11: (BlueStore::ExtentMap::decode_spanning_blobs(ceph::buffer::ptr::iterator&)+0x23f) [0x562f2210ce1f] 12: (BlueStore::Collection::get_onode(ghobject_t const&, bool)+0xf06) [0x562f2212d366] 13: (BlueStore::stat(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ghobject_t const&, stat*, bool)+0xcc) [0x562f2212debc] 14: (PGBackend::be_scan_list(ScrubMap&, std::vector<hobject_t, std::allocator<hobject_t> > const&, bool, unsigned int, ThreadPool::TPHandle&)+0x1ed) [0x562f21f0742d] 15: (PG::build_scrub_map_chunk(ScrubMap&, hobject_t, hobject_t, bool, unsigned int, ThreadPool::TPHandle&)+0x214) [0x562f21dc3b64] 16: (PG::replica_scrub(boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x61d) [0x562f21dc448d] 17: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x772) [0x562f21e78172] 18: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x22c) [0x562f21d19e0c] 19: (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest> const&)+0x57) [0x562f21d1a227] 20: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x108c) [0x562f21d45d4c] 21: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x96e) [0x562f2228c0ae] 22: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x562f2228e2b0] 23: (()+0x7494) [0x7f0296eb1494] 24: (clone()+0x3f) [0x7f0295dab93f] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
not sure if it exists in the latest version, though.
History
#1 Updated by Peter Gervai almost 7 years ago
What happened (twice) was:
- the osd had a crc error inconsistent pg
- set debug-bluestore and debug-osd to 20
- the osd crashes
(so I haven't initiated the scrub manually)
after it has been 'pg repair'ed I set debug again and started scrub and deep-scrub and osd haven't crashed again.
#2 Updated by Greg Farnum almost 7 years ago
- Project changed from Ceph to RADOS
- Category deleted (
107) - Component(RADOS) BlueStore added
#3 Updated by Sage Weil almost 7 years ago
- Status changed from New to Need More Info
A bug was just fixed in the spanning blob code, see https://github.com/ceph/ceph/pull/15654. Are you able to reproduce the crash, and/or can you retry with latest master?
#4 Updated by Sage Weil over 6 years ago
- Status changed from Need More Info to Can't reproduce