Project

General

Profile

Actions

Bug #22678

closed

block checksum mismatch from rocksdb

Added by Mike O'Connor over 6 years ago. Updated about 6 years ago.

Status:
Duplicate
Priority:
High
Assignee:
-
Target version:
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hi
There seems to be a crash bug in the Luminous OSD code which causes OSDs to crash.

Jan 15 15:54:43 pve ceph-osd[29759]: 2018-01-15 15:54:43.557716 7f683157e700 -1 abort: Corruption: block checksum mismatch*** Caught signal (Aborted) **
Jan 15 15:54:43 pve ceph-osd[29759]:  in thread 7f683157e700 thread_name:tp_osd_tp
Jan 15 15:54:43 pve ceph-osd[29759]:  ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)
Jan 15 15:54:43 pve ceph-osd[29759]:  1: (()+0xa16664) [0x5626f6077664]
Jan 15 15:54:43 pve ceph-osd[29759]:  2: (()+0x110c0) [0x7f684996f0c0]
Jan 15 15:54:43 pve ceph-osd[29759]:  3: (gsignal()+0xcf) [0x7f6848936fcf]
Jan 15 15:54:43 pve ceph-osd[29759]:  4: (abort()+0x16a) [0x7f68489383fa]
Jan 15 15:54:43 pve ceph-osd[29759]:  5: (RocksDBStore::get(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, char const*, unsigned long, ceph::buffer:
:list*)+0x29f) [0x5626f5fb595f]
Jan 15 15:54:43 pve ceph-osd[29759]:  6: (BlueStore::Collection::get_onode(ghobject_t const&, bool)+0x5ae) [0x5626f5f392ae]
Jan 15 15:54:43 pve ceph-osd[29759]:  7: (BlueStore::read(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ghobject_t const&, unsigned long, unsigned long, ceph::buffer::list&, unsigned
int)+0xfc) [0x5626f5f64a9c]
Jan 15 15:54:43 pve ceph-osd[29759]:  8: (ECBackend::handle_sub_read(pg_shard_t, ECSubRead const&, ECSubReadReply*, ZTracer::Trace const&)+0x239) [0x5626f5df1209]
Jan 15 15:54:43 pve ceph-osd[29759]:  9: (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x50d) [0x5626f5df29cd]
Jan 15 15:54:43 pve ceph-osd[29759]:  10: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50) [0x5626f5cd1be0]
Jan 15 15:54:43 pve ceph-osd[29759]:  11: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x503) [0x5626f5c37a73]
Jan 15 15:54:43 pve ceph-osd[29759]:  12: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3ab) [0x5626f5ab59eb]
Jan 15 15:54:43 pve ceph-osd[29759]:  13: (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest> const&)+0x5a) [0x5626f5d53eba]
Jan 15 15:54:43 pve ceph-osd[29759]:  14: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x103d) [0x5626f5adcf4d]
Jan 15 15:54:43 pve ceph-osd[29759]:  15: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x8ef) [0x5626f60c406f]
Jan 15 15:54:43 pve ceph-osd[29759]:  16: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x5626f60c7370]
Jan 15 15:54:43 pve ceph-osd[29759]:  17: (()+0x7494) [0x7f6849965494]
Jan 15 15:54:43 pve ceph-osd[29759]:  18: (clone()+0x3f) [0x7f68489ecaff]
Jan 15 15:54:43 pve ceph-osd[29759]: 2018-01-15 15:54:43.562224 7f683157e700 -1 *** Caught signal (Aborted) **
Jan 15 15:54:43 pve ceph-osd[29759]:  in thread 7f683157e700 thread_name:tp_osd_tp
Jan 15 15:54:43 pve ceph-osd[29759]:  ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)
Jan 15 15:54:43 pve ceph-osd[29759]:  1: (()+0xa16664) [0x5626f6077664]
Jan 15 15:54:43 pve ceph-osd[29759]:  2: (()+0x110c0) [0x7f684996f0c0]
Jan 15 15:54:43 pve ceph-osd[29759]:  3: (gsignal()+0xcf) [0x7f6848936fcf]
Jan 15 15:54:43 pve ceph-osd[29759]:  4: (abort()+0x16a) [0x7f68489383fa]
Jan 15 15:54:43 pve ceph-osd[29759]:  5: (RocksDBStore::get(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, char const*, unsigned long, ceph::buffer:
:list*)+0x29f) [0x5626f5fb595f]
Jan 15 15:54:43 pve ceph-osd[29759]:  6: (BlueStore::Collection::get_onode(ghobject_t const&, bool)+0x5ae) [0x5626f5f392ae]
Jan 15 15:54:43 pve ceph-osd[29759]:  7: (BlueStore::read(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ghobject_t const&, unsigned long, unsigned long, ceph::buffer::list&, unsigned
int)+0xfc) [0x5626f5f64a9c]
Jan 15 15:54:43 pve ceph-osd[29759]:  8: (ECBackend::handle_sub_read(pg_shard_t, ECSubRead const&, ECSubReadReply*, ZTracer::Trace const&)+0x239) [0x5626f5df1209]
Jan 15 15:54:43 pve ceph-osd[29759]:  9: (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x50d) [0x5626f5df29cd]
Jan 15 15:54:43 pve ceph-osd[29759]:  10: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50) [0x5626f5cd1be0]
Jan 15 15:54:43 pve ceph-osd[29759]:  11: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x503) [0x5626f5c37a73]
Jan 15 15:54:43 pve ceph-osd[29759]:  12: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3ab) [0x5626f5ab59eb]
Jan 15 15:54:43 pve ceph-osd[29759]:  13: (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest> const&)+0x5a) [0x5626f5d53eba]
Jan 15 15:54:43 pve ceph-osd[29759]:  14: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x103d) [0x5626f5adcf4d]
Jan 15 15:54:43 pve ceph-osd[29759]:  15: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x8ef) [0x5626f60c406f]
Jan 15 15:54:43 pve ceph-osd[29759]:  16: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x5626f60c7370]
Jan 15 15:54:43 pve ceph-osd[29759]:  17: (()+0x7494) [0x7f6849965494]
Jan 15 15:54:43 pve ceph-osd[29759]:  18: (clone()+0x3f) [0x7f68489ecaff]
Jan 15 15:54:43 pve ceph-osd[29759]:  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Jan 15 15:54:43 pve ceph-osd[29759]:     -1> 2018-01-15 15:54:43.557716 7f683157e700 -1 abort: Corruption: block checksum mismatch
Jan 15 15:54:43 pve ceph-osd[29759]:      0> 2018-01-15 15:54:43.562224 7f683157e700 -1 *** Caught signal (Aborted) **
Jan 15 15:54:43 pve ceph-osd[29759]:  in thread 7f683157e700 thread_name:tp_osd_tp
Jan 15 15:54:43 pve ceph-osd[29759]:  ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)
Jan 15 15:54:43 pve ceph-osd[29759]:  1: (()+0xa16664) [0x5626f6077664]
Jan 15 15:54:43 pve ceph-osd[29759]:  2: (()+0x110c0) [0x7f684996f0c0]
Jan 15 15:54:43 pve ceph-osd[29759]:  3: (gsignal()+0xcf) [0x7f6848936fcf]
Jan 15 15:54:43 pve ceph-osd[29759]:  4: (abort()+0x16a) [0x7f68489383fa]
Jan 15 15:54:43 pve ceph-osd[29759]:  5: (RocksDBStore::get(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, char const*, unsigned long, ceph::buffer::list*)+0x29f) [0x5626f5fb595f]
Jan 15 15:54:43 pve ceph-osd[29759]:  6: (BlueStore::Collection::get_onode(ghobject_t const&, bool)+0x5ae) [0x5626f5f392ae]
Jan 15 15:54:43 pve ceph-osd[29759]:  7: (BlueStore::read(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ghobject_t const&, unsigned long, unsigned long, ceph::buffer::list&, unsigned int)+0xfc) [0x5626f5f64a9c]
Jan 15 15:54:43 pve ceph-osd[29759]:  8: (ECBackend::handle_sub_read(pg_shard_t, ECSubRead const&, ECSubReadReply*, ZTracer::Trace const&)+0x239) [0x5626f5df1209]
Jan 15 15:54:43 pve ceph-osd[29759]:  9: (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x50d) [0x5626f5df29cd]
Jan 15 15:54:43 pve ceph-osd[29759]:  10: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50) [0x5626f5cd1be0]
Jan 15 15:54:43 pve ceph-osd[29759]:  11: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x503) [0x5626f5c37a73]
Jan 15 15:54:43 pve ceph-osd[29759]:  12: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3ab) [0x5626f5ab59eb]
Jan 15 15:54:43 pve ceph-osd[29759]:  13: (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest> const&)+0x5a) [0x5626f5d53eba]
Jan 15 15:54:43 pve ceph-osd[29759]:  14: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x103d) [0x5626f5adcf4d]
Jan 15 15:54:43 pve ceph-osd[29759]:  15: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x8ef) [0x5626f60c406f]
Jan 15 15:54:43 pve ceph-osd[29759]:  16: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x5626f60c7370]
Jan 15 15:54:43 pve ceph-osd[29759]:  17: (()+0x7494) [0x7f6849965494]
Jan 15 15:54:43 pve ceph-osd[29759]:  18: (clone()+0x3f) [0x7f68489ecaff]
Jan 15 15:54:43 pve ceph-osd[29759]:  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

I can provide a copy of the dump file if needed but it will not fit inside 1000KB

Happy to provide whatever I can in the way of other detail.

Thanks
Mike


Related issues 1 (0 open1 closed)

Related to bluestore - Bug #22102: BlueStore crashed on rocksdb checksum mismatchWon't Fix11/10/2017

Actions
Actions

Also available in: Atom PDF