Bug #51133
openOSDs failing to start: rocksdb: submit_common error: Corruption: block checksum mismatch
0%
03648e85bd54b069c13692282b035d0f030786bfcaab3b1221178fd42f1130b8
109ab3ee85a3bc3337746eb0e056f21b397fcbad99e97dfa608e3080667f8744
83841b3bf546fd7e26065342f78c40b0d62a52eee87abbb2d08784c462a16920
89731bf93a69fe6a0abb4311dc16af93b46f1fae4949ca7654c5e30dfccc595b
a3b141a7ff14019694d6551ae1bff756bc7fb55f7dda44d04b022ae42c1be7cb
b10d16e2ecdc42f40eab364c6086bede7a0c39621f1d62c8302a9731ded47727
ee27598d524a96776e424624e7972be2c01b4ee6121ac2eacc836aa63d7cff1b
Description
After a while of high usage on my stack, I'm getting this error:
--- begin dump of recent events --- debug -83> 2021-06-07T17:20:59.814+0000 7fe0ee9d4080 5 asok(0x56241139c000) register_command assert hook 0x5624112e6540 debug -82> 2021-06-07T17:20:59.814+0000 7fe0ee9d4080 5 asok(0x56241139c000) register_command abort hook 0x5624112e6540 debug -81> 2021-06-07T17:20:59.814+0000 7fe0ee9d4080 5 asok(0x56241139c000) register_command leak_some_memory hook 0x5624112e6540 debug -80> 2021-06-07T17:20:59.814+0000 7fe0ee9d4080 5 asok(0x56241139c000) register_command perfcounters_dump hook 0x5624112e6540 debug -79> 2021-06-07T17:20:59.814+0000 7fe0ee9d4080 5 asok(0x56241139c000) register_command 1 hook 0x5624112e6540 debug -78> 2021-06-07T17:20:59.814+0000 7fe0ee9d4080 5 asok(0x56241139c000) register_command perf dump hook 0x5624112e6540 debug -77> 2021-06-07T17:20:59.814+0000 7fe0ee9d4080 5 asok(0x56241139c000) register_command perfcounters_schema hook 0x5624112e6540 debug -76> 2021-06-07T17:20:59.814+0000 7fe0ee9d4080 5 asok(0x56241139c000) register_command perf histogram dump hook 0x5624112e6540 debug -75> 2021-06-07T17:20:59.814+0000 7fe0ee9d4080 5 asok(0x56241139c000) register_command 2 hook 0x5624112e6540 debug -74> 2021-06-07T17:20:59.814+0000 7fe0ee9d4080 5 asok(0x56241139c000) register_command perf schema hook 0x5624112e6540 debug -73> 2021-06-07T17:20:59.814+0000 7fe0ee9d4080 5 asok(0x56241139c000) register_command perf histogram schema hook 0x5624112e6540 debug -72> 2021-06-07T17:20:59.814+0000 7fe0ee9d4080 5 asok(0x56241139c000) register_command perf reset hook 0x5624112e6540 debug -71> 2021-06-07T17:20:59.814+0000 7fe0ee9d4080 5 asok(0x56241139c000) register_command config show hook 0x5624112e6540 debug -70> 2021-06-07T17:20:59.814+0000 7fe0ee9d4080 5 asok(0x56241139c000) register_command config help hook 0x5624112e6540 debug -69> 2021-06-07T17:20:59.814+0000 7fe0ee9d4080 5 asok(0x56241139c000) register_command config set hook 0x5624112e6540 debug -68> 2021-06-07T17:20:59.814+0000 7fe0ee9d4080 5 asok(0x56241139c000) register_command config unset hook 0x5624112e6540 debug -67> 2021-06-07T17:20:59.814+0000 7fe0ee9d4080 5 asok(0x56241139c000) register_command config get hook 0x5624112e6540 debug -66> 2021-06-07T17:20:59.814+0000 7fe0ee9d4080 5 asok(0x56241139c000) register_command config diff hook 0x5624112e6540 debug -65> 2021-06-07T17:20:59.814+0000 7fe0ee9d4080 5 asok(0x56241139c000) register_command config diff get hook 0x5624112e6540 debug -64> 2021-06-07T17:20:59.814+0000 7fe0ee9d4080 5 asok(0x56241139c000) register_command injectargs hook 0x5624112e6540 debug -63> 2021-06-07T17:20:59.814+0000 7fe0ee9d4080 5 asok(0x56241139c000) register_command log flush hook 0x5624112e6540 debug -62> 2021-06-07T17:20:59.814+0000 7fe0ee9d4080 5 asok(0x56241139c000) register_command log dump hook 0x5624112e6540 debug -61> 2021-06-07T17:20:59.814+0000 7fe0ee9d4080 5 asok(0x56241139c000) register_command log reopen hook 0x5624112e6540 debug -60> 2021-06-07T17:20:59.814+0000 7fe0ee9d4080 5 asok(0x56241139c000) register_command dump_mempools hook 0x56241138c328 debug -59> 2021-06-07T17:20:59.887+0000 7fe0ee9d4080 0 set uid:gid to 167:167 (ceph:ceph) debug -58> 2021-06-07T17:20:59.887+0000 7fe0ee9d4080 0 ceph version 16.2.4 (3cbe25cde3cfa028984618ad32de9edc4c1eaed0) pacific (stable), process ceph-osd, pid 1 debug -57> 2021-06-07T17:20:59.887+0000 7fe0ee9d4080 0 pidfile_write: ignore empty --pid-file debug -56> 2021-06-07T17:21:00.418+0000 7fe0ee9d4080 0 starting osd.7 osd_data /var/lib/ceph/osd/ceph-7 /var/lib/ceph/osd/ceph-7/journal debug -55> 2021-06-07T17:21:00.419+0000 7fe0ee9d4080 -1 Falling back to public interface debug -54> 2021-06-07T17:21:00.453+0000 7fe0ee9d4080 0 load: jerasure load: lrc load: isa debug -53> 2021-06-07T17:21:00.726+0000 7fe0ee9d4080 0 osd.7:0.OSDShard using op scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue, cutoff=196) debug -52> 2021-06-07T17:21:00.989+0000 7fe0ee9d4080 0 osd.7:1.OSDShard using op scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue, cutoff=196) debug -51> 2021-06-07T17:21:01.250+0000 7fe0ee9d4080 0 osd.7:2.OSDShard using op scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue, cutoff=196) debug -50> 2021-06-07T17:21:01.510+0000 7fe0ee9d4080 0 osd.7:3.OSDShard using op scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue, cutoff=196) debug -49> 2021-06-07T17:21:01.520+0000 7fe0ee9d4080 0 osd.7:4.OSDShard using op scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue, cutoff=196) debug -48> 2021-06-07T17:21:01.521+0000 7fe0ee9d4080 0 bluestore(/var/lib/ceph/osd/ceph-7) _open_db_and_around read-only:0 repair:0 debug -47> 2021-06-07T17:21:01.616+0000 7fe0ee9d4080 1 set rocksdb option max_total_wal_size = 1073741824 debug -46> 2021-06-07T17:21:01.617+0000 7fe0ee9d4080 1 set rocksdb option compaction_readahead_size = 2097152 debug -45> 2021-06-07T17:21:01.617+0000 7fe0ee9d4080 1 set rocksdb option max_write_buffer_number = 4 debug -44> 2021-06-07T17:21:01.617+0000 7fe0ee9d4080 1 set rocksdb option max_background_compactions = 2 debug -43> 2021-06-07T17:21:01.617+0000 7fe0ee9d4080 1 set rocksdb option compression = kNoCompression debug -42> 2021-06-07T17:21:01.617+0000 7fe0ee9d4080 1 set rocksdb option writable_file_max_buffer_size = 0 debug -41> 2021-06-07T17:21:01.617+0000 7fe0ee9d4080 1 set rocksdb option min_write_buffer_number_to_merge = 1 debug -40> 2021-06-07T17:21:01.617+0000 7fe0ee9d4080 1 set rocksdb option recycle_log_file_num = 4 debug -39> 2021-06-07T17:21:01.617+0000 7fe0ee9d4080 1 set rocksdb option write_buffer_size = 268435456 debug -38> 2021-06-07T17:21:01.617+0000 7fe0ee9d4080 1 set rocksdb option max_total_wal_size = 1073741824 debug -37> 2021-06-07T17:21:01.617+0000 7fe0ee9d4080 1 set rocksdb option compaction_readahead_size = 2097152 debug -36> 2021-06-07T17:21:01.617+0000 7fe0ee9d4080 1 set rocksdb option max_write_buffer_number = 4 debug -35> 2021-06-07T17:21:01.617+0000 7fe0ee9d4080 1 set rocksdb option max_background_compactions = 2 debug -34> 2021-06-07T17:21:01.617+0000 7fe0ee9d4080 1 set rocksdb option compression = kNoCompression debug -33> 2021-06-07T17:21:01.617+0000 7fe0ee9d4080 1 set rocksdb option writable_file_max_buffer_size = 0 debug -32> 2021-06-07T17:21:01.618+0000 7fe0ee9d4080 1 set rocksdb option min_write_buffer_number_to_merge = 1 debug -31> 2021-06-07T17:21:01.618+0000 7fe0ee9d4080 1 set rocksdb option recycle_log_file_num = 4 debug -30> 2021-06-07T17:21:01.618+0000 7fe0ee9d4080 1 set rocksdb option write_buffer_size = 268435456 debug -29> 2021-06-07T17:21:02.843+0000 7fe0ee9d4080 1 set rocksdb option max_total_wal_size = 1073741824 debug -28> 2021-06-07T17:21:02.843+0000 7fe0ee9d4080 1 set rocksdb option compaction_readahead_size = 2097152 debug -27> 2021-06-07T17:21:02.843+0000 7fe0ee9d4080 1 set rocksdb option max_write_buffer_number = 4 debug -26> 2021-06-07T17:21:02.843+0000 7fe0ee9d4080 1 set rocksdb option max_background_compactions = 2 debug -25> 2021-06-07T17:21:02.843+0000 7fe0ee9d4080 1 set rocksdb option compression = kNoCompression debug -24> 2021-06-07T17:21:02.843+0000 7fe0ee9d4080 1 set rocksdb option writable_file_max_buffer_size = 0 debug -23> 2021-06-07T17:21:02.844+0000 7fe0ee9d4080 1 set rocksdb option min_write_buffer_number_to_merge = 1 debug -22> 2021-06-07T17:21:02.844+0000 7fe0ee9d4080 1 set rocksdb option recycle_log_file_num = 4 debug -21> 2021-06-07T17:21:02.844+0000 7fe0ee9d4080 1 set rocksdb option write_buffer_size = 268435456 debug -20> 2021-06-07T17:21:02.844+0000 7fe0ee9d4080 1 set rocksdb option max_total_wal_size = 1073741824 debug -19> 2021-06-07T17:21:02.844+0000 7fe0ee9d4080 1 set rocksdb option compaction_readahead_size = 2097152 debug -18> 2021-06-07T17:21:02.844+0000 7fe0ee9d4080 1 set rocksdb option max_write_buffer_number = 4 debug -17> 2021-06-07T17:21:02.844+0000 7fe0ee9d4080 1 set rocksdb option max_background_compactions = 2 debug -16> 2021-06-07T17:21:02.844+0000 7fe0ee9d4080 1 set rocksdb option compression = kNoCompression debug -15> 2021-06-07T17:21:02.844+0000 7fe0ee9d4080 1 set rocksdb option writable_file_max_buffer_size = 0 debug -14> 2021-06-07T17:21:02.844+0000 7fe0ee9d4080 1 set rocksdb option min_write_buffer_number_to_merge = 1 debug -13> 2021-06-07T17:21:02.844+0000 7fe0ee9d4080 1 set rocksdb option recycle_log_file_num = 4 debug -12> 2021-06-07T17:21:02.844+0000 7fe0ee9d4080 1 set rocksdb option write_buffer_size = 268435456 debug -11> 2021-06-07T17:21:05.107+0000 7fe0ee9d4080 0 <cls> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.4/rpm/el8/BUILD/ceph-16.2.4/src/cls/cephfs/cls_cephfs.cc:201: loading cephfs debug -10> 2021-06-07T17:21:05.109+0000 7fe0ee9d4080 0 <cls> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.4/rpm/el8/BUILD/ceph-16.2.4/src/cls/hello/cls_hello.cc:316: loading cls_hello debug -9> 2021-06-07T17:21:05.111+0000 7fe0ee9d4080 0 _get_class not permitted to load kvs debug -8> 2021-06-07T17:21:05.114+0000 7fe0ee9d4080 0 _get_class not permitted to load lua debug -7> 2021-06-07T17:21:05.128+0000 7fe0ee9d4080 0 _get_class not permitted to load sdk debug -6> 2021-06-07T17:21:05.131+0000 7fe0ee9d4080 0 osd.7 236728 crush map has features 288514119978713088, adjusting msgr requires for clients debug -5> 2021-06-07T17:21:05.131+0000 7fe0ee9d4080 0 osd.7 236728 crush map has features 288514119978713088 was 8705, adjusting msgr requires for mons debug -4> 2021-06-07T17:21:05.131+0000 7fe0ee9d4080 0 osd.7 236728 crush map has features 3314933069571702784, adjusting msgr requires for osds debug -3> 2021-06-07T17:21:08.664+0000 7fe0ee9d4080 0 osd.7 236728 load_pgs debug -2> 2021-06-07T17:21:08.908+0000 7fe0d7c9f700 -1 rocksdb: submit_common error: Corruption: block checksum mismatch: expected 4021751865, got 3381438153 in db/007491.sst offset 57636 size 741633 code = 2 Rocksdb transaction: PutCF( prefix = P key = 0x0000000000008284'.can_rollback_to' value size = 12) PutCF( prefix = P key = 0x0000000000008284'.rollback_info_trimmed_to' value size = 12) PutCF( prefix = O key = 0x82800000000000001AB0000000'!!='0xFFFFFFFFFFFFFFFEFFFFFFFFFFFFFFFF6F value size = 30) PutCF( prefix = S key = 'nid_max' value size = 8) PutCF( prefix = S key = 'blobid_max' value size = 8) debug -1> 2021-06-07T17:21:08.919+0000 7fe0d7c9f700 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.4/rpm/el8/BUILD/ceph-16.2.4/src/os/bluestore/BlueStore.cc: In function 'void BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)' thread 7fe0d7c9f700 time 2021-06-07T17:21:08.909473+0000 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.4/rpm/el8/BUILD/ceph-16.2.4/src/os/bluestore/BlueStore.cc: 11601: FAILED ceph_assert(r == 0) ceph version 16.2.4 (3cbe25cde3cfa028984618ad32de9edc4c1eaed0) pacific (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x562406b3c064] 2: ceph-osd(+0x56927e) [0x562406b3c27e] 3: (BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)+0x45f) [0x56240716382f] 4: (BlueStore::_kv_sync_thread()+0x16dc) [0x56240719c6fc] 5: (BlueStore::KVSyncThread::entry()+0x11) [0x5624071c4b91] 6: /lib64/libpthread.so.0(+0x814a) [0x7fe0ec73114a] 7: clone() debug 0> 2021-06-07T17:21:08.927+0000 7fe0d7c9f700 -1 *** Caught signal (Aborted) ** in thread 7fe0d7c9f700 thread_name:bstore_kv_sync ceph version 16.2.4 (3cbe25cde3cfa028984618ad32de9edc4c1eaed0) pacific (stable) 1: /lib64/libpthread.so.0(+0x12b20) [0x7fe0ec73bb20] 2: gsignal() 3: abort() 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a9) [0x562406b3c0b5] 5: ceph-osd(+0x56927e) [0x562406b3c27e] 6: (BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)+0x45f) [0x56240716382f] 7: (BlueStore::_kv_sync_thread()+0x16dc) [0x56240719c6fc] 8: (BlueStore::KVSyncThread::entry()+0x11) [0x5624071c4b91] 9: /lib64/libpthread.so.0(+0x814a) [0x7fe0ec73114a] 10: clone() NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
This seems to be the only relevant part of my config:
[osd] # Needed to bypass default valuef failure (896M is the lowest) osd memory base = 100663296 osd memory target = 939524096 osd memory target cgroup limit ratio = 0.0 bluestore cache autotune = false bluestore cache size hdd = 134217728 bluestore cache size ssd = 67108864 # https://tracker.ceph.com/issues/50656 bluestore allocator = bitmap
There were also a bunch of crashes before this appeared:
{ "archived": "2021-06-08 12:44:51.333438", "assert_condition": "!is_scrubbing()", "assert_file": "/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.4/rpm/el8/BUILD/ceph-16.2.4/src/osd/PG.cc", "assert_func": "bool PG::sched_scrub()", "assert_line": 1339, "assert_msg": "/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.4/rpm/el8/BUILD/ceph-16.2.4/src/osd/PG.cc: In function 'bool PG::sched_scrub()' thread 7fd905a7e700 time 2021-06-07T16:51:16.856046+0000\n/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.4/rpm/el8/BUILD/ceph-16.2.4/src/osd/PG.cc: 1339: FAILED ceph_assert(!is_scrubbing())\n", "assert_thread_name": "safe_timer", "backtrace": [ "/lib64/libpthread.so.0(+0x12b20) [0x7fd90eaf4b20]", "gsignal()", "abort()", "(ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a9) [0x55c3d221a0b5]", "ceph-osd(+0x56927e) [0x55c3d221a27e]", "(PG::sched_scrub()+0x561) [0x55c3d23ca7d1]", "(OSD::sched_scrub()+0x8e3) [0x55c3d2314633]", "(OSD::tick_without_osd_lock()+0x678) [0x55c3d2325fa8]", "(Context::complete(int)+0xd) [0x55c3d235974d]", "(SafeTimer::timer_thread()+0x1b7) [0x55c3d299fb07]", "(SafeTimerThread::entry()+0x11) [0x55c3d29a10e1]", "/lib64/libpthread.so.0(+0x814a) [0x7fd90eaea14a]", "clone()" ], "ceph_version": "16.2.4", "crash_id": "2021-06-07T16:51:17.910297Z_cc0c9987-61ba-45f2-b9d2-a4c9a180a283", "entity_name": "osd.7", "os_id": "centos", "os_name": "CentOS Linux", "os_version": "8", "os_version_id": "8", "process_name": "ceph-osd", "stack_sig": "e5c6203c14b6621da9fda8f2bdd2ee6b8585023de8b70ffeef68b075342749cf", "timestamp": "2021-06-07T16:51:17.910297Z", "utsname_hostname": "rook-ceph-osd-7-97684b598-d8zkz", "utsname_machine": "x86_64", "utsname_release": "5.8.12-200.fc32.x86_64", "utsname_sysname": "Linux", "utsname_version": "#1 SMP Mon Sep 28 12:17:31 UTC 2020" }
And then a lot of these:
{ "archived": "2021-06-08 12:44:51.509393", "assert_condition": "r == 0", "assert_file": "/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.4/rpm/el8/BUILD/ceph-16.2.4/src/os/bluestore/BlueStore.cc", "assert_func": "void BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)", "assert_line": 11601, "assert_msg": "/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.4/rpm/el8/BUILD/ceph-16.2.4/src/os/bluestore/BlueStore.cc: In function 'void BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)' thread 7f8513d4b700 time 2021-06-07T16:58:26.313334+0000\n/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.4/rpm/el8/BUILD/ceph-16.2.4/src/os/bluestore/BlueStore.cc: 11601: FAILED ceph_assert(r == 0)\n", "assert_thread_name": "bstore_kv_sync", "backtrace": [ "/lib64/libpthread.so.0(+0x12b20) [0x7f85287e7b20]", "gsignal()", "abort()", "(ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a9) [0x563a580960b5]", "ceph-osd(+0x56927e) [0x563a5809627e]", "(BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)+0x45f) [0x563a586bd82f]", "(BlueStore::_kv_sync_thread()+0x16dc) [0x563a586f66fc]", "(BlueStore::KVSyncThread::entry()+0x11) [0x563a5871eb91]", "/lib64/libpthread.so.0(+0x814a) [0x7f85287dd14a]", "clone()" ], "ceph_version": "16.2.4", "crash_id": "2021-06-07T16:58:26.505453Z_1a9cbbf3-9906-4110-916f-f6b183900189", "entity_name": "osd.7", "os_id": "centos", "os_name": "CentOS Linux", "os_version": "8", "os_version_id": "8", "process_name": "ceph-osd", "stack_sig": "a3b141a7ff14019694d6551ae1bff756bc7fb55f7dda44d04b022ae42c1be7cb", "timestamp": "2021-06-07T16:58:26.505453Z", "utsname_hostname": "rook-ceph-osd-7-97684b598-d8zkz", "utsname_machine": "x86_64", "utsname_release": "5.8.12-200.fc32.x86_64", "utsname_sysname": "Linux", "utsname_version": "#1 SMP Mon Sep 28 12:17:31 UTC 2020" }
I am deploying Ceph by using rook-ceph in Kubernetes. It belongs to an EC pool 4+1.
Files
Updated by Neha Ojha almost 3 years ago
- File ceph-osd.1442.log.gz ceph-osd.1442.log.gz added
similar
2021-07-03T06:51:08.746+0200 7fba950a3080 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.4/rpm/el8/BUILD/ceph-16.2.4/src/kv/RocksDBStore.cc: In function 'virtual int RocksDBStore::get(const string&, const string&, ceph::bufferlist*)' thread 7fba950a3080 time 2021-07-03T06:51:08.744960+0200 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.4/rpm/el8/BUILD/ceph-16.2.4/src/kv/RocksDBStore.cc: 1840: ceph_abort_msg("block checksum mismatch: expected 4252592570, got 1819148153 in db/120756.sst offset 1020683 size 3952") ceph version 16.2.4 (3cbe25cde3cfa028984618ad32de9edc4c1eaed0) pacific (stable) 1: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xe5) [0x55837d9697a4] 2: (RocksDBStore::get(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, ceph::buffer::v15_2_0::list*)+0x3ec) [0x55837e4c4a1c] 3: (BlueStore::omap_get_values(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ghobject_t const&, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::v15_2_0::list, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::v15_2_0::list> > >*)+0x2f1) [0x55837df83641] 4: (PG::peek_map_epoch(ObjectStore*, spg_t, unsigned int*)+0x3d5) [0x55837db0c735] 5: (OSD::load_pgs()+0x90f) [0x55837da721cf] 6: (OSD::init()+0x26f7) [0x55837da9f1b7] 7: main() 8: __libc_start_main() 9: _start()
Updated by Neha Ojha almost 3 years ago
- Related to Bug #51338: osd/scrub_machine.cc: FAILED ceph_assert(state_cast<const NotActive*>()) added
Updated by Telemetry Bot over 2 years ago
- Crash signature (v1) updated (diff)
- Crash signature (v2) updated (diff)
- Affected Versions v16.2.0, v16.2.1, v16.2.5 added
Assert condition: r == 0
Assert function: void BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)
Sanitized backtrace:
/lib64/libpthread.so.0( ceph-osd( BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool) BlueStore::_kv_sync_thread() BlueStore::KVSyncThread::entry() /lib64/libpthread.so.0( clone()
Crash dump sample:
{ "assert_condition": "r == 0", "assert_file": "os/bluestore/BlueStore.cc", "assert_func": "void BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)", "assert_line": 11603, "assert_msg": "os/bluestore/BlueStore.cc: In function 'void BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)' thread 7f3d0ef32700 time 2021-08-09T20:33:07.148828+0000\nos/bluestore/BlueStore.cc: 11603: FAILED ceph_assert(r == 0)", "assert_thread_name": "bstore_kv_sync", "backtrace": [ "/lib64/libpthread.so.0(+0x12b20) [0x7f3d221d3b20]", "gsignal()", "abort()", "(ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a9) [0x560e8d2d3f0b]", "ceph-osd(+0x56a0d4) [0x560e8d2d40d4]", "(BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)+0x45f) [0x560e8d8fb31f]", "(BlueStore::_kv_sync_thread()+0x16dc) [0x560e8d93450c]", "(BlueStore::KVSyncThread::entry()+0x11) [0x560e8d95cdf1]", "/lib64/libpthread.so.0(+0x814a) [0x7f3d221c914a]", "clone()" ], "ceph_version": "16.2.5", "crash_id": "2021-08-09T20:33:07.162057Z_da07fd6b-ce84-4986-8939-99bc874a2108", "entity_name": "osd.72c850397be22043c1b0330cc36a398953933930", "os_id": "centos", "os_name": "CentOS Linux", "os_version": "8", "os_version_id": "8", "process_name": "ceph-osd", "stack_sig": "03648e85bd54b069c13692282b035d0f030786bfcaab3b1221178fd42f1130b8", "timestamp": "2021-08-09T20:33:07.162057Z", "utsname_machine": "x86_64", "utsname_release": "5.4.0-80-generic", "utsname_sysname": "Linux", "utsname_version": "#90-Ubuntu SMP Fri Jul 9 22:49:44 UTC 2021" }
Updated by Telemetry Bot about 2 years ago
- Crash signature (v1) updated (diff)
- Affected Versions v16.2.6, v16.2.7 added
Updated by Telemetry Bot about 2 years ago
Updated by jianwei zhang 3 months ago
similar: 16.2.6
/usr/bin/ceph-osd --cluster ceph -f -i 699 --setuser ceph --setgroup ceph 2024-01-31T13:34:39.802+0800 7fdb64dc5140 -1 Falling back to public interface log_submit_lat=0.000000 last_log_flush_lat=0.000052 2024-01-31T13:34:45.443+0800 7fdb64dc5140 -1 osd.699 0 set_numa_affinity unable to identify public interface '' numa node: (2) No such file or directory log_submit_lat=0.000001 last_log_flush_lat=0.000004 2024-01-31T13:34:52.900+0800 7fdb4b8a8700 -1 rocksdb: submit_common error: Corruption: block checksum mismatch: stored = 3145323779, computed = 3833377101, type = 1 in db/000417.sst offset 11792861 size 3699 code = 2 Rocksdb transaction: PutCF( prefix = O key = 0x7F8000000000001B9EF0000000'!!='0xFFFFFFFFFFFFFFFEFFFFFFFFFFFFFFFF6F value size = 30) PutCF( prefix = S key = 'nid_max' value size = 8) PutCF( prefix = S key = 'blobid_max' value size = 8) log_submit_lat=0.000000 last_log_flush_lat=0.000039 /home/gitlab-runner/builds/ppn7NR4N/0/eos/slicer/slicer-src/rpmbuild/BUILD/ceph-16.2.6-31/src/os/bluestore/BlueStore.cc: In function 'void BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)' thread 7fdb4b8a8700 time 2024-01-31T13:34:52.901575+0800 /home/gitlab-runner/builds/ppn7NR4N/0/eos/slicer/slicer-src/rpmbuild/BUILD/ceph-16.2.6-31/src/os/bluestore/BlueStore.cc: 11732: FAILED ceph_assert(r == 0) ceph version 16.2.6-31 (91120fbd2fd68d60ec50f9b33eeccb09ddb6500b) pacific (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x55f76ccfd47e] 2: /usr/bin/ceph-osd(+0x639698) [0x55f76ccfd698] 3: (BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)+0x4af) [0x55f76d36cc4f] 4: (BlueStore::_kv_sync_thread()+0x1787) [0x55f76d3a6d17] 5: (BlueStore::KVSyncThread::entry()+0x11) [0x55f76d3d02b1] 6: /lib64/libpthread.so.0(+0x814a) [0x7fdb628b814a] 7: clone() *** Caught signal (Aborted) ** in thread 7fdb4b8a8700 thread_name:bstore_kv_sync 2024-01-31T13:34:52.903+0800 7fdb4b8a8700 -1 /home/gitlab-runner/builds/ppn7NR4N/0/eos/slicer/slicer-src/rpmbuild/BUILD/ceph-16.2.6-31/src/os/bluestore/BlueStore.cc: In function 'void BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)' thread 7fdb4b8a8700 time 2024-01-31T13:34:52.901575+0800 /home/gitlab-runner/builds/ppn7NR4N/0/eos/slicer/slicer-src/rpmbuild/BUILD/ceph-16.2.6-31/src/os/bluestore/BlueStore.cc: 11732: FAILED ceph_assert(r == 0) ceph version 16.2.6-31 (91120fbd2fd68d60ec50f9b33eeccb09ddb6500b) pacific (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x55f76ccfd47e] 2: /usr/bin/ceph-osd(+0x639698) [0x55f76ccfd698] 3: (BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)+0x4af) [0x55f76d36cc4f] 4: (BlueStore::_kv_sync_thread()+0x1787) [0x55f76d3a6d17] 5: (BlueStore::KVSyncThread::entry()+0x11) [0x55f76d3d02b1] 6: /lib64/libpthread.so.0(+0x814a) [0x7fdb628b814a] 7: clone() log_submit_lat=0.000000 last_log_flush_lat=0.000031 ceph version 16.2.6-31 (91120fbd2fd68d60ec50f9b33eeccb09ddb6500b) pacific (stable) 1: /lib64/libpthread.so.0(+0x12b20) [0x7fdb628c2b20] 2: gsignal() 3: abort() 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a9) [0x55f76ccfd4cf] 5: /usr/bin/ceph-osd(+0x639698) [0x55f76ccfd698] 6: (BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)+0x4af) [0x55f76d36cc4f] 7: (BlueStore::_kv_sync_thread()+0x1787) [0x55f76d3a6d17] 8: (BlueStore::KVSyncThread::entry()+0x11) [0x55f76d3d02b1] 9: /lib64/libpthread.so.0(+0x814a) [0x7fdb628b814a] 10: clone() 2024-01-31T13:34:52.905+0800 7fdb4b8a8700 -1 *** Caught signal (Aborted) ** in thread 7fdb4b8a8700 thread_name:bstore_kv_sync ceph version 16.2.6-31 (91120fbd2fd68d60ec50f9b33eeccb09ddb6500b) pacific (stable) 1: /lib64/libpthread.so.0(+0x12b20) [0x7fdb628c2b20] 2: gsignal() 3: abort() 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a9) [0x55f76ccfd4cf] 5: /usr/bin/ceph-osd(+0x639698) [0x55f76ccfd698] 6: (BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)+0x4af) [0x55f76d36cc4f] 7: (BlueStore::_kv_sync_thread()+0x1787) [0x55f76d3a6d17] 8: (BlueStore::KVSyncThread::entry()+0x11) [0x55f76d3d02b1] 9: /lib64/libpthread.so.0(+0x814a) [0x7fdb628b814a] 10: clone() NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. log_submit_lat=0.000000 last_log_flush_lat=0.000026 -102> 2024-01-31T13:34:39.802+0800 7fdb64dc5140 -1 Falling back to public interface log_submit_lat=0.000000 last_log_flush_lat=0.000005 -73> 2024-01-31T13:34:45.443+0800 7fdb64dc5140 -1 osd.699 0 set_numa_affinity unable to identify public interface '' numa node: (2) No such file or directory log_submit_lat=0.000001 last_log_flush_lat=0.000003 -2> 2024-01-31T13:34:52.900+0800 7fdb4b8a8700 -1 rocksdb: submit_common error: Corruption: block checksum mismatch: stored = 3145323779, computed = 3833377101, type = 1 in db/000417.sst offset 11792861 size 3699 code = 2 Rocksdb transaction: PutCF( prefix = O key = 0x7F8000000000001B9EF0000000'!!='0xFFFFFFFFFFFFFFFEFFFFFFFFFFFFFFFF6F value size = 30) PutCF( prefix = S key = 'nid_max' value size = 8) PutCF( prefix = S key = 'blobid_max' value size = 8) log_submit_lat=0.000000 last_log_flush_lat=0.000005 -1> 2024-01-31T13:34:52.903+0800 7fdb4b8a8700 -1 /home/gitlab-runner/builds/ppn7NR4N/0/eos/slicer/slicer-src/rpmbuild/BUILD/ceph-16.2.6-31/src/os/bluestore/BlueStore.cc: In function 'void BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)' thread 7fdb4b8a8700 time 2024-01-31T13:34:52.901575+0800 /home/gitlab-runner/builds/ppn7NR4N/0/eos/slicer/slicer-src/rpmbuild/BUILD/ceph-16.2.6-31/src/os/bluestore/BlueStore.cc: 11732: FAILED ceph_assert(r == 0) ceph version 16.2.6-31 (91120fbd2fd68d60ec50f9b33eeccb09ddb6500b) pacific (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x55f76ccfd47e] 2: /usr/bin/ceph-osd(+0x639698) [0x55f76ccfd698] 3: (BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)+0x4af) [0x55f76d36cc4f] 4: (BlueStore::_kv_sync_thread()+0x1787) [0x55f76d3a6d17] 5: (BlueStore::KVSyncThread::entry()+0x11) [0x55f76d3d02b1] 6: /lib64/libpthread.so.0(+0x814a) [0x7fdb628b814a] 7: clone() log_submit_lat=0.000000 last_log_flush_lat=0.000008 0> 2024-01-31T13:34:52.905+0800 7fdb4b8a8700 -1 *** Caught signal (Aborted) ** in thread 7fdb4b8a8700 thread_name:bstore_kv_sync ceph version 16.2.6-31 (91120fbd2fd68d60ec50f9b33eeccb09ddb6500b) pacific (stable) 1: /lib64/libpthread.so.0(+0x12b20) [0x7fdb628c2b20] 2: gsignal() 3: abort() 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a9) [0x55f76ccfd4cf] 5: /usr/bin/ceph-osd(+0x639698) [0x55f76ccfd698] 6: (BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)+0x4af) [0x55f76d36cc4f] 7: (BlueStore::_kv_sync_thread()+0x1787) [0x55f76d3a6d17] 8: (BlueStore::KVSyncThread::entry()+0x11) [0x55f76d3d02b1] 9: /lib64/libpthread.so.0(+0x814a) [0x7fdb628b814a] 10: clone() NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. log_submit_lat=0.000000 last_log_flush_lat=0.000008 -102> 2024-01-31T13:34:39.802+0800 7fdb64dc5140 -1 Falling back to public interface log_submit_lat=0.000000 last_log_flush_lat=0.000011 -73> 2024-01-31T13:34:45.443+0800 7fdb64dc5140 -1 osd.699 0 set_numa_affinity unable to identify public interface '' numa node: (2) No such file or directory log_submit_lat=0.000001 last_log_flush_lat=0.000003 -2> 2024-01-31T13:34:52.900+0800 7fdb4b8a8700 -1 rocksdb: submit_common error: Corruption: block checksum mismatch: stored = 3145323779, computed = 3833377101, type = 1 in db/000417.sst offset 11792861 size 3699 code = 2 Rocksdb transaction: PutCF( prefix = O key = 0x7F8000000000001B9EF0000000'!!='0xFFFFFFFFFFFFFFFEFFFFFFFFFFFFFFFF6F value size = 30) PutCF( prefix = S key = 'nid_max' value size = 8) PutCF( prefix = S key = 'blobid_max' value size = 8) log_submit_lat=0.000000 last_log_flush_lat=0.000003 -1> 2024-01-31T13:34:52.903+0800 7fdb4b8a8700 -1 /home/gitlab-runner/builds/ppn7NR4N/0/eos/slicer/slicer-src/rpmbuild/BUILD/ceph-16.2.6-31/src/os/bluestore/BlueStore.cc: In function 'void BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)' thread 7fdb4b8a8700 time 2024-01-31T13:34:52.901575+0800 /home/gitlab-runner/builds/ppn7NR4N/0/eos/slicer/slicer-src/rpmbuild/BUILD/ceph-16.2.6-31/src/os/bluestore/BlueStore.cc: 11732: FAILED ceph_assert(r == 0) ceph version 16.2.6-31 (91120fbd2fd68d60ec50f9b33eeccb09ddb6500b) pacific (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x55f76ccfd47e] 2: /usr/bin/ceph-osd(+0x639698) [0x55f76ccfd698] 3: (BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)+0x4af) [0x55f76d36cc4f] 4: (BlueStore::_kv_sync_thread()+0x1787) [0x55f76d3a6d17] 5: (BlueStore::KVSyncThread::entry()+0x11) [0x55f76d3d02b1] 6: /lib64/libpthread.so.0(+0x814a) [0x7fdb628b814a] 7: clone() log_submit_lat=0.000000 last_log_flush_lat=0.000007 0> 2024-01-31T13:34:52.905+0800 7fdb4b8a8700 -1 *** Caught signal (Aborted) ** in thread 7fdb4b8a8700 thread_name:bstore_kv_sync ceph version 16.2.6-31 (91120fbd2fd68d60ec50f9b33eeccb09ddb6500b) pacific (stable) 1: /lib64/libpthread.so.0(+0x12b20) [0x7fdb628c2b20] 2: gsignal() 3: abort() 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a9) [0x55f76ccfd4cf] 5: /usr/bin/ceph-osd(+0x639698) [0x55f76ccfd698] 6: (BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)+0x4af) [0x55f76d36cc4f] 7: (BlueStore::_kv_sync_thread()+0x1787) [0x55f76d3a6d17] 8: (BlueStore::KVSyncThread::entry()+0x11) [0x55f76d3d02b1] 9: /lib64/libpthread.so.0(+0x814a) [0x7fdb628b814a] 10: clone() NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.