Project

General

Profile

Bug #53907

BlueStore.h: 4148: FAILED ceph_assert(cur >= p.length)

Added by Vikhyat Umrao 8 months ago. Updated 10 days ago.

Status:
Resolved
Priority:
Immediate
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
backport_processed
Backport:
quincy
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2022-01-18T00:11:09.058+0000 7f904d00d700  4 rocksdb: (Original Log Time 2022/01/18-00:11:09.059551) [db/memtable_list.cc:631] [L] Level-0 commit table #2027: memtable #1 done
2022-01-18T00:11:09.058+0000 7f904d00d700  4 rocksdb: (Original Log Time 2022/01/18-00:11:09.059566) EVENT_LOG_v1 {"time_micros": 1642464669059560, "job": 635, "event": "flush_finished", "output_compression": "NoCompression", "lsm_state": [2, 1, 0, 0, 0, 0, 0], "immutable_memtables": 0}
2022-01-18T00:11:09.058+0000 7f904d00d700  4 rocksdb: (Original Log Time 2022/01/18-00:11:09.059592) [db/db_impl/db_impl_compaction_flush.cc:235] [L] Level summary: files[2 1 0 0 0 0 0] max score 0.50

2022-01-18T00:11:09.058+0000 7f904d00d700  4 rocksdb: [db/db_impl/db_impl_files.cc:420] [JOB 635] Try to delete WAL files size 409400591, prev total WAL file size 414051513, number of live WAL files 3.

2022-01-18T00:11:09.145+0000 7f90485f6700 -1 /home/jenkins-build/build/workspace/ceph-dev-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-10229-g7e035110/rpm/el8/BUILD/ceph-17.0.0-10229-g7e035110/src/os/bluestore/BlueStore.h: In function 'virtual void RocksDBBlueFSVolumeSelector::sub_usage(void*, const bluefs_fnode_t&)' thread 7f904d00d700 time 2022-01-18T00:11:09.085371+0000
/home/jenkins-build/build/workspace/ceph-dev-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-10229-g7e035110/rpm/el8/BUILD/ceph-17.0.0-10229-g7e035110/src/os/bluestore/BlueStore.h: 4148: FAILED ceph_assert(cur >= p.length)

 ceph version 17.0.0-10229-g7e035110 (7e035110784fba02ba81944e444be9a36932c6a3) quincy (dev)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152) [0x561a196bab8e]
 2: /usr/bin/ceph-osd(+0x5d5daf) [0x561a196badaf]
 3: (RocksDBBlueFSVolumeSelector::sub_usage(void*, bluefs_fnode_t const&)+0x16a) [0x561a19d7ecca]
 4: (BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long, unsigned long)+0x735) [0x561a19e1cb45]
 5: (BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0xa9) [0x561a19e1d009]
 6: (BlueFS::fsync(BlueFS::FileWriter*)+0x18e) [0x561a19e386de]
 7: (BlueRocksWritableFile::Sync()+0x18) [0x561a19e48fb8]
 8: (rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions const&, rocksdb::IODebugContext*)+0x1f) [0x561a1a36c74f]
 9: (rocksdb::WritableFileWriter::SyncInternal(bool)+0x662) [0x561a1a49cf22]
 10: (rocksdb::WritableFileWriter::Sync(bool)+0xf8) [0x561a1a49e8e8]
 11: (rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup const&, rocksdb::log::Writer*, unsigned long*, bool, bool, unsigned long)+0x341) [0x561a1a383701]
 12: (rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned long, bool, unsigned long*, unsigned long, rocksdb::PreReleaseCallback*)+0x1c04) [0x561a1a38b454]
 13: (rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, rocksdb::WriteBatch*)+0x21) [0x561a1a38b5a1]
 14: (RocksDBStore::submit_common(rocksdb::WriteOptions&, std::shared_ptr<KeyValueDB::TransactionImpl>)+0x84) [0x561a1a325f84]
 15: (RocksDBStore::submit_transaction_sync(std::shared_ptr<KeyValueDB::TransactionImpl>)+0x9a) [0x561a1a32698a]
 16: (BlueStore::_kv_sync_thread()+0x3530) [0x561a19d7d390]
 17: (BlueStore::KVSyncThread::entry()+0x11) [0x561a19dac0b1]
 18: /lib64/libpthread.so.0(+0x817a) [0x7f905f01817a]
 19: clone()

2022-01-18T00:11:09.146+0000 7f904d00d700 -1 /home/jenkins-build/build/workspace/ceph-dev-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-10229-g7e035110/rpm/el8/BUILD/ceph-17.0.0-10229-g7e035110/src/os/bluestore/BlueStore.h: In function 'virtual void RocksDBBlueFSVolumeSelector::sub_usage(void*, const bluefs_fnode_t&)' thread 7f904d00d700 time 2022-01-18T00:11:09.085371+0000
/home/jenkins-build/build/workspace/ceph-dev-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-10229-g7e035110/rpm/el8/BUILD/ceph-17.0.0-10229-g7e035110/src/os/bluestore/BlueStore.h: 4148: FAILED ceph_assert(cur >= p.length)

 ceph version 17.0.0-10229-g7e035110 (7e035110784fba02ba81944e444be9a36932c6a3) quincy (dev)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152) [0x561a196bab8e]
 2: /usr/bin/ceph-osd(+0x5d5daf) [0x561a196badaf]
 3: (RocksDBBlueFSVolumeSelector::sub_usage(void*, bluefs_fnode_t const&)+0x16a) [0x561a19d7ecca]
 4: (BlueFS::_drop_link_D(boost::intrusive_ptr<BlueFS::File>)+0x5fd) [0x561a19e1514d]
 5: (BlueFS::unlink(std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >)+0x706) [0x561a19e23ed6]
 6: (BlueRocksEnv::DeleteFile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x47) [0x561a19e47167]
 7: (rocksdb::DeleteDBFile(rocksdb::ImmutableDBOptions const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool, bool)+0x9d) [0x561a1a492abd]
 8: (rocksdb::DBImpl::DeleteObsoleteFileImpl(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rocksdb::FileType, unsigned long)+0x116) [0x561a1a3a37a6]
 9: (rocksdb::DBImpl::PurgeObsoleteFiles(rocksdb::JobContext&, bool)+0x14c4) [0x561a1a3a6974]
 10: (rocksdb::DBImpl::BackgroundCallFlush(rocksdb::Env::Priority)+0x16b) [0x561a1a39dd8b]
 11: (rocksdb::ThreadPoolImpl::Impl::BGThread(unsigned long)+0x24a) [0x561a1a56243a]
 12: (rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper(void*)+0x5d) [0x561a1a5625dd]
 13: /lib64/libstdc++.so.6(+0xc2ba3) [0x7f905e66bba3]
 14: /lib64/libpthread.so.0(+0x817a) [0x7f905f01817a]
 15: clone()

2022-01-18T00:11:29.158+0000 7f42f7cf7240  0 set uid:gid to 167:167 (ceph:ceph)
2022-01-18T00:11:29.158+0000 7f42f7cf7240  0 ceph version 17.0.0-10229-g7e035110 (7e035110784fba02ba81944e444be9a36932c6a3) quincy (dev), process ceph-osd, pid 8
2022-01-18T00:11:29.158+0000 7f42f7cf7240  0 pidfile_write: ignore empty --pid-file
2022-01-18T00:11:29.160+0000 7f42f7cf7240  1 bdev(0x55e9a1703400 /var/lib/ceph/osd/ceph-4/block) open path /var/lib/ceph/osd/ceph-4/block
2022-01-18T00:11:29.161+0000 7f42f7cf7240  1 bdev(0x55e9a1703400 /var/lib/ceph/osd/ceph-4/block) open size 1999839952896 (0x1d19fc00000, 1.8 TiB) block_size 4096 (4 KiB) rotational discard not supported


Related issues

Duplicated by bluestore - Bug #53906: BlueStore.h: 4158: FAILED ceph_assert(cur >= fnode.size) Duplicate
Copied to bluestore - Backport #54209: quincy: BlueStore.h: 4148: FAILED ceph_assert(cur >= p.length) Resolved

History

#1 Updated by Vikhyat Umrao 8 months ago

- After hitting this crash the systemd restarted the OSD container pod and after the restart, OSD is running fine!

#2 Updated by Vikhyat Umrao 8 months ago

The same cluster OSD.95 had also hit the same assert and got restarted by systemd and after that running fine!


2022-01-17T22:56:40.077+0000 7f03e4a29700 -1 /home/jenkins-build/build/workspace/ceph-dev-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-10229-g7e035110/rpm/el8/BUILD/ceph-17.0.0-10229-g7e035110/src/os/bluestore/BlueStore.h: In function 'virtual void RocksDBBlueFSVolumeSelector::sub_usage(void*, const bluefs_fnode_t&)' thread 7f03e4a29700 time 2022-01-17T22:56:40.066674+0000
/home/jenkins-build/build/workspace/ceph-dev-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-10229-g7e035110/rpm/el8/BUILD/ceph-17.0.0-10229-g7e035110/src/os/bluestore/BlueStore.h: 4148: FAILED ceph_assert(cur >= p.length)

 ceph version 17.0.0-10229-g7e035110 (7e035110784fba02ba81944e444be9a36932c6a3) quincy (dev)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152) [0x561ec5505b8e]
 2: /usr/bin/ceph-osd(+0x5d5daf) [0x561ec5505daf]
 3: (RocksDBBlueFSVolumeSelector::sub_usage(void*, bluefs_fnode_t const&)+0x16a) [0x561ec5bc9cca]
 4: (BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long, unsigned long)+0x735) [0x561ec5c67b45]
 5: (BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0xa9) [0x561ec5c68009]
 6: (BlueFS::fsync(BlueFS::FileWriter*)+0x18e) [0x561ec5c836de]
 7: (BlueRocksWritableFile::Sync()+0x18) [0x561ec5c93fb8]
 8: (rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions const&, rocksdb::IODebugContext*)+0x1f) [0x561ec61b774f]
 9: (rocksdb::WritableFileWriter::SyncInternal(bool)+0x662) [0x561ec62e7f22]
 10: (rocksdb::WritableFileWriter::Sync(bool)+0xf8) [0x561ec62e98e8]
 11: (rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup const&, rocksdb::log::Writer*, unsigned long*, bool, bool, unsigned long)+0x341) [0x561ec61ce701]
 12: (rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned long, bool, unsigned long*, unsigned long, rocksdb::PreReleaseCallback*)+0x1c04) [0x561ec61d6454]
 13: (rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, rocksdb::WriteBatch*)+0x21) [0x561ec61d65a1]
 14: (RocksDBStore::submit_common(rocksdb::WriteOptions&, std::shared_ptr<KeyValueDB::TransactionImpl>)+0x84) [0x561ec6170f84]
 15: (RocksDBStore::submit_transaction_sync(std::shared_ptr<KeyValueDB::TransactionImpl>)+0x9a) [0x561ec617198a]
 16: (BlueStore::_kv_sync_thread()+0x3530) [0x561ec5bc8390]
 17: (BlueStore::KVSyncThread::entry()+0x11) [0x561ec5bf70b1]
 18: /lib64/libpthread.so.0(+0x817a) [0x7f03fb44b17a]
 19: clone()

2022-01-17T22:57:03.972+0000 7f33a0448240  0 set uid:gid to 167:167 (ceph:ceph)
2022-01-17T22:57:03.972+0000 7f33a0448240  0 ceph version 17.0.0-10229-g7e035110 (7e035110784fba02ba81944e444be9a36932c6a3) quincy (dev), process ceph-osd, pid 8
2022-01-17T22:57:03.972+0000 7f33a0448240  0 pidfile_write: ignore empty --pid-file
2022-01-17T22:57:03.974+0000 7f33a0448240  1 bdev(0x563d0b97d400 /var/lib/ceph/osd/ceph-95/block) open path /var/lib/ceph/osd/ceph-95/block
2022-01-17T22:57:03.975+0000 7f33a0448240  1 bdev(0x563d0b97d400 /var/lib/ceph/osd/ceph-95/block) open size 1999839952896 (0x1d19fc00000, 1.8 TiB) block_size 4096 (4 KiB) rotational discard not supported

#3 Updated by Vikhyat Umrao 8 months ago

- OSD.0 also had similar crash

2022-01-17T23:40:22.136+0000 7f32423b0700 -1 /home/jenkins-build/build/workspace/ceph-dev-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-10229-g7e035110/rpm/el8/BUILD/ceph-17.0.0-10229-g7e035110/src/os/bluestore/BlueStore.h: In function 'virtual void RocksDBBlueFSVolumeSelector::sub_usage(void*, const bluefs_fnode_t&)' thread 7f32423b0700 time 2022-01-17T23:40:22.040838+0000
/home/jenkins-build/build/workspace/ceph-dev-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-10229-g7e035110/rpm/el8/BUILD/ceph-17.0.0-10229-g7e035110/src/os/bluestore/BlueStore.h: 4148: FAILED ceph_assert(cur >= p.length)

 ceph version 17.0.0-10229-g7e035110 (7e035110784fba02ba81944e444be9a36932c6a3) quincy (dev)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152) [0x55ffca66cb8e]
 2: /usr/bin/ceph-osd(+0x5d5daf) [0x55ffca66cdaf]
 3: (RocksDBBlueFSVolumeSelector::sub_usage(void*, bluefs_fnode_t const&)+0x16a) [0x55ffcad30cca]
 4: (BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long, unsigned long)+0x735) [0x55ffcadceb45]
 5: (BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0xa9) [0x55ffcadcf009]
 6: (BlueFS::fsync(BlueFS::FileWriter*)+0x18e) [0x55ffcadea6de]
 7: (BlueRocksWritableFile::Sync()+0x18) [0x55ffcadfafb8]
 8: (rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions const&, rocksdb::IODebugContext*)+0x1f) [0x55ffcb31e74f]
 9: (rocksdb::WritableFileWriter::SyncInternal(bool)+0x662) [0x55ffcb44ef22]
 10: (rocksdb::WritableFileWriter::Sync(bool)+0xf8) [0x55ffcb4508e8]
 11: (rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup const&, rocksdb::log::Writer*, unsigned long*, bool, bool, unsigned long)+0x341) [0x55ffcb335701]
 12: (rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned long, bool, unsigned long*, unsigned long, rocksdb::PreReleaseCallback*)+0x1c04) [0x55ffcb33d454]
 13: (rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, rocksdb::WriteBatch*)+0x21) [0x55ffcb33d5a1]
 14: (RocksDBStore::submit_common(rocksdb::WriteOptions&, std::shared_ptr<KeyValueDB::TransactionImpl>)+0x84) [0x55ffcb2d7f84]
 15: (RocksDBStore::submit_transaction_sync(std::shared_ptr<KeyValueDB::TransactionImpl>)+0x9a) [0x55ffcb2d898a]
 16: (BlueStore::_kv_sync_thread()+0x3530) [0x55ffcad2f390]
 17: (BlueStore::KVSyncThread::entry()+0x11) [0x55ffcad5e0b1]
 18: /lib64/libpthread.so.0(+0x817a) [0x7f3258dd217a]
 19: clone()

2022-01-17T23:40:22.230+0000 7f32423b0700 -1 *** Caught signal (Aborted) **
 in thread 7f32423b0700 thread_name:bstore_kv_sync

 ceph version 17.0.0-10229-g7e035110 (7e035110784fba02ba81944e444be9a36932c6a3) quincy (dev)
 1: /lib64/libpthread.so.0(+0x12c20) [0x7f3258ddcc20]
 2: gsignal()
 3: abort()
 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1b0) [0x55ffca66cbec]
 5: /usr/bin/ceph-osd(+0x5d5daf) [0x55ffca66cdaf]
 6: (RocksDBBlueFSVolumeSelector::sub_usage(void*, bluefs_fnode_t const&)+0x16a) [0x55ffcad30cca]
 7: (BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long, unsigned long)+0x735) [0x55ffcadceb45]
 8: (BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0xa9) [0x55ffcadcf009]
 9: (BlueFS::fsync(BlueFS::FileWriter*)+0x18e) [0x55ffcadea6de]
 10: (BlueRocksWritableFile::Sync()+0x18) [0x55ffcadfafb8]
 11: (rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions const&, rocksdb::IODebugContext*)+0x1f) [0x55ffcb31e74f]
 12: (rocksdb::WritableFileWriter::SyncInternal(bool)+0x662) [0x55ffcb44ef22]
 13: (rocksdb::WritableFileWriter::Sync(bool)+0xf8) [0x55ffcb4508e8]
 14: (rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup const&, rocksdb::log::Writer*, unsigned long*, bool, bool, unsigned long)+0x341) [0x55ffcb335701]
 15: (rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned long, bool, unsigned long*, unsigned long, rocksdb::PreReleaseCallback*)+0x1c04) [0x55ffcb33d454]
 16: (rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, rocksdb::WriteBatch*)+0x21) [0x55ffcb33d5a1]
 17: (RocksDBStore::submit_common(rocksdb::WriteOptions&, std::shared_ptr<KeyValueDB::TransactionImpl>)+0x84) [0x55ffcb2d7f84]
 18: (RocksDBStore::submit_transaction_sync(std::shared_ptr<KeyValueDB::TransactionImpl>)+0x9a) [0x55ffcb2d898a]
 19: (BlueStore::_kv_sync_thread()+0x3530) [0x55ffcad2f390]
 20: (BlueStore::KVSyncThread::entry()+0x11) [0x55ffcad5e0b1]
 21: /lib64/libpthread.so.0(+0x817a) [0x7f3258dd217a]
 22: clone()

2022-01-17T23:40:42.537+0000 7fa18eaa0240  0 set uid:gid to 167:167 (ceph:ceph)
2022-01-17T23:40:42.537+0000 7fa18eaa0240  0 ceph version 17.0.0-10229-g7e035110 (7e035110784fba02ba81944e444be9a36932c6a3) quincy (dev), process ceph-osd, pid 7
2022-01-17T23:40:42.537+0000 7fa18eaa0240  0 pidfile_write: ignore empty --pid-file
2022-01-17T23:40:42.555+0000 7fa18eaa0240  1 bdev(0x559239489400 /var/lib/ceph/osd/ceph-0/block) open path /var/lib/ceph/osd/ceph-0/block
2022-01-17T23:40:42.556+0000 7fa18eaa0240  1 bdev(0x559239489400 /var/lib/ceph/osd/ceph-0/block) open size 1999839952896 (0x1d19fc00000, 1.8 TiB) block_size 4096 (4 KiB) rotational discard not supported
2022-01-17T23:40:42.556+0000 7fa18eaa0240  1 bluestore(/var/lib/ceph/osd/ceph-0) _set_cache_sizes cache_size 1073741824 meta 0.45 kv 0.45 data 0.06
2022-01-17T23:40:42.556+0000 7fa18eaa0240  1 bdev(0x559239488c00 /var/lib/ceph/osd/ceph-0/block.db) open path /var/lib/ceph/osd/ceph-0/block.db

#4 Updated by Neha Ojha 8 months ago

  • Assignee set to Adam Kupczyk

#5 Updated by Neha Ojha 8 months ago

  • Duplicated by Bug #53906: BlueStore.h: 4158: FAILED ceph_assert(cur >= fnode.size) added

#6 Updated by Neha Ojha 8 months ago

  • Priority changed from Normal to Immediate

#7 Updated by Adam Kupczyk 8 months ago

  • Pull request ID set to 44713

#8 Updated by Igor Fedotov 8 months ago

  • Status changed from New to Fix Under Review

#9 Updated by Neha Ojha 8 months ago

  • Status changed from Fix Under Review to Pending Backport
  • Backport set to quincy

#10 Updated by Backport Bot 8 months ago

  • Copied to Backport #54209: quincy: BlueStore.h: 4148: FAILED ceph_assert(cur >= p.length) added

#11 Updated by Backport Bot about 2 months ago

  • Tags set to backport_processed

#12 Updated by Igor Fedotov 10 days ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF