Actions
Bug #51217
openBlueFS::_flush_range assert(h->file->fnode.ino != 1)
Status:
In Progress
Priority:
Normal
Assignee:
-
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
[Version]
ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) luminous (stable)
[Operation]
1. set args
bluefs_alloc_size = 8192 bluefs_max_prefetch = 8192 bluefs_min_log_runway = 8192 bluefs_max_log_runway = 16384 bluefs_log_compact_min_size = 21474836480 bluestore_min_alloc_size = 8192
2. create osd and pool, then write data.
[Appearance]
1. This is the first backstrace, which looks similar to https://tracker.ceph.com/issues/45519
2021-06-15 03:17:11.951251 7f1d6d49c700 -1 /work/Product/rpmbuild/BUILD/ceph-12.2.12/src/os/bluestore/BlueFS.cc: In function 'int BlueFS::_flush_range(BlueFS:: FileWriter*, uint64_t, uint64_t)' thread 7f1d6d49c700 time 2021-06-15 03:17:11.944069 /work/Product/rpmbuild/BUILD/ceph-12.2.12/src/os/bluestore/BlueFS.cc: 1548: FAILED assert(h->file->fnode.ino != 1) ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) luminous (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x55754b705a50] 2: (BlueFS::_flush_range(BlueFS::FileWriter*, unsigned long, unsigned long)+0x1d89) [0x55754b683d79] 3: (BlueFS::_flush(BlueFS::FileWriter*, bool)+0x188) [0x55754b684138] 4: (BlueFS::_flush_and_sync_log(std::unique_lock<std::mutex>&, unsigned long, unsigned long)+0x796) [0x55754b684d46] 5: (BlueFS::_fsync(BlueFS::FileWriter*, std::unique_lock<std::mutex>&)+0x2a1) [0x55754b6867f1] 6: (BlueRocksWritableFile::Sync()+0x63) [0x55754b69fdc3] 7: (rocksdb::WritableFileWriter::SyncInternal(bool)+0x149) [0x55754ba88e69] 8: (rocksdb::WritableFileWriter::Sync(bool)+0xe8) [0x55754ba89b38] 9: (rocksdb::DBImpl::WriteToWAL(rocksdb::autovector<rocksdb::WriteThread::Writer*, 8ul> const&, rocksdb::log::Writer*, bool, bool, unsigned long)+0x41a) [0x55754bad569a] 10: (rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned long, bool)+0x94b) [0x55754bad627b] 11: (rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, rocksdb::WriteBatch*)+0x27) [0x55754bad7247] 12: (RocksDBStore::submit_transaction_sync(std::shared_ptr<KeyValueDB::TransactionImpl>)+0xcf) [0x55754b617d0f] 13: (BlueStore::_kv_sync_thread()+0x1c6f) [0x55754b5ac93f] 14: (BlueStore::KVSyncThread::entry()+0xd) [0x55754b5f540d] 15: (()+0x7e25) [0x7f1d7e995e25] 16: (clone()+0x6d) [0x7f1d7da8934d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2. Restart OSD, the second backstrace which seems related to https://github.com/ceph/ceph/pull/35776 and https://github.com/ceph/ceph/pull/35473
2021-06-15 03:17:32.631234 7f5c09e52d00 1 bluefs mount 2021-06-15 03:17:33.381762 7f5c09e52d00 -1 *** Caught signal (Segmentation fault) ** in thread 7f5c09e52d00 thread_name:ceph-osd ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) luminous (stable) 1: (()+0xa64d91) [0x557527796d91] 2: (()+0xf5e0) [0x7f5c074995e0] 3: (BlueFS::_read(BlueFS::FileReader*, BlueFS::FileReaderBuffer*, unsigned long, unsigned long, ceph::buffer::list*, char*)+0x4e2) [0x55752774b752] 4: (BlueFS::_replay(bool)+0x48d) [0x55752775edfd] 5: (BlueFS::mount()+0x1d4) [0x5575277629e4] 6: (BlueStore::_open_db(bool)+0x1847) [0x557527673b67] 7: (BlueStore::_mount(bool)+0x40e) [0x5575276a89be] 8: (OSD::init()+0x3bd) [0x55752724f9dd] 9: (main()+0x2d07) [0x557527152dd7] 10: (__libc_start_main()+0xf5) [0x7f5c064aec05] 11: (()+0x4c0ca3) [0x5575271f2ca3] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
My goal is to reproduce problem 2 and then analyze the root cause, but encounters problem 1.
Actions