Project

General

Profile

Actions

Bug #52939

closed

lockdep cycle under BlueFS::_compact_log_async_LD_NF_D()

Added by Casey Bodley over 2 years ago. Updated over 2 years ago.

Status:
Rejected
Priority:
High
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Yes
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

reproduces reliably in vstart just after starting s3-tests against rgw on master, at commit c19250ce3cb30a5a0409e3ebd4142dad6191642e

   -43> 2021-10-14T13:53:48.401-0400 7f3934723640 20 bluefs _clear_dirty_set_stable_D seq_stable 4097
   -42> 2021-10-14T13:53:48.401-0400 7f3934723640 20 bluefs compact_log_async_dump_metadata_NF op_init
   -41> 2021-10-14T13:53:48.401-0400 7f3934723640 20 bluefs compact_log_async_dump_metadata_NF op_file_update file(ino 38 size 0xae30 mtime 2021-10-14T13:52:36.809287-0400 allocated 100000 extents [1:0xc00000~100000])
   -40> 2021-10-14T13:53:48.401-0400 7f3934723640 20 bluefs compact_log_async_dump_metadata_NF op_file_update file(ino 37 size 0x1894e2a mtime 2021-10-14T13:53:48.401110-0400 allocated 1900000 extents [0:0x500000~1000000,0:0x1900000~300000,0:0x2000000~300000,0:0x2700000~200000,0:0x2d00000~100000])
   -39> 2021-10-14T13:53:48.401-0400 7f3934723640 20 bluefs compact_log_async_dump_metadata_NF op_file_update file(ino 36 size 0x10 mtime 2021-10-14T13:52:36.805830-0400 allocated 100000 extents [1:0xb00000~100000])
   -38> 2021-10-14T13:53:48.401-0400 7f3934723640 20 bluefs compact_log_async_dump_metadata_NF op_file_update file(ino 35 size 0x475 mtime 2021-10-14T13:52:36.805243-0400 allocated 100000 extents [1:0xa00000~100000])
   -37> 2021-10-14T13:53:48.401-0400 7f3934723640 20 bluefs compact_log_async_dump_metadata_NF op_file_update file(ino 33 size 0x4a6 mtime 2021-10-14T13:52:36.803329-0400 allocated 100000 extents [1:0x700000~100000])
   -36> 2021-10-14T13:53:48.401-0400 7f3934723640 20 bluefs compact_log_async_dump_metadata_NF op_file_update file(ino 28 size 0xae30 mtime 2021-10-14T13:52:32.561061-0400 allocated 100000 extents [1:0x800000~100000])
   -35> 2021-10-14T13:53:48.401-0400 7f3934723640 20 bluefs compact_log_async_dump_metadata_NF op_file_update file(ino 29 size 0x0 mtime 2021-10-14T13:52:36.815800-0400 allocated 100000 extents [1:0x900000~100000])
   -34> 2021-10-14T13:53:48.401-0400 7f3934723640 20 bluefs compact_log_async_dump_metadata_NF op_file_update file(ino 21 size 0x3a mtime 2021-10-14T13:52:29.076036-0400 allocated 100000 extents [1:0x1200000~100000])
   -33> 2021-10-14T13:53:48.401-0400 7f3934723640 20 bluefs compact_log_async_dump_metadata_NF op_file_update file(ino 24 size 0x599 mtime 2021-10-14T13:52:32.556185-0400 allocated 100000 extents [1:0x500000~100000])
   -32> 2021-10-14T13:53:48.401-0400 7f3934723640 20 bluefs compact_log_async_dump_metadata_NF op_file_update file(ino 2 size 0x0 mtime 2021-10-14T13:52:29.002203-0400 allocated 0 extents [])
   -31> 2021-10-14T13:53:48.401-0400 7f3934723640 20 bluefs compact_log_async_dump_metadata_NF op_file_update file(ino 3 size 0x25 mtime 2021-10-14T13:52:29.002664-0400 allocated 100000 extents [1:0x100000~100000])
   -30> 2021-10-14T13:53:48.401-0400 7f3934723640 20 bluefs compact_log_async_dump_metadata_NF op_file_update file(ino 32 size 0x4e2 mtime 2021-10-14T13:52:36.802331-0400 allocated 100000 extents [1:0x400000~100000])
   -29> 2021-10-14T13:53:48.401-0400 7f3934723640 20 bluefs compact_log_async_dump_metadata_NF op_dir_create ALLOCATOR_NCB_DIR
   -28> 2021-10-14T13:53:48.401-0400 7f3934723640 20 bluefs compact_log_async_dump_metadata_NF op_dir_link ALLOCATOR_NCB_DIR/ALLOCATOR_NCB_FILE to 29
   -27> 2021-10-14T13:53:48.401-0400 7f3934723640 20 bluefs compact_log_async_dump_metadata_NF op_dir_create db
   -26> 2021-10-14T13:53:48.401-0400 7f3934723640 20 bluefs compact_log_async_dump_metadata_NF op_dir_link db/000029.sst to 24
   -25> 2021-10-14T13:53:48.401-0400 7f3934723640 20 bluefs compact_log_async_dump_metadata_NF op_dir_link db/000035.sst to 32
   -24> 2021-10-14T13:53:48.401-0400 7f3934723640 20 bluefs compact_log_async_dump_metadata_NF op_dir_link db/000036.sst to 33
   -23> 2021-10-14T13:53:48.401-0400 7f3934723640 20 bluefs compact_log_async_dump_metadata_NF op_dir_link db/CURRENT to 36
   -22> 2021-10-14T13:53:48.401-0400 7f3934723640 20 bluefs compact_log_async_dump_metadata_NF op_dir_link db/IDENTITY to 3
   -21> 2021-10-14T13:53:48.401-0400 7f3934723640 20 bluefs compact_log_async_dump_metadata_NF op_dir_link db/LOCK to 2
   -20> 2021-10-14T13:53:48.401-0400 7f3934723640 20 bluefs compact_log_async_dump_metadata_NF op_dir_link db/MANIFEST-000038 to 35
   -19> 2021-10-14T13:53:48.401-0400 7f3934723640 20 bluefs compact_log_async_dump_metadata_NF op_dir_link db/OPTIONS-000033 to 28
   -18> 2021-10-14T13:53:48.401-0400 7f3934723640 20 bluefs compact_log_async_dump_metadata_NF op_dir_link db/OPTIONS-000041 to 38
   -17> 2021-10-14T13:53:48.401-0400 7f3934723640 20 bluefs compact_log_async_dump_metadata_NF op_dir_create db.slow
   -16> 2021-10-14T13:53:48.401-0400 7f3934723640 20 bluefs compact_log_async_dump_metadata_NF op_dir_create db.wal
   -15> 2021-10-14T13:53:48.401-0400 7f3934723640 20 bluefs compact_log_async_dump_metadata_NF op_dir_link db.wal/000039.log to 37
   -14> 2021-10-14T13:53:48.401-0400 7f3934723640 20 bluefs compact_log_async_dump_metadata_NF op_dir_create sharding
   -13> 2021-10-14T13:53:48.401-0400 7f3934723640 20 bluefs compact_log_async_dump_metadata_NF op_dir_link sharding/def to 21
   -12> 2021-10-14T13:53:48.401-0400 7f3934723640 10 bluefs _allocate len 0x100000 from 1
   -11> 2021-10-14T13:53:48.401-0400 7f3934723640 10 HybridAllocator allocate want 0x100000 unit 0x100000 max_alloc_size 0x100000 hint 0x0
   -10> 2021-10-14T13:53:48.401-0400 7f3934723640 20 AvlAllocator _allocate first fit=13631488 size=1048576
    -9> 2021-10-14T13:53:48.401-0400 7f3934723640 10 bluefs _pad_bl padding with 0xc38 zeros
    -8> 2021-10-14T13:53:48.401-0400 7f3934723640 10 bluefs _compact_log_async_LD_NF_D new_log_jump_to 0x100000
    -7> 2021-10-14T13:53:48.401-0400 7f3934723640  0
------------------------------------
existing dependency BlueFS::FileWriter::lock (168) -> BlueFS::File::lock (167) at:

    -6> 2021-10-14T13:53:48.409-0400 7f3934723640  0 new dependency BlueFS::File::lock (167) -> BlueFS::FileWriter::lock (168) creates a cycle at
 ceph version Development (no_version) quincy (dev)
 1: (ceph::mutex_debug_detail::mutex_debug_impl<false>::lock(bool)+0x2e) [0x5586b4f4d860]
 2: (BlueFS::_compact_log_async_LD_NF_D()+0x9ec) [0x5586b56c2b9a]
 3: (BlueFS::_maybe_compact_log_LN_NF_LD_D()+0x38) [0x5586b56c9808]
 4: (BlueFS::fsync(BlueFS::FileWriter*)+0x4f3) [0x5586b56ca0c5]
 5: (BlueRocksWritableFile::Sync()+0x11) [0x5586b56d9921]
 6: (rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions const&, rocksdb::IODebugContext*)+0x3f) [0x5586b5c14459]
 7: (rocksdb::WritableFileWriter::SyncInternal(bool)+0x257) [0x5586b5e1bfab]
 8: (rocksdb::WritableFileWriter::Sync(bool)+0x192) [0x5586b5e1b96e]
 9: (rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup const&, rocksdb::log::Writer*, unsigned long*, bool, bool, unsigned long)+0x389) [0x5586b5c6862b]
 10: (rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned long, bool, unsigned long*, unsigned long, rocksdb::PreReleaseCallback*)+0x12de) [0x5586b5c642a8]
 11: (rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, rocksdb::WriteBatch*)+0x4a) [0x5586b5c62f66]
 12: (RocksDBStore::submit_common(rocksdb::WriteOptions&, std::shared_ptr<KeyValueDB::TransactionImpl>)+0x7c) [0x5586b5bb41c0]
 13: (RocksDBStore::submit_transaction_sync(std::shared_ptr<KeyValueDB::TransactionImpl>)+0xaf) [0x5586b5bb5545]
 14: (BlueStore::_kv_sync_thread()+0x2ebd) [0x5586b55f96cd]
 15: (BlueStore::KVSyncThread::entry()+0xd) [0x5586b56331b3]
 16: (Thread::entry_wrapper()+0x3f) [0x5586b5755543]
 17: (Thread::_entry_func(void*)+0x9) [0x5586b575555b]
 18: /lib64/libpthread.so.0(+0x93f9) [0x7f394902d3f9]
 19: clone()
    -5> 2021-10-14T13:53:48.410-0400 7f3934723640  0 btw, i am holding these locks:
    -4> 2021-10-14T13:53:48.410-0400 7f3934723640  0   BlueFS::File::lock (167)
    -3> 2021-10-14T13:53:48.410-0400 7f3934723640  0   BlueFS::FileWriter::lock (168)
    -2> 2021-10-14T13:53:48.410-0400 7f3934723640  0
    -1> 2021-10-14T13:53:48.418-0400 7f3934723640 -1 ../src/common/lockdep.cc: In function 'int lockdep_will_lock(const char*, int, bool, bool)' thread 7f3934723640 time 2021-10-14T13:53:48.410701-0400
../src/common/lockdep.cc: 337: ceph_abort_msg("abort() called")
Actions

Also available in: Atom PDF