Project

General

Profile

Bug #43353

BlueFS files read and written at the same time

Added by Adam Kupczyk 2 months ago. Updated about 2 months ago.

Status:
New
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature:

Description

Bluestore heavy loaded with rocksdb fails to compact.
It fails on check against simultaneous flushing (writing) and having file ready to read from.

Assert:
/work/adam/ceph-4/src/os/bluestore/BlueFS.cc: 2532: FAILED ceph_assert(h->file->num_readers.load() == 0)

Callstack:
#6 0x000055bff8ea90fb in ceph::__ceph_assert_fail (assertion=<optimized out>, file=<optimized out>, line=<optimized out>,
func=<optimized out>) at /work/adam/ceph-4/src/common/assert.cc:73
#7 0x000055bff8ea927a in ceph::__ceph_assert_fail (ctx=...) at /work/adam/ceph-4/src/common/assert.cc:78
#8 0x000055bff94d18e1 in BlueFS::_flush_range (this=this@entry=0x55c004590800, h=h@entry=0x55c0049fadc0,
---Type <return> to continue, or q <return> to quit---
offset=offset@entry=741911, length=length@entry=192) at /work/adam/ceph-4/src/os/bluestore/BlueFS.cc:2532
#9 0x000055bff94d1a4b in BlueFS::_flush (this=this@entry=0x55c004590800, h=h@entry=0x55c0049fadc0, force=force@entry=true)
at /work/adam/ceph-4/src/os/bluestore/BlueFS.cc:2782
#10 0x000055bff94d4c07 in BlueFS::_fsync (this=this@entry=0x55c004590800, h=h@entry=0x55c0049fadc0, l=...)
at /work/adam/ceph-4/src/os/bluestore/BlueFS.cc:2831
#11 0x000055bff94f49c3 in fsync (h=0x55c0049fadc0, this=0x55c004590800) at /work/adam/ceph-4/src/os/bluestore/BlueFS.h:557
#12 BlueRocksWritableFile::Sync (this=<optimized out>) at /work/adam/ceph-4/src/os/bluestore/BlueRocksEnv.cc:220
#13 0x000055bff9ad4151 in rocksdb::WritableFileWriter::SyncInternal (this=this@entry=0x55c004d8eaa0,
use_fsync=use_fsync@entry=false) at /work/adam/ceph-4/src/rocksdb/util/file_reader_writer.cc:426
#14 0x000055bff9ad56f8 in rocksdb::WritableFileWriter::Sync (this=this@entry=0x55c004d8eaa0, use_fsync=<optimized out>)
at /work/adam/ceph-4/src/rocksdb/util/file_reader_writer.cc:395
#15 0x000055bff9adb0d7 in rocksdb::SyncManifest (env=<optimized out>, db_options=0x55c0039b6a08, file=0x55c004d8eaa0)
at /work/adam/ceph-4/src/rocksdb/util/filename.cc:407
#16 0x000055bff9a244ad in rocksdb::VersionSet::ProcessManifestWrites (this=this@entry=0x55c003989900, writers=...,
mu=0x55c0039b6c60, db_directory=db_directory@entry=0x55c005fc3060, new_descriptor_log=<optimized out>,
new_descriptor_log@entry=false, new_cf_options=0x0) at /work/adam/ceph-4/src/rocksdb/db/version_set.cc:3090
#17 0x000055bff9a25538 in rocksdb::VersionSet::LogAndApply (this=0x55c003989900, column_family_datas=...,
mutable_cf_options_list=..., edit_lists=..., mu=<optimized out>, db_directory=0x55c005fc3060, new_descriptor_log=false,
new_cf_options=0x0) at /work/adam/ceph-4/src/rocksdb/db/version_set.cc:3310
#18 0x000055bff99618ec in rocksdb::VersionSet::LogAndApply (this=0x55c003989900, column_family_data=<optimized out>,
mutable_cf_options=..., edit=<optimized out>, mu=0x55c0039b6c60, db_directory=0x55c005fc3060, new_descriptor_log=false,
column_family_options=0x0) at /work/adam/ceph-4/src/rocksdb/db/version_set.h:774
#19 0x000055bff9979da0 in rocksdb::DBImpl::BackgroundCompaction (this=this@entry=0x55c0039b6800,
made_progress=made_progress@entry=0x7f2a74e35e86, job_context=job_context@entry=0x7f2a74e35ea0,
log_buffer=log_buffer@entry=0x7f2a74e36070, prepicked_compaction=prepicked_compaction@entry=0x0,
thread_pri=<optimized out>) at /work/adam/ceph-4/src/rocksdb/db/db_impl_compaction_flush.cc:2542
#20 0x000055bff997fc06 in rocksdb::DBImpl::BackgroundCallCompaction (this=this@entry=0x55c0039b6800,
prepicked_compaction=prepicked_compaction@entry=0x0, bg_thread_pri=bg_thread_pri@entry=rocksdb::Env::LOW)
at /work/adam/ceph-4/src/rocksdb/db/db_impl_compaction_flush.cc:2192
#21 0x000055bff99800ba in rocksdb::DBImpl::BGWorkCompaction (arg=<optimized out>)
at /work/adam/ceph-4/src/rocksdb/db/db_impl_compaction_flush.cc:1972
#22 0x000055bff9b5d9b4 in operator() (this=0x7f2a74e369f0)
at /opt/rh/devtoolset-7/root/usr/include/c++/7/bits/std_function.h:706
#23 rocksdb::ThreadPoolImpl::Impl::BGThread (this=this@entry=0x55c0045bad20, thread_id=thread_id@entry=0)
at /work/adam/ceph-4/src/rocksdb/util/threadpool_imp.cc:265
#24 0x000055bff9b5db4d in rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper (arg=0x55c004d11330)
at /work/adam/ceph-4/src/rocksdb/util/threadpool_imp.cc:306
#25 0x000055bff9c15f9f in execute_native_thread_routine ()
#26 0x00007f2a83e5ce25 in start_thread (arg=0x7f2a74e39700) at pthread_create.c:308
#27 0x00007f2a82ec434d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

History

#1 Updated by Adam Kupczyk about 2 months ago

When you start operating on DB by invoking ListColumnFamilies, background compaction process may start before finishing ListColumnFamilies.
This compaction creates new MANIFEST file from existing one and continues to add to it.
In the meantime ListColumnFamilies attempts to read newest MANIFEST, which leads to assert().

I assume that this is unintended behavior on RocksDB part, and it got unnoticed only because there is no problem to read and write one file on typical filesystems.

Also available in: Atom PDF