Project

General

Profile

Actions

Bug #47530

closed

Racksdb compression at L2 to L3 causes osd to crash

Added by jiaxu li over 3 years ago. Updated about 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
OSD
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Continuously writing data to the cluster through rgw, rocksdb triggers the compaction operation. When the compaction is performed by L2->L3, there is a certain probability that the osd will be down.

The logs before the osd crash are as follows:

2020-09-14T11:38:25.810+0800 7f17ca173700 4 rocksdb: [db/compaction_job.cc:1645] [default] [JOB 3] Compacting 1@2 + 2@3 files to L3, score 1.03
2020-09-14T11:38:25.810+0800 7f17ca173700 4 rocksdb: [db/compaction_job.cc:1649] [default] Compaction start summary: Base version 2 Base level 2, inputs: [2732(64MB)], [1654(64MB) 1655(64MB)]
2020-09-14T11:38:25.810+0800 7f17ca173700 4 rocksdb: EVENT_LOG_v1

{"time_micros": 1600054705811182, "job": 3, "event": "compaction_started", "compaction_reason": "LevelMaxLevelSize", "files_L2": [2732], "files_L3": [1654, 1655], "score": 1.03341, "input_data_size": 203983498}
......
2020-09-14T11:38:26.211+0800 7f17ca173700 -1 bdev(0x55a3b5b3b500 /var/lib/ceph/osd/ceph-118/block.db) read_random direct_aligned_read 0x172000000~ error: (61) No data available
2020-09-14T11:38:26.220+0800 7f17ca173700 -1 /kvm3/LIUHONG/onest-8-7/onest/rpmbuild/BUILD/ceph-16.0.0-13/src/os/bluestore/BlueFS.cc: In function 'int BlueFS::_read_random(BlueFS::FileReader*, uint64_t, uint64_t, char*)' thread 7f17ca173700 time 2020-09-14T11:38:26.212628+0800
/kvm3/LIUHONG/onest-8-7/onest/rpmbuild/BUILD/ceph-16.0.0-13/src/os/bluestore/BlueFS.cc: 1899: FAILED ceph_assert(r == 0)
ceph version 16.0.0-13 (460bafde3447e1f141e1165e0f44040dd835de53) octopus (rc)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14c) [0x55a3a91fd117]
2: (()+0x4d32df) [0x55a3a91fd2df]
3: (BlueFS::_read_random(BlueFS::FileReader*, unsigned long, unsigned long, char*)+0xc03) [0x55a3a981f973]
4: (BlueRocksRandomAccessFile::Read(unsigned long, unsigned long, rocksdb::Slice*, char*) const+0x20) [0x55a3a984c880]
5: (()+0x10b356c) [0x55a3a9ddd56c]
6: (rocksdb::RandomAccessFileReader::Read(unsigned long, unsigned long, rocksdb::Slice*, char*) const+0x826) [0x55a3a9dde236]
7: (rocksdb::BlockFetcher::ReadBlockContents()+0x484) [0x55a3a9da8bf4]
8: (()+0x106b920) [0x55a3a9d95920]
9: (rocksdb::DataBlockIter* rocksdb::BlockBasedTable::NewDataBlockIterator<rocksdb::DataBlockIter>(rocksdb::BlockBasedTable::Rep*, rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, rocksdb::DataBlockIter*, bool, bool, bool, rocksdb::GetContext*, rocksdb::Status, rocksdb::FilePrefetchBuffer*)+0x5b4) [0x55a3a9da5644]
10: (rocksdb::BlockBasedTableIterator<rocksdb::DataBlockIter, rocksdb::Slice>::InitDataBlock()+0xb6) [0x55a3a9da6696]
11: (rocksdb::BlockBasedTableIterator<rocksdb::DataBlockIter, rocksdb::Slice>::FindKeyForward()+0x1b0) [0x55a3a9da6a70]
12: (()+0x1007ff9) [0x55a3a9d31ff9]
13: (rocksdb::MergingIterator::Next()+0x33) [0x55a3a9db78b3]
14: (rocksdb::CompactionIterator::Next()+0x132) [0x55a3a9e1c1f2]
15: (rocksdb::CompactionJob::ProcessKeyValueCompaction(rocksdb::CompactionJob::SubcompactionState*)+0x78c) [0x55a3a9e2229c]
16: (rocksdb::CompactionJob::Run()+0x264) [0x55a3a9e23264]
17: (rocksdb::DBImpl::BackgroundCompaction(bool*, rocksdb::JobContext*, rocksdb::LogBuffer*, rocksdb::DBImpl::PrepickedCompaction*, rocksdb::Env::Priority)+0xce9) [0x55a3a9c9bc69]
18: (rocksdb::DBImpl::BackgroundCallCompaction(rocksdb::DBImpl::PrepickedCompaction*, rocksdb::Env::Priority)+0xc6) [0x55a3a9ca1ff6]
19: (rocksdb::DBImpl::BGWorkCompaction(void*)+0x3a) [0x55a3a9ca24da]
20: (rocksdb::ThreadPoolImpl::Impl::BGThread(unsigned long)+0x24a) [0x55a3a9e5fe5a]
21: (rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper(void*)+0x5d) [0x55a3a9e5ffdd]
22: (()+0x11fc66f) [0x55a3a9f2666f]
23: (()+0x7dd5) [0x7f17d9344dd5]
24: (clone()+0x6d) [0x7f17d820aead]

Actions

Also available in: Atom PDF