Project

General

Profile

Bug #47530

Racksdb compression at L2 to L3 causes osd to crash

Added by jiaxu li over 3 years ago. Updated almost 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
OSD
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Continuously writing data to the cluster through rgw, rocksdb triggers the compaction operation. When the compaction is performed by L2->L3, there is a certain probability that the osd will be down.

The logs before the osd crash are as follows:

2020-09-14T11:38:25.810+0800 7f17ca173700 4 rocksdb: [db/compaction_job.cc:1645] [default] [JOB 3] Compacting 1@2 + 2@3 files to L3, score 1.03
2020-09-14T11:38:25.810+0800 7f17ca173700 4 rocksdb: [db/compaction_job.cc:1649] [default] Compaction start summary: Base version 2 Base level 2, inputs: [2732(64MB)], [1654(64MB) 1655(64MB)]
2020-09-14T11:38:25.810+0800 7f17ca173700 4 rocksdb: EVENT_LOG_v1

{"time_micros": 1600054705811182, "job": 3, "event": "compaction_started", "compaction_reason": "LevelMaxLevelSize", "files_L2": [2732], "files_L3": [1654, 1655], "score": 1.03341, "input_data_size": 203983498}
......
2020-09-14T11:38:26.211+0800 7f17ca173700 -1 bdev(0x55a3b5b3b500 /var/lib/ceph/osd/ceph-118/block.db) read_random direct_aligned_read 0x172000000~ error: (61) No data available
2020-09-14T11:38:26.220+0800 7f17ca173700 -1 /kvm3/LIUHONG/onest-8-7/onest/rpmbuild/BUILD/ceph-16.0.0-13/src/os/bluestore/BlueFS.cc: In function 'int BlueFS::_read_random(BlueFS::FileReader*, uint64_t, uint64_t, char*)' thread 7f17ca173700 time 2020-09-14T11:38:26.212628+0800
/kvm3/LIUHONG/onest-8-7/onest/rpmbuild/BUILD/ceph-16.0.0-13/src/os/bluestore/BlueFS.cc: 1899: FAILED ceph_assert(r == 0)
ceph version 16.0.0-13 (460bafde3447e1f141e1165e0f44040dd835de53) octopus (rc)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14c) [0x55a3a91fd117]
2: (()+0x4d32df) [0x55a3a91fd2df]
3: (BlueFS::_read_random(BlueFS::FileReader*, unsigned long, unsigned long, char*)+0xc03) [0x55a3a981f973]
4: (BlueRocksRandomAccessFile::Read(unsigned long, unsigned long, rocksdb::Slice*, char*) const+0x20) [0x55a3a984c880]
5: (()+0x10b356c) [0x55a3a9ddd56c]
6: (rocksdb::RandomAccessFileReader::Read(unsigned long, unsigned long, rocksdb::Slice*, char*) const+0x826) [0x55a3a9dde236]
7: (rocksdb::BlockFetcher::ReadBlockContents()+0x484) [0x55a3a9da8bf4]
8: (()+0x106b920) [0x55a3a9d95920]
9: (rocksdb::DataBlockIter* rocksdb::BlockBasedTable::NewDataBlockIterator<rocksdb::DataBlockIter>(rocksdb::BlockBasedTable::Rep*, rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, rocksdb::DataBlockIter*, bool, bool, bool, rocksdb::GetContext*, rocksdb::Status, rocksdb::FilePrefetchBuffer*)+0x5b4) [0x55a3a9da5644]
10: (rocksdb::BlockBasedTableIterator<rocksdb::DataBlockIter, rocksdb::Slice>::InitDataBlock()+0xb6) [0x55a3a9da6696]
11: (rocksdb::BlockBasedTableIterator<rocksdb::DataBlockIter, rocksdb::Slice>::FindKeyForward()+0x1b0) [0x55a3a9da6a70]
12: (()+0x1007ff9) [0x55a3a9d31ff9]
13: (rocksdb::MergingIterator::Next()+0x33) [0x55a3a9db78b3]
14: (rocksdb::CompactionIterator::Next()+0x132) [0x55a3a9e1c1f2]
15: (rocksdb::CompactionJob::ProcessKeyValueCompaction(rocksdb::CompactionJob::SubcompactionState*)+0x78c) [0x55a3a9e2229c]
16: (rocksdb::CompactionJob::Run()+0x264) [0x55a3a9e23264]
17: (rocksdb::DBImpl::BackgroundCompaction(bool*, rocksdb::JobContext*, rocksdb::LogBuffer*, rocksdb::DBImpl::PrepickedCompaction*, rocksdb::Env::Priority)+0xce9) [0x55a3a9c9bc69]
18: (rocksdb::DBImpl::BackgroundCallCompaction(rocksdb::DBImpl::PrepickedCompaction*, rocksdb::Env::Priority)+0xc6) [0x55a3a9ca1ff6]
19: (rocksdb::DBImpl::BGWorkCompaction(void*)+0x3a) [0x55a3a9ca24da]
20: (rocksdb::ThreadPoolImpl::Impl::BGThread(unsigned long)+0x24a) [0x55a3a9e5fe5a]
21: (rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper(void*)+0x5d) [0x55a3a9e5ffdd]
22: (()+0x11fc66f) [0x55a3a9f2666f]
23: (()+0x7dd5) [0x7f17d9344dd5]
24: (clone()+0x6d) [0x7f17d820aead]

History

#1 Updated by jiaxu li over 3 years ago

The block path deployed on the hdd bare disk, and the wal and db path deployed on the ssd partition.
The block path size is 10T, and the wal and db path sizes are both 220G

#2 Updated by Igor Fedotov over 3 years ago

@jiaxu li - do you mean compaction not compression in the caption?

Also I'm wondering whether you node reports any hardware errors via dmesg. The following line indicates that the root cause is rather at H/W level.

2020-09-14T11:38:26.211+0800 7f17ca173700 -1 bdev(0x55a3b5b3b500 /var/lib/ceph/osd/ceph-118/block.db) read_random direct_aligned_read 0x172000000~ error: (61) No data available

#3 Updated by jiaxu li over 3 years ago

@Igor Fedotov yes, the title should be compation.
There are some errors about the hardware, and these errors are about the disk where wal/db is located.
  1. dmesg | grep -i error
    [ 6.208199] ERST: Error Record Serialization Table (ERST) support is initialized.
    [ 6.320882] RAS: Correctable Errors collector initialized.
    [ 7.779689] GPT: Use GNU Parted to correct GPT errors.
    [ 12.290177] GPT: Use GNU Parted to correct GPT errors.
    [ 12.628098] GPT: Use GNU Parted to correct GPT errors.
    [489518.305862] GPT: Use GNU Parted to correct GPT errors.
    [2102038.658163] sd 16:0:11:0: [sdl] tag#0 Sense Key : Medium Error [current]
    [2102038.658166] sd 16:0:11:0: [sdl] tag#0 Add. Sense: Unrecovered read error
    [2102038.658174] print_req_error: critical medium error, dev sdl, sector 2875184
    [2103720.214626] sd 16:0:11:0: [sdl] tag#0 Sense Key : Medium Error [current]
    [2103720.214629] sd 16:0:11:0: [sdl] tag#0 Add. Sense: Unrecovered read error
    [2103720.214635] print_req_error: critical medium error, dev sdl, sector 2875184
    ......
    [5794566.897877] sd 16:0:11:0: [sdl] tag#0 Sense Key : Medium Error [current]
    [5794566.897881] sd 16:0:11:0: [sdl] tag#0 Add. Sense: Unrecovered read error
    [5794566.897887] print_req_error: critical medium error, dev sdl, sector 1405096512
    [5794567.803600] sd 16:0:11:0: [sdl] tag#0 Sense Key : Medium Error [current]
    [5794567.803603] sd 16:0:11:0: [sdl] tag#0 Add. Sense: Unrecovered read error
    [5794567.803609] print_req_error: critical medium error, dev sdl, sector 671093224

#4 Updated by Greg Farnum almost 3 years ago

  • Status changed from New to Resolved

Also available in: Atom PDF