Bug #36709: OSD stuck while flushing rocksdb WAL - RADOS - Ceph

Actions

Copy link

Bug #36709

closed

OSD stuck while flushing rocksdb WAL

Added by Aleksei Zakharov over 5 years ago. Updated over 5 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Category:

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

2 - major

Reviewed:

Affected Versions:

Ceph - v12.2.8

ceph-qa-suite:

Component(RADOS):

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

Hi all,
We use:
ceph version 12.2.8 (ae699615bac534ea496ee965ac6192cb7e0e07c0) luminous (stable)
Clients work on:
ceph version 10.2.10 (5dc1e4c05cb68dbf62ae6fce3f0700e4654fdbbe)

We see that sometimes random osd's with rocksdb and filestore stuck with these lines in a log:

2018-11-06 12:03:12.583520 7fd28b81b700  0 log_channel(cluster) log [DBG] : 14.95c scrub starts
2018-11-06 12:03:15.855132 7fd28f823700  0 log_channel(cluster) log [DBG] : 14.95c scrub ok
2018-11-06 12:03:17.584285 7fd290024700  0 log_channel(cluster) log [DBG] : 14.292b scrub starts
2018-11-06 12:03:21.264719 7fd290024700  0 log_channel(cluster) log [DBG] : 14.292b scrub ok
2018-11-06 12:03:23.585003 7fd28b01a700  0 log_channel(cluster) log [DBG] : 14.2e05 scrub starts
2018-11-06 12:03:28.201684 7fd28f022700  0 log_channel(cluster) log [DBG] : 14.2e05 scrub ok
2018-11-06 12:03:54.286060 7fd2a484d700  4 rocksdb: [/build/ceph-12.2.8/src/rocksdb/db/db_impl_write.cc:725] [default] New memtable created with log file: #79900. Immutable memtables: 0.

2018-11-06 12:03:54.286115 7fd287813700  4 rocksdb: [/build/ceph-12.2.8/src/rocksdb/db/db_impl_compaction_flush.cc:49] [JOB 950] Syncing log #79898
2018-11-06 12:03:54.286518 7fd287813700  4 rocksdb: (Original Log Time 2018/11/06-12:03:54.286103) [/build/ceph-12.2.8/src/rocksdb/db/db_impl_compaction_flush.cc:1158] Calling FlushMemTableToOutputFile with colu
mn family [default], flush slots available 1, compaction slots allowed 1, compaction slots scheduled 1
2018-11-06 12:03:54.286527 7fd287813700  4 rocksdb: [/build/ceph-12.2.8/src/rocksdb/db/flush_job.cc:264] [default] [JOB 950] Flushing memtable with next log file: 79900

2018-11-06 12:03:54.286541 7fd287813700  4 rocksdb: EVENT_LOG_v1 {"time_micros": 1541495034286533, "job": 950, "event": "flush_started", "num_memtables": 1, "num_entries": 370287, "num_deletes": 146919, "memory_usage": 65025896}
2018-11-06 12:03:54.286547 7fd287813700  4 rocksdb: [/build/ceph-12.2.8/src/rocksdb/db/flush_job.cc:293] [default] [JOB 950] Level-0 flush table #79901: started
2018-11-06 12:03:54.452365 7fd287813700  4 rocksdb: EVENT_LOG_v1 {"time_micros": 1541495034452352, "cf_name": "default", "job": 950, "event": "table_file_creation", "file_number": 79901, "file_size": 14836047, "table_properties": {"data_size": 13956943, "index_size": 281340, "filter_size": 596833, "raw_key_size": 15773747, "raw_average_key_size": 69, "raw_value_size": 10003671, "raw_average_value_size": 44, "num_data_blocks": 3525, "num_entries": 225940, "filter_policy_name": "rocksdb.BuiltinBloomFilter", "kDeletedKeys": "146919", "kMergeOperands": "0"}}
2018-11-06 12:03:54.452385 7fd287813700  4 rocksdb: [/build/ceph-12.2.8/src/rocksdb/db/flush_job.cc:319] [default] [JOB 950] Level-0 flush table #79901: 14836047 bytes OK
2018-11-06 12:03:54.452722 7fd287813700  4 rocksdb: (Original Log Time 2018/11/06-12:03:54.452394) [/build/ceph-12.2.8/src/rocksdb/db/memtable_list.cc:360] [default] Level-0 commit table #79901 started
2018-11-06 12:03:54.452728 7fd287813700  4 rocksdb: (Original Log Time 2018/11/06-12:03:54.452668) [/build/ceph-12.2.8/src/rocksdb/db/memtable_list.cc:383] [default] Level-0 commit table #79901: memtable #1 done
2018-11-06 12:03:54.452732 7fd287813700  4 rocksdb: (Original Log Time 2018/11/06-12:03:54.452678) EVENT_LOG_v1 {"time_micros": 1541495034452674, "job": 950, "event": "flush_finished", "lsm_state": [2, 2, 5, 0, 0, 0, 0], "immutable_memtables": 0}
2018-11-06 12:03:54.452735 7fd287813700  4 rocksdb: (Original Log Time 2018/11/06-12:03:54.452706) [/build/ceph-12.2.8/src/rocksdb/db/db_impl_compaction_flush.cc:132] [default] Level summary: base level 1 max bytes base 268435456 files[2 2 5 0 0 0 0] max score 0.93

2018-11-06 12:03:54.452742 7fd287813700  4 rocksdb: [/build/ceph-12.2.8/src/rocksdb/db/db_impl_files.cc:388] [JOB 950] Try to delete WAL files size 56883285, prev total WAL file size 56916690, number of live WAL files 2.

There're no "blocked operations" message in ceph status, but some write requests are stuck for really long time (seconds if not minutes).
The only way to find such osd's while problem - is to mark some osd's from pool as down. It triggers peering and those osd's are found, because they stuck in peering state.
Last time we saw that two pg's on one osd has been stuck in snaptrim:

~# ceph pg dump | grep snaptrim
dumped all
14.99f      115                  0        0         0       0  1399443474 1585     1585 active+clean+snaptrim 2018-11-06 10:45:46.177039     1318693'8143685    1318693:11024026 [225,527,628]        225 [225,527,628]            225     1318210'8139261 2018-11-05 23:11:26.827327               0'0 2016-10-14 07:59:40.557778            20 
14.129b     125                  0        0         0       0  1768898560 1562     1562 active+clean+snaptrim 2018-11-06 10:45:12.233255    1318693'18374010    1318693:16891803 [527,225,650]        527 [527,225,650]            527    1318190'18342505 2018-11-05 06:48:51.767389               0'0 2016-10-14 07:59:40.567406            26

After we marked it down (ceph osd down osd.225), everything got back to normal state.

Hope for any help.
If any additional info is needed, please ask.

Files

osd_pid.20503.backtrace (49.7 KB) osd_pid.20503.backtrace

problem osd backtrace

Aleksei Zakharov, 12/12/2018 01:02 PM

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » RADOS

Custom queries

Bug #36709

OSD stuck while flushing rocksdb WAL

Updated by Greg Farnum over 5 years ago

Updated by Sage Weil over 5 years ago

Updated by Aleksei Zakharov over 5 years ago

Updated by Josh Durgin over 5 years ago

Updated by Aleksei Zakharov over 5 years ago

Updated by Aleksei Zakharov over 5 years ago

Updated by Igor Fedotov over 5 years ago