Project

General

Profile

Bug #15600

rocksdb stops working correctly after 6 hours bluestore writes

Added by Jianjian Huo almost 8 years ago. Updated almost 7 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Found this bug on jewel release 10.2.0.
With fio_rbd 100% 4KB writes, after 6 hours, rocksdb reported below error msg:

...
2016-04-25 13:41:09.230511 7f1e66e18700 4 rocksdb: (Original Log Time 2016/04/25-13:41:09.230485) EVENT_LOG_v1 {"time_micros": 1461616869230478, "job": 206964, "event": "compaction_finished", "compaction_time_micros": 49240, "output_level": 2, "num_output_files": 3, "total_output_size": 4770145, "num_input_records": 2524, "num_output_records": 2344, "num_subcompactions": 1, "lsm_state": [1, 10, 60, 143, 0, 0, 0]}
2016-04-25 13:41:09.230917 7f1e66e18700 4 rocksdb: EVENT_LOG_v1 {"time_micros": 1461616869230914, "job": 206964, "event": "table_file_deletion", "file_number": 1187734}
2016-04-25 13:41:09.231082 7f1e66e18700 4 rocksdb: EVENT_LOG_v1 {"time_micros": 1461616869231080, "job": 206964, "event": "table_file_deletion", "file_number": 1187521}
2016-04-25 13:41:09.231327 7f1e66e18700 4 rocksdb: EVENT_LOG_v1 {"time_micros": 1461616869231326, "job": 206964, "event": "table_file_deletion", "file_number": 1187520}
2016-04-25 13:41:09.231340 7f1e66e18700 2 rocksdb: Waiting after background compaction error: IO error: /home/ceph_user/my_cluster/ceph-deploy/osd/myosddata/*db/1187753.log: No such file or directory, Accumulated background error counts: 1*

History

#1 Updated by Jianjian Huo almost 8 years ago

This website editor changed my original text...

The original error msg reported by rocksdb is below:
2016-04-25 13:41:09.231340 7f1e66e18700 2 rocksdb: Waiting after background compaction error: IO error: /home/ceph_user/my_cluster/ceph-deploy/osd/myosddata/db/1187753.log: No such file or directory, Accumulated background error counts: 1

After this msg, rocksdb stopped generating any new sst file whatsoever, but Ceph/bluestore was still working, and bluestore 4KB write IOPS jumped to 6X high(not real).

I guess current rocksdb code is a little bit old(Dec.2015), we need pull in latest rocksdb?

#2 Updated by Jianjian Huo almost 8 years ago

Had other two ceph/bluestore fio_rbd runs, and both of them reported same rocksdb error after about 6 hours. So this issue is very easy to reproduce.
I am using almost all default bluestore options, one notable exception is "bluestore_bluefs = false".

#3 Updated by Sage Weil almost 7 years ago

  • Status changed from New to Can't reproduce

Also available in: Atom PDF