Project

General

Profile

Bug #36705

Update rocksdb to 5.15.X get iterator auto-readhead

Added by Mark Nelson over 5 years ago. Updated almost 3 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

https://github.com/facebook/rocksdb/wiki/Iterator

It's only in 5.12 for buffered IO and 5.15 for direct IO that iterators get auto-readahead. This is important because we don't really want users trying to tune this themselves. On IRC today, dproyer showed me behavior from a 12.2.9 luminous cluster where an OSD was spending all of it's time in tp_osd_disk threads spinning in rocksdb trying to read SST files 8KB at a time into the cache (blktrace wasn't totally sequential, but it's possible that there were multiple interleaved sequential read streams). The wallclock trace looked surprisingly similar to what we saw when compaction readahead wasn't enabled. Crazy amounts of time spent in pread64 doing those 8KB reads.

Ultimately it appears we fixed dpryors' immediate issue by increasing block cache size to avoid the reads entirely, though we should still consider upgrading to rocksdb to 5.15.X and backporting to luminous to get iterator readahead and other recent improvements as rocksdb will now be used more often in both bluestore and filestore backed clusters.

History

#1 Updated by Sage Weil almost 3 years ago

  • Status changed from New to Closed

Also available in: Atom PDF