Project

General

Profile

Actions

Feature #15835

closed

filestore: randomize split threshold

Added by Josh Durgin almost 8 years ago. Updated about 6 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Performance/Resource Usage
Target version:
-
% Done:

0%

Source:
Support
Tags:
Backport:
jewel
Reviewed:
Affected Versions:
Component(RADOS):
Pull request ID:

Description

If the distribution of files is roughly even, many osds will reach the split threshold at the same time, causing them all to incur high latency as they all split directories at once.

A simple change that may mitigate this is to randomize the split threshold, similar to the randomized scrub threshold, so different osds split directories over a larger period of time.


Related issues 1 (0 open1 closed)

Copied to RADOS - Backport #22658: filestore: randomize split thresholdResolvedJosh DurginActions
Actions #1

Updated by Vikhyat Umrao over 7 years ago

  • Source changed from other to Support
Actions #2

Updated by Peng Chen about 7 years ago

Hi! I am an undergrad student wishing to contribute to CEPH, and I would like to work on this issue. Please let me know.

Thanks,

Peng Chen

Actions #3

Updated by Josh Durgin about 7 years ago

This one is more about performance testing, and at this point I think effort there is better spent on bluestore than filestore, and bluestore does not have internal splitting like this at all.

Perhaps you'd like to try http://tracker.ceph.com/issues/18629 ?

Actions #4

Updated by Josh Durgin almost 7 years ago

  • Backport set to jewel, kraken
Actions #6

Updated by Nathan Cutler almost 7 years ago

  • Status changed from New to Fix Under Review
Actions #7

Updated by Josh Durgin almost 7 years ago

  • Status changed from Fix Under Review to Resolved
  • Backport deleted (jewel, kraken)

Perf testing is not indicating much benefit, so I'd hold off on backporting this.

Actions #8

Updated by Josh Durgin over 6 years ago

  • Backport set to jewel

I spoke too soon, there is significantly improved latency and throughput in longer running tests on several osds.

Actions #9

Updated by Josh Durgin over 6 years ago

  • Category deleted (OSD)
  • Status changed from Resolved to Pending Backport
Actions #10

Updated by Josh Durgin over 6 years ago

  • Project changed from Ceph to RADOS
  • Category set to Performance/Resource Usage
  • Assignee set to Josh Durgin
Actions #11

Updated by Josh Durgin over 6 years ago

Actions #12

Updated by Nathan Cutler about 6 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF