Project

General

Profile

Improve tail latency

Summary

Tail latency (e.g. 99.99%) is important for some online serving scenarios, this blueprint summarizes some tail latency issues we observed on our production cluster.
  • OSD ungraceful shutdown. OSD might crash due to broken disk, software bug, etc. Currently the crash/down of OSD is detected by its peers and it could take tens of seconds to trigger a osdmap change (20 seconds by default), which further lead client to retry the in flight requests associated with this OSD.
    • Just wondering if it is possible to preemptively tell MON that the OSD is going down (crash) when there is assertion failures, like the way being used by graceful shutdown?
  • Peering. Thanks to Sage and Sam working on improvements with peering, which proved to impact tail latency.
  • Slow OSDs. OSD could become slow for various reasons, and currently the client latency is determined by the slowest OSD in the PG serving the request.
    • For EC pool, we tested the patch to read k + m chunks and used the first returned k chunks to serve the client, it turned out to significantly (30%) improved the latency, especially for tail. However, there is still a couple of problems: 1> If the primary is stucked, the patch would not help. 2> the patch does not bring benefit for WRITE (maybe only in a negative way as it brought more load). 3> it does not benefit replication pool.

Owners

  • Guang Yang (Yahoo!)
  • Name (Affiliation)
  • Name

Interested Parties

  • Name (Affiliation)
  • Name (Affiliation)
  • Name

Current Status

Detailed Description

Work items

Coding tasks

  1. Task 1
  2. Task 2
  3. Task 3

Build / release tasks

  1. Task 1
  2. Task 2
  3. Task 3

Documentation tasks

  1. Task 1
  2. Task 2
  3. Task 3

Deprecation tasks

  1. Task 1
  2. Task 2
  3. Task 3