Version 1 - History - Tail latency improvements - Ceph - Ceph

Tail latency improvements » History » Version 1

Samuel Just, 06/12/2015 07:53 PM

-Samuel Just
+Improve tail latency
 Summary
 Tail latency (e.g. 99.99%) is important for some online serving scenarios, this blueprint summarizes some tail latency issues we observed on our production cluster.
 OSD ungraceful shutdown. OSD might crash due to broken disk, software bug, etc. Currently the crash/down of OSD is detected by its peers and it could take tens of seconds to trigger a osdmap change (20 seconds by default), which further lead client to retry the in flight requests associated with this OSD.
 Just wondering if it is possible to preemptively tell MON that the OSD is going down (crash) when there is assertion failures, like the way being used by graceful shutdown?
 Peering. Thanks to Sage and Sam working on improvements with peering, which proved to impact tail latency.
 Slow OSDs. OSD could become slow for various reasons, and currently the client latency is determined by the slowest OSD in the PG serving the request.
 For EC pool, we tested the patch to read k + m chunks and used the first returned k chunks to serve the client, it turned out to significantly (30%) improved the latency, especially for tail. However, there is still a couple of problems: 1> If the primary is stucked, the patch would not help. 2> the patch does not bring benefit for WRITE (maybe only in a negative way as it brought more load). 3> it does not benefit replication pool.