Project

General

Profile

Actions

Feature #3764

open

osd: async replicas

Added by Samuel Just over 11 years ago. Updated over 4 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Reviewed:
Affected Versions:
Component(RADOS):
OSD
Pull request ID:

Description

The following is more a topic for conversation than a feature:

Currently, latency on any operation is limited by the slowest replica in the pg. It might be worth exploring a scheme where the primary waits for M/N acks or commits before responding to the client where M is a pool configurable and N is the pool size (replication level). Objects which are accessed infrequently but with high latency sensitivity might benefit significantly from such a scheme.

The obvious disadvantage is that to maintain the same guarantees as we currently provide, we would need to contact at least (N-M+1) replicas from each interval in which the pg might have gone active since up to (N-M) replicas might be behind what the client considers completed.

We would also need to allow a configurable bound on how far behind a replica is allowed to be.

It seems to me that rgw bucket indices could benefit from this scheme. Any given bucket index is relatively small, accessed relatively infrequently, and the accesses are relatively cheap, but the index operation latency limits the rgw op latency. The bucket indices could therefore be put in a separate pool with perhaps N=4 and M=3 in order to decrease overall rgw op latency.

A first step might be instrumentation to evaluate how slow the slowest replica tends to be.

Actions #1

Updated by Patrick Donnelly about 6 years ago

  • Project changed from Ceph to RADOS
  • Status changed from New to 4
  • Component(RADOS) OSD added
Actions #2

Updated by Patrick Donnelly over 4 years ago

  • Status changed from 4 to New
Actions

Also available in: Atom PDF