Feature #64845
openSupport read_from_replica everywhere
0%
Description
RADOS supports reads from replicas now, and has done so for a while. It is not on by default and requires setting a flag in the Objecter/librados, because reading from random OSDs is less cache-efficient. But it is desirable in stretch cluster scenarios, which Ceph is seeing more of as time goes by.
Right now, only RBD supports read-from-replica, because the initial implementation did not properly order reads with in-progress writes on the OSD side, so it was only viable for snapshots. Happily, that changed several releases ago. So, we should extend this option to every component! This ticket will serve as an umbrella, and perhaps holds the actual work.
I see two obvious approaches to resolving this:
1) Every component adds a config analogous to rbd_read_from_replica_policy. (I see CephFS has a libcephfs::ceph_localize_reads() function, but no obvious way to set it — weird?)
2) We make a RADOS config option that can be set by any random Ceph client.
The second is obviously broadly applicable and a shorter config, but leaves out the possibility of components applying appropriate system-specific checks or tweaks. The first is a little more work?
Updated by Greg Farnum about 1 month ago
Here's a proposal for option (2) from Yehuda https://github.com/ceph/ceph/pull/56180