Libradosobjecter - smarter localized reads » History » Version 1
Jessica Mack, 06/21/2015 06:16 AM
1 | 1 | Jessica Mack | h1. Libradosobjecter - smarter localized reads |
---|---|---|---|
2 | |||
3 | h3. Summary |
||
4 | |||
5 | Allow reads from object replicas on the same host, same rack, same data center, or other distance metric. |
||
6 | |||
7 | h3. Owners |
||
8 | |||
9 | * Name (Affiliation) |
||
10 | |||
11 | h3. Interested Parties |
||
12 | |||
13 | * Sage Weil (Inktank) |
||
14 | * Loic Dachary <loic@dachary.org> |
||
15 | * Sam Just (Inktank) |
||
16 | |||
17 | h3. Current Status |
||
18 | |||
19 | There is currently a librados/libcephfs/Objecter flag that tells the client to read from a non-primary replica if it happens to be on the same host (as indicated by a matching IP address). This works in some situations but is otherwise pretty limited. |
||
20 | |||
21 | h3. Detailed Description |
||
22 | |||
23 | Clusters typically have some sort of hierarchical structure, and we conveniently have that handy in the CRUSH map. We want to use that information to choose a replica that is closes to us. |
||
24 | The key missing piece of information is where the client is in a form that can be reconciled with the CRUSH map. If the client has a set of key/value pairs indicating its location in the same terms that and OSDs location is reflected by CRUSH, we can use that to choose the one that is closest to us. |
||
25 | The 'crush location' config option would be a simple list of key/value pairs, e.g. |
||
26 | |||
27 | p((. crush location = host=foo rack=bar room=baz |
||
28 | |||
29 | We might allow multiple locations to be listed: |
||
30 | |||
31 | p((. crush location = host=foo host=foo2 rack=bar |
||
32 | |||
33 | which would be useful when there are parallel hierarchies and we want to indicate locality for both. |
||
34 | Objecter would look for the match with the CRUSH hierarchy with the lowest-valued crush type (i.e., a matching host is closer than a matching rack) |
||
35 | This would be triggered by the existing LOCALIZED_READS flag implemented in the Objecter and exposed to varying degrees by librados and libcephfs. |
||
36 | |||
37 | h3. Work items |
||
38 | |||
39 | h3. Coding tasks |
||
40 | |||
41 | # Objecter: add 'crush location' config option and parse on init |
||
42 | # Objecter: add additonal API calls to adjust this location setting at runtime |
||
43 | # Objecter: choose the closest replica based on location information, when it is specified. This can either supplement or (more likely) replace the current check for a matching IP address. |
||
44 | # librados, libcephfs: expose explicit API to set the location. This would supplement simply setting the 'crush location = ...' config option |
||
45 | # [maybe] Update hadoop bindings to use the new API |
||
46 | # [maybe] librbd: set localized reads flag on clone parents? |
||
47 | |||
48 | h3. Build / release tasks |
||
49 | |||
50 | # Build/expand localized-reads test to verify the correct replica is chosen |
||
51 | |||
52 | h3. Documentation tasks |
||
53 | |||
54 | # Document API changes |
||
55 | |||
56 | h3. Tracker Links |
||
57 | |||
58 | http://tracker.ceph.com/issues/5035 |