Osd - Faster Peering » History » Version 2
Jessica Mack, 07/10/2015 09:58 PM
1 | 1 | Jessica Mack | h1. Osd - Faster Peering |
---|---|---|---|
2 | 1 | Jessica Mack | |
3 | 1 | Jessica Mack | h3. Summary |
4 | 1 | Jessica Mack | |
5 | 1 | Jessica Mack | For correctness reasons, peering requires a series of serial message transmissions and filestore syncs prior to completion. This puts something of a lower bound on the latency client IO suffers on cluster change. |
6 | 1 | Jessica Mack | |
7 | 1 | Jessica Mack | h3. Owners |
8 | 1 | Jessica Mack | |
9 | 1 | Jessica Mack | * Sam Just (RedHat) |
10 | 1 | Jessica Mack | * Name (Affiliation) |
11 | 1 | Jessica Mack | * Name |
12 | 1 | Jessica Mack | |
13 | 1 | Jessica Mack | h3. Interested Parties |
14 | 1 | Jessica Mack | |
15 | 1 | Jessica Mack | * Guang Yang (Yahoo!) |
16 | 1 | Jessica Mack | * Name (Affiliation) |
17 | 1 | Jessica Mack | * Name |
18 | 1 | Jessica Mack | |
19 | 1 | Jessica Mack | h3. Current Status |
20 | 1 | Jessica Mack | |
21 | 1 | Jessica Mack | h3. Detailed Description |
22 | 1 | Jessica Mack | |
23 | 2 | Jessica Mack | !{width:40%}graph.png! |
24 | 1 | Jessica Mack | |
25 | 1 | Jessica Mack | The above is the peering state chart generated from the source. GetInfo->GetLog->GetMissing requires three round trips to replicas. First, we get pg infos from every osd in the prior set, acting set, and up set in order to choose an authoritative log. Second, we fetch the authoritative log. Last, we fetch missing sets from each acting set replica for use during recovery. |
26 | 1 | Jessica Mack | 1) Can we preemptively request the log+missing for osds in the most recent prior set interval to hopefully skip the GetLog step? |
27 | 1 | Jessica Mack | 2) Can we preemptively request the log+missing for acting and up osds in the GetInfo set to hopefully skip the GetMissing step? |
28 | 1 | Jessica Mack | |
29 | 1 | Jessica Mack | Another wrinkle is that replicas do not send the info requested in GetInfo and the primary cannot start peering until the previous acting interval has been flushed. |
30 | 1 | Jessica Mack | 1) We might be able to relax this to waiting for a commit (journal only) if we track unstable objects across intervals. We need to track unstable objects for replicas going forward anyway to get replica reads right, so this might not be so bad. |
31 | 1 | Jessica Mack | |
32 | 1 | Jessica Mack | h3. Work items |
33 | 1 | Jessica Mack | |
34 | 1 | Jessica Mack | h4. Coding tasks |
35 | 1 | Jessica Mack | |
36 | 1 | Jessica Mack | # Task 1 |
37 | 1 | Jessica Mack | # Task 2 |
38 | 1 | Jessica Mack | # Task 3 |
39 | 1 | Jessica Mack | |
40 | 1 | Jessica Mack | h4. Build / release tasks |
41 | 1 | Jessica Mack | |
42 | 1 | Jessica Mack | # Task 1 |
43 | 1 | Jessica Mack | # Task 2 |
44 | 1 | Jessica Mack | # Task 3 |
45 | 1 | Jessica Mack | |
46 | 1 | Jessica Mack | h4. Documentation tasks |
47 | 1 | Jessica Mack | |
48 | 1 | Jessica Mack | # Task 1 |
49 | 1 | Jessica Mack | # Task 2 |
50 | 1 | Jessica Mack | # Task 3 |
51 | 1 | Jessica Mack | |
52 | 1 | Jessica Mack | h4. Deprecation tasks |
53 | 1 | Jessica Mack | |
54 | 1 | Jessica Mack | # Task 1 |
55 | 1 | Jessica Mack | # Task 2 |
56 | 1 | Jessica Mack | # Task 3 |