Osd - prepopulate pg temp » History » Version 1
Jessica Mack, 07/03/2015 08:43 PM
1 | 1 | Jessica Mack | h1. Osd - prepopulate pg temp |
---|---|---|---|
2 | |||
3 | h3. Summary |
||
4 | |||
5 | Pre-populate the pg_temp mapping in the OSDMap when there are large changes in the CRUSH map. |
||
6 | |||
7 | h3. Owners |
||
8 | |||
9 | * Sage Weil (Inktank) |
||
10 | |||
11 | h3. Interested Parties |
||
12 | |||
13 | * Guang Yang (Yahoo!) |
||
14 | * Name (Affiliation |
||
15 | * Name |
||
16 | |||
17 | h3. Current Status |
||
18 | |||
19 | Normally when there is a major change (like a CRUSH rule change, or reweighting of an entire rack), many PG primaries get remapped to devices that do not have the content, and each one sends a request to the monitor to add a pg_temp exception remapping to the previous location. This incurs a delay in availability, especially when there are many such PGs and a large number of messages the monitors have to process to add the remappings. |
||
20 | |||
21 | h3. Detailed Description |
||
22 | |||
23 | Instead of waiting for the OSDs to add an exception, we could (optionally) prepopulate pg_temp after a CRUSH map change. This minimizes (or eliminates) any lapse in availability (no i/o stalls) at the expense of monitor CPU utilization calculating the mappings. |
||
24 | Key considerations: |
||
25 | * what triggers the mon to calculate pg mappings? pg_pool_t property change? CRUSH map change? |
||
26 | * how do we prevent that work from disrupting ongoing mon work? |
||
27 | ** async worker thread that may/may not come back with useful work before the paxos round gets proposed? |
||
28 | * ensure that thrashosds.py is making changes that trigger said remapping |
||
29 | |||
30 | h3. Work items |
||
31 | |||
32 | h4. Coding tasks |
||
33 | |||
34 | # mon: build predicate to determine when to calculate mappings |
||
35 | ## add config options controlling this as appropriate |
||
36 | # mon: calculate mappings and pre-populate pg_temp |
||
37 | # mon: push calculation onto an async worker thread that can run in parallel with real work |
||
38 | |||
39 | h4. Build / release tasks |
||
40 | |||
41 | # teuthology: ensure thrashosds exercises new feature |