Project

General

Profile

Osd - Scrub and Repair » History » Version 2

Jessica Mack, 08/12/2015 05:17 AM

1 1 Jessica Mack
h1. Osd - Scrub and Repair
2
3
h3. Summary
4
5
Current scrub and repair is fairly primitive.  There are several improvements which need to be made:
6
1) There needs to be a way to query the results of the most recent scrub on a pg.
7
2) The user should be able to query the contents of the replica objects in the event of an inconsistency (including data payload, xattrs, and omap).  This probably coopts the existing replica read machinery.
8
3) The user should be able to specify which replica to use for repair using the above information.
9
10
h3. Owners
11
12
* Samuel Just (Red Hat)
13
* Name (Affiliation)
14
* Name
15
16
h3. Interested Parties
17
18
* Guang Yang (Yahoo!)
19
* Loic Dachary (Red Hat)
20
* Danny Al-Gaaf (Deutsche Telekom)
21
* Name
22
23
h3. Current Status
24
25
There are scrub and repair mechanisms, this blueprint aims to expand and improve them.
26 2 Jessica Mack
27
h3. Detailed Description
28
29 1 Jessica Mack
On the osd side, the first change is that the primary needs to track the inconsistency information as scrub progresses.  As this might involve a large number of objects (though probably not), we do not want to keep this in memory.  I suggest storing the information in a per-pg scratch object which is cleared during peering reset.  The machinery used for the SnapMapper object can be re-used to handle maintaining a cache of unstable keys.
30
 
31
Next, I suggest adding a librados interface to ferry that information out to the user:
32
<pre>
33
 
34
/// get currently inconsistent pgs
35
void get_inconsistent_pgs(
36
  pg_t last,            ///< [in] list pgs > last
37
  list<pg_t> *out   ///< [out] listed pgs
38
  );
39
 
40
/// get information about inconsistent objects in a pg
41
bool query_inconsistent_pg(
42
  pg_t to_query,                      ///< [in] pg to query
43
  pair<string, string> last,       ///< [in] begin listing objects > last, (locator, object)
44
  epoch_t *activation_epoch, ///< [out] activation epoch for the interval in which this query was serviced
45
  list<pair<pair<string, string>, inconsistent_info_t> *out ///< [out] listed inconsistency information
46
  ); ///< @return true iff primary has a populated scrub info structure
47
 
48
inconsistent_info_t will include all relevant information about each inconsistent object.
49
 
50
/// allows directed repair of an object
51
void repair_inconsistent_object(
52
  pg_t pg_to_repair,                   ///< [in] pg in which we want to perform a repair
53
  pair<string, string> to_repair, ///< [in] object to repair
54
  replica_t replica_to_use,        ///< [in] replica to use for repair
55
  epoch_t activation_epoch,     ///< [in] from query_inconsistent_pg
56
  ...                                              ///< stuff to support async?
57
  );
58
</pre>
59
60
The repair_inconsistent_object return machinery will return an -EAGAIN (or some other error) if activation_epoch is prior to the primary's current interval activation_epoch.  This ensures that the replica_to_use value is not out of date.
61
 
62
We'll also need interfaces to allow reading based on replica_t, but we don't want to duplicate all of the current read interfaces.  Possibly a ioctx method to set replica_to_use, pg_to_repair, and activation_epoch?
63
64
h3. Work items
65
66
h4. Coding tasks
67
68
# Task 1
69
# Task 2
70
# Task 3
71
72
h4. Build / release tasks
73
74
# Task 1
75
# Task 2
76
# Task 3
77
78
h4. Documentation tasks
79
80
# Task 1
81
# Task 2
82
# Task 3
83
84
h4. Deprecation tasks
85
86
# Task 1
87
# Task 2
88
# Task 3