Project

General

Profile

Scrub repair » History » Version 1

Samuel Just, 06/12/2015 07:51 PM

1 1 Samuel Just
osd: Scrub and Repair
2
Summary
3
Current scrub and repair is fairly primitive.  There are several improvements which need to be made:
4
1) There needs to be a way to query the results of the most recent scrub on a pg.
5
2) The user should be able to query the contents of the replica objects in the event of an inconsistency (including data payload, xattrs, and omap).  This probably coopts the existing replica read machinery.
6
3) The user should be able to specify which replica to use for repair using the above information.
7
8
Owners
9
Samuel Just (Red Hat)
10
11
Interested Parties
12
Guang Yang (Yahoo!)
13
Loic Dachary (Red Hat)
14
Danny Al-Gaaf (Deutsche Telekom)
15
16
Current Status
17
18
There are scrub and repair mechanisms, this blueprint aims to expand and improve them.
19
Detailed Description
20
On the osd side, the first change is that the primary needs to track the inconsistency information as scrub progresses.  As this might involve a large number of objects (though probably not), we do not want to keep this in memory.  I suggest storing the information in a per-pg scratch object which is cleared during peering reset.  The machinery used for the SnapMapper object can be re-used to handle maintaining a cache of unstable keys.
21
 
22
Next, I suggest adding a librados interface to ferry that information out to the user:
23
 
24
/// get currently inconsistent pgs
25
void get_inconsistent_pgs(
26
  pg_t last,            ///< [in] list pgs > last
27
  list<pg_t> *out   ///< [out] listed pgs
28
  );
29
 
30
/// get information about inconsistent objects in a pg
31
bool query_inconsistent_pg(
32
  pg_t to_query,                      ///< [in] pg to query
33
  pair<string, string> last,       ///< [in] begin listing objects > last, (locator, object)
34
  epoch_t *activation_epoch, ///< [out] activation epoch for the interval in which this query was serviced
35
  list<pair<pair<string, string>, inconsistent_info_t> *out ///< [out] listed inconsistency information
36
  ); ///< @return true iff primary has a populated scrub info structure
37
 
38
inconsistent_info_t will include all relevant information about each inconsistent object.
39
 
40
/// allows directed repair of an object
41
void repair_inconsistent_object(
42
  pg_t pg_to_repair,                   ///< [in] pg in which we want to perform a repair
43
  pair<string, string> to_repair, ///< [in] object to repair
44
  replica_t replica_to_use,        ///< [in] replica to use for repair
45
  epoch_t activation_epoch,     ///< [in] from query_inconsistent_pg
46
  ...                                              ///< stuff to support async?
47
  );
48
 
49
The repair_inconsistent_object return machinery will return an -EAGAIN (or some other error) if activation_epoch is prior to the primary's current interval activation_epoch.  This ensures that the replica_to_use value is not out of date.
50
 
51
We'll also need interfaces to allow reading based on replica_t, but we don't want to duplicate all of the current read interfaces.  Possibly a ioctx method to set replica_to_use, pg_to_repair, and activation_epoch?
52
Work items
53
Coding tasks
54
Task 1
55
Task 2
56
Task 3
57
Build / release tasks
58
Task 1
59
Task 2
60
Task 3
61
Documentation tasks
62
Task 1
63
Task 2
64
Task 3
65
Deprecation tasks
66
Task 1
67
Task 2
68
Task 3