Project

General

Profile

Scrub repair » History » Version 3

Li Wang, 06/20/2015 01:42 AM

1 1 Samuel Just
osd: Scrub and Repair
2
Summary
3
Current scrub and repair is fairly primitive.  There are several improvements which need to be made:
4
1) There needs to be a way to query the results of the most recent scrub on a pg.
5
2) The user should be able to query the contents of the replica objects in the event of an inconsistency (including data payload, xattrs, and omap).  This probably coopts the existing replica read machinery.
6
3) The user should be able to specify which replica to use for repair using the above information.
7
8
Owners
9
Samuel Just (Red Hat)
10
11
Interested Parties
12
Guang Yang (Yahoo!)
13
Loic Dachary (Red Hat)
14
Danny Al-Gaaf (Deutsche Telekom)
15 3 Li Wang
Min Chen (UbuntuKylin)
16 1 Samuel Just
17
Current Status
18
19
There are scrub and repair mechanisms, this blueprint aims to expand and improve them.
20
Detailed Description
21
On the osd side, the first change is that the primary needs to track the inconsistency information as scrub progresses.  As this might involve a large number of objects (though probably not), we do not want to keep this in memory.  I suggest storing the information in a per-pg scratch object which is cleared during peering reset.  The machinery used for the SnapMapper object can be re-used to handle maintaining a cache of unstable keys.
22
 
23
Next, I suggest adding a librados interface to ferry that information out to the user:
24
 
25
/// get currently inconsistent pgs
26
void get_inconsistent_pgs(
27
  pg_t last,            ///< [in] list pgs > last
28
  list<pg_t> *out   ///< [out] listed pgs
29
  );
30
 
31
/// get information about inconsistent objects in a pg
32
bool query_inconsistent_pg(
33
  pg_t to_query,                      ///< [in] pg to query
34
  pair<string, string> last,       ///< [in] begin listing objects > last, (locator, object)
35
  epoch_t *activation_epoch, ///< [out] activation epoch for the interval in which this query was serviced
36
  list<pair<pair<string, string>, inconsistent_info_t> *out ///< [out] listed inconsistency information
37
  ); ///< @return true iff primary has a populated scrub info structure
38
 
39
inconsistent_info_t will include all relevant information about each inconsistent object.
40
 
41
/// allows directed repair of an object
42
void repair_inconsistent_object(
43
  pg_t pg_to_repair,                   ///< [in] pg in which we want to perform a repair
44
  pair<string, string> to_repair, ///< [in] object to repair
45
  replica_t replica_to_use,        ///< [in] replica to use for repair
46
  epoch_t activation_epoch,     ///< [in] from query_inconsistent_pg
47
  ...                                              ///< stuff to support async?
48
  );
49
 
50
The repair_inconsistent_object return machinery will return an -EAGAIN (or some other error) if activation_epoch is prior to the primary's current interval activation_epoch.  This ensures that the replica_to_use value is not out of date.
51
 
52
We'll also need interfaces to allow reading based on replica_t, but we don't want to duplicate all of the current read interfaces.  Possibly a ioctx method to set replica_to_use, pg_to_repair, and activation_epoch?