Project

General

Profile

Diagnosability » History » Version 1

Jessica Mack, 06/23/2015 02:58 AM

1 1 Jessica Mack
h1. Diagnosability
2
3
h3. Summary
4
 
5
Make it easier to diagnose things like inconsistent PGs in a cluster.  Ideas include, but are not limited to:
6
 
7
store state of scrub errors for later querying by CLI/HTTP
8
explain pg repair and osd repair
9
maybe parameterize pg repair and osd repair to address arbitrary choices for resolution
10
central audit log containing "admin operations that change the cluster" (much smaller than ceph.log, for manual audit)
11
 
12
 
13
h3. Owners
14
15
* Dan Mick (Inktank)
16
17
h3. Interested Parties
18
19
* Dan Mick (Inktank)
20
* Danny Al-Gaaf (Deutsche Telekom AG)
21
* Loic Dachary <loic@dachary.org>
22
* Name
23
24
h3. Current Status
25
26
Today you can find details of scrub errors and object names only in the central log; a query interface would be a start.  Also, pg/osd repair's actions are...unclear, and perhaps suboptimal; they should be clearly documented as a start, and perhaps improved/parameterized (as automatic repair ideally needs administrator policy input).
27
28
h3. Detailed Description
29
 
30
h3. Work items
31
32
h4. Coding tasks
33
34
# Task 1
35
# Task 2
36
# Task 3
37
38
h4. Build / release tasks
39
40
# Task 1
41
# Task 2
42
# Task 3
43
44
h4. Documentation tasks
45
46
# Task 1
47
# Task 2
48
# Task 3
49
50
h4. Deprecation tasks
51
52
# Task 1
53
# Task 2
54
# Task 3