Actions
Bug #14521
closedFailure on restart after repairing corrupted PG
% Done:
0%
Source:
other
Tags:
repair, meta, corruption, file exists
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Test case:
1. Start 2 OSDs, 8 PGs, pool size = 2
2. Run write workload for some time.
3. Stop workload
4. rm -rf dev/osd0/current/0.2_head/*
5. ceph osd scrub 0
6. ceph pg repair 0.2
7. Restart OSD.
8. Get "error (17) File exists not handled on operation"
The root cause is that "head" meta file wasn't restored by pg repair. So
all omap_get/setkeys fail for that PG.
On restart load_pgs skips that PG because it can't read metadata, but later when
OSD tries to recreate PG it hit the error from the test case, because
all the data files are in place, restored by repair.
Actions