Bug #17444
closed
Data object missing xattr attrbutes on primary OSD will not be discovered by scrub/deep-scrub
Added by Cheng Li Yi over 7 years ago.
Updated almost 7 years ago.
Description
I just had a test on my ceph testing environment(Ceph 0.94.9) by removing xattr attrbutes for a data object, then scrub/deep-scrub/repair the pg, nothing happen, everything seems be ok. No error info, no xattr attrbutes repair.
root@node-1:~# ceph osd map images rbd_data.124d434b25e0.0000000000000001
osdmap e147 pool 'images' (3) object 'rbd_data.124d434b25e0.0000000000000001' -> pg 3.611702fb (3.fb) -> up ([1,0], p1) acting ([1,0], p1)
root@node-8:/var/lib/ceph/osd/ceph-1/current/3.fb_head# mv rbd\\udata.124d434b25e0.0000000000000001__head_611702FB__3 /tmp/
root@node-8:/var/lib/ceph/osd/ceph-1/current/3.fb_head# cp /tmp/rbd\\udata.124d434b25e0.0000000000000001__head_611702FB__3 ./
root@node-8:/var/lib/ceph/osd/ceph-1/current/3.fb_head# xattr -l rbd\\udata.124d434b25e0.0000000000000001__head_611702FB__3
root@node-8:/var/lib/ceph/osd/ceph-1/current/3.fb_head# xattr -l /tmp/rbd\\udata.124d434b25e0.0000000000000001__head_611702FB__3
user.cephos.spill_out:
0000 30 00
...
2016-09-30 04:03:48.798440 7f5f28dea700 0 log_channel(cluster) log [INF] : 3.fb scrub starts
2016-09-30 04:03:48.800012 7f5f28dea700 0 log_channel(cluster) log [INF] : 3.fb scrub ok
2016-09-30 04:04:01.801601 7f5f28dea700 0 log_channel(cluster) log [INF] : 3.fb deep-scrub starts
2016-09-30 04:04:01.806269 7f5f28dea700 0 log_channel(cluster) log [INF] : 3.fb deep-scrub ok
2016-09-30 04:04:04.802212 7f5f28dea700 0 log_channel(cluster) log [INF] : 3.fb repair starts
2016-09-30 04:04:04.807045 7f5f28dea700 0 log_channel(cluster) log [INF] : 3.fb repair ok, 0 fixed
root@node-8:/var/lib/ceph/osd/ceph-1/current/3.fb_head# xattr -l rbd\\udata.124d434b25e0.0000000000000001__head_611702FB__3
root@node-8:/var/lib/ceph/osd/ceph-1/current/3.fb_head#
root@node-1:~# ceph health detail
HEALTH_OK
- Project changed from Stable releases to Ceph
- Tracker changed from Tasks to Bug
- Assignee set to David Zafman
David, don't we checksum across all the metadata? Or is that a post-hammer bug?
- Status changed from New to Rejected
We've seen this before. If the OSD has the original unlinked file opened, it won't see the new one without the xattrs.
If the user restarts the osd with the corrupted object, the scrub will see the corruption.
Also available in: Atom
PDF