Bug #19267
openrados list-inconsistent-obj sometimes doesn't flag that all 3 copies are bad
0%
Description
I tested ceph 10.2.3 with cluster of 3 osd nodes.
I upload a text file to ceph cluster, then manually change the text content on osd nodes.
After deep-scrub, ceph reports inconsistent error. But rados list-inconsistent-obj <pg> doesn't say all copies data are bad, instead it says only two copies are bad.
steps:
1. upload text file to cephrados -p rbd_ssd put testfile testfile
2. get the file location on osdsroot@cheng-ceph1:~# ceph osd map rbd_ssd testfile
osdmap e97 pool 'rbd_ssd' (1) object 'testfile' -> pg 1.551a2b36 (1.36) -> up ([1,0,2], p1) acting ([1,0,2], p1)
3. make all 3 copies bad by updating /var/lib/ceph/osd/ceph-x/current/1.36_head/testfile__head_551A2B36__1
4. trigger deep-scrub
5. now ceph reports inconsistent errroot@cheng-ceph3:~# ceph health detail
HEALTH_ERR 1 pgs inconsistent; 3 scrub errors
pg 1.36 is active+clean+inconsistent, acting [1,0,2]
3 scrub errors
6. but list-inconsistent-obj shows only two copies are bad. In fact, all 3 copies are bad and have different size from the original text file. The original text file has only 18 chars.
@root@cheng-ceph3:~# rados -p rbd_ssd list-inconsistent-obj 1.36 |python -m json.tool
{
"epoch": 95,
"inconsistents": [
{
"errors": [
"size_mismatch"
],
"object": {
"locator": "",
"name": "testfile",
"nspace": "",
"snap": "head"
},
"shards": [
{
"data_digest": "0xa3ba020a",
"errors": [
"size_mismatch"
],
"omap_digest": "0xffffffff",
"osd": 0,
"size": 21
},
{
"data_digest": "0xa3ba020a",
"errors": [
"size_mismatch"
],
"omap_digest": "0xffffffff",
"osd": 1,
"size": 22
},
{
"data_digest": "0xa3ba020a",
"errors": [],
"omap_digest": "0xffffffff",
"osd": 2,
"size": 23
}
]
}
]
}@
Another thing I don't understand is that ceph doesn't block user from putting object even 3 copies are bad.