Support #23971
omap_digest_mismatch_oi
0%
Description
Following a strange RGW GC backlog issue reported in https://tracker.ceph.com/issues/23839 - I have some PG's that are inconsistent in the buckets.index pool, the PG's in the 3 replicates have the same data in the omaplist if dumpded using `ceph-objectstore-tool` - the omap_digest for each replica also looks to be the same, but it seems to report an omap_digest_missmatch_oi error - and I am not really sure how to troubleshoot this one further.
The Details of the error are below:
- rados list-inconsistent-obj 15.13a --format=json-pretty
{
"epoch": 1232493,
"inconsistents": [ {
"object": {
"name": ".dir.default.142205237.23.4",
"nspace": "",
"locator": "",
"snap": "head",
"version": 3693649
},
"errors": [],
"union_shard_errors": [
"omap_digest_mismatch_oi"
],
"selected_object_info": "15:5cb389b5:::.dir.default.142205237.23.4:head(1223775'3712706 osd.644.0:7107793 dirty|omap|data_digest|omap_digest s 0 uv 3693649 dd ffffffff od 1a82fea2 alloc_hint [0 0])",
"shards": [ {
"osd": 304,
"errors": [
"omap_digest_mismatch_oi"
],
"size": 0,
"omap_digest": "0x4efed260",
"data_digest": "0xffffffff"
}, {
"osd": 591,
"errors": [
"omap_digest_mismatch_oi"
],
"size": 0,
"omap_digest": "0x4efed260",
"data_digest": "0xffffffff"
}, {
"osd": 644,
"errors": [
"omap_digest_mismatch_oi"
],
"size": 0,
"omap_digest": "0x4efed260",
"data_digest": "0xffffffff"
}
]
}
]
}
The log from a ceph pg repair attempt are below:
2018-05-02 19:01:21.098300 7fddbda87700 -1 log_channel(cluster) log [ERR] : 15.13a shard 304: soid 15:5cb389b5:::.dir.default.142205237.23.4:head omap_digest 0x4efed260 != omap_digest 0x1a82fea2 from auth oi 15:5cb389b5:::.dir.default.142205237.23.4:head(1223775'3712706 osd.644.0:7107793 dirty|omap|data_digest|omap_digest s 0 uv 3693649 dd ffffffff od 1a82fea2 alloc_hint [0 0])
2018-05-02 19:01:21.098452 7fddbda87700 -1 log_channel(cluster) log [ERR] : 15.13a shard 591: soid 15:5cb389b5:::.dir.default.142205237.23.4:head omap_digest 0x4efed260 != omap_digest 0x1a82fea2 from auth oi 15:5cb389b5:::.dir.default.142205237.23.4:head(1223775'3712706 osd.644.0:7107793 dirty|omap|data_digest|omap_digest s 0 uv 3693649 dd ffffffff od 1a82fea2 alloc_hint [0 0])
2018-05-02 19:01:21.098457 7fddbda87700 -1 log_channel(cluster) log [ERR] : 15.13a shard 644: soid 15:5cb389b5:::.dir.default.142205237.23.4:head omap_digest 0x4efed260 != omap_digest 0x1a82fea2 from auth oi 15:5cb389b5:::.dir.default.142205237.23.4:head(1223775'3712706 osd.644.0:7107793 dirty|omap|data_digest|omap_digest s 0 uv 3693649 dd ffffffff od 1a82fea2 alloc_hint [0 0])
2018-05-02 19:01:21.098459 7fddbda87700 -1 log_channel(cluster) log [ERR] : 15.13a soid 15:5cb389b5:::.dir.default.142205237.23.4:head: failed to pick suitable auth object
2018-05-02 19:01:21.098524 7fddbda87700 -1 log_channel(cluster) log [ERR] : 15.13a repair 3 errors, 0 fixed
Any advise on how to dig into this further would be very welcome.
History
#1 Updated by sean redmond almost 6 years ago
maybe this is the same as http://tracker.ceph.com/issues/21388