Project

General

Profile

Bug #20243

Updated by David Zafman over 5 years ago


Something similar to this was seen on a production system. If all the object_info_t matched there would be no errors from list-inconsistent-obj.

<pre>
shard disk size oi size
0 1588 1588 0
1 1588 1588
2 1588 0 1588

{
"epoch": 17,
"inconsistents": [
{
"object": {
"name": "foo",
"nspace": "",
"locator": "",
"snap": "head",
"version": 1
},
"errors": [
"object_info_inconsistency",
"attr_value_mismatch"
],
"union_shard_errors": [],
"selected_object_info": "0:602f83fe:::foo:head(12'1 client.4111.0:1 dirty|data_digest|omap_digest s 1588 uv 1 dd a9a36536 od ffffffff alloc_hint [0 0 0])",
"shards": [
{
"osd": 0,
"errors": [],
"size": 1588,
"omap_digest": "0xffffffff",
"data_digest": "0xa9a36536",
"object_info": "0:602f83fe:::foo:head(12'1 client.4111.0:1 dirty|data_digest|omap_digest s 1588 uv 1 dd a9a36536 od ffffffff alloc_hint [0 0 0])"
},
{
"osd": 1,
"errors": [],
"size": 1588,
"omap_digest": "0xffffffff",
"data_digest": "0xa9a36536",
"object_info": "0:602f83fe:::foo:head(12'1 client.4111.0:1 dirty|data_digest|omap_digest s 1588 uv 1 dd a9a36536 od ffffffff alloc_hint [0 0 0])"
},
{
"osd": 2,
"errors": [],
"size": 1588,
"omap_digest": "0xffffffff",
"data_digest": "0xa9a36536",
"object_info": "0:602f83fe:::foo:head(12'1 client.4111.0:1 dirty|data_digest|omap_digest s 0 uv 1 dd a9a36536 od ffffffff alloc_hint [0 0 0])"
}
]
}
]
}
</pre>

Currently all we see is object_info_inconsistency and attr_value_mismatch and no shard errors. Without snapshots there is no info from list-inconsistent-snapset which included some additional size checking.

In be_select_auth_object we should check for a shards disk size vs oi_size. This should be a new disk_size_shard error. This would make that shard less likely to be the authoritative one.
We should ignore system xattrs when checking for attr_value_mismatch. We will ignore strange xattr keys and never report a attr_name_mismatch.

Already present in the code:
We have object error size_mismatch when different shard don't have the same disk size (maybe rename to disk_size_mismatch too?)
We have shard error size_mismatch_oi which like other _oi errors means the disk size doesn't match the authoritative size

Back