Bug #23428
openSnapset inconsistency is hard to diagnose because authoritative copy used by list-inconsistent-snapset not shown
0%
Description
$ sudo rados list-inconsistent-snapset 3.7f {"epoch":79,"inconsistents":[]} $ sudo rados list-inconsistent-obj 3.7f --format=json-pretty { "epoch": 79, "inconsistents": [ { "object": { "name": "obj1", "nspace": "", "locator": "", "snap": "head", "version": 13 }, "errors": [ "snapset_inconsistency" ], "union_shard_errors": [], "selected_object_info": "3:ff7b1f36:::obj1:head(73'13 client.4471.0:1 dirty|data_digest|omap_digest s 1682 uv 13 dd 735b0743 od ffffffff alloc_hint [0 0 0])", "shards": [ { "osd": 1, "primary": false, "errors": [], "size": 1682, "omap_digest": "0xffffffff", "data_digest": "0x735b0743", "snapset": "0=[]:[]+stray_clone_snaps={1=[1],2=[2],3=[3],4=[4],5=[5],6=[6]}" }, { "osd": 6, "primary": true, "errors": [], "size": 1682, "omap_digest": "0xffffffff", "data_digest": "0x735b0743", "snapset": "6=[6,5,4,3,2,1]:{1=[1],2=[2],3=[3],4=[4],5=[5],6=[6]}" }, { "osd": 8, "primary": false, "errors": [], "size": 1682, "omap_digest": "0xffffffff", "data_digest": "0x735b0743", "snapset": "6=[6,5,4,3,2,1]:{1=[1],2=[2],3=[3],4=[4],5=[5],6=[6]}" } ] } ] }
For now the user would have to increase the debug_osd log level and examine the osd logs to find the selected authoritative copy for a specific object. With 2 or different snapsets we could make it more complex by showing the snapshot results using each snapset for comparison or easier would be to indicate which is the authoritative copy. The existing code in PG::scrub_compare_maps() doesn't pass enough information to PrimaryLogPG::scrub_snapshot_metadata() for it to see both snapset variants or know which shard it is using.
Updated by David Zafman about 6 years ago
- Subject changed from Snapset inconsistency is hard to diagnose because authoritative copy used by list-inconsistent-snapset to Snapset inconsistency is hard to diagnose because authoritative copy used by list-inconsistent-snapset not shown
Updated by David Zafman about 6 years ago
- Related to Feature #23364: Special scrub handling of hinfo_key errors added
Updated by David Zafman about 6 years ago
In the pull request https://github.com/ceph/ceph/pull/20947 there is a change to partially address this issue. Unfortunately, in the scenario shown in this tracker's description, we don't have any particular shards in error. So in this case the list-inconsistent-snapset will still have inconsistents empty.
Here is an example of what is improved:
{ "name": "obj14", "nspace": "", "locator": "", "snap": "head", "snapset": { "snap_context": { "seq": 1, "snaps": [ 1 ] }, "clones": [ { "snap": 1, "size": 1033, "overlap": "[]", "snaps": [ 1 ] } ] }, "errors": [] }, { "errors": [ "size_mismatch" ], "snap": 1, "locator": "", "nspace": "", "name": "obj14" }