Project

General

Profile

Actions

Bug #54966

open

osd/ec: some pg status stuck at active+recovery_unfound+degraded+remapped

Added by jianwei zhang about 2 years ago. Updated almost 2 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
OSD
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
v15.2.13
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
rados
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

reproduce:
1. 5 hosts
2. cephfs (replica pool + ec pool (allow_ec_overwrite))
3. 3 hosts create ec(2+1) data pool and 3 replicas meta pool for base_cluster
4. running vdbench test file case
5. expand remain 2 hosts to cluster, data pool and meta pool, tips: osd one by one join up&in to cluster
6. all osds is up && in
7. backfill and recovery over

result:
ceph -s
11 active+recovery_unfound+degraded+remapped

analyse:
I found a reason, and I need everyone to discuss it together

Through the ceph log、 pg and object information on the disk,
The 3 ec shard data of the desired eversion of ec(2+1) are storaged on the disks, and are not lost.
The problem lies in the missing object after the peering is completed. In the process of adding the source osd required for recovery, it will be skipped the source osd that because the named serial number of the object in the pg is greater than the last_backfill(MIN), and it is not added to the source osd set, source osds not enough ec(2) for recover, resulting in recovery_unfound.

problem code:

MissingLoc::add_source_info {
for ()
...
if (p->first >= oinfo.last_backfill) {
// FIXME: this is probably true, although it could conceivably
// be in the undefined region! Hmm!
ldout(cct, 10) << "search_for_missing " << soid << " " << need << " also missing on osd." << fromosd
<< " (past last_backfill " << oinfo.last_backfill << ")" << dendl;
continue;
}
...
}

commit b99e135848ca5666308344cf5ecc9c7f95f30137
Author: Sage Weil <>
Date: Mon Dec 5 17:25:21 2011 -0800

osd: make backfill (basically) work again
Still need to handle concurrent updates, log recovery vs backfill, etc.
Signed-off-by: Sage Weil &lt;&gt;

evidence :
(Take an recovery_unfound object as an example) : 101000000000133.0000001d

  1. ceph pg 3.2f2 list_unfound {
    "oid": {
    "oid": "101000000000133.0000001d",
    "key": "",
    "snapid": -2,
    "hash": 1021279986,
    "max": 0,
    "pool": 3,
    "namespace": ""
    },
    "need": "374'872",
    "have": "0'0",
    "flags": "none",
    "clean_regions": "clean_offsets: [], clean_omap: 0, new_object: 1",
    "locations": [
    "36(0)"
    ]
    }
  1. /etc/ceph/ceph.conf debug_osd = 30/30
  2. filter add source osd for missing object after peering
    [root@node124 ceph]# grep -e Primary -e 101000000000133.0000001d ceph-osd.57.log|grep -E "3\.2f2.*transitioning to Primary|search_for_missing.*101000000000133.0000001d"
    2022-03-19T14:04:42.223+0800 7f82f2028700 1 osd.57 pg_epoch: 5030 pg[3.2f2s0( v 785'2826 lc 0'0 (503'2242,785'2826] local-lis/les=5003/5004 n=230 ec=198/198 lis/c=4985/282 les/c/f=4986/283/0 sis=5030) [76,59,41]/[57,41,37]p57(0) r=0 lpr=5030 pi=[282,5030)/9 crt=785'2826 mlcod 0'0 remapped m=18 mbc={}] state<Start>: transitioning to Primary
    2022-03-19T14:04:43.303+0800 7f82f2028700 10 osd.57 pg_epoch: 5031 pg[3.2f2s0( v 785'2826 lc 0'0 (503'2242,785'2826] local-lis/les=5030/5031 n=230 ec=198/198 lis/c=4985/282 les/c/f=4986/283/0 sis=5030) [76,59,41]/[57,41,37]p57(0) backfill=[41(2),59(1),76(0)] r=0 lpr=5030 pi=[282,5030)/9 crt=785'2826 mlcod 0'0 remapped m=18 u=18 mbc={}] search_for_missing 3:4f7efb3c:::101000000000133.0000001d:head 374'872 also missing on osd.57(0)
    2022-03-19T14:04:43.303+0800 7f82f2028700 10 osd.57 pg_epoch: 5031 pg[3.2f2s0( v 785'2826 lc 0'0 (503'2242,785'2826] local-lis/les=5030/5031 n=230 ec=198/198 lis/c=4985/282 les/c/f=4986/283/0 sis=5030) [76,59,41]/[57,41,37]p57(0) backfill=[41(2),59(1),76(0)] r=0 lpr=5030 pi=[282,5030)/9 crt=785'2826 mlcod 0'0 remapped m=18 u=18 mbc={}] search_for_missing 3:4f7efb3c:::101000000000133.0000001d:head 374'872 also missing on osd.37(2)
    2022-03-19T14:04:43.304+0800 7f82f2028700 10 osd.57 pg_epoch: 5031 pg[3.2f2s0( v 785'2826 lc 0'0 (503'2242,785'2826] local-lis/les=5030/5031 n=230 ec=198/198 lis/c=4985/282 les/c/f=4986/283/0 sis=5030) [76,59,41]/[57,41,37]p57(0) backfill=[41(2),59(1),76(0)] r=0 lpr=5030 pi=[282,5030)/9 crt=785'2826 mlcod 0'0 remapped m=18 u=18 mbc={}] search_for_missing 3:4f7efb3c:::101000000000133.0000001d:head 374'872 also missing on osd.41(1)
    2022-03-19T14:04:43.305+0800 7f82f2028700 10 osd.57 pg_epoch: 5031 pg[3.2f2s0( v 785'2826 lc 0'0 (503'2242,785'2826] local-lis/les=5030/5031 n=230 ec=198/198 lis/c=4985/282 les/c/f=4986/283/0 sis=5030) [76,59,41]/[57,41,37]p57(0) backfill=[41(2),59(1),76(0)] r=0 lpr=5030 pi=[282,5030)/9 crt=785'2826 mlcod 0'0 remapped m=18 u=18 mbc={}] search_for_missing 3:4f7efb3c:::101000000000133.0000001d:head 374'872 also missing on osd.41(2) (past last_backfill MIN) ///3.2f2s2.101000000000133.0000001d.osd41.object_info_t(on disk object_info) //However, because the serial number of the object is greater than the last_backfill of osd.41(2), it was skipped and not added to the source osd

2022-03-19T14:04:43.306+0800 7f82f2028700 10 osd.57 pg_epoch: 5031 pg[3.2f2s0( v 785'2826 lc 0'0 (503'2242,785'2826] local-lis/les=5030/5031 n=230 ec=198/198 lis/c=4985/282 les/c/f=4986/283/0 sis=5030) [76,59,41]/[57,41,37]p57(0) backfill=[41(2),59(1),76(0)] r=0 lpr=5030 pi=[282,5030)/9 crt=785'2826 mlcod 0'0 remapped m=18 u=18 mbc={}] search_for_missing 3:4f7efb3c:::101000000000133.0000001d:head 374'872 also missing on osd.59(1) (past last_backfill MIN) ///3.2f2s1.101000000000133.0000001d.osd59.object_info_t(on disk object_info) ///However, because the serial number of the object is greater than the last_backfill of osd.59(1), it was skipped and not added to the source osd

2022-03-19T14:04:43.307+0800 7f82f2028700 10 osd.57 pg_epoch: 5031 pg[3.2f2s0( v 785'2826 lc 0'0 (503'2242,785'2826] local-lis/les=5030/5031 n=230 ec=198/198 lis/c=4985/282 les/c/f=4986/283/0 sis=5030) [76,59,41]/[57,41,37]p57(0) backfill=[41(2),59(1),76(0)] r=0 lpr=5030 pi=[282,5030)/9 crt=785'2826 mlcod 0'0 remapped m=18 u=18 mbc={}] search_for_missing 3:4f7efb3c:::101000000000133.0000001d:head 374'872 also missing on osd.76(0) (past last_backfill MIN)
2022-03-19T14:04:43.344+0800 7f82f2028700 10 osd.57 pg_epoch: 5031 pg[3.2f2s0( v 785'2826 lc 0'0 (503'2242,785'2826] local-lis/les=5030/5031 n=230 ec=198/198 lis/c=4985/282 les/c/f=4986/283/0 sis=5030) [76,59,41]/[57,41,37]p57(0) backfill=[41(2),59(1),76(0)] r=0 lpr=5030 pi=[282,5030)/9 crt=785'2826 mlcod 0'0 activating+degraded+remapped m=18 u=18 mbc={}] search_for_missing 3:4f7efb3c:::101000000000133.0000001d:head 374'872 also missing on osd.44(1) (last_update 346'424 < needed 374'872)
2022-03-19T14:04:43.348+0800 7f82f2028700 10 osd.57 pg_epoch: 5031 pg[3.2f2s0( v 785'2826 lc 0'0 (503'2242,785'2826] local-lis/les=5030/5031 n=230 ec=198/198 lis/c=4985/282 les/c/f=4986/283/0 sis=5030) [76,59,41]/[57,41,37]p57(0) backfill=[41(2),59(1),76(0)] r=0 lpr=5030 pi=[282,5030)/9 crt=785'2826 mlcod 0'0 activating+degraded+remapped m=18 u=18 mbc={0={(0+1)=17},1={(0+0)=17},2={(0+0)=17}}] search_for_missing 3:4f7efb3c:::101000000000133.0000001d:head 374'872 is on osd.36(0) ///3.2f2s0.101000000000133.0000001d.osd36.object_info_t(on disk object_info) //Only osd.36(0) meets all the conditions and is added to the source osd

2022-03-19T14:04:43.362+0800 7f82f2028700 10 osd.57 pg_epoch: 5031 pg[3.2f2s0( v 785'2826 lc 0'0 (503'2242,785'2826] local-lis/les=5030/5031 n=230 ec=198/198 lis/c=4985/282 les/c/f=4986/283/0 sis=5030) [76,59,41]/[57,41,37]p57(0) backfill=[41(2),59(1),76(0)] r=0 lpr=5030 pi=[282,5030)/9 crt=785'2826 mlcod 0'0 activating+degraded+remapped m=18 u=18 mbc={0={(0+1)=18},1={(0+0)=18},2={(0+0)=18}}] search_for_missing 3:4f7efb3c:::101000000000133.0000001d:head 374'872 also missing on osd.57(1) (last_update 362'664 < needed 374'872)
2022-03-19T14:04:43.365+0800 7f82ef823700 10 osd.57 pg_epoch: 5031 pg[3.2f2s0( v 785'2826 lc 0'0 (503'2242,785'2826] local-lis/les=5030/5031 n=230 ec=198/198 lis/c=4985/282 les/c/f=4986/283/0 sis=5030) [76,59,41]/[57,41,37]p57(0) backfill=[41(2),59(1),76(0)] r=0 lpr=5030 pi=[282,5030)/9 crt=785'2826 mlcod 0'0 activating+degraded+remapped m=18 u=18 mbc={0={(0+1)=18},1={(0+0)=18},2={(0+0)=18}}] search_for_missing 3:4f7efb3c:::101000000000133.0000001d:head 374'872 also missing on osd.75(0) (past last_backfill MIN)
2022-03-19T14:04:43.378+0800 7f82f2028700 10 osd.57 pg_epoch: 5031 pg[3.2f2s0( v 785'2826 lc 0'0 (503'2242,785'2826] local-lis/les=5030/5031 n=230 ec=198/198 lis/c=4985/282 les/c/f=4986/283/0 sis=5030) [76,59,41]/[57,41,37]p57(0) backfill=[41(2),59(1),76(0)] r=0 lpr=5030 pi=[282,5030)/9 crt=785'2826 mlcod 0'0 activating+degraded+remapped m=18 u=18 mbc={0={(0+1)=18},1={(0+0)=18},2={(0+0)=18}}] search_for_missing 3:4f7efb3c:::101000000000133.0000001d:head 374'872 also missing on osd.39(2)
2022-03-19T14:04:43.384+0800 7f82f482d700 10 osd.57 pg_epoch: 5031 pg[3.2f2s0( v 785'2826 lc 0'0 (503'2242,785'2826] local-lis/les=5030/5031 n=230 ec=198/198 lis/c=4985/282 les/c/f=4986/283/0 sis=5030) [76,59,41]/[57,41,37]p57(0) backfill=[41(2),59(1),76(0)] r=0 lpr=5030 pi=[282,5030)/9 crt=785'2826 mlcod 0'0 activating+degraded+remapped m=18 u=18 mbc={0={(0+1)=18},1={(0+0)=18},2={(0+0)=18}}] search_for_missing 3:4f7efb3c:::101000000000133.0000001d:head 374'872 also missing on osd.48(1) (last_update 351'468 < needed 374'872)
2022-03-19T14:04:43.387+0800 7f82f2028700 10 osd.57 pg_epoch: 5031 pg[3.2f2s0( v 785'2826 lc 0'0 (503'2242,785'2826] local-lis/les=5030/5031 n=230 ec=198/198 lis/c=4985/282 les/c/f=4986/283/0 sis=5030) [76,59,41]/[57,41,37]p57(0) backfill=[41(2),59(1),76(0)] r=0 lpr=5030 pi=[282,5030)/9 crt=785'2826 mlcod 0'0 activating+degraded+remapped m=18 u=18 mbc={0={(0+1)=18},1={(0+0)=18},2={(0+0)=18}}] search_for_missing 3:4f7efb3c:::101000000000133.0000001d:head 374'872 also missing on osd.73(0) (past last_backfill MIN)

ceph-object-tool: (object_info_t and check bluestore) {
"oid": {
"oid": "101000000000133.0000001d",
"key": "",
"snapid": -2,
"hash": 1021279986,
"max": 0,
"pool": 3,
"namespace": ""
},
"version": "374'872",
"prior_version": "374'871",
"last_reqid": "client.133562.0:359556",
"user_version": 872,
"size": 4194304,
"mtime": "2022-03-18T11:41:39.433611+0800",
"local_mtime": "2022-03-18T11:41:39.436761+0800",
"lost": 0,
"flags": [
"dirty"
],
"truncate_seq": 0,
"truncate_size": 0,
"data_digest": "0xffffffff",
"omap_digest": "0xffffffff",
"expected_object_size": 0,
"expected_write_size": 0,
"alloc_hint_flags": 0,
"manifest": {
"type": 0
},
"watchers": {}
}

3.2f2]# for i in $(ls -tlhr|grep 101000000000133.0000001d|grep -v -e '0 Mar 19'|grep -e osd|awk '{print $NF}'); do echo $i; ceph-dencoder import $i type object_info_t decode dump_json;done
3.2f2s0.101000000000133.0000001d.osd36.object_info_t

3.2f2s2.101000000000133.0000001d.osd41.object_info_t {
"oid": {
"oid": "101000000000133.0000001d",
"key": "",
"snapid": -2,
"hash": 1021279986,
"max": 0,
"pool": 3,
"namespace": ""
},
"version": "374'872",
"prior_version": "374'871",
"last_reqid": "client.133562.0:359556",
"user_version": 872,
"size": 4194304,
"mtime": "2022-03-18T11:41:39.433611+0800",
"local_mtime": "2022-03-18T11:41:39.436761+0800",
"lost": 0,
"flags": [
"dirty"
],
"truncate_seq": 0,
"truncate_size": 0,
"data_digest": "0xffffffff",
"omap_digest": "0xffffffff",
"expected_object_size": 0,
"expected_write_size": 0,
"alloc_hint_flags": 0,
"manifest": {
"type": 0
},
"watchers": {}
}

3.2f2s1.101000000000133.0000001d.osd59.object_info_t {
"oid": {
"oid": "101000000000133.0000001d",
"key": "",
"snapid": -2,
"hash": 1021279986,
"max": 0,
"pool": 3,
"namespace": ""
},
"version": "374'872",
"prior_version": "374'871",
"last_reqid": "client.133562.0:359556",
"user_version": 872,
"size": 4194304,
"mtime": "2022-03-18T11:41:39.433611+0800",
"local_mtime": "2022-03-18T11:41:39.436761+0800",
"lost": 0,
"flags": [
"dirty"
],
"truncate_seq": 0,
"truncate_size": 0,
"data_digest": "0xffffffff",
"omap_digest": "0xffffffff",
"expected_object_size": 0,
"expected_write_size": 0,
"alloc_hint_flags": 0,
"manifest": {
"type": 0
},
"watchers": {}
}

node121 ceph-objectstore-tool --op fuse --data-path /var/lib/ceph/osd/ceph-36/ --mountpoint /mnt/fuse-osd36 &
node122 ceph-objectstore-tool --op fuse --data-path /var/lib/ceph/osd/ceph-41/ --mountpoint /mnt/fuse-osd41 &
node124 ceph-objectstore-tool --op fuse --data-path /var/lib/ceph/osd/ceph-59/ --mountpoint /mnt/fuse-osd59 &

[root@node121 0#3:4f7efb3c:::101000000000133.0000001d:head#]# md5sum /mnt/fuse-osd36/3.2f2s0_head/all/0#3:4f7efb3c:::101000000000133.0000001d:head#/data
ea003c1e3b13d63d6ed67d0595e8ff1b /mnt/fuse-osd36/3.2f2s0_head/all/0#3:4f7efb3c:::101000000000133.0000001d:head#/data

[root@node122 2#3:4f7efb3c:::101000000000133.0000001d:head#]# md5sum /mnt/fuse-osd41/3.2f2s2_head/all/2#3:4f7efb3c:::101000000000133.0000001d:head#/data
52ed9a463dd46982113bf748e7f0c4fe /mnt/fuse-osd41/3.2f2s2_head/all/2#3:4f7efb3c:::101000000000133.0000001d:head#/data

[root@node124 1#3:4f7efb3c:::101000000000133.0000001d:head#]# md5sum /mnt/fuse-osd59/3.2f2s1_head/all/1#3:4f7efb3c:::101000000000133.0000001d:head#/data
f339660aa3a182d89be8110d67024a04 /mnt/fuse-osd59/3.2f2s1_head/all/1#3:4f7efb3c:::101000000000133.0000001d:head#/data

ceph-object-tool: dump pg info and pglog, then parse

[root@node121 3.2f2]# python3 parse_data.py 101000000000133.0000001d *.log
['3.2f2s0.osd36.log', '3.2f2s0.osd57.log', '3.2f2s0.osd73.log', '3.2f2s0.osd75.log', '3.2f2s0.osd76.log', '3.2f2s1.osd41.log', '3.2f2s1.osd44.log', '3.2f2s1.osd48.log', '3.2f2s1.osd57.log', '3.2f2s1.osd59.log', '3.2f2s2.osd37.log', '3.2f2s2.osd39.log', '3.2f2s2.osd41.log']
========================missing object===========================
3.2f2s0.osd36.log
3.2f2s0.osd57.log {'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'need': "374'872", 'have': "0'0", 'flags': 'none', 'clean_regions': 'clean_offsets: [], clean_omap: 0, new_object: 1'}
3.2f2s0.osd73.log
3.2f2s0.osd75.log
3.2f2s0.osd76.log
3.2f2s1.osd41.log {'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'need': "374'872", 'have': "0'0", 'flags': 'none', 'clean_regions': 'clean_offsets: [], clean_omap: 0, new_object: 1'}
3.2f2s1.osd44.log
3.2f2s1.osd48.log
3.2f2s1.osd57.log
3.2f2s1.osd59.log
3.2f2s2.osd37.log {'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'need': "374'872", 'have': "0'0", 'flags': 'none', 'clean_regions': 'clean_offsets: [], clean_omap: 0, new_object: 1'}
3.2f2s2.osd39.log {'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'need': "374'872", 'have': "0'0", 'flags': 'none', 'clean_regions': 'clean_offsets: [], clean_omap: 0, new_object: 1'}
3.2f2s2.osd41.log
========================pg log object==========================
3.2f2s0.osd36.log [0'0, 375'884] {'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'833", 'prior_version': "0'0", 'reqid': 'client.133721.0:366904', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:03.843215+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': True, 'ops': [{'code': 'CREATE'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[1048576~18446744073708503039]', 'clean_omap': True, 'new_object': False}}} {'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'834", 'prior_version': "374'833", 'reqid': 'client.133721.0:366972', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:03.967153+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'APPEND', 'old_size': 1048576}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~1048576,2097152~18446744073707454463]', 'clean_omap': True, 'new_object': False}}} {'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'835", 'prior_version': "374'834", 'reqid': 'client.133721.0:367041', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:04.090498+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'APPEND', 'old_size': 2097152}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~2097152,3145728~18446744073706405887]', 'clean_omap': True, 'new_object': False}}} {'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'836", 'prior_version': "374'835", 'reqid': 'client.133721.0:367114', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:04.210199+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'APPEND', 'old_size': 3145728}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~3145728,4194304~18446744073705357311]', 'clean_omap': True, 'new_object': False}}} {'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'853", 'prior_version': "374'836", 'reqid': 'client.114246.0:360904', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:25.208243+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 853, 'snaps': '[0,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[1048576~18446744073708503039]', 'clean_omap': True, 'new_object': False}}} {'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'854", 'prior_version': "374'853", 'reqid': 'client.114246.0:360926', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:25.256403+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 854, 'snaps': '[524288,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~1048576,2097152~18446744073707454463]', 'clean_omap': True, 'new_object': False}}} {'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'855", 'prior_version': "374'854", 'reqid': 'client.114246.0:360958', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:25.295928+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 855, 'snaps': '[1048576,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~2097152,3145728~18446744073706405887]', 'clean_omap': True, 'new_object': False}}} {'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'856", 'prior_version': "374'855", 'reqid': 'client.114246.0:360973', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:25.323967+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 856, 'snaps': '[1572864,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~3145728,4194304~18446744073705357311]', 'clean_omap': True, 'new_object': False}}} {'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'869", 'prior_version': "374'856", 'reqid': 'client.133562.0:359426', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:39.277007+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 869, 'snaps': '[0,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[1048576~18446744073708503039]', 'clean_omap': True, 'new_object': False}}} {'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'870", 'prior_version': "374'869", 'reqid': 'client.133562.0:359461', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:39.334504+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 870, 'snaps': '[524288,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~1048576,2097152~18446744073707454463]', 'clean_omap': True, 'new_object': False}}} {'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'871", 'prior_version': "374'870", 'reqid': 'client.133562.0:359504', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:39.385649+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 871, 'snaps': '[1048576,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~2097152,3145728~18446744073706405887]', 'clean_omap': True, 'new_object': False}}} {'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'872", 'prior_version': "374'871", 'reqid': 'client.133562.0:359556', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:39.433611+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 872, 'snaps': '[1572864,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~3145728,4194304~18446744073705357311]', 'clean_omap': True, 'new_object': False}}}
3.2f2s0.osd57.log [503'2242, 785'2826]
3.2f2s0.osd73.log [466'1540, 488'2076]
3.2f2s0.osd75.log [466'1540, 491'2080]
3.2f2s0.osd76.log [503'2242, 785'2826]
3.2f2s1.osd41.log [503'2242, 785'2826]
3.2f2s1.osd44.log [0'0, 346'424]
3.2f2s1.osd48.log [0'0, 351'468]
3.2f2s1.osd57.log [0'0, 362'664]
3.2f2s1.osd59.log [503'2242, 785'2826]
3.2f2s2.osd37.log [503'2242, 785'2826]
3.2f2s2.osd39.log [291'200, 386'1120] {'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'833", 'prior_version': "0'0", 'reqid': 'client.133721.0:366904', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:03.843215+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': True, 'ops': [{'code': 'CREATE'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[1048576~18446744073708503039]', 'clean_omap': True, 'new_object': False}}} {'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'834", 'prior_version': "374'833", 'reqid': 'client.133721.0:366972', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:03.967153+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'APPEND', 'old_size': 1048576}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~1048576,2097152~18446744073707454463]', 'clean_omap': True, 'new_object': False}}} {'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'835", 'prior_version': "374'834", 'reqid': 'client.133721.0:367041', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:04.090498+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'APPEND', 'old_size': 2097152}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~2097152,3145728~18446744073706405887]', 'clean_omap': True, 'new_object': False}}} {'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'836", 'prior_version': "374'835", 'reqid': 'client.133721.0:367114', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:04.210199+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'APPEND', 'old_size': 3145728}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~3145728,4194304~18446744073705357311]', 'clean_omap': True, 'new_object': False}}} {'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'853", 'prior_version': "374'836", 'reqid': 'client.114246.0:360904', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:25.208243+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 853, 'snaps': '[0,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[1048576~18446744073708503039]', 'clean_omap': True, 'new_object': False}}} {'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'854", 'prior_version': "374'853", 'reqid': 'client.114246.0:360926', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:25.256403+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 854, 'snaps': '[524288,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~1048576,2097152~18446744073707454463]', 'clean_omap': True, 'new_object': False}}} {'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'855", 'prior_version': "374'854", 'reqid': 'client.114246.0:360958', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:25.295928+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 855, 'snaps': '[1048576,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~2097152,3145728~18446744073706405887]', 'clean_omap': True, 'new_object': False}}} {'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'856", 'prior_version': "374'855", 'reqid': 'client.114246.0:360973', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:25.323967+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 856, 'snaps': '[1572864,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~3145728,4194304~18446744073705357311]', 'clean_omap': True, 'new_object': False}}} {'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'869", 'prior_version': "374'856", 'reqid': 'client.133562.0:359426', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:39.277007+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 869, 'snaps': '[0,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[1048576~18446744073708503039]', 'clean_omap': True, 'new_object': False}}} {'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'870", 'prior_version': "374'869", 'reqid': 'client.133562.0:359461', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:39.334504+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 870, 'snaps': '[524288,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~1048576,2097152~18446744073707454463]', 'clean_omap': True, 'new_object': False}}} {'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'871", 'prior_version': "374'870", 'reqid': 'client.133562.0:359504', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:39.385649+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 871, 'snaps': '[1048576,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~2097152,3145728~18446744073706405887]', 'clean_omap': True, 'new_object': False}}} {'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'872", 'prior_version': "374'871", 'reqid': 'client.133562.0:359556', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:39.433611+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 872, 'snaps': '[1572864,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~3145728,4194304~18446744073705357311]', 'clean_omap': True, 'new_object': False}}}
3.2f2s2.osd41.log [503'2242, 785'2826]


Files

txt-type-description.txt (29.9 KB) txt-type-description.txt jianwei zhang, 03/19/2022 12:58 PM
Actions #1

Updated by jianwei zhang about 2 years ago

If you think the above text layout is not easy to read, you can read txt-type-description.txt

Actions #2

Updated by jianwei zhang about 2 years ago

[root@node122 ~]# ceph -s
cluster:
id: 524c307a-a663-11ec-b495-f875a4d79c88
health: HEALTH_ERR
286/250830 objects unfound (0.114%)
noout,noscrub,nodeep-scrub flag(s) set
Possible data damage: 11 pgs recovery_unfound
Degraded data redundancy: 633/752490 objects degraded (0.084%), 11 pgs degraded

services:
mon: 3 daemons, quorum a,b,c (age 2d)
mgr: a(active, since 2d), standbys: b, c
tfsmds: tfs
5 daemons: 5 up
d = up:running
b = up:running
e = up:running
a = up:running
c = up:running
10 ranks
10 up:active
osd: 80 osds: 80 up (since 63m), 80 in (since 2d); 11 remapped pgs
flags noout,noscrub,nodeep-scrub
data:
pools: 9 pools, 3329 pgs
objects: 250.83k objects, 978 GiB
usage: 3.7 TiB used, 293 TiB / 297 TiB avail
pgs: 633/752490 objects degraded (0.084%)
6234/752490 objects misplaced (0.828%)
286/250830 objects unfound (0.114%)
3318 active+clean
11 active+recovery_unfound+degraded+remapped
io:
client: 493 B/s rd, 11 KiB/s wr, 0 op/s rd, 4 op/s wr
Actions #3

Updated by jianwei zhang about 2 years ago

[root@node122 ~]# ceph -s
  cluster:
    id:     524c307a-a663-11ec-b495-f875a4d79c88
    health: HEALTH_ERR
            286/250830 objects unfound (0.114%)
            noout,noscrub,nodeep-scrub flag(s) set
            Possible data damage: 11 pgs recovery_unfound
            Degraded data redundancy: 633/752490 objects degraded (0.084%), 11 pgs degraded

  services:
    mon: 3 daemons, quorum a,b,c (age 2d)
    mgr: a(active, since 2d), standbys: b, c
    tfsmds: tfs
      5 daemons: 5 up
        d = up:running
        b = up:running
        e = up:running
        a = up:running
        c = up:running
      10 ranks
        10 up:active
    osd: 80 osds: 80 up (since 63m), 80 in (since 2d); 11 remapped pgs
         flags noout,noscrub,nodeep-scrub

  data:
    pools:   9 pools, 3329 pgs
    objects: 250.83k objects, 978 GiB
    usage:   3.7 TiB used, 293 TiB / 297 TiB avail
    pgs:     633/752490 objects degraded (0.084%)
             6234/752490 objects misplaced (0.828%)
             286/250830 objects unfound (0.114%)
             3318 active+clean
             11   active+recovery_unfound+degraded+remapped

  io:
    client:   493 B/s rd, 11 KiB/s wr, 0 op/s rd, 4 op/s wr

Actions #4

Updated by jianwei zhang about 2 years ago

[root@node122 ~]# ceph -s
  cluster:
    id:     524c307a-a663-11ec-b495-f875a4d79c88
    health: HEALTH_ERR
            286/250830 objects unfound (0.114%)
            noout,noscrub,nodeep-scrub flag(s) set
            Possible data damage: 11 pgs recovery_unfound
            Degraded data redundancy: 633/752490 objects degraded (0.084%), 11 pgs degraded

  services:
    mon: 3 daemons, quorum a,b,c (age 2d)
    mgr: a(active, since 2d), standbys: b, c
    tfsmds: tfs
      5 daemons: 5 up
        d = up:running
        b = up:running
        e = up:running
        a = up:running
        c = up:running
      10 ranks
        10 up:active
    osd: 80 osds: 80 up (since 66s), 80 in (since 2d); 11 remapped pgs
         flags noout,noscrub,nodeep-scrub

  data:
    pools:   9 pools, 3329 pgs
    objects: 250.83k objects, 978 GiB
    usage:   3.7 TiB used, 293 TiB / 297 TiB avail
    pgs:     633/752490 objects degraded (0.084%)
             6234/752490 objects misplaced (0.828%)
             286/250830 objects unfound (0.114%)
             3318 active+clean
             11   active+recovery_unfound+degraded+remapped

  io:
    client:   994 B/s rd, 12 KiB/s wr, 1 op/s rd, 4 op/s wr

  progress:
    Rebalancing after osd.49 marked in (2d)
      [=========================...] (remaining: 4h)
    Rebalancing after osd.48 marked in (2d)
      [===========================.] (remaining: 15m)

[root@node122 ~]# ceph pg map 3.212
osdmap e7003 pg 3.212 (3.212) -> up [57,40,44] acting [57,45,44]

ceph pg 3.212 list_unfound
        {
            "oid": {
                "oid": "5050000000001a5.0000000f",
                "key": "",
                "snapid": -2,
                "hash": 438253074,
                "max": 0,
                "pool": 3,
                "namespace": "" 
            },
            "need": "351'588",
            "have": "0'0",
            "flags": "none",
            "clean_regions": "clean_offsets: [], clean_omap: 0, new_object: 1",
            "locations": [
                "44(2)" 
            ]
        }

[root@node121 3.212]# cat 3.212.pg.might_have_unfound
37 1
39 0
40 1
43 2
44 2
45 0
45 1
47 1
48 0
57 0
73 1

[root@node121 3.212]# ls -lthr *.object_info_t
-rw-r--r-- 1 root root 279 Mar 20 13:55 3.212s1.5050000000001a5.0000000f.osd40.object_info_t
-rw-r--r-- 1 root root 279 Mar 20 13:55 3.212s2.5050000000001a5.0000000f.osd44.object_info_t
-rw-r--r-- 1 root root 279 Mar 20 13:56 3.212s0.5050000000001a5.0000000f.osd45.object_info_t

node123 mkdir -p /mnt/fuse-osd40; ceph-objectstore-tool --op fuse --data-path /var/lib/ceph/osd/ceph-40/ --mountpoint /mnt/fuse-osd40 &
node122 mkdir -p /mnt/fuse-osd44; ceph-objectstore-tool --op fuse --data-path /var/lib/ceph/osd/ceph-44/ --mountpoint /mnt/fuse-osd44    &
node121 mkdir -p /mnt/fuse-osd45; ceph-objectstore-tool --op fuse --data-path /var/lib/ceph/osd/ceph-45/ --mountpoint /mnt/fuse-osd45    &

[root@node121 ~]# ls -l /mnt/fuse-osd45/3.212s0_head/all/0#3:486cf858:::5050000000001a5.0000000f:head#/data
-rwx------ 1 root root 2097152 Jan  1  1970 /mnt/fuse-osd45/3.212s0_head/all/0#3:486cf858:::5050000000001a5.0000000f:head#/data

[root@node122 ~]# ls -l /mnt/fuse-osd44/3.212s2_head/all/2#3:486cf858:::5050000000001a5.0000000f:head#/data
-rwx------ 1 root root 2097152 Jan  1  1970 /mnt/fuse-osd44/3.212s2_head/all/2#3:486cf858:::5050000000001a5.0000000f:head#/data

[root@node123 ~]# ll /mnt/fuse-osd40/3.212s1_head/all/1#3:486cf858:::5050000000001a5.0000000f:head#/data
-rwx------ 1 root root 2097152 Jan  1  1970 /mnt/fuse-osd40/3.212s1_head/all/1#3:486cf858:::5050000000001a5.0000000f:head#/data

[root@node121 3.212]# for i in $(ls -tlhr|grep object_info_t|grep -v -e '0 Mar 20'|grep -e osd|awk '{print $NF}'); do echo $i; ceph-dencoder import $i type object_info_t decode dump_json;done
obj3.212s1.5050000000001a5.0000000f.osd40.object_info_t
{
    "oid": {
        "oid": "5050000000001a5.0000000f",
        "key": "",
        "snapid": -2,
        "hash": 438253074,
        "max": 0,
        "pool": 3,
        "namespace": "" 
    },
    "version": "351'588",
    "prior_version": "351'587",
    "last_reqid": "client.133721.0:206388",
    "user_version": 588,
    "size": 4194304,
    "mtime": "2022-03-18T11:33:16.284910+0800",
    "local_mtime": "2022-03-18T11:33:16.287527+0800",
    "lost": 0,
    "flags": [
        "dirty" 
    ],
    "truncate_seq": 0,
    "truncate_size": 0,
    "data_digest": "0xffffffff",
    "omap_digest": "0xffffffff",
    "expected_object_size": 0,
    "expected_write_size": 0,
    "alloc_hint_flags": 0,
    "manifest": {
        "type": 0
    },
    "watchers": {}
}

3.212s2.5050000000001a5.0000000f.osd44.object_info_t
{
    "oid": {
        "oid": "5050000000001a5.0000000f",
        "key": "",
        "snapid": -2,
        "hash": 438253074,
        "max": 0,
        "pool": 3,
        "namespace": "" 
    },
    "version": "351'588",
    "prior_version": "351'587",
    "last_reqid": "client.133721.0:206388",
    "user_version": 588,
    "size": 4194304,
    "mtime": "2022-03-18T11:33:16.284910+0800",
    "local_mtime": "2022-03-18T11:33:16.287527+0800",
    "lost": 0,
    "flags": [
        "dirty" 
    ],
    "truncate_seq": 0,
    "truncate_size": 0,
    "data_digest": "0xffffffff",
    "omap_digest": "0xffffffff",
    "expected_object_size": 0,
    "expected_write_size": 0,
    "alloc_hint_flags": 0,
    "manifest": {
        "type": 0
    },
    "watchers": {}
}

3.212s0.5050000000001a5.0000000f.osd45.object_info_t
{
    "oid": {
        "oid": "5050000000001a5.0000000f",
        "key": "",
        "snapid": -2,
        "hash": 438253074,
        "max": 0,
        "pool": 3,
        "namespace": "" 
    },
    "version": "346'560",
    "prior_version": "346'559",
    "last_reqid": "client.133562.0:206272",
    "user_version": 560,
    "size": 4194304,
    "mtime": "2022-03-18T11:32:51.189754+0800",
    "local_mtime": "2022-03-18T11:32:51.192649+0800",
    "lost": 0,
    "flags": [
        "dirty",
        "data_digest" 
    ],
    "truncate_seq": 0,
    "truncate_size": 0,
    "data_digest": "0x4d26eda1",
    "omap_digest": "0xffffffff",
    "expected_object_size": 0,
    "expected_write_size": 0,
    "alloc_hint_flags": 0,
    "manifest": {
        "type": 0
    },
    "watchers": {}
}

[root@node121 3.212]# python3 parse_data.py 5050000000001a5.0000000f *.log
5050000000001a5.0000000f
['3.212s0.osd39.log', '3.212s0.osd45.log', '3.212s0.osd48.log', '3.212s0.osd57.log', '3.212s1.osd37.log', '3.212s1.osd40.log', '3.212s1.osd45.log', '3.212s1.osd47.log', '3.212s1.osd73.log', '3.212s2.osd43.log', '3.212s2.osd44.log']
========================missing object===========================
3.212s0.osd39.log
3.212s0.osd45.log
3.212s0.osd48.log
{'object': '3:486cf858:::5050000000001a5.0000000f:head', 'need': "351'588", 'have': "0'0", 'flags': 'none', 'clean_regions': 'clean_offsets: [], clean_omap: 0, new_object: 1'}
3.212s0.osd57.log
{'object': '3:486cf858:::5050000000001a5.0000000f:head', 'need': "351'588", 'have': "0'0", 'flags': 'none', 'clean_regions': 'clean_offsets: [], clean_omap: 0, new_object: 1'}
3.212s1.osd37.log
3.212s1.osd40.log
3.212s1.osd45.log
{'object': '3:486cf858:::5050000000001a5.0000000f:head', 'need': "351'588", 'have': "0'0", 'flags': 'none', 'clean_regions': 'clean_offsets: [], clean_omap: 0, new_object: 1'}
3.212s1.osd47.log
3.212s1.osd73.log
3.212s2.osd43.log
3.212s2.osd44.log
========================pg log object==========================
3.212s0.osd39.log [0'0, 0'0]
3.212s0.osd45.log [0'0, 346'564]
{'op': 'modify', 'object': '3:486cf858:::5050000000001a5.0000000f:head', 'version': "346'557", 'prior_version': "0'0", 'reqid': 'client.133562.0:206204', 'extra_reqids': [], 'mtime': '2022-03-18T11:32:51.042270+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': True, 'ops': [{'code': 'CREATE'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[1048576~18446744073708503039]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:486cf858:::5050000000001a5.0000000f:head', 'version': "346'558", 'prior_version': "346'557", 'reqid': 'client.133562.0:206224', 'extra_reqids': [], 'mtime': '2022-03-18T11:32:51.084657+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'APPEND', 'old_size': 1048576}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~1048576,2097152~18446744073707454463]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:486cf858:::5050000000001a5.0000000f:head', 'version': "346'559", 'prior_version': "346'558", 'reqid': 'client.133562.0:206251', 'extra_reqids': [], 'mtime': '2022-03-18T11:32:51.135893+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'APPEND', 'old_size': 2097152}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~2097152,3145728~18446744073706405887]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:486cf858:::5050000000001a5.0000000f:head', 'version': "346'560", 'prior_version': "346'559", 'reqid': 'client.133562.0:206272', 'extra_reqids': [], 'mtime': '2022-03-18T11:32:51.189754+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'APPEND', 'old_size': 3145728}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~3145728,4194304~18446744073705357311]', 'clean_omap': True, 'new_object': False}}}
3.212s0.osd48.log [0'0, 351'656]
{'op': 'modify', 'object': '3:486cf858:::5050000000001a5.0000000f:head', 'version': "346'557", 'prior_version': "0'0", 'reqid': 'client.133562.0:206204', 'extra_reqids': [], 'mtime': '2022-03-18T11:32:51.042270+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': True, 'ops': [{'code': 'CREATE'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[1048576~18446744073708503039]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:486cf858:::5050000000001a5.0000000f:head', 'version': "346'558", 'prior_version': "346'557", 'reqid': 'client.133562.0:206224', 'extra_reqids': [], 'mtime': '2022-03-18T11:32:51.084657+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'APPEND', 'old_size': 1048576}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~1048576,2097152~18446744073707454463]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:486cf858:::5050000000001a5.0000000f:head', 'version': "346'559", 'prior_version': "346'558", 'reqid': 'client.133562.0:206251', 'extra_reqids': [], 'mtime': '2022-03-18T11:32:51.135893+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'APPEND', 'old_size': 2097152}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~2097152,3145728~18446744073706405887]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:486cf858:::5050000000001a5.0000000f:head', 'version': "346'560", 'prior_version': "346'559", 'reqid': 'client.133562.0:206272', 'extra_reqids': [], 'mtime': '2022-03-18T11:32:51.189754+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'APPEND', 'old_size': 3145728}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~3145728,4194304~18446744073705357311]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:486cf858:::5050000000001a5.0000000f:head', 'version': "351'577", 'prior_version': "346'560", 'reqid': 'client.114246.0:202100', 'extra_reqids': [], 'mtime': '2022-03-18T11:33:04.549527+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 577, 'snaps': '[0,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[1048576~18446744073708503039]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:486cf858:::5050000000001a5.0000000f:head', 'version': "351'578", 'prior_version': "351'577", 'reqid': 'client.114246.0:202107', 'extra_reqids': [], 'mtime': '2022-03-18T11:33:04.572572+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 578, 'snaps': '[524288,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~1048576,2097152~18446744073707454463]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:486cf858:::5050000000001a5.0000000f:head', 'version': "351'579", 'prior_version': "351'578", 'reqid': 'client.114246.0:202115', 'extra_reqids': [], 'mtime': '2022-03-18T11:33:04.594172+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 579, 'snaps': '[1048576,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~2097152,3145728~18446744073706405887]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:486cf858:::5050000000001a5.0000000f:head', 'version': "351'580", 'prior_version': "351'579", 'reqid': 'client.114246.0:202128', 'extra_reqids': [], 'mtime': '2022-03-18T11:33:04.615673+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 580, 'snaps': '[1572864,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~3145728,4194304~18446744073705357311]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:486cf858:::5050000000001a5.0000000f:head', 'version': "351'585", 'prior_version': "351'580", 'reqid': 'client.133721.0:206341', 'extra_reqids': [], 'mtime': '2022-03-18T11:33:16.115356+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 585, 'snaps': '[0,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[1048576~18446744073708503039]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:486cf858:::5050000000001a5.0000000f:head', 'version': "351'586", 'prior_version': "351'585", 'reqid': 'client.133721.0:206378', 'extra_reqids': [], 'mtime': '2022-03-18T11:33:16.215681+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 586, 'snaps': '[524288,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~1048576,2097152~18446744073707454463]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:486cf858:::5050000000001a5.0000000f:head', 'version': "351'587", 'prior_version': "351'586", 'reqid': 'client.133721.0:206386', 'extra_reqids': [], 'mtime': '2022-03-18T11:33:16.264633+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 587, 'snaps': '[1048576,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~2097152,3145728~18446744073706405887]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:486cf858:::5050000000001a5.0000000f:head', 'version': "351'588", 'prior_version': "351'587", 'reqid': 'client.133721.0:206388', 'extra_reqids': [], 'mtime': '2022-03-18T11:33:16.284910+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 588, 'snaps': '[1572864,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~3145728,4194304~18446744073705357311]', 'clean_omap': True, 'new_object': False}}}
3.212s0.osd57.log [495'2362, 780'2956]
3.212s1.osd37.log [346'562, 380'1168]
{'op': 'modify', 'object': '3:486cf858:::5050000000001a5.0000000f:head', 'version': "351'577", 'prior_version': "346'560", 'reqid': 'client.114246.0:202100', 'extra_reqids': [], 'mtime': '2022-03-18T11:33:04.549527+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 577, 'snaps': '[0,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[1048576~18446744073708503039]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:486cf858:::5050000000001a5.0000000f:head', 'version': "351'578", 'prior_version': "351'577", 'reqid': 'client.114246.0:202107', 'extra_reqids': [], 'mtime': '2022-03-18T11:33:04.572572+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 578, 'snaps': '[524288,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~1048576,2097152~18446744073707454463]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:486cf858:::5050000000001a5.0000000f:head', 'version': "351'579", 'prior_version': "351'578", 'reqid': 'client.114246.0:202115', 'extra_reqids': [], 'mtime': '2022-03-18T11:33:04.594172+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 579, 'snaps': '[1048576,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~2097152,3145728~18446744073706405887]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:486cf858:::5050000000001a5.0000000f:head', 'version': "351'580", 'prior_version': "351'579", 'reqid': 'client.114246.0:202128', 'extra_reqids': [], 'mtime': '2022-03-18T11:33:04.615673+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 580, 'snaps': '[1572864,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~3145728,4194304~18446744073705357311]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:486cf858:::5050000000001a5.0000000f:head', 'version': "351'585", 'prior_version': "351'580", 'reqid': 'client.133721.0:206341', 'extra_reqids': [], 'mtime': '2022-03-18T11:33:16.115356+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 585, 'snaps': '[0,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[1048576~18446744073708503039]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:486cf858:::5050000000001a5.0000000f:head', 'version': "351'586", 'prior_version': "351'585", 'reqid': 'client.133721.0:206378', 'extra_reqids': [], 'mtime': '2022-03-18T11:33:16.215681+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 586, 'snaps': '[524288,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~1048576,2097152~18446744073707454463]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:486cf858:::5050000000001a5.0000000f:head', 'version': "351'587", 'prior_version': "351'586", 'reqid': 'client.133721.0:206386', 'extra_reqids': [], 'mtime': '2022-03-18T11:33:16.264633+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 587, 'snaps': '[1048576,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~2097152,3145728~18446744073706405887]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:486cf858:::5050000000001a5.0000000f:head', 'version': "351'588", 'prior_version': "351'587", 'reqid': 'client.133721.0:206388', 'extra_reqids': [], 'mtime': '2022-03-18T11:33:16.284910+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 588, 'snaps': '[1572864,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~3145728,4194304~18446744073705357311]', 'clean_omap': True, 'new_object': False}}}
3.212s1.osd40.log [495'2362, 780'2956]
3.212s1.osd45.log [495'2362, 780'2956]
3.212s1.osd47.log [362'762, 387'1328]
3.212s1.osd73.log [481'2062, 513'2576]
3.212s2.osd43.log [362'762, 387'1328]
3.212s2.osd44.log [495'2362, 780'2956]

[root@node124 ceph]# grep -e Primary -e AllReplicasActive -e 5050000000001a5.0000000f ceph-osd.57.log|grep -E "3\.212.*transitioning to Primary|search_for_missing.*5050000000001a5.0000000f" 
2022-03-20T14:18:46.617+0800 7f69ef32f700  1 osd.57 pg_epoch: 6989 pg[3.212s0( v 780'2956 lc 0'0 (495'2362,780'2956] local-lis/les=5151/5152 n=240 ec=198/198 lis/c=5151/269 les/c/f=5152/270/0 sis=6989) [57,40,44]/[57,45,44]p57(0) r=0 lpr=6989 pi=[269,6989)/5 crt=780'2956 mlcod 0'0 remapped m=3 mbc={}] state<Start>: transitioning to Primary
2022-03-20T14:18:47.743+0800 7f69f1b34700 10 osd.57 pg_epoch: 6990 pg[3.212s0( v 780'2956 lc 0'0 (495'2362,780'2956] local-lis/les=6989/6990 n=240 ec=198/198 lis/c=5151/269 les/c/f=5152/270/0 sis=6989) [57,40,44]/[57,45,44]p57(0) backfill=[40(1)] r=0 lpr=6989 pi=[269,6989)/5 crt=780'2956 mlcod 0'0 remapped m=3 u=3 mbc={}] search_for_missing 3:486cf858:::5050000000001a5.0000000f:head 351'588 also missing on osd.57(0)
2022-03-20T14:18:47.743+0800 7f69f1b34700 10 osd.57 pg_epoch: 6990 pg[3.212s0( v 780'2956 lc 0'0 (495'2362,780'2956] local-lis/les=6989/6990 n=240 ec=198/198 lis/c=5151/269 les/c/f=5152/270/0 sis=6989) [57,40,44]/[57,45,44]p57(0) backfill=[40(1)] r=0 lpr=6989 pi=[269,6989)/5 crt=780'2956 mlcod 0'0 remapped m=3 u=3 mbc={}] search_for_missing 3:486cf858:::5050000000001a5.0000000f:head 351'588 also missing on osd.40(1) (past last_backfill MIN)                                                                3.212s1.5050000000001a5.0000000f.osd40.object_info_t
2022-03-20T14:18:47.743+0800 7f69f1b34700 10 osd.57 pg_epoch: 6990 pg[3.212s0( v 780'2956 lc 0'0 (495'2362,780'2956] local-lis/les=6989/6990 n=240 ec=198/198 lis/c=5151/269 les/c/f=5152/270/0 sis=6989) [57,40,44]/[57,45,44]p57(0) backfill=[40(1)] r=0 lpr=6989 pi=[269,6989)/5 crt=780'2956 mlcod 0'0 remapped m=3 u=3 mbc={0={(0+0)=2},1={(0+0)=2},2={(1+0)=2}}] search_for_missing 3:486cf858:::5050000000001a5.0000000f:head 351'588 is on osd.44(2)                                                                3.212s2.5050000000001a5.0000000f.osd44.object_info_t
2022-03-20T14:18:47.744+0800 7f69f1b34700 10 osd.57 pg_epoch: 6990 pg[3.212s0( v 780'2956 lc 0'0 (495'2362,780'2956] local-lis/les=6989/6990 n=240 ec=198/198 lis/c=5151/269 les/c/f=5152/270/0 sis=6989) [57,40,44]/[57,45,44]p57(0) backfill=[40(1)] r=0 lpr=6989 pi=[269,6989)/5 crt=780'2956 mlcod 0'0 remapped m=3 u=3 mbc={0={(0+0)=3},1={(0+0)=3},2={(1+0)=3}}] search_for_missing 3:486cf858:::5050000000001a5.0000000f:head 351'588 also missing on osd.45(1)
2022-03-20T14:18:47.782+0800 7f69f1b34700 10 osd.57 pg_epoch: 6990 pg[3.212s0( v 780'2956 lc 0'0 (495'2362,780'2956] local-lis/les=6989/6990 n=240 ec=198/198 lis/c=5151/269 les/c/f=5152/270/0 sis=6989) [57,40,44]/[57,45,44]p57(0) backfill=[40(1)] r=0 lpr=6989 pi=[269,6989)/5 crt=780'2956 mlcod 0'0 activating+degraded+remapped m=3 u=3 mbc={0={(0+0)=3},1={(0+0)=3},2={(1+0)=3}}] search_for_missing 3:486cf858:::5050000000001a5.0000000f:head 351'588 also missing on osd.47(1) (past last_backfill MIN)
2022-03-20T14:18:47.786+0800 7f69ef32f700 10 osd.57 pg_epoch: 6990 pg[3.212s0( v 780'2956 lc 0'0 (495'2362,780'2956] local-lis/les=6989/6990 n=240 ec=198/198 lis/c=5151/269 les/c/f=5152/270/0 sis=6989) [57,40,44]/[57,45,44]p57(0) backfill=[40(1)] r=0 lpr=6989 pi=[269,6989)/5 crt=780'2956 mlcod 0'0 activating+degraded+remapped m=3 u=3 mbc={0={(0+0)=3},1={(0+0)=3},2={(1+0)=3}}] search_for_missing 3:486cf858:::5050000000001a5.0000000f:head 351'588 also missing on osd.37(1) (past last_backfill MIN)
2022-03-20T14:18:47.797+0800 7f69ecb2a700 10 osd.57 pg_epoch: 6990 pg[3.212s0( v 780'2956 lc 0'0 (495'2362,780'2956] local-lis/les=6989/6990 n=240 ec=198/198 lis/c=5151/269 les/c/f=5152/270/0 sis=6989) [57,40,44]/[57,45,44]p57(0) backfill=[40(1)] r=0 lpr=6989 pi=[269,6989)/5 crt=780'2956 mlcod 0'0 activating+degraded+remapped m=3 u=3 mbc={0={(0+0)=3},1={(0+0)=3},2={(1+0)=3}}] search_for_missing 3:486cf858:::5050000000001a5.0000000f:head 351'588 also missing on osd.48(0)
2022-03-20T14:18:47.799+0800 7f69ef32f700 10 osd.57 pg_epoch: 6990 pg[3.212s0( v 780'2956 lc 0'0 (495'2362,780'2956] local-lis/les=6989/6990 n=240 ec=198/198 lis/c=5151/269 les/c/f=5152/270/0 sis=6989) [57,40,44]/[57,45,44]p57(0) backfill=[40(1)] r=0 lpr=6989 pi=[269,6989)/5 crt=780'2956 mlcod 0'0 activating+degraded+remapped m=3 u=3 mbc={0={(0+0)=3},1={(0+0)=3},2={(1+0)=3}}] search_for_missing 3:486cf858:::5050000000001a5.0000000f:head 351'588 also missing on osd.43(2) (past last_backfill MIN)
2022-03-20T14:18:47.812+0800 7f69ecb2a700 10 osd.57 pg_epoch: 6990 pg[3.212s0( v 780'2956 lc 0'0 (495'2362,780'2956] local-lis/les=6989/6990 n=240 ec=198/198 lis/c=5151/269 les/c/f=5152/270/0 sis=6989) [57,40,44]/[57,45,44]p57(0) backfill=[40(1)] r=0 lpr=6989 pi=[269,6989)/5 crt=780'2956 mlcod 0'0 activating+degraded+remapped m=3 u=3 mbc={0={(0+0)=3},1={(0+0)=3},2={(1+0)=3}}] search_for_missing 3:486cf858:::5050000000001a5.0000000f:head 351'588 also missing on osd.73(1) (past last_backfill MIN)
2022-03-20T14:18:47.817+0800 7f69f1b34700 10 osd.57 pg_epoch: 6990 pg[3.212s0( v 780'2956 lc 0'0 (495'2362,780'2956] local-lis/les=6989/6990 n=240 ec=198/198 lis/c=5151/269 les/c/f=5152/270/0 sis=6989) [57,40,44]/[57,45,44]p57(0) backfill=[40(1)] r=0 lpr=6989 pi=[269,6989)/5 crt=780'2956 mlcod 0'0 activating+degraded+remapped m=3 u=3 mbc={0={(0+0)=3},1={(0+0)=3},2={(1+0)=3}}] search_for_missing 3:486cf858:::5050000000001a5.0000000f:head 351'588 also missing on osd.45(0) (last_update 346'564 < needed 351'588)                3.212s0.5050000000001a5.0000000f.osd45.object_info_t
                                                                                                                                                                                                                                                                                                                                                                                                                    -rw-r--r-- 1 root root 279 Mar 20 13:55 3.212s1.5050000000001a5.0000000f.osd40.object_info_t
                                                                                                                                                                                                                                                                                                                                                                                                                    -rw-r--r-- 1 root root 279 Mar 20 13:55 3.212s2.5050000000001a5.0000000f.osd44.object_info_t
                                                                                                                                                                                                                                                                                                                                                                                                                    -rw-r--r-- 1 root root 279 Mar 20 13:56 3.212s0.5050000000001a5.0000000f.osd45.object_info_t
Actions #5

Updated by jianwei zhang about 2 years ago

reproduce:
1. 5 hosts
2. cephfs (replica pool + ec pool (allow_ec_overwrite))
3. 3 hosts create ec(2+1) data pool  and 3 replicas meta pool for base_cluster
4. running vdbench test file case
5. expand remain 2 hosts to cluster, data pool and meta pool, tips: osd one by one join up&in to cluster
6. all osds is up && in
7. backfill and recovery over

result:
ceph -s
11 active+recovery_unfound+degraded+remapped

analyse:
I found a reason, and I need everyone to discuss it together

Through the ceph log,pg and object information on the disk, 
The 3 ec shard data of the desired eversion of ec(2+1) are storaged on the disks, and are not lost. 
The problem lies in the missing object after the peering is completed. In the process of adding the source osd required for recovery, it will be skipped the source osd that because the named serial number of the object in the pg is greater than the last_backfill(MIN), and it is not added to the source osd set, source osds not enough ec(2) for recover, resulting in recovery_unfound.

problem code:

MissingLoc::add_source_info
{
    for ()
    ...
        if (p->first >= oinfo.last_backfill) {
            // FIXME: this is _probably_ true, although it could conceivably
            // be in the undefined region!  Hmm!
            ldout(cct, 10) << "search_for_missing " << soid << " " << need << " also missing on osd." << fromosd
                           << " (past last_backfill " << oinfo.last_backfill << ")" << dendl;
            continue;
        }
     ...
}

commit b99e135848ca5666308344cf5ecc9c7f95f30137
Author: Sage Weil <sage@inktank.com>
Date:   Mon Dec 5 17:25:21 2011 -0800

    osd: make backfill (basically) work again

    Still need to handle concurrent updates, log recovery vs backfill, etc.

    Signed-off-by: Sage Weil <sage.weil@dreamhost.com>

Actions #6

Updated by jianwei zhang about 2 years ago

evidence :
(Take an recovery_unfound object as an example) : 101000000000133.0000001d

# ceph pg 3.2f2 list_unfound
        {
            "oid": {
                "oid": "101000000000133.0000001d",
                "key": "",
                "snapid": -2,
                "hash": 1021279986,
                "max": 0,
                "pool": 3,
                "namespace": "" 
            },
            "need": "374'872",
            "have": "0'0",
            "flags": "none",
            "clean_regions": "clean_offsets: [], clean_omap: 0, new_object: 1",
            "locations": [
                "36(0)" 
            ]
        }

# /etc/ceph/ceph.conf  debug_osd = 30/30
# filter add source osd for missing object after peering
[root@node124 ceph]# grep -e Primary -e 101000000000133.0000001d ceph-osd.57.log|grep -E "3\.2f2.*transitioning to Primary|search_for_missing.*101000000000133.0000001d" 
2022-03-19T14:04:42.223+0800 7f82f2028700  1 osd.57 pg_epoch: 5030 pg[3.2f2s0( v 785'2826 lc 0'0 (503'2242,785'2826] local-lis/les=5003/5004 n=230 ec=198/198 lis/c=4985/282 les/c/f=4986/283/0 sis=5030) [76,59,41]/[57,41,37]p57(0) r=0 lpr=5030 pi=[282,5030)/9 crt=785'2826 mlcod 0'0 remapped m=18 mbc={}] state<Start>: transitioning to Primary
2022-03-19T14:04:43.303+0800 7f82f2028700 10 osd.57 pg_epoch: 5031 pg[3.2f2s0( v 785'2826 lc 0'0 (503'2242,785'2826] local-lis/les=5030/5031 n=230 ec=198/198 lis/c=4985/282 les/c/f=4986/283/0 sis=5030) [76,59,41]/[57,41,37]p57(0) backfill=[41(2),59(1),76(0)] r=0 lpr=5030 pi=[282,5030)/9 crt=785'2826 mlcod 0'0 remapped m=18 u=18 mbc={}] search_for_missing 3:4f7efb3c:::101000000000133.0000001d:head 374'872 also missing on osd.57(0)
2022-03-19T14:04:43.303+0800 7f82f2028700 10 osd.57 pg_epoch: 5031 pg[3.2f2s0( v 785'2826 lc 0'0 (503'2242,785'2826] local-lis/les=5030/5031 n=230 ec=198/198 lis/c=4985/282 les/c/f=4986/283/0 sis=5030) [76,59,41]/[57,41,37]p57(0) backfill=[41(2),59(1),76(0)] r=0 lpr=5030 pi=[282,5030)/9 crt=785'2826 mlcod 0'0 remapped m=18 u=18 mbc={}] search_for_missing 3:4f7efb3c:::101000000000133.0000001d:head 374'872 also missing on osd.37(2)
2022-03-19T14:04:43.304+0800 7f82f2028700 10 osd.57 pg_epoch: 5031 pg[3.2f2s0( v 785'2826 lc 0'0 (503'2242,785'2826] local-lis/les=5030/5031 n=230 ec=198/198 lis/c=4985/282 les/c/f=4986/283/0 sis=5030) [76,59,41]/[57,41,37]p57(0) backfill=[41(2),59(1),76(0)] r=0 lpr=5030 pi=[282,5030)/9 crt=785'2826 mlcod 0'0 remapped m=18 u=18 mbc={}] search_for_missing 3:4f7efb3c:::101000000000133.0000001d:head 374'872 also missing on osd.41(1)                                                                                                        
2022-03-19T14:04:43.305+0800 7f82f2028700 10 osd.57 pg_epoch: 5031 pg[3.2f2s0( v 785'2826 lc 0'0 (503'2242,785'2826] local-lis/les=5030/5031 n=230 ec=198/198 lis/c=4985/282 les/c/f=4986/283/0 sis=5030) [76,59,41]/[57,41,37]p57(0) backfill=[41(2),59(1),76(0)] r=0 lpr=5030 pi=[282,5030)/9 crt=785'2826 mlcod 0'0 remapped m=18 u=18 mbc={}] search_for_missing 3:4f7efb3c:::101000000000133.0000001d:head 374'872 also missing on osd.41(2) (past last_backfill MIN)                                                                                ///3.2f2s2.101000000000133.0000001d.osd41.object_info_t(on disk  object_info)            //However, because the serial number of the object is greater than the last_backfill of osd.41(2), it was skipped and not added to the source osd

2022-03-19T14:04:43.306+0800 7f82f2028700 10 osd.57 pg_epoch: 5031 pg[3.2f2s0( v 785'2826 lc 0'0 (503'2242,785'2826] local-lis/les=5030/5031 n=230 ec=198/198 lis/c=4985/282 les/c/f=4986/283/0 sis=5030) [76,59,41]/[57,41,37]p57(0) backfill=[41(2),59(1),76(0)] r=0 lpr=5030 pi=[282,5030)/9 crt=785'2826 mlcod 0'0 remapped m=18 u=18 mbc={}] search_for_missing 3:4f7efb3c:::101000000000133.0000001d:head 374'872 also missing on osd.59(1) (past last_backfill MIN)                                                                                ///3.2f2s1.101000000000133.0000001d.osd59.object_info_t(on disk object_info)            ///However, because the serial number of the object is greater than the last_backfill of osd.59(1), it was skipped and not added to the source osd

2022-03-19T14:04:43.307+0800 7f82f2028700 10 osd.57 pg_epoch: 5031 pg[3.2f2s0( v 785'2826 lc 0'0 (503'2242,785'2826] local-lis/les=5030/5031 n=230 ec=198/198 lis/c=4985/282 les/c/f=4986/283/0 sis=5030) [76,59,41]/[57,41,37]p57(0) backfill=[41(2),59(1),76(0)] r=0 lpr=5030 pi=[282,5030)/9 crt=785'2826 mlcod 0'0 remapped m=18 u=18 mbc={}] search_for_missing 3:4f7efb3c:::101000000000133.0000001d:head 374'872 also missing on osd.76(0) (past last_backfill MIN)
2022-03-19T14:04:43.344+0800 7f82f2028700 10 osd.57 pg_epoch: 5031 pg[3.2f2s0( v 785'2826 lc 0'0 (503'2242,785'2826] local-lis/les=5030/5031 n=230 ec=198/198 lis/c=4985/282 les/c/f=4986/283/0 sis=5030) [76,59,41]/[57,41,37]p57(0) backfill=[41(2),59(1),76(0)] r=0 lpr=5030 pi=[282,5030)/9 crt=785'2826 mlcod 0'0 activating+degraded+remapped m=18 u=18 mbc={}] search_for_missing 3:4f7efb3c:::101000000000133.0000001d:head 374'872 also missing on osd.44(1) (last_update 346'424 < needed 374'872)
2022-03-19T14:04:43.348+0800 7f82f2028700 10 osd.57 pg_epoch: 5031 pg[3.2f2s0( v 785'2826 lc 0'0 (503'2242,785'2826] local-lis/les=5030/5031 n=230 ec=198/198 lis/c=4985/282 les/c/f=4986/283/0 sis=5030) [76,59,41]/[57,41,37]p57(0) backfill=[41(2),59(1),76(0)] r=0 lpr=5030 pi=[282,5030)/9 crt=785'2826 mlcod 0'0 activating+degraded+remapped m=18 u=18 mbc={0={(0+1)=17},1={(0+0)=17},2={(0+0)=17}}] search_for_missing 3:4f7efb3c:::101000000000133.0000001d:head 374'872 is on osd.36(0)                                                                             ///3.2f2s0.101000000000133.0000001d.osd36.object_info_t(on disk object_info)             //Only osd.36(0) meets all the conditions and is added to the source osd

2022-03-19T14:04:43.362+0800 7f82f2028700 10 osd.57 pg_epoch: 5031 pg[3.2f2s0( v 785'2826 lc 0'0 (503'2242,785'2826] local-lis/les=5030/5031 n=230 ec=198/198 lis/c=4985/282 les/c/f=4986/283/0 sis=5030) [76,59,41]/[57,41,37]p57(0) backfill=[41(2),59(1),76(0)] r=0 lpr=5030 pi=[282,5030)/9 crt=785'2826 mlcod 0'0 activating+degraded+remapped m=18 u=18 mbc={0={(0+1)=18},1={(0+0)=18},2={(0+0)=18}}] search_for_missing 3:4f7efb3c:::101000000000133.0000001d:head 374'872 also missing on osd.57(1) (last_update 362'664 < needed 374'872)
2022-03-19T14:04:43.365+0800 7f82ef823700 10 osd.57 pg_epoch: 5031 pg[3.2f2s0( v 785'2826 lc 0'0 (503'2242,785'2826] local-lis/les=5030/5031 n=230 ec=198/198 lis/c=4985/282 les/c/f=4986/283/0 sis=5030) [76,59,41]/[57,41,37]p57(0) backfill=[41(2),59(1),76(0)] r=0 lpr=5030 pi=[282,5030)/9 crt=785'2826 mlcod 0'0 activating+degraded+remapped m=18 u=18 mbc={0={(0+1)=18},1={(0+0)=18},2={(0+0)=18}}] search_for_missing 3:4f7efb3c:::101000000000133.0000001d:head 374'872 also missing on osd.75(0) (past last_backfill MIN)
2022-03-19T14:04:43.378+0800 7f82f2028700 10 osd.57 pg_epoch: 5031 pg[3.2f2s0( v 785'2826 lc 0'0 (503'2242,785'2826] local-lis/les=5030/5031 n=230 ec=198/198 lis/c=4985/282 les/c/f=4986/283/0 sis=5030) [76,59,41]/[57,41,37]p57(0) backfill=[41(2),59(1),76(0)] r=0 lpr=5030 pi=[282,5030)/9 crt=785'2826 mlcod 0'0 activating+degraded+remapped m=18 u=18 mbc={0={(0+1)=18},1={(0+0)=18},2={(0+0)=18}}] search_for_missing 3:4f7efb3c:::101000000000133.0000001d:head 374'872 also missing on osd.39(2)
2022-03-19T14:04:43.384+0800 7f82f482d700 10 osd.57 pg_epoch: 5031 pg[3.2f2s0( v 785'2826 lc 0'0 (503'2242,785'2826] local-lis/les=5030/5031 n=230 ec=198/198 lis/c=4985/282 les/c/f=4986/283/0 sis=5030) [76,59,41]/[57,41,37]p57(0) backfill=[41(2),59(1),76(0)] r=0 lpr=5030 pi=[282,5030)/9 crt=785'2826 mlcod 0'0 activating+degraded+remapped m=18 u=18 mbc={0={(0+1)=18},1={(0+0)=18},2={(0+0)=18}}] search_for_missing 3:4f7efb3c:::101000000000133.0000001d:head 374'872 also missing on osd.48(1) (last_update 351'468 < needed 374'872)
2022-03-19T14:04:43.387+0800 7f82f2028700 10 osd.57 pg_epoch: 5031 pg[3.2f2s0( v 785'2826 lc 0'0 (503'2242,785'2826] local-lis/les=5030/5031 n=230 ec=198/198 lis/c=4985/282 les/c/f=4986/283/0 sis=5030) [76,59,41]/[57,41,37]p57(0) backfill=[41(2),59(1),76(0)] r=0 lpr=5030 pi=[282,5030)/9 crt=785'2826 mlcod 0'0 activating+degraded+remapped m=18 u=18 mbc={0={(0+1)=18},1={(0+0)=18},2={(0+0)=18}}] search_for_missing 3:4f7efb3c:::101000000000133.0000001d:head 374'872 also missing on osd.73(0) (past last_backfill MIN)

ceph-object-tool: (object_info_t and check bluestore)

 3.2f2]# for i in $(ls -tlhr|grep 101000000000133.0000001d|grep -v -e '0 Mar 19'|grep -e osd|awk '{print $NF}'); do echo $i; ceph-dencoder import $i type object_info_t decode dump_json;done
3.2f2s0.101000000000133.0000001d.osd36.object_info_t
{
    "oid": {
        "oid": "101000000000133.0000001d",
        "key": "",
        "snapid": -2,
        "hash": 1021279986,
        "max": 0,
        "pool": 3,
        "namespace": "" 
    },
    "version": "374'872",
    "prior_version": "374'871",
    "last_reqid": "client.133562.0:359556",
    "user_version": 872,
    "size": 4194304,
    "mtime": "2022-03-18T11:41:39.433611+0800",
    "local_mtime": "2022-03-18T11:41:39.436761+0800",
    "lost": 0,
    "flags": [
        "dirty" 
    ],
    "truncate_seq": 0,
    "truncate_size": 0,
    "data_digest": "0xffffffff",
    "omap_digest": "0xffffffff",
    "expected_object_size": 0,
    "expected_write_size": 0,
    "alloc_hint_flags": 0,
    "manifest": {
        "type": 0
    },
    "watchers": {}
}

3.2f2s2.101000000000133.0000001d.osd41.object_info_t
{
    "oid": {
        "oid": "101000000000133.0000001d",
        "key": "",
        "snapid": -2,
        "hash": 1021279986,
        "max": 0,
        "pool": 3,
        "namespace": "" 
    },
    "version": "374'872",
    "prior_version": "374'871",
    "last_reqid": "client.133562.0:359556",
    "user_version": 872,
    "size": 4194304,
    "mtime": "2022-03-18T11:41:39.433611+0800",
    "local_mtime": "2022-03-18T11:41:39.436761+0800",
    "lost": 0,
    "flags": [
        "dirty" 
    ],
    "truncate_seq": 0,
    "truncate_size": 0,
    "data_digest": "0xffffffff",
    "omap_digest": "0xffffffff",
    "expected_object_size": 0,
    "expected_write_size": 0,
    "alloc_hint_flags": 0,
    "manifest": {
        "type": 0
    },
    "watchers": {}
}

3.2f2s1.101000000000133.0000001d.osd59.object_info_t
{
    "oid": {
        "oid": "101000000000133.0000001d",
        "key": "",
        "snapid": -2,
        "hash": 1021279986,
        "max": 0,
        "pool": 3,
        "namespace": "" 
    },
    "version": "374'872",
    "prior_version": "374'871",
    "last_reqid": "client.133562.0:359556",
    "user_version": 872,
    "size": 4194304,
    "mtime": "2022-03-18T11:41:39.433611+0800",
    "local_mtime": "2022-03-18T11:41:39.436761+0800",
    "lost": 0,
    "flags": [
        "dirty" 
    ],
    "truncate_seq": 0,
    "truncate_size": 0,
    "data_digest": "0xffffffff",
    "omap_digest": "0xffffffff",
    "expected_object_size": 0,
    "expected_write_size": 0,
    "alloc_hint_flags": 0,
    "manifest": {
        "type": 0
    },
    "watchers": {}
}

node121 ceph-objectstore-tool --op fuse --data-path /var/lib/ceph/osd/ceph-36/ --mountpoint /mnt/fuse-osd36 &
node122 ceph-objectstore-tool --op fuse --data-path /var/lib/ceph/osd/ceph-41/ --mountpoint /mnt/fuse-osd41    &
node124 ceph-objectstore-tool --op fuse --data-path /var/lib/ceph/osd/ceph-59/ --mountpoint /mnt/fuse-osd59    &

[root@node121 0#3:4f7efb3c:::101000000000133.0000001d:head#]# md5sum /mnt/fuse-osd36/3.2f2s0_head/all/0#3:4f7efb3c:::101000000000133.0000001d:head#/data
ea003c1e3b13d63d6ed67d0595e8ff1b  /mnt/fuse-osd36/3.2f2s0_head/all/0#3:4f7efb3c:::101000000000133.0000001d:head#/data

[root@node122 2#3:4f7efb3c:::101000000000133.0000001d:head#]# md5sum /mnt/fuse-osd41/3.2f2s2_head/all/2#3:4f7efb3c:::101000000000133.0000001d:head#/data
52ed9a463dd46982113bf748e7f0c4fe  /mnt/fuse-osd41/3.2f2s2_head/all/2#3:4f7efb3c:::101000000000133.0000001d:head#/data

[root@node124 1#3:4f7efb3c:::101000000000133.0000001d:head#]# md5sum /mnt/fuse-osd59/3.2f2s1_head/all/1#3:4f7efb3c:::101000000000133.0000001d:head#/data
f339660aa3a182d89be8110d67024a04  /mnt/fuse-osd59/3.2f2s1_head/all/1#3:4f7efb3c:::101000000000133.0000001d:head#/data

ceph-object-tool: dump pg info and pglog, then parse

[root@node121 3.2f2]# python3 parse_data.py 101000000000133.0000001d *.log
['3.2f2s0.osd36.log', '3.2f2s0.osd57.log', '3.2f2s0.osd73.log', '3.2f2s0.osd75.log', '3.2f2s0.osd76.log', '3.2f2s1.osd41.log', '3.2f2s1.osd44.log', '3.2f2s1.osd48.log', '3.2f2s1.osd57.log', '3.2f2s1.osd59.log', '3.2f2s2.osd37.log', '3.2f2s2.osd39.log', '3.2f2s2.osd41.log']
========================missing object===========================
3.2f2s0.osd36.log
3.2f2s0.osd57.log
{'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'need': "374'872", 'have': "0'0", 'flags': 'none', 'clean_regions': 'clean_offsets: [], clean_omap: 0, new_object: 1'}
3.2f2s0.osd73.log
3.2f2s0.osd75.log
3.2f2s0.osd76.log
3.2f2s1.osd41.log
{'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'need': "374'872", 'have': "0'0", 'flags': 'none', 'clean_regions': 'clean_offsets: [], clean_omap: 0, new_object: 1'}
3.2f2s1.osd44.log
3.2f2s1.osd48.log
3.2f2s1.osd57.log
3.2f2s1.osd59.log
3.2f2s2.osd37.log
{'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'need': "374'872", 'have': "0'0", 'flags': 'none', 'clean_regions': 'clean_offsets: [], clean_omap: 0, new_object: 1'}
3.2f2s2.osd39.log
{'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'need': "374'872", 'have': "0'0", 'flags': 'none', 'clean_regions': 'clean_offsets: [], clean_omap: 0, new_object: 1'}
3.2f2s2.osd41.log
========================pg log object==========================
3.2f2s0.osd36.log [0'0, 375'884]
{'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'833", 'prior_version': "0'0", 'reqid': 'client.133721.0:366904', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:03.843215+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': True, 'ops': [{'code': 'CREATE'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[1048576~18446744073708503039]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'834", 'prior_version': "374'833", 'reqid': 'client.133721.0:366972', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:03.967153+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'APPEND', 'old_size': 1048576}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~1048576,2097152~18446744073707454463]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'835", 'prior_version': "374'834", 'reqid': 'client.133721.0:367041', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:04.090498+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'APPEND', 'old_size': 2097152}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~2097152,3145728~18446744073706405887]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'836", 'prior_version': "374'835", 'reqid': 'client.133721.0:367114', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:04.210199+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'APPEND', 'old_size': 3145728}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~3145728,4194304~18446744073705357311]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'853", 'prior_version': "374'836", 'reqid': 'client.114246.0:360904', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:25.208243+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 853, 'snaps': '[0,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[1048576~18446744073708503039]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'854", 'prior_version': "374'853", 'reqid': 'client.114246.0:360926', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:25.256403+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 854, 'snaps': '[524288,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~1048576,2097152~18446744073707454463]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'855", 'prior_version': "374'854", 'reqid': 'client.114246.0:360958', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:25.295928+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 855, 'snaps': '[1048576,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~2097152,3145728~18446744073706405887]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'856", 'prior_version': "374'855", 'reqid': 'client.114246.0:360973', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:25.323967+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 856, 'snaps': '[1572864,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~3145728,4194304~18446744073705357311]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'869", 'prior_version': "374'856", 'reqid': 'client.133562.0:359426', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:39.277007+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 869, 'snaps': '[0,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[1048576~18446744073708503039]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'870", 'prior_version': "374'869", 'reqid': 'client.133562.0:359461', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:39.334504+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 870, 'snaps': '[524288,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~1048576,2097152~18446744073707454463]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'871", 'prior_version': "374'870", 'reqid': 'client.133562.0:359504', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:39.385649+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 871, 'snaps': '[1048576,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~2097152,3145728~18446744073706405887]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'872", 'prior_version': "374'871", 'reqid': 'client.133562.0:359556', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:39.433611+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 872, 'snaps': '[1572864,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~3145728,4194304~18446744073705357311]', 'clean_omap': True, 'new_object': False}}}
3.2f2s0.osd57.log [503'2242, 785'2826]
3.2f2s0.osd73.log [466'1540, 488'2076]
3.2f2s0.osd75.log [466'1540, 491'2080]
3.2f2s0.osd76.log [503'2242, 785'2826]
3.2f2s1.osd41.log [503'2242, 785'2826]
3.2f2s1.osd44.log [0'0, 346'424]
3.2f2s1.osd48.log [0'0, 351'468]
3.2f2s1.osd57.log [0'0, 362'664]
3.2f2s1.osd59.log [503'2242, 785'2826]
3.2f2s2.osd37.log [503'2242, 785'2826]
3.2f2s2.osd39.log [291'200, 386'1120]
{'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'833", 'prior_version': "0'0", 'reqid': 'client.133721.0:366904', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:03.843215+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': True, 'ops': [{'code': 'CREATE'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[1048576~18446744073708503039]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'834", 'prior_version': "374'833", 'reqid': 'client.133721.0:366972', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:03.967153+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'APPEND', 'old_size': 1048576}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~1048576,2097152~18446744073707454463]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'835", 'prior_version': "374'834", 'reqid': 'client.133721.0:367041', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:04.090498+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'APPEND', 'old_size': 2097152}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~2097152,3145728~18446744073706405887]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'836", 'prior_version': "374'835", 'reqid': 'client.133721.0:367114', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:04.210199+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'APPEND', 'old_size': 3145728}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~3145728,4194304~18446744073705357311]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'853", 'prior_version': "374'836", 'reqid': 'client.114246.0:360904', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:25.208243+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 853, 'snaps': '[0,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[1048576~18446744073708503039]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'854", 'prior_version': "374'853", 'reqid': 'client.114246.0:360926', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:25.256403+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 854, 'snaps': '[524288,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~1048576,2097152~18446744073707454463]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'855", 'prior_version': "374'854", 'reqid': 'client.114246.0:360958', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:25.295928+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 855, 'snaps': '[1048576,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~2097152,3145728~18446744073706405887]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'856", 'prior_version': "374'855", 'reqid': 'client.114246.0:360973', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:25.323967+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 856, 'snaps': '[1572864,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~3145728,4194304~18446744073705357311]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'869", 'prior_version': "374'856", 'reqid': 'client.133562.0:359426', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:39.277007+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 869, 'snaps': '[0,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[1048576~18446744073708503039]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'870", 'prior_version': "374'869", 'reqid': 'client.133562.0:359461', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:39.334504+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 870, 'snaps': '[524288,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~1048576,2097152~18446744073707454463]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'871", 'prior_version': "374'870", 'reqid': 'client.133562.0:359504', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:39.385649+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 871, 'snaps': '[1048576,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~2097152,3145728~18446744073706405887]', 'clean_omap': True, 'new_object': False}}}
{'op': 'modify', 'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'version': "374'872", 'prior_version': "374'871", 'reqid': 'client.133562.0:359556', 'extra_reqids': [], 'mtime': '2022-03-18T11:41:39.433611+0800', 'return_code': 0, 'mod_desc': {'object_mod_desc': {'can_local_rollback': True, 'rollback_info_completed': False, 'ops': [{'code': 'SETATTRS', 'attrs': ['_', 'hinfo_key', 'snapset']}, {'code': 'ROLLBACK_EXTENTS', 'gen': 872, 'snaps': '[1572864,524288]'}]}}, 'clean_regions': {'object_clean_regions': {'clean_offsets': '[0~3145728,4194304~18446744073705357311]', 'clean_omap': True, 'new_object': False}}}
3.2f2s2.osd41.log [503'2242, 785'2826]
Actions #7

Updated by jianwei zhang about 2 years ago

以3.2f2为例

[root@node121 objects-unfound]# grep 'start_peering_interval up' ceph-osd.36.log-3.2f2-node121 |awk -Fstart_peering_interval '{print $2}'
2022-03-18T10:45:13.683    epoch 199 up [36,44,28] acting [36,44,28]
2022-03-18T11:00:41.974 epoch 233 up [36,44,28] -> [38,34,27], acting [36,44,28] -> [38,34,27], acting_primary 36(0) -> 38, up_primary 36(0) -> 38, role 0 -> -1, features acting 4540138292840890367 upacting 4540138292840890367
扩容开始
2022-03-18T11:32:54.836 epoch 348 up [36,44,37] -> [36,41,37], acting [36,44,37] -> [36,41,37], acting_primary 36(0) -> 36, up_primary 36(0) -> 36, role 0 -> 0, features acting 4540138292840890367 upacting 4540138292840890367
2022-03-18T11:32:55.833 epoch 349 up [36,41,37] -> [36,48,41], acting [36,41,37] -> [36,-1,37], acting_primary 36(0) -> 36, up_primary 36(0) -> 36, role 0 -> 0, features acting 4540138292840890367 upacting 4540138292840890367
2022-03-18T11:32:56.881 epoch 350 up [36,48,41] -> [36,48,41], acting [36,-1,37] -> [36,-1,41], acting_primary 36(0) -> 36, up_primary 36(0) -> 36, role 0 -> 0, features acting 4540138292840890367 upacting 4540138292840890367
2022-03-18T11:34:36.477 epoch 354 up [36,48,41] -> [36,57,41], acting [36,-1,41] -> [36,-1,41], acting_primary 36(0) -> 36, up_primary 36(0) -> 36, role 0 -> 0, features acting 4540138292840890367 upacting 4540138292840890367
2022-03-18T11:38:09.782 epoch 366 up [36,57,41] -> [36,59,41], acting [36,-1,41] -> [36,-1,41], acting_primary 36(0) -> 36, up_primary 36(0) -> 36, role 0 -> 0, features acting 4540138292840890367 upacting 4540138292840890367
2022-03-18T11:41:49.350 epoch 376 up [36,59,41] -> [57,41,39], acting [36,-1,41] -> [36,-1,41], acting_primary 36(0) -> 36, up_primary 36(0) -> 57, role 0 -> 0, features acting 4540138292840890367 upacting 4540138292840890367
2022-03-18T11:41:49.896 epoch 377 up [57,41,39] -> [57,41,39], acting [36,-1,41] -> [-1,41,39], acting_primary 36(0) -> 41, up_primary 57(0) -> 57, role 0 -> -1, features acting 4540138292840890367 upacting 4540138292840890367
2022-03-18T11:45:33.471 epoch 390 up [57,41,39] -> [57,41,37], acting [-1,41,39] -> [-1,41,39], acting_primary 41(1) -> 41, up_primary 57(0) -> 57, role -1 -> -1, features acting 4540138292840890367 upacting 4540138292840890367
2022-03-18T11:45:34.461 epoch 391 up [57,41,37] -> [57,41,37], acting [-1,41,39] -> [57,41,-1], acting_primary 41(1) -> 57, up_primary 57(0) -> 57, role -1 -> -1, features acting 4540138292840890367 upacting 4540138292840890367
2022-03-18T11:55:29.770 epoch 483 up [57,41,37] -> [73,59,41], acting [57,41,-1] -> [57,41,-1], acting_primary 57(0) -> 57, up_primary 57(0) -> 73, role -1 -> -1, features acting 4540138292840890367 upacting 4540138292840890367
2022-03-18T11:55:30.707 epoch 484 up [73,59,41] -> [73,59,41], acting [57,41,-1] -> [57,41,37], acting_primary 57(0) -> 57, up_primary 73(0) -> 73, role -1 -> -1, features acting 4540138292840890367 upacting 4540138292840890367
2022-03-18T11:57:14.897 epoch 490 up [73,59,41] -> [75,59,41], acting [57,41,37] -> [57,41,37], acting_primary 57(0) -> 57, up_primary 73(0) -> 75, role -1 -> -1, features acting 4540138292840890367 upacting 4540138292840890367
2022-03-18T11:57:17.609 epoch 493 up [75,59,41] -> [76,59,41], acting [57,41,37] -> [57,41,37], acting_primary 57(0) -> 57, up_primary 75(0) -> 76, role -1 -> -1, features acting 4540138292840890367 upacting 4540138292840890367
扩容结束

问题梳理,为什么要进行recovery_unfound?

1. 扩容完成后,最后一次peering完成后,对应的up/acting,此时我们可知osd.37属于acting集合
    2022-03-18T11:57:17.609 epoch 493 up [75,59,41] -> [76,59,41], acting [57,41,37] -> [57,41,37], acting_primary 57(0) -> 57, up_primary 75(0) -> 76, role -1 -> -1, features acting 4540138292840890367 upacting 4540138292840890367

2. 查看57/41/37的missing object
    acting=[57,41,37]
        3.2f2s0.osd57 {'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'need': "374'872", 'have': "0'0", 'flags': 'none', 'clean_regions': 'clean_offsets: [], clean_omap: 0, new_object: 1'}
        3.2f2s1.osd41 {'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'need': "374'872", 'have': "0'0", 'flags': 'none', 'clean_regions': 'clean_offsets: [], clean_omap: 0, new_object: 1'}
        3.2f2s2.osd37 {'object': '3:4f7efb3c:::101000000000133.0000001d:head', 'need': "374'872", 'have': "0'0", 'flags': 'none', 'clean_regions': 'clean_offsets: [], clean_omap: 0, new_object: 1'}
        这说明acting=[57,41,37]全部3个ec分片都缺失101000000000133.0000001d对象
        问题:什么条件下,osd57(0)/osd41(1)/osd37(2) 把101000000000133.0000001d加入到missing列表中的?
            osd.57(0)在历史上与osd.36(0)和osd.41(2)有交集
            osd.41(1)在历史上与osd.36(0)和osd.41(2)有交集
            osd.37(2)在历史上与osd.36(0)有交集

    up=[76,59,41]
        3.2f2s0.osd76 empty
        3.2f2s1.osd59 empty
        3.2f2s2.osd41 empty

3. 同一个PG,ceph要先基于missing objects进行recovery,然后才能够进行backfill

4. 对象版本存在于磁盘的osd
    search_for_missing 3:4f7efb3c:::101000000000133.0000001d:head 374'872 also missing on osd.41(2) (past last_backfill MIN) ///3.2f2s2.101000000000133.0000001d.osd41.object_info_t(磁盘存在),未选中
    search_for_missing 3:4f7efb3c:::101000000000133.0000001d:head 374'872 also missing on osd.59(1) (past last_backfill MIN) ///3.2f2s1.101000000000133.0000001d.osd59.object_info_t(磁盘存在),未选中    
                                                                                                                             ///应该是在epoch 366-376历史期间从osd.41(2)+osd.36(0) backfill出来的
    search_for_missing 3:4f7efb3c:::101000000000133.0000001d:head 374'872 is on osd.36(0)                                    ///3.2f2s0.101000000000133.0000001d.osd36.object_info_t(磁盘存在),已选中

5. osd.41(2)与osd.59(1)在历史上的up/acting
    2022-03-18T11:34:36.477 epoch 354 up [36,48,41] -> [36,57,41], acting [36,-1,41] -> [36,-1,41], acting_primary 36(0) -> 36, up_primary 36(0) -> 36, role 0 -> 0, features acting 4540138292840890367 upacting 4540138292840890367
    2022-03-18T11:38:09.782 epoch 366 up [36,57,41] -> [36,59,41], acting [36,-1,41] -> [36,-1,41], acting_primary 36(0) -> 36, up_primary 36(0) -> 36, role 0 -> 0, features acting 4540138292840890367 upacting 4540138292840890367
    2022-03-18T11:41:49.350 epoch 376 up [36,59,41] -> [57,41,39], acting [36,-1,41] -> [36,-1,41], acting_primary 36(0) -> 36, up_primary 36(0) -> 57, role 0 -> 0, features acting 4540138292840890367 upacting 4540138292840890367

6. 最后一次peering完成后,此时我们可知,在epoch=493时,osd.59(1)/osd.41(2),这2个osd,在epoch=366时,也是osd.59(1)/osd.41(2)
    2022-03-18T11:57:17.609 epoch 493 up [75,59,41] -> [76,59,41], acting [57,41,37] -> [57,41,37], acting_primary 57(0) -> 57, up_primary 75(0) -> 76, role -1 -> -1, features acting 4540138292840890367 upacting 4540138292840890367

7. 为什么osd.59(1)和osd.41(2)的last_backfill被置为MIN
    [root@jianwei ceph]# grep 'starting backfill to osd.' ceph.log |grep 3.2f2
    2022-03-18T11:55:31.749062+0800 osd.57 (osd.57) 340 : cluster [DBG] 3.2f2s0 starting backfill to osd.41(2) from (0'0,375'884] MAX to 482'1907
    2022-03-18T11:55:31.758747+0800 osd.57 (osd.57) 341 : cluster [DBG] 3.2f2s0 starting backfill to osd.59(1) from (0'0,375'884] MAX to 482'1907
    2022-03-18T11:57:18.399993+0800 osd.57 (osd.57) 394 : cluster [DBG] 3.2f2s0 starting backfill to osd.76(0) from (0'0,0'0] MAX to 491'2080

    acting=[57,41,37]
    3.2f2s0.osd57.log [503'2242, 785'2826]
    3.2f2s1.osd41.log [503'2242, 785'2826]
    3.2f2s2.osd37.log [503'2242, 785'2826]

    up=[76,59,41]
    3.2f2s0.osd76.log [503'2242, 785'2826]
    3.2f2s1.osd59.log [503'2242, 785'2826]
    3.2f2s2.osd41.log [503'2242, 785'2826]

    由于在epoch=366历史时期,osd.59(1)和osd.41(2)就已经是分片1和分片2了,当时他们的日志范围是
        osd.59(1) (0'0,375'884]
        osd.41(2) (0'0,375'884]
        其实这也进一步说明up=[76,59,41]集合中,只有osd.76缺失101000000000133.0000001d对象

    find_best_info jianwei.zhang best is osd.37(2)          [503'2242, 785'2826]

    由于权威日志的范围是[503'2242, 785'2826],因此,
        osd.59(1) (0'0,375'884]     它的head(375'884)远远小于权威日志的tail(503'2242),也就是日志没有交集,只能通过backfill恢复
        osd.41(2) (0'0,375'884]     它的head(375'884)远远小于权威日志的tail(503'2242),也就是日志没有交集,只能通过backfill恢复

    最后在PeeringState::activate()该函数是peering结束后,activating激活从osd的处理函数

    PG Peering完成后,向各个副本osd激活的时候,针对需要从头开始backfill的osd,会把last_backfill重置为MIN

    void PeeringState::activate(ObjectStore::Transaction &t, epoch_t activation_epoch, PeeringCtxWrapper &ctx)
    {
            ...
            else if (pg_log.get_tail() > pi.last_update ||                         ///①primary osd日志的尾部比replica osd的日志头部还要大,说明日志没有交集,只能通过backfill进行数据恢复
                    pi.last_backfill == hobject_t() ||                            ///②replica osd上一次的last_backfill指针是MIN,这说明上一次peering完成后,这个replica osd被置为MIN后,还未来得及进行backfill,又死了,然后又起来了
                   (backfill_targets.count(*i) && pi.last_backfill.is_max())) {    ///③replica osd存在于backfill_targets目标列表中,并且replica osd的last_backfill指针是MAX,这说明该replica osd属于up集合,并且需要backfill
                /* ^ This last case covers a situation where a replica is not contiguous
                 * with the auth_log, but is contiguous with this replica.  Reshuffling
                 * the active set to handle this would be tricky, so instead we just go
                 * ahead and backfill it anyway.  This is probably preferrable in any
                 * case since the replica in question would have to be significantly
                 * behind.
                 */
                // backfill
                pl->get_clog_debug() << info.pgid << " starting backfill to osd." << peer << " from (" << pi.log_tail << "," << pi.last_update << "] " << pi.last_backfill << " to " << info.last_update;

                pi.last_update = info.last_update;
                pi.last_complete = info.last_update;
                pi.set_last_backfill(hobject_t());                                 ///将last_backfill重置为MIN
                pi.last_epoch_started = info.last_epoch_started;
                pi.last_interval_started = info.last_interval_started;
                pi.history = info.history;
                pi.hit_set = info.hit_set;
                // Save num_bytes for reservation request, can't be negative
                peer_bytes[peer] = std::max<int64_t>(0, pi.stats.stats.sum.num_bytes);
                pi.stats.stats.clear();
                pi.stats.stats.sum.num_bytes = peer_bytes[peer];

                // initialize peer with our purged_snaps.
                pi.purged_snaps = info.purged_snaps;

                m = new MOSDPGLog(i->shard, pg_whoami.shard, get_osdmap_epoch(), pi,
                                  last_peering_reset /* epoch to create pg at */);

                // send some recent log, so that op dup detection works well.
                m->log.copy_up_to(cct, pg_log.get_log(), cct->_conf->osd_max_pg_log_entries);
                m->info.log_tail = m->log.tail;
                pi.log_tail = m->log.tail;  // sigh...

                pm.clear();
            }
            ...
    }

问题逻辑:
1.acting集合有missing object需要恢复,
2. 但是这些missing object对象的source osd又是要backfill的up集合中的osd,
3. 由于up集合中的osd要进行backfill,所以要把backfill的osd的last_backfill置为MIN,
4. 由于source osd的last_backfill是MIN,所以在MissingLoc::add_source_info时,把拥有missing object的source osd给过滤掉了。

复现频率高的原因在于咱们的扩容的流程,从11:32开始扩容到11:57 两个节点扩容完成,要产生13个 up/acting

Actions #8

Updated by jianwei zhang about 2 years ago

有一个修改想法 : 参考ceph pg 2.5 mark_unfound_lost revert|delete该命令行

针对ec,当发现处于该场景下时,也封装一个ceph命令行,用于为recovery_unfound missing object添加source osd
具体方法:利用might_have_unfound列表中的osd,扫一遍磁盘,发现一致的version prior_version last_reqid mtime等,认定为是同一个op,然后将其加入到source osd中,并触发数据恢复。

问题逻辑:
1.acting集合有missing object需要恢复,
2. 但是这些missing object对象的source osd又是要backfill的up集合中的osd,
3. 由于up集合中的osd要进行backfill,所以要把backfill的osd的last_backfill置为MIN,
4. 由于source osd的last_backfill是MIN,所以在MissingLoc::add_source_info时,把拥有missing object的source osd给过滤掉了。

            "might_have_unfound": [
                {
                    "osd": "36(0)",
                    "status": "already probed" 
                },
                {
                    "osd": "37(2)",
                    "status": "already probed" 
                },
                {
                    "osd": "39(2)",
                    "status": "already probed" 
                },
                {
                    "osd": "41(1)",
                    "status": "already probed" 
                },
                {
                    "osd": "41(2)",
                    "status": "already probed" 
                },
                {
                    "osd": "44(1)",
                    "status": "already probed" 
                },
                {
                    "osd": "48(1)",
                    "status": "already probed" 
                },
                {
                    "osd": "57(1)",
                    "status": "already probed" 
                },
                {
                    "osd": "59(1)",
                    "status": "already probed" 
                },
                {
                    "osd": "73(0)",
                    "status": "already probed" 
                },
                {
                    "osd": "75(0)",
                    "status": "already probed" 
                },
                {
                    "osd": "76(0)",
                    "status": "already probed" 
                }
            ],

Actions #10

Updated by Ilya Dryomov almost 2 years ago

  • Target version deleted (v15.2.16)
Actions

Also available in: Atom PDF