Bug #52509
closedPG merge: PG stuck in premerge+peered state
0%
Description
Hi, we get a couple of outages with two stuck PG it premerge+peered state:
PG_AVAILABILITY Reduced data availability: 2 pgs inactive pg 4.1d9 is stuck inactive for 1000.400328, current state remapped+premerge+backfilling+peered, last acting [328,315,352] pg 4.9d9 is stuck inactive for 1000.400333, current state remapped+premerge+backfill_wait+peered, last acting [328,315,352]
When I was restart osd.352 - issue was resolved immediately. I was query this PG before perform restart (json's attached). After restart PG 4.9d9 was deleted
backfill target is osd.362 - this OSD was added at 2021-09-03, and also added some upmaps:
ceph-mon.mon1.log.2.gz:2021-09-03 08:41:05.988 7fa960ed2700 0 mon.mon1@0(leader) e6 handle_command mon_command({"prefix": "osd pg-upmap-items", "pgid": "4.9d9", "id": [362, 352]} v 0) v1 ceph-mon.mon1.log.2.gz:2021-09-03 08:41:05.988 7fa960ed2700 0 log_channel(audit) log [INF] : from='client.? ' entity='client.admin' cmd=[{"prefix": "osd pg-upmap-items", "pgid": "4.9d9", "id": [362, 352]}]: dispatch ceph-mon.mon1.log.2.gz:2021-09-03 08:41:06.824 7fa95f6cf700 0 log_channel(audit) log [INF] : from='client.? ' entity='client.admin' cmd='[{"prefix": "osd pg-upmap-items", "pgid": "4.9d9", "id": [362, 352]}]': finished
Before this PG merge was runs (decreasing PG for pool_id 4)
Logs for this time for all OSD's:
osd.328.log
2021-09-05 11:50:49.561 7ff9340e4700 1 osd.328 pg_epoch: 494692 pg[4.9d9( v 494690'26564689 (494679'26561633,494690'26564689] local-lis/les=493206/493207 n=5699 ec=2307/2302 lis/c 493206/493206 les/c/f 493207/493207/0 494692/494692/485476) [328,362,315] r=0 lpr=494692 pi=[493206,494692)/1 luod=0'0 crt=494690'26564689 lcod 494690'26564688 mlcod 0'0 active mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [328,352,315] -> [328,362,315], acting [328,352,315] -> [328,362,315], acting_primary 3 28 -> 328, up_primary 328 -> 328, role 0 -> 0, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 11:50:49.561 7ff9330e2700 1 osd.328 pg_epoch: 494692 pg[4.1d9( v 494691'30825984 (494673'30822955,494691'30825984] local-lis/les=493179/493180 n=5765 ec=2302/2302 lis/c 493179/493179 les/c/f 493180/493180/0 494692/494692/328335) [328,362,315] r=0 lpr=494692 pi=[493179,494692)/1 luod=0'0 crt=494691'30825984 lcod 494687'30825982 mlcod 0'0 active mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [328,352,315] -> [328,362,315], acting [328,352,315] -> [328,362,315], acting_primary 3 28 -> 328, up_primary 328 -> 328, role 0 -> 0, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 11:50:49.561 7ff9340e4700 1 osd.328 pg_epoch: 494692 pg[4.9d9( v 494690'26564689 (494679'26561633,494690'26564689] local-lis/les=493206/493207 n=5699 ec=2307/2302 lis/c 493206/493206 les/c/f 493207/493207/0 494692/494692/485476) [328,362,315] r=0 lpr=494692 pi=[493206,494692)/1 crt=494690'26564689 lcod 494690'26564688 mlcod 0'0 unknown mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Primary 2021-09-05 11:50:49.561 7ff9330e2700 1 osd.328 pg_epoch: 494692 pg[4.1d9( v 494691'30825984 (494673'30822955,494691'30825984] local-lis/les=493179/493180 n=5765 ec=2302/2302 lis/c 493179/493179 les/c/f 493180/493180/0 494692/494692/328335) [328,362,315] r=0 lpr=494692 pi=[493179,494692)/1 crt=494691'30825984 lcod 494687'30825982 mlcod 0'0 unknown mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Primary 2021-09-05 11:50:50.585 7ff9340e4700 1 osd.328 pg_epoch: 494693 pg[4.9d9( v 494690'26564689 (494679'26561633,494690'26564689] local-lis/les=493206/493207 n=5699 ec=2307/2302 lis/c 493206/493206 les/c/f 493207/493207/0 494692/494693/485476) [328,362,315] /[328,315,352] r=0 lpr=494693 pi=[493206,494693)/1 crt=494690'26564689 lcod 494690'26564688 mlcod 0'0 remapped mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [328,362,315] -> [328,362,315], acting [328,362,315] -> [328,315,352], acting_pr imary 328 -> 328, up_primary 328 -> 328, role 0 -> 0, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 11:50:50.585 7ff9358e7700 1 osd.328 pg_epoch: 494693 pg[4.1d9( v 494691'30825984 (494673'30822955,494691'30825984] local-lis/les=493179/493180 n=5765 ec=2302/2302 lis/c 493179/493179 les/c/f 493180/493180/0 494692/494693/328335) [328,362,315] /[328,315,352] r=0 lpr=494693 pi=[493179,494693)/1 crt=494691'30825984 lcod 494687'30825982 mlcod 0'0 remapped mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [328,362,315] -> [328,362,315], acting [328,362,315] -> [328,315,352], acting_pr imary 328 -> 328, up_primary 328 -> 328, role 0 -> 0, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 11:50:50.585 7ff9340e4700 1 osd.328 pg_epoch: 494693 pg[4.9d9( v 494690'26564689 (494679'26561633,494690'26564689] local-lis/les=493206/493207 n=5699 ec=2307/2302 lis/c 493206/493206 les/c/f 493207/493207/0 494692/494693/485476) [328,362,315] /[328,315,352] r=0 lpr=494693 pi=[493206,494693)/1 crt=494690'26564689 lcod 494690'26564688 mlcod 0'0 remapped mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Primary 2021-09-05 11:50:50.585 7ff9358e7700 1 osd.328 pg_epoch: 494693 pg[4.1d9( v 494691'30825984 (494673'30822955,494691'30825984] local-lis/les=493179/493180 n=5765 ec=2302/2302 lis/c 493179/493179 les/c/f 493180/493180/0 494692/494693/328335) [328,362,315] /[328,315,352] r=0 lpr=494693 pi=[493179,494693)/1 crt=494691'30825984 lcod 494687'30825982 mlcod 0'0 remapped mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Primary 2021-09-05 11:50:51.617 7ff9340e4700 1 osd.328 pg_epoch: 494694 pg[4.9d9( v 494690'26564689 (494679'26561633,494690'26564689] local-lis/les=493206/493207 n=5699 ec=2307/2302 lis/c 493206/493206 les/c/f 493207/493207/0 494692/494694/485476) [328,362,315] /[328,315,352] backfill=[362] r=0 lpr=494694 pi=[493206,494694)/1 crt=494690'26564689 lcod 494690'26564688 mlcod 0'0 remapped mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [328,362,315] -> [328,362,315], acting [328,315,352] -> [328,315, 352], acting_primary 328 -> 328, up_primary 328 -> 328, role 0 -> 0, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 11:50:51.617 7ff9340e4700 1 osd.328 pg_epoch: 494694 pg[4.9d9( v 494690'26564689 (494679'26561633,494690'26564689] local-lis/les=493206/493207 n=5699 ec=2307/2302 lis/c 493206/493206 les/c/f 493207/493207/0 494692/494694/485476) [328,362,315] /[328,315,352] r=0 lpr=494694 pi=[493206,494694)/1 crt=494690'26564689 lcod 494690'26564688 mlcod 0'0 remapped mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Primary 2021-09-05 11:50:51.617 7ff9358e7700 1 osd.328 pg_epoch: 494694 pg[4.1d9( v 494691'30825984 (494673'30822955,494691'30825984] local-lis/les=493179/493180 n=5765 ec=2302/2302 lis/c 493179/493179 les/c/f 493180/493180/0 494692/494694/328335) [328,362,315] /[328,315,352] backfill=[362] r=0 lpr=494694 pi=[493179,494694)/1 crt=494691'30825984 lcod 494687'30825982 mlcod 0'0 remapped mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [328,362,315] -> [328,362,315], acting [328,315,352] -> [328,315, 352], acting_primary 328 -> 328, up_primary 328 -> 328, role 0 -> 0, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 11:50:51.617 7ff9358e7700 1 osd.328 pg_epoch: 494694 pg[4.1d9( v 494691'30825984 (494673'30822955,494691'30825984] local-lis/les=493179/493180 n=5765 ec=2302/2302 lis/c 493179/493179 les/c/f 493180/493180/0 494692/494694/328335) [328,362,315] /[328,315,352] r=0 lpr=494694 pi=[493179,494694)/1 crt=494691'30825984 lcod 494687'30825982 mlcod 0'0 remapped mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Primary 2021-09-05 11:50:52.657 7ff9340e4700 0 log_channel(cluster) log [DBG] : 4.9d9 starting backfill to osd.362 from (0'0,0'0] MAX to 494690'26564689 2021-09-05 11:50:52.657 7ff9330e2700 0 log_channel(cluster) log [DBG] : 4.1d9 starting backfill to osd.362 from (0'0,0'0] MAX to 494691'30825984 2021-09-05 12:10:03.354 7ff9358e7700 1 osd.328 pg_epoch: 494696 pg[4.1d9( v 494691'30825984 (494673'30822955,494691'30825984] local-lis/les=494694/494695 n=5765 ec=2302/2302 lis/c 494694/493179 les/c/f 494695/493180/0 494692/494696/328335) [328,362,315] /[328,315] backfill=[362] r=0 lpr=494696 pi=[493179,494696)/1 luod=0'0 rops=1 crt=494691'30825984 lcod 494687'30825982 mlcod 0'0 remapped+premerge+peered mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [328,362,315] -> [328,362,315], actin g [328,315,352] -> [328,315], acting_primary 328 -> 328, up_primary 328 -> 328, role 0 -> 0, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 12:10:03.354 7ff9368e9700 1 osd.328 pg_epoch: 494696 pg[4.9d9( v 494690'26564689 (494679'26561633,494690'26564689] local-lis/les=494694/494695 n=5699 ec=2307/2302 lis/c 494694/493206 les/c/f 494695/493207/0 494692/494696/485476) [328,362,315] /[328,315] backfill=[362] r=0 lpr=494696 pi=[493206,494696)/1 luod=0'0 crt=494690'26564689 lcod 494690'26564688 mlcod 0'0 remapped+premerge+peered mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [328,362,315] -> [328,362,315], acting [328, 315,352] -> [328,315], acting_primary 328 -> 328, up_primary 328 -> 328, role 0 -> 0, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 12:10:03.354 7ff9368e9700 1 osd.328 pg_epoch: 494696 pg[4.9d9( v 494690'26564689 (494679'26561633,494690'26564689] local-lis/les=494694/494695 n=5699 ec=2307/2302 lis/c 494694/493206 les/c/f 494695/493207/0 494692/494696/485476) [328,362,315] /[328,315] r=0 lpr=494696 pi=[493206,494696)/1 crt=494690'26564689 lcod 494690'26564688 mlcod 0'0 remapped mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Primary 2021-09-05 12:10:03.354 7ff9358e7700 1 osd.328 pg_epoch: 494696 pg[4.1d9( v 494691'30825984 (494673'30822955,494691'30825984] local-lis/les=494694/494695 n=5765 ec=2302/2302 lis/c 494694/493179 les/c/f 494695/493180/0 494692/494696/328335) [328,362,315] /[328,315] r=0 lpr=494696 pi=[493179,494696)/1 crt=494691'30825984 lcod 494687'30825982 mlcod 0'0 remapped mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Primary 2021-09-05 12:10:03.630 7ff9368e9700 0 log_channel(cluster) log [INF] : 4.9d9 continuing backfill to osd.362 from (494679'26561689,494690'26564689] MIN to 494690'26564689 2021-09-05 12:10:03.630 7ff9330e2700 0 log_channel(cluster) log [INF] : 4.1d9 continuing backfill to osd.362 from (494673'30822984,494691'30825984] 4:9b8e3888:::rbd_data.2c48cf82c3faa3.0000000000000dcd:head to 494691'30825984 2021-09-05 12:10:04.738 7ff9330e2700 1 osd.328 pg_epoch: 494698 pg[4.1d9( v 494691'30825984 (494673'30822955,494691'30825984] local-lis/les=494696/494697 n=5765 ec=2302/2302 lis/c 494696/493179 les/c/f 494697/493180/0 494692/494698/328335) [328,362,315] /[328,315] backfill=[362] r=0 lpr=494698 pi=[493179,494698)/1 luod=0'0 rops=1 crt=494691'30825984 lcod 494687'30825982 mlcod 0'0 remapped+premerge+peered mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [328,362,315] -> [328,362,315], actin g [328,315] -> [328,315], acting_primary 328 -> 328, up_primary 328 -> 328, role 0 -> 0, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 12:10:04.738 7ff9330e2700 1 osd.328 pg_epoch: 494698 pg[4.1d9( v 494691'30825984 (494673'30822955,494691'30825984] local-lis/les=494696/494697 n=5765 ec=2302/2302 lis/c 494696/493179 les/c/f 494697/493180/0 494692/494698/328335) [328,362,315] /[328,315] r=0 lpr=494698 pi=[493179,494698)/1 crt=494691'30825984 lcod 494687'30825982 mlcod 0'0 remapped mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Primary 2021-09-05 12:10:04.738 7ff9340e4700 1 osd.328 pg_epoch: 494698 pg[4.9d9( v 494690'26564689 (494679'26561633,494690'26564689] local-lis/les=494696/494697 n=5699 ec=2307/2302 lis/c 494696/493206 les/c/f 494697/493207/0 494692/494698/485476) [328,362,315] /[328,315] backfill=[362] r=0 lpr=494698 pi=[493206,494698)/1 luod=0'0 crt=494690'26564689 lcod 494690'26564688 mlcod 0'0 remapped+premerge+peered mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [328,362,315] -> [328,362,315], acting [328, 315] -> [328,315], acting_primary 328 -> 328, up_primary 328 -> 328, role 0 -> 0, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 12:10:04.738 7ff9340e4700 1 osd.328 pg_epoch: 494698 pg[4.9d9( v 494690'26564689 (494679'26561633,494690'26564689] local-lis/les=494696/494697 n=5699 ec=2307/2302 lis/c 494696/493206 les/c/f 494697/493207/0 494692/494698/485476) [328,362,315] /[328,315] r=0 lpr=494698 pi=[493206,494698)/1 crt=494690'26564689 lcod 494690'26564688 mlcod 0'0 remapped mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Primary 2021-09-05 12:10:05.726 7ff9368e9700 0 log_channel(cluster) log [INF] : 4.9d9 continuing backfill to osd.362 from (494679'26561689,494690'26564689] MIN to 494690'26564689 2021-09-05 12:10:05.726 7ff9330e2700 0 log_channel(cluster) log [INF] : 4.1d9 continuing backfill to osd.362 from (494673'30822984,494691'30825984] 4:9b8e3d6a:::rbd_data.c1ed8e3723dd3d.000000000001f461:head to 494691'30825984 2021-09-05 12:11:40.543 7ff9320e0700 0 osd.328 494705 do_command r=0 2021-09-05 12:11:40.551 7ff9320e0700 0 osd.328 494705 do_command r=0 2021-09-05 12:13:51.909 7ff9358e7700 1 osd.328 pg_epoch: 494716 pg[4.1d9( v 494713'30826259 (494675'30823255,494713'30826259] local-lis/les=494698/494699 n=5766 ec=2302/2302 lis/c 494698/493179 les/c/f 494699/493180/0 494692/494716/328335) [328,362,315] r=0 lpr=494716 pi=[493179,494716)/1 luod=0'0 crt=494713'30826259 lcod 494713'30826258 mlcod 0'0 active mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [328,362,315] -> [328,362,315], acting [328,315] -> [328,362,315], acting_primary 328 - > 328, up_primary 328 -> 328, role 0 -> 0, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 12:13:51.909 7ff9358e7700 1 osd.328 pg_epoch: 494716 pg[4.1d9( v 494713'30826259 (494675'30823255,494713'30826259] local-lis/les=494698/494699 n=5766 ec=2302/2302 lis/c 494698/493179 les/c/f 494699/493180/0 494692/494716/328335) [328,362,315] r=0 lpr=494716 pi=[493179,494716)/1 crt=494713'30826259 lcod 494713'30826258 mlcod 0'0 unknown mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Primary
osd.315.log
2021-09-05 11:50:49.525 7f506df40700 1 osd.315 pg_epoch: 494692 pg[4.1d9( v 494691'30825984 (494673'30822955,494691'30825984] local-lis/les=493179/493180 n=5765 ec=2302/2302 lis/c 493179/493179 les/c/f 493180/493180/0 494692/494692/328335) [328,362,315] r=2 lpr=494692 pi=[493179,494692)/1 luod=0'0 lua=494547'30756656 crt=494691'30825984 lcod 494687'30825982 active mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [328,352,315] -> [328,362,315], acting [328,352,315] -> [328,362,315], acting _primary 328 -> 328, up_primary 328 -> 328, role 2 -> 2, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 11:50:49.525 7f506ef42700 1 osd.315 pg_epoch: 494692 pg[4.9d9( v 494690'26564689 (494679'26561633,494690'26564689] local-lis/les=493206/493207 n=5699 ec=2307/2302 lis/c 493206/493206 les/c/f 493207/493207/0 494692/494692/485476) [328,362,315] r=2 lpr=494692 pi=[493206,494692)/1 luod=0'0 lua=494547'26450040 crt=494690'26564689 lcod 494690'26564688 active mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [328,352,315] -> [328,362,315], acting [328,352,315] -> [328,362,315], acting _primary 328 -> 328, up_primary 328 -> 328, role 2 -> 2, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 11:50:49.525 7f506df40700 1 osd.315 pg_epoch: 494692 pg[4.1d9( v 494691'30825984 (494673'30822955,494691'30825984] local-lis/les=493179/493180 n=5765 ec=2302/2302 lis/c 493179/493179 les/c/f 493180/493180/0 494692/494692/328335) [328,362,315] r=2 lpr=494692 pi=[493179,494692)/1 crt=494691'30825984 lcod 494687'30825982 unknown NOTIFY mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Stray 2021-09-05 11:50:49.525 7f506ef42700 1 osd.315 pg_epoch: 494692 pg[4.9d9( v 494690'26564689 (494679'26561633,494690'26564689] local-lis/les=493206/493207 n=5699 ec=2307/2302 lis/c 493206/493206 les/c/f 493207/493207/0 494692/494692/485476) [328,362,315] r=2 lpr=494692 pi=[493206,494692)/1 crt=494690'26564689 lcod 494690'26564688 unknown NOTIFY mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Stray 2021-09-05 11:50:50.581 7f506c73d700 1 osd.315 pg_epoch: 494693 pg[4.9d9( v 494690'26564689 (494679'26561633,494690'26564689] local-lis/les=493206/493207 n=5699 ec=2307/2302 lis/c 493206/493206 les/c/f 493207/493207/0 494692/494693/485476) [328,362,315] /[328,315,352] r=1 lpr=494693 pi=[493206,494693)/1 crt=494690'26564689 lcod 494690'26564688 remapped NOTIFY mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [328,362,315] -> [328,362,315], acting [328,362,315] -> [328,315,352], acting_prima ry 328 -> 328, up_primary 328 -> 328, role 2 -> 1, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 11:50:50.581 7f506c73d700 1 osd.315 pg_epoch: 494693 pg[4.9d9( v 494690'26564689 (494679'26561633,494690'26564689] local-lis/les=493206/493207 n=5699 ec=2307/2302 lis/c 493206/493206 les/c/f 493207/493207/0 494692/494693/485476) [328,362,315] /[328,315,352] r=1 lpr=494693 pi=[493206,494693)/1 crt=494690'26564689 lcod 494690'26564688 remapped NOTIFY mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Stray 2021-09-05 11:50:50.581 7f506df40700 1 osd.315 pg_epoch: 494693 pg[4.1d9( v 494691'30825984 (494673'30822955,494691'30825984] local-lis/les=493179/493180 n=5765 ec=2302/2302 lis/c 493179/493179 les/c/f 493180/493180/0 494692/494693/328335) [328,362,315] /[328,315,352] r=1 lpr=494693 pi=[493179,494693)/1 crt=494691'30825984 lcod 494687'30825982 remapped NOTIFY mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [328,362,315] -> [328,362,315], acting [328,362,315] -> [328,315,352], acting_prima ry 328 -> 328, up_primary 328 -> 328, role 2 -> 1, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 11:50:50.581 7f506df40700 1 osd.315 pg_epoch: 494693 pg[4.1d9( v 494691'30825984 (494673'30822955,494691'30825984] local-lis/les=493179/493180 n=5765 ec=2302/2302 lis/c 493179/493179 les/c/f 493180/493180/0 494692/494693/328335) [328,362,315] /[328,315,352] r=1 lpr=494693 pi=[493179,494693)/1 crt=494691'30825984 lcod 494687'30825982 remapped NOTIFY mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Stray 2021-09-05 11:50:51.645 7f506ef42700 1 osd.315 pg_epoch: 494694 pg[4.9d9( v 494690'26564689 (494679'26561633,494690'26564689] local-lis/les=493206/493207 n=5699 ec=2307/2302 lis/c 493206/493206 les/c/f 493207/493207/0 494692/494694/485476) [328,362,315] /[328,315,352] r=1 lpr=494694 pi=[493206,494694)/1 crt=494690'26564689 lcod 494690'26564688 remapped NOTIFY mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [328,362,315] -> [328,362,315], acting [328,315,352] -> [328,315,352], acting_prima ry 328 -> 328, up_primary 328 -> 328, role 1 -> 1, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 11:50:51.645 7f506ef42700 1 osd.315 pg_epoch: 494694 pg[4.9d9( v 494690'26564689 (494679'26561633,494690'26564689] local-lis/les=493206/493207 n=5699 ec=2307/2302 lis/c 493206/493206 les/c/f 493207/493207/0 494692/494694/485476) [328,362,315] /[328,315,352] r=1 lpr=494694 pi=[493206,494694)/1 crt=494690'26564689 lcod 494690'26564688 remapped NOTIFY mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Stray 2021-09-05 11:50:51.645 7f506b73b700 1 osd.315 pg_epoch: 494694 pg[4.1d9( v 494691'30825984 (494673'30822955,494691'30825984] local-lis/les=493179/493180 n=5765 ec=2302/2302 lis/c 493179/493179 les/c/f 493180/493180/0 494692/494694/328335) [328,362,315] /[328,315,352] r=1 lpr=494694 pi=[493179,494694)/1 crt=494691'30825984 lcod 494687'30825982 remapped NOTIFY mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [328,362,315] -> [328,362,315], acting [328,315,352] -> [328,315,352], acting_prima ry 328 -> 328, up_primary 328 -> 328, role 1 -> 1, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 11:50:51.645 7f506b73b700 1 osd.315 pg_epoch: 494694 pg[4.1d9( v 494691'30825984 (494673'30822955,494691'30825984] local-lis/les=493179/493180 n=5765 ec=2302/2302 lis/c 493179/493179 les/c/f 493180/493180/0 494692/494694/328335) [328,362,315] /[328,315,352] r=1 lpr=494694 pi=[493179,494694)/1 crt=494691'30825984 lcod 494687'30825982 remapped NOTIFY mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Stray 2021-09-05 12:10:02.597 7f506df40700 1 osd.315 pg_epoch: 494696 pg[4.7a6( v 494695'31404168 (494685'31401125,494695'31404168] local-lis/les=367719/367720 n=11285 ec=2305/2302 lis/c 367719/367719 les/c/f 367720/367720/0 494696/494696/494696) [333,315] r= 1 lpr=494696 pi=[367719,494696)/1 luod=0'0 lua=494547'31097439 crt=494695'31404168 lcod 494695'31404167 active mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [352,333,315] -> [333,315], acting [352,333,315] -> [333,315], acting_primary 35 2 -> 333, up_primary 352 -> 333, role 2 -> 1, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 12:10:02.597 7f506ef42700 1 osd.315 pg_epoch: 494696 pg[4.9d9( v 494690'26564689 (494679'26561633,494690'26564689] local-lis/les=494694/494695 n=5699 ec=2307/2302 lis/c 494694/493206 les/c/f 494695/493207/0 494692/494696/485476) [328,362,315] /[328,315] r=1 lpr=494696 pi=[493206,494696)/1 luod=0'0 crt=494690'26564689 lcod 494690'26564688 active+remapped mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [328,362,315] -> [328,362,315], acting [328,315,352] -> [328,315], acting_prim ary 328 -> 328, up_primary 328 -> 328, role 1 -> 1, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 12:10:02.597 7f506df40700 1 osd.315 pg_epoch: 494696 pg[4.7a6( v 494695'31404168 (494685'31401125,494695'31404168] local-lis/les=367719/367720 n=11285 ec=2305/2302 lis/c 367719/367719 les/c/f 367720/367720/0 494696/494696/494696) [333,315] r= 1 lpr=494696 pi=[367719,494696)/1 crt=494695'31404168 lcod 494695'31404167 unknown NOTIFY mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Stray 2021-09-05 12:10:02.597 7f506df40700 1 osd.315 pg_epoch: 494696 pg[4.1d9( v 494691'30825984 (494673'30822955,494691'30825984] local-lis/les=494694/494695 n=5765 ec=2302/2302 lis/c 494694/493179 les/c/f 494695/493180/0 494692/494696/328335) [328,362,315] /[328,315] r=1 lpr=494696 pi=[493179,494696)/1 luod=0'0 crt=494691'30825984 lcod 494687'30825982 active+remapped mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [328,362,315] -> [328,362,315], acting [328,315,352] -> [328,315], acting_prim ary 328 -> 328, up_primary 328 -> 328, role 1 -> 1, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 12:10:02.597 7f506df40700 1 osd.315 pg_epoch: 494696 pg[4.1d9( v 494691'30825984 (494673'30822955,494691'30825984] local-lis/les=494694/494695 n=5765 ec=2302/2302 lis/c 494694/493179 les/c/f 494695/493180/0 494692/494696/328335) [328,362,315] /[328,315] r=1 lpr=494696 pi=[493179,494696)/1 crt=494691'30825984 lcod 494687'30825982 remapped NOTIFY mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Stray 2021-09-05 12:10:02.597 7f506ef42700 1 osd.315 pg_epoch: 494696 pg[4.9d9( v 494690'26564689 (494679'26561633,494690'26564689] local-lis/les=494694/494695 n=5699 ec=2307/2302 lis/c 494694/493206 les/c/f 494695/493207/0 494692/494696/485476) [328,362,315] /[328,315] r=1 lpr=494696 pi=[493206,494696)/1 crt=494690'26564689 lcod 494690'26564688 remapped NOTIFY mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Stray 2021-09-05 12:10:04.765 7f506b73b700 1 osd.315 pg_epoch: 494698 pg[4.1d9( v 494691'30825984 (494673'30822955,494691'30825984] local-lis/les=494696/494697 n=5765 ec=2302/2302 lis/c 494696/493179 les/c/f 494697/493180/0 494692/494698/328335) [328,362,315] /[328,315] r=1 lpr=494698 pi=[493179,494698)/1 luod=0'0 crt=494691'30825984 lcod 494687'30825982 active+remapped mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [328,362,315] -> [328,362,315], acting [328,315] -> [328,315], acting_primary 328 -> 328, up_primary 328 -> 328, role 1 -> 1, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 12:10:04.765 7f506c73d700 1 osd.315 pg_epoch: 494698 pg[4.9d9( v 494690'26564689 (494679'26561633,494690'26564689] local-lis/les=494696/494697 n=5699 ec=2307/2302 lis/c 494696/493206 les/c/f 494697/493207/0 494692/494698/485476) [328,362,315] /[328,315] r=1 lpr=494698 pi=[493206,494698)/1 luod=0'0 crt=494690'26564689 lcod 494690'26564688 active+remapped mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [328,362,315] -> [328,362,315], acting [328,315] -> [328,315], acting_primary 328 -> 328, up_primary 328 -> 328, role 1 -> 1, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 12:10:04.765 7f506c73d700 1 osd.315 pg_epoch: 494698 pg[4.9d9( v 494690'26564689 (494679'26561633,494690'26564689] local-lis/les=494696/494697 n=5699 ec=2307/2302 lis/c 494696/493206 les/c/f 494697/493207/0 494692/494698/485476) [328,362,315] /[328,315] r=1 lpr=494698 pi=[493206,494698)/1 crt=494690'26564689 lcod 494690'26564688 remapped NOTIFY mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Stray 2021-09-05 12:10:04.765 7f506b73b700 1 osd.315 pg_epoch: 494698 pg[4.1d9( v 494691'30825984 (494673'30822955,494691'30825984] local-lis/les=494696/494697 n=5765 ec=2302/2302 lis/c 494696/493179 les/c/f 494697/493180/0 494692/494698/328335) [328,362,315] /[328,315] r=1 lpr=494698 pi=[493179,494698)/1 crt=494691'30825984 lcod 494687'30825982 remapped NOTIFY mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Stray 2021-09-05 12:10:19.449 7f506af3a700 1 osd.315 pg_epoch: 494700 pg[4.5c2( v 494699'46116981 (494683'46113941,494699'46116981] local-lis/les=493362/493363 n=11376 ec=2305/2302 lis/c 493362/493362 les/c/f 493363/494699/0 493361/494700/494700) [376,315,356 ] r=1 lpr=494700 pi=[493362,494700)/1 luod=0'0 lua=494547'45939362 crt=494699'46116981 lcod 494699'46116979 active mbc={} ps=[23f~1]] start_peering_interval up [376,315,356] -> [376,315,356], acting [113,315,356] -> [376,315,356], acting_primary 113 -> 3 76, up_primary 376 -> 376, role 1 -> 1, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 12:10:19.453 7f506af3a700 1 osd.315 pg_epoch: 494700 pg[4.5c2( v 494699'46116981 (494683'46113941,494699'46116981] local-lis/les=493362/493363 n=11376 ec=2305/2302 lis/c 493362/493362 les/c/f 493363/494699/0 493361/494700/494700) [376,315,356 ] r=1 lpr=494700 pi=[493362,494700)/1 crt=494699'46116981 lcod 494699'46116979 unknown NOTIFY mbc={} ps=[23f~1]] state<Start>: transitioning to Stray 2021-09-05 12:10:57.345 7f506b73b700 1 osd.315 pg_epoch: 494702 pg[4.7a6( v 494701'31404229 (494685'31401225,494701'31404229] local-lis/les=494696/494697 n=11285 ec=2305/2302 lis/c 494696/367719 les/c/f 494697/367720/0 494702/494702/494702) [352,333,315 ] r=2 lpr=494702 pi=[367719,494702)/1 luod=0'0 lua=494697'31404169 crt=494701'31404229 lcod 494701'31404228 active mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [333,315] -> [352,333,315], acting [333,315] -> [352,333,315], acting_primar y 333 -> 352, up_primary 333 -> 352, role 1 -> 2, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 12:10:57.345 7f506b73b700 1 osd.315 pg_epoch: 494702 pg[4.7a6( v 494701'31404229 (494685'31401225,494701'31404229] local-lis/les=494696/494697 n=11285 ec=2305/2302 lis/c 494696/367719 les/c/f 494697/367720/0 494702/494702/494702) [352,333,315 ] r=2 lpr=494702 pi=[367719,494702)/1 crt=494701'31404229 lcod 494701'31404228 unknown NOTIFY mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Stray 2021-09-05 12:13:51.939 7f506df40700 1 osd.315 pg_epoch: 494716 pg[4.1d9( v 494713'30826259 (494675'30823255,494713'30826259] local-lis/les=494698/494699 n=5766 ec=2302/2302 lis/c 494698/493179 les/c/f 494699/493180/0 494692/494716/328335) [328,362,315] r=2 lpr=494716 pi=[493179,494716)/1 luod=0'0 lua=494699'30826084 crt=494713'30826259 lcod 494713'30826258 active mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [328,362,315] -> [328,362,315], acting [328,315] -> [328,362,315], acting_pri mary 328 -> 328, up_primary 328 -> 328, role 1 -> 2, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 12:13:51.939 7f506df40700 1 osd.315 pg_epoch: 494716 pg[4.1d9( v 494713'30826259 (494675'30823255,494713'30826259] local-lis/les=494698/494699 n=5766 ec=2302/2302 lis/c 494698/493179 les/c/f 494699/493180/0 494692/494716/328335) [328,362,315] r=2 lpr=494716 pi=[493179,494716)/1 crt=494713'30826259 lcod 494713'30826258 unknown NOTIFY mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Stray
osd.352.log
2021-09-05 11:50:49.540 7f60f5ce4700 1 osd.352 pg_epoch: 494692 pg[4.9d9( v 494690'26564689 (494679'26561633,494690'26564689] local-lis/les=493206/493207 n=5699 ec=2307/2302 lis/c 493206/493206 les/c/f 493207/493207/0 494692/494692/485476) [328,362,315] r=-1 lpr=494692 pi=[493206,494692)/1 luod=0'0 lua=494547'26450040 crt=494690'26564689 lcod 494690'26564688 active mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [328,352,315] -> [328,362,315], acting [328,352,315] -> [328,362,315], actin g_primary 328 -> 328, up_primary 328 -> 328, role 1 -> -1, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 11:50:49.540 7f60f5ce4700 1 osd.352 pg_epoch: 494692 pg[4.9d9( v 494690'26564689 (494679'26561633,494690'26564689] local-lis/les=493206/493207 n=5699 ec=2307/2302 lis/c 493206/493206 les/c/f 493207/493207/0 494692/494692/485476) [328,362,315] r=-1 lpr=494692 pi=[493206,494692)/1 crt=494690'26564689 lcod 494690'26564688 unknown NOTIFY mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Stray 2021-09-05 11:50:49.540 7f60f24dd700 1 osd.352 pg_epoch: 494692 pg[4.1d9( v 494691'30825984 (494673'30822955,494691'30825984] local-lis/les=493179/493180 n=5765 ec=2302/2302 lis/c 493179/493179 les/c/f 493180/493180/0 494692/494692/328335) [328,362,315] r=-1 lpr=494692 pi=[493179,494692)/1 luod=0'0 lua=494547'30756656 crt=494691'30825984 lcod 494687'30825983 active mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [328,352,315] -> [328,362,315], acting [328,352,315] -> [328,362,315], actin g_primary 328 -> 328, up_primary 328 -> 328, role 1 -> -1, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 11:50:49.540 7f60f24dd700 1 osd.352 pg_epoch: 494692 pg[4.1d9( v 494691'30825984 (494673'30822955,494691'30825984] local-lis/les=493179/493180 n=5765 ec=2302/2302 lis/c 493179/493179 les/c/f 493180/493180/0 494692/494692/328335) [328,362,315] r=-1 lpr=494692 pi=[493179,494692)/1 crt=494691'30825984 lcod 494687'30825983 unknown NOTIFY mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Stray 2021-09-05 11:50:50.589 7f60f4ce2700 1 osd.352 pg_epoch: 494693 pg[4.1d9( v 494691'30825984 (494673'30822955,494691'30825984] local-lis/les=493179/493180 n=5765 ec=2302/2302 lis/c 493179/493179 les/c/f 493180/493180/0 494692/494693/328335) [328,362,315] /[328,315,352] r=2 lpr=494693 pi=[493179,494693)/1 crt=494691'30825984 lcod 494687'30825983 remapped NOTIFY mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [328,362,315] -> [328,362,315], acting [328,362,315] -> [328,315,352], acting_prima ry 328 -> 328, up_primary 328 -> 328, role -1 -> 2, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 11:50:50.589 7f60f4ce2700 1 osd.352 pg_epoch: 494693 pg[4.1d9( v 494691'30825984 (494673'30822955,494691'30825984] local-lis/les=493179/493180 n=5765 ec=2302/2302 lis/c 493179/493179 les/c/f 493180/493180/0 494692/494693/328335) [328,362,315] /[328,315,352] r=2 lpr=494693 pi=[493179,494693)/1 crt=494691'30825984 lcod 494687'30825983 remapped NOTIFY mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Stray 2021-09-05 11:50:50.593 7f60f5ce4700 1 osd.352 pg_epoch: 494693 pg[4.9d9( v 494690'26564689 (494679'26561633,494690'26564689] local-lis/les=493206/493207 n=5699 ec=2307/2302 lis/c 493206/493206 les/c/f 493207/493207/0 494692/494693/485476) [328,362,315] /[328,315,352] r=2 lpr=494693 pi=[493206,494693)/1 crt=494690'26564689 lcod 494690'26564688 remapped NOTIFY mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [328,362,315] -> [328,362,315], acting [328,362,315] -> [328,315,352], acting_prima ry 328 -> 328, up_primary 328 -> 328, role -1 -> 2, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 11:50:50.593 7f60f5ce4700 1 osd.352 pg_epoch: 494693 pg[4.9d9( v 494690'26564689 (494679'26561633,494690'26564689] local-lis/les=493206/493207 n=5699 ec=2307/2302 lis/c 493206/493206 les/c/f 493207/493207/0 494692/494693/485476) [328,362,315] /[328,315,352] r=2 lpr=494693 pi=[493206,494693)/1 crt=494690'26564689 lcod 494690'26564688 remapped NOTIFY mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Stray 2021-09-05 11:50:51.629 7f60f5ce4700 1 osd.352 pg_epoch: 494694 pg[4.9d9( v 494690'26564689 (494679'26561633,494690'26564689] local-lis/les=493206/493207 n=5699 ec=2307/2302 lis/c 493206/493206 les/c/f 493207/493207/0 494692/494694/485476) [328,362,315] /[328,315,352] r=2 lpr=494694 pi=[493206,494694)/1 crt=494690'26564689 lcod 494690'26564688 remapped NOTIFY mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [328,362,315] -> [328,362,315], acting [328,315,352] -> [328,315,352], acting_prima ry 328 -> 328, up_primary 328 -> 328, role 2 -> 2, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 11:50:51.629 7f60f4ce2700 1 osd.352 pg_epoch: 494694 pg[4.1d9( v 494691'30825984 (494673'30822955,494691'30825984] local-lis/les=493179/493180 n=5765 ec=2302/2302 lis/c 493179/493179 les/c/f 493180/493180/0 494692/494694/328335) [328,362,315] /[328,315,352] r=2 lpr=494694 pi=[493179,494694)/1 crt=494691'30825984 lcod 494687'30825983 remapped NOTIFY mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [328,362,315] -> [328,362,315], acting [328,315,352] -> [328,315,352], acting_prima ry 328 -> 328, up_primary 328 -> 328, role 2 -> 2, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 11:50:51.629 7f60f5ce4700 1 osd.352 pg_epoch: 494694 pg[4.9d9( v 494690'26564689 (494679'26561633,494690'26564689] local-lis/les=493206/493207 n=5699 ec=2307/2302 lis/c 493206/493206 les/c/f 493207/493207/0 494692/494694/485476) [328,362,315] /[328,315,352] r=2 lpr=494694 pi=[493206,494694)/1 crt=494690'26564689 lcod 494690'26564688 remapped NOTIFY mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Stray 2021-09-05 11:50:51.629 7f60f4ce2700 1 osd.352 pg_epoch: 494694 pg[4.1d9( v 494691'30825984 (494673'30822955,494691'30825984] local-lis/les=493179/493180 n=5765 ec=2302/2302 lis/c 493179/493179 les/c/f 493180/493180/0 494692/494694/328335) [328,362,315] /[328,315,352] r=2 lpr=494694 pi=[493179,494694)/1 crt=494691'30825984 lcod 494687'30825983 remapped NOTIFY mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Stray
osd.362.log
2021-09-05 11:50:52.685 7fb43cd78700 1 osd.362 pg_epoch: 494694 pg[4.9d9( empty local-lis/les=0/0 n=0 ec=2307/2302 lis/c 493206/493206 les/c/f 493207/493207/0 494692/494694/485476) [328,362,315]/[328,315,352] r=-1 lpr=494694 pi=[493206,494694)/1 crt=0'0 unknown mbc={}] state<Start>: transitioning to Stray 2021-09-05 11:50:52.685 7fb43e57b700 1 osd.362 pg_epoch: 494694 pg[4.1d9( empty local-lis/les=0/0 n=0 ec=2302/2302 lis/c 493179/493179 les/c/f 493180/493180/0 494692/494694/328335) [328,362,315]/[328,315,352] r=-1 lpr=494694 pi=[493179,494694)/1 crt=0'0 unknown mbc={}] state<Start>: transitioning to Stray
Then restart osd.352
2021-09-05 12:10:02.628 7fb43cd78700 1 osd.362 pg_epoch: 494696 pg[4.9d9( v 494690'26564689 (494679'26561689,494690'26564689] lb MIN (bitwise) local-lis/les=494694/494695 n=0 ec=2307/2302 lis/c 494694/493206 les/c/f 494695/493207/0 494692/494696/485476) [328,362,315]/[328,315] r=-1 lpr=494696 pi=[493206,494696)/1 luod=0'0 crt=494690'26564689 lcod 0'0 active+remapped mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [328,362,315] -> [328,362,315], acting [328,315,352] -> [328,315], acting_p rimary 328 -> 328, up_primary 328 -> 328, role -1 -> -1, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 12:10:02.628 7fb43cd78700 1 osd.362 pg_epoch: 494696 pg[4.9d9( v 494690'26564689 (494679'26561689,494690'26564689] lb MIN (bitwise) local-lis/les=494694/494695 n=0 ec=2307/2302 lis/c 494694/493206 les/c/f 494695/493207/0 494692/494696/485476) [328,362,315]/[328,315] r=-1 lpr=494696 pi=[493206,494696)/1 crt=494690'26564689 lcod 0'0 remapped NOTIFY mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Stray 2021-09-05 12:10:02.628 7fb43e57b700 1 osd.362 pg_epoch: 494696 pg[4.1d9( v 494691'30825984 (494673'30822984,494691'30825984] lb 4:9b8e3888:::rbd_data.2c48cf82c3faa3.0000000000000dcd:head (bitwise) local-lis/les=494694/494695 n=5101 ec=2302/2302 lis/c 4 94694/493179 les/c/f 494695/493180/0 494692/494696/328335) [328,362,315]/[328,315] r=-1 lpr=494696 pi=[493179,494696)/1 luod=0'0 crt=494691'30825984 active+remapped mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [328,362,315] -> [328,362, 315], acting [328,315,352] -> [328,315], acting_primary 328 -> 328, up_primary 328 -> 328, role -1 -> -1, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 12:10:02.628 7fb43e57b700 1 osd.362 pg_epoch: 494696 pg[4.1d9( v 494691'30825984 (494673'30822984,494691'30825984] lb 4:9b8e3888:::rbd_data.2c48cf82c3faa3.0000000000000dcd:head (bitwise) local-lis/les=494694/494695 n=5101 ec=2302/2302 lis/c 4 94694/493179 les/c/f 494695/493180/0 494692/494696/328335) [328,362,315]/[328,315] r=-1 lpr=494696 pi=[493179,494696)/1 crt=494691'30825984 remapped NOTIFY mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Stray 2021-09-05 12:10:04.724 7fb43cd78700 1 osd.362 pg_epoch: 494698 pg[4.9d9( v 494690'26564689 (494679'26561689,494690'26564689] lb MIN (bitwise) local-lis/les=494694/494695 n=0 ec=2307/2302 lis/c 494696/493206 les/c/f 494697/493207/0 494692/494698/485476) [328,362,315]/[328,315] r=-1 lpr=494698 pi=[493206,494698)/1 luod=0'0 crt=494690'26564689 lcod 0'0 active+remapped mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [328,362,315] -> [328,362,315], acting [328,315] -> [328,315], acting_prima ry 328 -> 328, up_primary 328 -> 328, role -1 -> -1, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 12:10:04.724 7fb43cd78700 1 osd.362 pg_epoch: 494698 pg[4.9d9( v 494690'26564689 (494679'26561689,494690'26564689] lb MIN (bitwise) local-lis/les=494694/494695 n=0 ec=2307/2302 lis/c 494696/493206 les/c/f 494697/493207/0 494692/494698/485476) [328,362,315]/[328,315] r=-1 lpr=494698 pi=[493206,494698)/1 crt=494690'26564689 lcod 0'0 remapped NOTIFY mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Stray 2021-09-05 12:10:04.724 7fb43bd76700 1 osd.362 pg_epoch: 494698 pg[4.1d9( v 494691'30825984 (494673'30822984,494691'30825984] lb 4:9b8e3d6a:::rbd_data.c1ed8e3723dd3d.000000000001f461:head (bitwise) local-lis/les=494694/494695 n=5107 ec=2302/2302 lis/c 4 94696/493179 les/c/f 494697/493180/0 494692/494698/328335) [328,362,315]/[328,315] r=-1 lpr=494698 pi=[493179,494698)/1 luod=0'0 crt=494691'30825984 active+remapped mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [328,362,315] -> [328,362, 315], acting [328,315] -> [328,315], acting_primary 328 -> 328, up_primary 328 -> 328, role -1 -> -1, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 12:10:04.724 7fb43bd76700 1 osd.362 pg_epoch: 494698 pg[4.1d9( v 494691'30825984 (494673'30822984,494691'30825984] lb 4:9b8e3d6a:::rbd_data.c1ed8e3723dd3d.000000000001f461:head (bitwise) local-lis/les=494694/494695 n=5107 ec=2302/2302 lis/c 4 94696/493179 les/c/f 494697/493180/0 494692/494698/328335) [328,362,315]/[328,315] r=-1 lpr=494698 pi=[493179,494698)/1 crt=494691'30825984 remapped NOTIFY mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Stray 2021-09-05 12:13:51.923 7fb43e57b700 1 osd.362 pg_epoch: 494716 pg[4.1d9( v 494713'30826259 (494675'30823255,494713'30826259] local-lis/les=494694/494695 n=5766 ec=2302/2302 lis/c 494698/493179 les/c/f 494699/493180/0 494692/494716/328335) [328,362,315] r=1 lpr=494716 pi=[493179,494716)/1 luod=0'0 lua=494699'30826084 crt=494713'30826259 active mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] start_peering_interval up [328,362,315] -> [328,362,315], acting [328,315] -> [328,362,315], acting_primary 328 -> 328, up_p rimary 328 -> 328, role -1 -> 1, features acting 4611087854035861503 upacting 4611087854035861503 2021-09-05 12:13:51.923 7fb43e57b700 1 osd.362 pg_epoch: 494716 pg[4.1d9( v 494713'30826259 (494675'30823255,494713'30826259] local-lis/les=494694/494695 n=5766 ec=2302/2302 lis/c 494698/493179 les/c/f 494699/493180/0 494692/494716/328335) [328,362,315] r=1 lpr=494716 pi=[493179,494716)/1 crt=494713'30826259 unknown NOTIFY mbc={} ps=[23f~1,408~1,4a2~1,4a4~1]] state<Start>: transitioning to Stray
This is some bug or operations issue (e.g. wrong mappigs)?
Files
Updated by Konstantin Shalygin over 2 years ago
- Affected Versions v14.2.22 added
Updated by Neha Ojha over 2 years ago
- Status changed from New to Need More Info
This is similar to https://tracker.ceph.com/issues/44684, which has already been fixed. It seems like the pgs are in backfill* states and therefore, we are waiting to merge, hence (+premerge). What's not clear is whether backfilling started after the merge targets were chosen by the mgr or before. Do you happen have the mgr logs from the same time, we are interested in logging from adjust_pgs? It'd be helpful if you could upload all the relevant logs using ceph-post-file.
Updated by Konstantin Shalygin over 2 years ago
Neha, sorry, logs already rotated by logrotate.
When I was removes all upmaps from this pool - all PG merges passed without problems. So now we just don't do upmaps + merges in one time to avoid problems like this
Updated by Konstantin Shalygin over 2 years ago
We can plan and spent time to setup staging cluster for this and try to reproduce it, if this a bug. With debug_mgr "10", as I can see, to see what mgr do
Updated by Neha Ojha over 2 years ago
Konstantin Shalygin wrote:
We can plan and spent time to setup staging cluster for this and try to reproduce it, if this a bug. With debug_mgr "10", as I can see, to see what mgr do
That would be great, thanks!
Updated by Markus Wennrich over 2 years ago
We had a similar outage.
We did try to increase the number of PGs on a bucket-index-pool:
$ ceph osd pool set s3.rgw.buckets.index pg_num 32
Then two PGs stuck inactive:
HEALTH_WARN Reduced data availability: 2 pgs inactive [WRN] PG_AVAILABILITY: Reduced data availability: 2 pgs inactive pg 5.e is stuck inactive for 23m, current state stale+remapped+premerge+backfill_wait+peered, last acting [10,24,26] pg 5.1e is stuck inactive for 23m, current state stale+remapped+premerge+backfill_wait+peered, last acting [10,24,26]
Some buckets were then unavailable.
After speeding things up with increasing osd-max-backfills and going to remapped+premerge+backfilling and then finally to remapped+backfilling, they got unstuck and buckets were available again.
It looks like pg_autoscaler (though in scale-up profile) tried to decrease immediately again:
debug 2021-10-27T08:39:57.310+0000 7fc6dcc00700 0 [progress INFO root] update: starting ev 6b2c4cbc-5d15-4046-b0fc-bd4ceb518c13 (PG autoscaler decreasing pool 5 PGs from 32 to 8)
POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL s3.rgw.buckets.index 5 27 3.6 MiB 1.08M 11 MiB 0 95 TiB
# ceph osd pool autoscale-status POOL SIZE TARGET SIZE RATE RAW CAPACITY RATIO TARGET RATIO EFFECTIVE RATIO BIAS PG_NUM NEW PG_NUM AUTOSCALE PROFILE s3.rgw.buckets.index 3661k 3.0 330.2T 0.0000 1.0 8 on scale-up
Updated by Konstantin Shalygin over 2 years ago
@Markus just for the record, what is your Ceph version? And what is your hardware for OSD's? The actual issue was on AMD CPU's
Thanks
Updated by Konstantin Shalygin over 2 years ago
Neha Ojha wrote:
Konstantin Shalygin wrote:
We can plan and spent time to setup staging cluster for this and try to reproduce it, if this a bug. With debug_mgr "10", as I can see, to see what mgr do
That would be great, thanks!
Today I was trying PG split, then PG merge couple of times on staging cluster and can't find actual reproduces for this issue (Intel CPU's)
Updated by Konstantin Shalygin over 1 year ago
- Status changed from Need More Info to Can't reproduce
We have never experienced this problem again