The transcript below shows a test on a cluster with new OSDs during rebalancing. The test shows the effect of - stopping+starting a new OSD (osd-phy6, ID 289), - stopping+starting an old OSD (osd-phy9, ID 74). In any test, we wait for peering to complete before taking "ceph status". Set noout and norebalance to avoid disturbances during test, wait for recovery to cease. Status at this point: # ceph status cluster: id: xxx-x-xxx health: HEALTH_WARN noout,norebalance flag(s) set 8235970/1498751416 objects misplaced (0.550%) 1 pools nearfull services: mon: 3 daemons, quorum ceph-01,ceph-02,ceph-03 mgr: ceph-01(active), standbys: ceph-03, ceph-02 mds: con-fs2-1/1/1 up {0=ceph-12=up:active}, 1 up:standby-replay osd: 297 osds: 272 up, 272 in; 46 remapped pgs flags noout,norebalance data: pools: 11 pools, 3215 pgs objects: 178.0 M objects, 491 TiB usage: 685 TiB used, 1.2 PiB / 1.9 PiB avail pgs: 8235970/1498751416 objects misplaced (0.550%) 3163 active+clean 40 active+remapped+backfill_wait 6 active+remapped+backfilling 5 active+clean+scrubbing+deep 1 active+clean+snaptrim io: client: 74 MiB/s rd, 42 MiB/s wr, 1.19 kop/s rd, 889 op/s wr # docker stop osd-phy6 osd-phy6 # ceph status cluster: id: xxx-x-xxx health: HEALTH_WARN noout,norebalance flag(s) set 1 osds down 8342724/1498792326 objects misplaced (0.557%) Degraded data redundancy: 5717609/1498792326 objects degraded (0.381%), 74 pgs degraded 1 pools nearfull services: mon: 3 daemons, quorum ceph-01,ceph-02,ceph-03 mgr: ceph-01(active), standbys: ceph-03, ceph-02 mds: con-fs2-1/1/1 up {0=ceph-12=up:active}, 1 up:standby-replay osd: 297 osds: 271 up, 272 in; 46 remapped pgs flags noout,norebalance data: pools: 11 pools, 3215 pgs objects: 178.0 M objects, 491 TiB usage: 685 TiB used, 1.2 PiB / 1.9 PiB avail pgs: 5717609/1498792326 objects degraded (0.381%) 8342724/1498792326 objects misplaced (0.557%) 3089 active+clean 74 active+undersized+degraded 31 active+remapped+backfill_wait 11 active+remapped+backfilling 5 active+clean+scrubbing+deep 4 active+clean+remapped+snaptrim 1 active+clean+scrubbing io: client: 69 MiB/s rd, 45 MiB/s wr, 1.28 kop/s rd, 838 op/s wr # ceph health detail HEALTH_WARN noout,norebalance flag(s) set; 1 osds down; 8342692/1498794289 objects misplaced (0.557%); Degraded data redundancy: 5717610/1498794289 objects degraded (0.381%), 74 pgs degraded, 74 pgs undersized; 1 pools nearfull OSDMAP_FLAGS noout,norebalance flag(s) set OSD_DOWN 1 osds down osd.289 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-05) is down OBJECT_MISPLACED 8342692/1498794289 objects misplaced (0.557%) PG_DEGRADED Degraded data redundancy: 5717610/1498794289 objects degraded (0.381%), 74 pgs degraded, 74 pgs undersized pg 11.2 is stuck undersized for 70.197385, current state active+undersized+degraded, last acting [87,292,2147483647,296,229,168,0,263] pg 11.16 is stuck undersized for 70.178478, current state active+undersized+degraded, last acting [2147483647,181,60,233,237,294,293,292] pg 11.1f is stuck undersized for 70.190040, current state active+undersized+degraded, last acting [230,238,182,292,84,2147483647,86,239] pg 11.39 is stuck undersized for 70.193683, current state active+undersized+degraded, last acting [158,148,293,73,168,2,2147483647,236] pg 11.3b is stuck undersized for 70.200823, current state active+undersized+degraded, last acting [2147483647,85,229,145,170,172,0,230] pg 11.47 is stuck undersized for 70.196419, current state active+undersized+degraded, last acting [3,296,2147483647,0,233,84,182,238] pg 11.59 is stuck undersized for 70.190002, current state active+undersized+degraded, last acting [2147483647,76,73,235,156,263,234,172] pg 11.63 is stuck undersized for 70.160846, current state active+undersized+degraded, last acting [0,146,1,156,2147483647,228,172,238] pg 11.66 is stuck undersized for 70.086237, current state active+undersized+degraded, last acting [291,159,296,233,2147483647,293,170,145] pg 11.6d is stuck undersized for 70.210387, current state active+undersized+degraded, last acting [84,235,73,290,295,2147483647,0,183] pg 11.7b is stuck undersized for 70.202578, current state active+undersized+degraded, last acting [2147483647,146,293,294,296,181,0,263] pg 11.7d is stuck undersized for 70.178488, current state active+undersized+degraded, last acting [294,2,263,2147483647,170,237,292,235] pg 11.7f is active+undersized+degraded, acting [148,232,2147483647,230,87,236,168,72] pg 11.146 is stuck undersized for 70.197744, current state active+undersized+degraded, last acting [235,183,156,295,2147483647,294,146,260] pg 11.155 is stuck undersized for 70.203091, current state active+undersized+degraded, last acting [73,72,170,259,260,63,84,2147483647] pg 11.15d is stuck undersized for 70.135909, current state active+undersized+degraded, last acting [259,182,0,63,234,294,233,2147483647] pg 11.171 is stuck undersized for 70.209391, current state active+undersized+degraded, last acting [170,168,232,72,231,172,2147483647,237] pg 11.176 is stuck undersized for 70.202583, current state active+undersized+degraded, last acting [146,237,181,2147483647,294,72,236,293] pg 11.177 is stuck undersized for 70.192564, current state active+undersized+degraded, last acting [156,146,236,235,63,2147483647,3,291] pg 11.179 is stuck undersized for 70.190284, current state active+undersized+degraded, last acting [87,156,233,86,2147483647,172,259,158] pg 11.17e is stuck undersized for 70.188938, current state active+undersized+degraded, last acting [3,231,290,260,76,183,2147483647,293] pg 11.181 is stuck undersized for 70.175985, current state active+undersized+degraded, last acting [2147483647,290,239,148,1,228,145,2] pg 11.188 is stuck undersized for 70.208638, current state active+undersized+degraded, last acting [2147483647,170,237,172,291,168,232,85] pg 11.18b is stuck undersized for 70.186336, current state active+undersized+degraded, last acting [233,148,228,87,2147483647,182,235,0] pg 11.18f is stuck undersized for 70.197416, current state active+undersized+degraded, last acting [73,237,238,2147483647,156,0,292,182] pg 11.19d is stuck undersized for 70.083071, current state active+undersized+degraded, last acting [291,172,146,145,238,2147483647,296,231] pg 11.1a5 is stuck undersized for 70.184859, current state active+undersized+degraded, last acting [293,145,2,230,159,239,85,2147483647] pg 11.1a6 is stuck undersized for 70.209851, current state active+undersized+degraded, last acting [229,145,158,296,0,292,2147483647,239] pg 11.1ac is stuck undersized for 70.192130, current state active+undersized+degraded, last acting [234,84,2147483647,86,239,183,294,232] pg 11.1b0 is stuck undersized for 70.180993, current state active+undersized+degraded, last acting [168,293,290,2,2147483647,159,296,73] pg 11.1b1 is stuck undersized for 70.175329, current state active+undersized+degraded, last acting [172,259,168,260,73,2147483647,146,263] pg 11.1b7 is stuck undersized for 70.208713, current state active+undersized+degraded, last acting [263,172,2147483647,259,0,87,145,228] pg 11.1c1 is stuck undersized for 70.170314, current state active+undersized+degraded, last acting [182,148,263,293,2,2147483647,228,294] pg 11.1c3 is stuck undersized for 70.192088, current state active+undersized+degraded, last acting [234,290,63,239,85,156,76,2147483647] pg 11.1c7 is stuck undersized for 70.192194, current state active+undersized+degraded, last acting [1,2147483647,263,232,86,234,84,172] pg 11.1dd is stuck undersized for 70.183525, current state active+undersized+degraded, last acting [293,172,295,156,170,237,2147483647,86] pg 11.1de is stuck undersized for 69.972952, current state active+undersized+degraded, last acting [296,293,76,63,231,146,2147483647,168] pg 11.1e8 is stuck undersized for 70.172003, current state active+undersized+degraded, last acting [172,3,290,229,236,156,2147483647,228] pg 11.1f2 is stuck undersized for 70.196870, current state active+undersized+degraded, last acting [234,0,159,2147483647,232,73,290,181] pg 11.1f5 is stuck undersized for 70.190841, current state active+undersized+degraded, last acting [238,234,73,2147483647,158,291,172,168] pg 11.1fc is stuck undersized for 70.181133, current state active+undersized+degraded, last acting [172,86,85,230,182,2147483647,238,233] pg 11.1fd is stuck undersized for 70.221124, current state active+undersized+degraded, last acting [72,145,237,293,2147483647,60,87,172] pg 11.203 is stuck undersized for 70.193700, current state active+undersized+degraded, last acting [2147483647,235,168,60,87,63,295,230] pg 11.20d is stuck undersized for 70.197909, current state active+undersized+degraded, last acting [236,172,73,182,228,168,2147483647,293] pg 11.20f is stuck undersized for 70.196571, current state active+undersized+degraded, last acting [85,84,76,60,238,233,159,2147483647] pg 11.211 is stuck undersized for 70.197522, current state active+undersized+degraded, last acting [156,2147483647,170,234,0,238,1,231] pg 11.212 is stuck undersized for 70.201683, current state active+undersized+degraded, last acting [148,2147483647,85,182,84,232,86,230] pg 11.21e is stuck undersized for 70.202044, current state active+undersized+degraded, last acting [146,156,159,2147483647,230,238,239,2] pg 11.224 is stuck undersized for 70.095494, current state active+undersized+degraded, last acting [291,148,237,2147483647,170,1,156,233] pg 11.22d is stuck undersized for 70.195735, current state active+undersized+degraded, last acting [3,168,296,158,292,236,0,2147483647] pg 11.22f is stuck undersized for 70.192480, current state active+undersized+degraded, last acting [1,2147483647,292,60,296,231,259,72] POOL_NEAR_FULL 1 pools nearfull pool 'sr-rbd-data-one-hdd' has 164 TiB (max 200 TiB) # ceph pg 11.2 query | jq ".acting,.up,.recovery_state" [ 87, 292, 2147483647, 296, 229, 168, 0, 263 ] [ 87, 292, 2147483647, 296, 229, 168, 0, 263 ] [ { "name": "Started/Primary/Active", "enter_time": "2020-08-11 10:35:49.109129", "might_have_unfound": [], "recovery_progress": { "backfill_targets": [], "waiting_on_backfill": [], "last_backfill_started": "MIN", "backfill_info": { "begin": "MIN", "end": "MIN", "objects": [] }, "peer_backfill_info": [], "backfills_in_flight": [], "recovering": [], "pg_backend": { "recovery_ops": [], "read_ops": [] } }, "scrub": { "scrubber.epoch_start": "0", "scrubber.active": false, "scrubber.state": "INACTIVE", "scrubber.start": "MIN", "scrubber.end": "MIN", "scrubber.max_end": "MIN", "scrubber.subset_last_update": "0'0", "scrubber.deep": false, "scrubber.waiting_on_whom": [] } }, { "name": "Started", "enter_time": "2020-08-11 10:35:48.137595" } ] # docker start osd-phy6 osd-phy6 # After starting the OSD again, the cluster almost recovers. The PG showing # up as backfill_toofull is due to a known bug (fixed in 13.2.10?). No # degraded objects, just misplaced ones. # ceph status cluster: id: xxx-x-xxx health: HEALTH_ERR noout,norebalance flag(s) set 8181843/1498795556 objects misplaced (0.546%) Degraded data redundancy (low space): 1 pg backfill_toofull 1 pools nearfull services: mon: 3 daemons, quorum ceph-01,ceph-02,ceph-03 mgr: ceph-01(active), standbys: ceph-03, ceph-02 mds: con-fs2-1/1/1 up {0=ceph-12=up:active}, 1 up:standby-replay osd: 297 osds: 272 up, 272 in; 46 remapped pgs flags noout,norebalance data: pools: 11 pools, 3215 pgs objects: 178.0 M objects, 491 TiB usage: 685 TiB used, 1.2 PiB / 1.9 PiB avail pgs: 8181843/1498795556 objects misplaced (0.546%) 3163 active+clean 39 active+remapped+backfill_wait 6 active+remapped+backfilling 5 active+clean+scrubbing+deep 1 active+remapped+backfill_toofull 1 active+clean+snaptrim io: client: 35 MiB/s rd, 23 MiB/s wr, 672 op/s rd, 686 op/s wr # docker stop osd-phy9 osd-phy9 # After stopping an old OSD, we observe an immediate degradation. The cluster seems # to loose track of objects already here. The recovery operation does not stop in # contrast to the situation when stopping a new OSD as seen above, where it shows up # only temporarily. # ceph status cluster: id: xxx-x-xxx health: HEALTH_WARN noout,norebalance flag(s) set 1 osds down 7967641/1498798381 objects misplaced (0.532%) Degraded data redundancy: 5763425/1498798381 objects degraded (0.385%), 75 pgs degraded 1 pools nearfull services: mon: 3 daemons, quorum ceph-01,ceph-02,ceph-03 mgr: ceph-01(active), standbys: ceph-03, ceph-02 mds: con-fs2-1/1/1 up {0=ceph-12=up:active}, 1 up:standby-replay osd: 297 osds: 271 up, 272 in; 46 remapped pgs flags noout,norebalance data: pools: 11 pools, 3215 pgs objects: 178.0 M objects, 491 TiB usage: 685 TiB used, 1.2 PiB / 1.9 PiB avail pgs: 5763425/1498798381 objects degraded (0.385%) 7967641/1498798381 objects misplaced (0.532%) 3092 active+clean 70 active+undersized+degraded 41 active+remapped+backfill_wait 4 active+clean+scrubbing+deep 4 active+undersized+degraded+remapped+backfilling 2 active+clean+scrubbing 1 active+undersized+degraded+remapped+backfill_wait 1 active+clean+snaptrim io: client: 76 MiB/s rd, 76 MiB/s wr, 736 op/s rd, 881 op/s wr recovery: 93 MiB/s, 23 objects/s # ceph health detail HEALTH_WARN noout,norebalance flag(s) set; 1 osds down; 7966306/1498798501 objects misplaced (0.532%); Degraded data redundancy: 5762977/1498798501 objects degraded (0.385%), 75 pgs degraded; 1 pools nearfull OSDMAP_FLAGS noout,norebalance flag(s) set OSD_DOWN 1 osds down osd.74 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-05) is down OBJECT_MISPLACED 7966306/1498798501 objects misplaced (0.532%) PG_DEGRADED Degraded data redundancy: 5762977/1498798501 objects degraded (0.385%), 75 pgs degraded pg 11.4 is active+undersized+degraded, acting [86,2147483647,237,235,182,63,231,84] pg 11.5 is active+undersized+degraded, acting [1,2147483647,183,0,239,145,293,170] pg 11.a is active+undersized+degraded+remapped+backfilling, acting [170,156,148,2147483647,234,86,236,232] pg 11.b is active+undersized+degraded, acting [3,228,0,146,239,292,145,2147483647] pg 11.e is active+undersized+degraded, acting [237,181,73,183,72,290,2147483647,295] pg 11.1b is active+undersized+degraded, acting [85,183,72,2147483647,156,232,263,146] pg 11.1d is active+undersized+degraded, acting [290,296,183,86,293,2147483647,236,3] pg 11.37 is active+undersized+degraded, acting [60,233,181,183,2147483647,296,87,86] pg 11.50 is active+undersized+degraded+remapped+backfill_wait, acting [231,259,228,87,182,156,2147483647,172] pg 11.52 is active+undersized+degraded, acting [237,60,1,2147483647,233,232,292,86] pg 11.57 is active+undersized+degraded, acting [231,259,230,170,72,87,181,2147483647] pg 11.5a is active+undersized+degraded, acting [290,2147483647,237,183,293,84,295,1] pg 11.5c is active+undersized+degraded, acting [84,2147483647,259,0,85,234,146,148] pg 11.62 is active+undersized+degraded, acting [182,294,293,2147483647,63,234,181,0] pg 11.68 is active+undersized+degraded, acting [158,296,229,168,76,3,159,2147483647] pg 11.6a is active+undersized+degraded, acting [2147483647,288,238,172,1,237,0,290] pg 11.73 is active+undersized+degraded, acting [236,234,259,2147483647,170,63,3,0] pg 11.77 is active+undersized+degraded, acting [2147483647,72,87,183,236,156,290,293] pg 11.78 is active+undersized+degraded, acting [172,230,236,156,294,60,2147483647,76] pg 11.84 is active+undersized+degraded, acting [86,84,239,296,294,182,2147483647,293] pg 11.87 is active+undersized+degraded+remapped+backfilling, acting [148,60,231,260,235,87,2147483647,181] pg 11.8b is active+undersized+degraded, acting [263,170,2147483647,259,296,172,73,76] pg 11.15a is active+undersized+degraded, acting [2147483647,260,182,0,263,73,159,288] pg 11.15f is active+undersized+degraded+remapped+backfilling, acting [146,233,2147483647,76,234,172,181,229] pg 11.162 is active+undersized+degraded, acting [84,294,230,2,293,290,2147483647,295] pg 11.16d is active+undersized+degraded, acting [236,230,2147483647,183,0,1,235,181] pg 11.172 is active+undersized+degraded, acting [181,148,237,3,231,293,76,2147483647] pg 11.185 is active+undersized+degraded, acting [296,0,236,238,2147483647,294,181,146] pg 11.18a is active+undersized+degraded, acting [0,2147483647,159,145,293,233,85,146] pg 11.192 is active+undersized+degraded, acting [148,76,170,296,295,2147483647,3,235] pg 11.193 is active+undersized+degraded, acting [2147483647,148,295,230,232,168,76,290] pg 11.198 is active+undersized+degraded, acting [260,76,87,2147483647,145,183,229,239] pg 11.19a is active+undersized+degraded, acting [146,294,230,238,2147483647,0,295,288] pg 11.1a1 is active+undersized+degraded, acting [84,183,294,2147483647,234,170,263,238] pg 11.1a7 is active+undersized+degraded, acting [63,236,158,84,86,237,87,2147483647] pg 11.1ae is active+undersized+degraded, acting [296,172,238,2147483647,170,288,294,295] pg 11.1c5 is active+undersized+degraded, acting [76,172,236,232,2147483647,296,288,170] pg 11.1c6 is active+undersized+degraded, acting [236,72,230,170,2147483647,238,181,148] pg 11.1d1 is active+undersized+degraded, acting [259,170,291,3,156,2147483647,292,296] pg 11.1d4 is active+undersized+degraded, acting [263,228,182,84,2,2147483647,259,87] pg 11.1e1 is active+undersized+degraded, acting [158,145,233,1,259,296,2,2147483647] pg 11.1ea is active+undersized+degraded, acting [84,183,260,259,85,60,2,2147483647] pg 11.1ec is active+undersized+degraded, acting [292,293,233,2,2147483647,85,288,146] pg 11.1ed is active+undersized+degraded, acting [156,237,293,233,148,2147483647,291,85] pg 11.1ee is active+undersized+degraded, acting [1,229,0,63,228,2147483647,233,156] pg 11.201 is active+undersized+degraded, acting [229,239,296,63,76,294,182,2147483647] pg 11.206 is active+undersized+degraded, acting [235,288,76,158,296,263,85,2147483647] pg 11.20a is active+undersized+degraded, acting [158,1,263,232,0,230,292,2147483647] pg 11.218 is active+undersized+degraded, acting [0,296,87,2147483647,263,148,156,232] pg 11.21a is active+undersized+degraded, acting [2147483647,230,159,231,60,235,73,291] pg 11.21b is active+undersized+degraded, acting [84,159,238,87,291,230,2147483647,182] POOL_NEAR_FULL 1 pools nearfull pool 'sr-rbd-data-one-hdd' has 164 TiB (max 200 TiB) # ceph pg 11.4 query | jq ".acting,.up,.recovery_state" [ 86, 2147483647, 237, 235, 182, 63, 231, 84 ] [ 86, 2147483647, 237, 235, 182, 63, 231, 84 ] [ { "name": "Started/Primary/Active", "enter_time": "2020-08-11 10:41:55.983304", "might_have_unfound": [], "recovery_progress": { "backfill_targets": [], "waiting_on_backfill": [], "last_backfill_started": "MIN", "backfill_info": { "begin": "MIN", "end": "MIN", "objects": [] }, "peer_backfill_info": [], "backfills_in_flight": [], "recovering": [], "pg_backend": { "recovery_ops": [], "read_ops": [ { "tid": 8418509, "to_read": "{11:2003536f:::rbd_data.1.a508f96b8b4567.000000000000182c:head=read_request_t(to_read=[1646592,24576,0], need={63(5)=[0,1],86(0)=[0,1],182(4)=[0,1],231(6)=[0,1],235(3)=[0,1],237(2)=[0,1]}, want_attrs=0)}", "complete": "{11:2003536f:::rbd_data.1.a508f96b8b4567.000000000000182c:head=read_result_t(r=0, errors={}, noattrs, returned=(1646592, 24576, [63(5),4096, 86(0),4096, 231(6),4096, 235(3),4096, 237(2),4096]))}", "priority": 127, "obj_to_source": "{11:2003536f:::rbd_data.1.a508f96b8b4567.000000000000182c:head=63(5),86(0),182(4),231(6),235(3),237(2)}", "source_to_obj": "{63(5)=11:2003536f:::rbd_data.1.a508f96b8b4567.000000000000182c:head,86(0)=11:2003536f:::rbd_data.1.a508f96b8b4567.000000000000182c:head,182(4)=11:2003536f:::rbd_data.1.a508f96b8b4567.000000000000182c:head,231(6)=11:2003536f:::rbd_data.1.a508f96b8b4567.000000000000182c:head,235(3)=11:2003536f:::rbd_data.1.a508f96b8b4567.000000000000182c:head,237(2)=11:2003536f:::rbd_data.1.a508f96b8b4567.000000000000182c:head}", "in_progress": "182(4)" } ] } }, "scrub": { "scrubber.epoch_start": "0", "scrubber.active": false, "scrubber.state": "INACTIVE", "scrubber.start": "MIN", "scrubber.end": "MIN", "scrubber.max_end": "MIN", "scrubber.subset_last_update": "0'0", "scrubber.deep": false, "scrubber.waiting_on_whom": [] } }, { "name": "Started", "enter_time": "2020-08-11 10:41:55.202575" } ] # docker start osd-phy9 osd-phy9 # After starting the old OSD, a lot of objects remain degraded. It looks like PGs that were # in state "...+remapped+backfilling" are affected. All others seem to recover, see below. # ceph status cluster: id: xxx-x-xxx health: HEALTH_ERR noout,norebalance flag(s) set 7954306/1498800854 objects misplaced (0.531%) Degraded data redundancy: 208493/1498800854 objects degraded (0.014%), 3 pgs degraded, 3 pgs undersized Degraded data redundancy (low space): 4 pgs backfill_toofull 1 pools nearfull services: mon: 3 daemons, quorum ceph-01,ceph-02,ceph-03 mgr: ceph-01(active), standbys: ceph-03, ceph-02 mds: con-fs2-1/1/1 up {0=ceph-12=up:active}, 1 up:standby-replay osd: 297 osds: 272 up, 272 in; 46 remapped pgs flags noout,norebalance data: pools: 11 pools, 3215 pgs objects: 178.0 M objects, 491 TiB usage: 685 TiB used, 1.2 PiB / 1.9 PiB avail pgs: 208493/1498800854 objects degraded (0.014%) 7954306/1498800854 objects misplaced (0.531%) 3162 active+clean 39 active+remapped+backfill_wait 6 active+clean+scrubbing+deep 4 active+remapped+backfill_toofull 3 active+undersized+degraded+remapped+backfilling 1 active+clean+snaptrim io: client: 111 MiB/s rd, 42 MiB/s wr, 763 op/s rd, 750 op/s wr recovery: 66 MiB/s, 16 objects/s # ceph health detail HEALTH_ERR noout,norebalance flag(s) set; 7953632/1498800881 objects misplaced (0.531%); Degraded data redundancy: 208184/1498800881 objects degraded (0.014%), 3 pgs degraded, 3 pgs undersized; Degraded data redundancy (low space): 4 pgs backfill_toofull; 1 pools nearfull OSDMAP_FLAGS noout,norebalance flag(s) set OBJECT_MISPLACED 7953632/1498800881 objects misplaced (0.531%) PG_DEGRADED Degraded data redundancy: 208184/1498800881 objects degraded (0.014%), 3 pgs degraded, 3 pgs undersized pg 11.a is stuck undersized for 311.488352, current state active+undersized+degraded+remapped+backfilling, last acting [170,156,148,2147483647,234,86,236,232] pg 11.87 is stuck undersized for 311.487625, current state active+undersized+degraded+remapped+backfilling, last acting [148,60,231,260,235,87,2147483647,181] pg 11.ed is stuck undersized for 311.465765, current state active+undersized+degraded+remapped+backfilling, last acting [233,2147483647,156,259,159,182,230,85] PG_DEGRADED_FULL Degraded data redundancy (low space): 4 pgs backfill_toofull pg 11.8 is active+remapped+backfill_wait+backfill_toofull, acting [86,158,237,85,159,259,144,263] pg 11.5d is active+remapped+backfill_wait+backfill_toofull, acting [263,158,230,73,183,84,2,169] pg 11.165 is active+remapped+backfill_wait+backfill_toofull, acting [60,148,234,73,2,229,84,180] pg 11.1f0 is active+remapped+backfill_wait+backfill_toofull, acting [237,148,2,238,169,231,60,87] POOL_NEAR_FULL 1 pools nearfull pool 'sr-rbd-data-one-hdd' has 164 TiB (max 200 TiB) # ceph pg 11.4 query | jq ".acting,.up,.recovery_state" [ 86, 74, 237, 235, 182, 63, 231, 84 ] [ 86, 74, 237, 235, 182, 63, 231, 84 ] [ { "name": "Started/Primary/Active", "enter_time": "2020-08-11 10:46:20.882683", "might_have_unfound": [], "recovery_progress": { "backfill_targets": [], "waiting_on_backfill": [], "last_backfill_started": "MIN", "backfill_info": { "begin": "MIN", "end": "MIN", "objects": [] }, "peer_backfill_info": [], "backfills_in_flight": [], "recovering": [], "pg_backend": { "recovery_ops": [], "read_ops": [] } }, "scrub": { "scrubber.epoch_start": "0", "scrubber.active": false, "scrubber.state": "INACTIVE", "scrubber.start": "MIN", "scrubber.end": "MIN", "scrubber.max_end": "MIN", "scrubber.subset_last_update": "0'0", "scrubber.deep": false, "scrubber.waiting_on_whom": [] } }, { "name": "Started", "enter_time": "2020-08-11 10:46:19.736862" } ] # ceph pg 11.a query | jq ".acting,.up,.recovery_state" [ 170, 156, 148, 2147483647, 234, 86, 236, 232 ] [ 170, 156, 292, 289, 234, 86, 236, 232 ] [ { "name": "Started/Primary/Active", "enter_time": "2020-08-11 10:41:55.982261", "might_have_unfound": [ { "osd": "74(3)", "status": "already probed" } ], "recovery_progress": { "backfill_targets": [ "289(3)", "292(2)" ], "waiting_on_backfill": [], "last_backfill_started": "11:500368d3:::rbd_data.1.af0b536b8b4567.000000000062aef2:head", "backfill_info": { "begin": "11:5003695e:::rbd_data.1.318f016b8b4567.000000000004aa48:head", "end": "11:5003e7ba:::rbd_data.1.ac314b6b8b4567.00000000000a3032:head", "objects": [ { "object": "11:5003695e:::rbd_data.1.318f016b8b4567.000000000004aa48:head", "version": "66191'195037" }, [... many many similar entries removed ...] { "object": "11:5003e76b:::rbd_data.1.b023c26b8b4567.0000000000b8efd1:head", "version": "181835'908372" } ] }, "peer_backfill_info": [ "289(3)", { "begin": "MAX", "end": "MAX", "objects": [] }, "292(2)", { "begin": "MAX", "end": "MAX", "objects": [] } ], "backfills_in_flight": [ "11:500368d3:::rbd_data.1.af0b536b8b4567.000000000062aef2:head" ], "recovering": [ "11:500368d3:::rbd_data.1.af0b536b8b4567.000000000062aef2:head" ], "pg_backend": { "recovery_ops": [ { "hoid": "11:500368d3:::rbd_data.1.af0b536b8b4567.000000000062aef2:head", "v": "178993'819609", "missing_on": "289(3),292(2)", "missing_on_shards": "2,3", "recovery_info": "ObjectRecoveryInfo(11:500368d3:::rbd_data.1.af0b536b8b4567.000000000062aef2:head@178993'819609, size: 4194304, copy_subset: [], clone_subset: {}, snapset: 0=[]:{})", "recovery_progress": "ObjectRecoveryProgress(!first, data_recovered_to:4202496, data_complete:true, omap_recovered_to:, omap_complete:true, error:false)", "state": "WRITING", "waiting_on_pushes": "289(3),292(2)", "extent_requested": "0,8404992" } ], "read_ops": [] } }, "scrub": { "scrubber.epoch_start": "0", "scrubber.active": false, "scrubber.state": "INACTIVE", "scrubber.start": "MIN", "scrubber.end": "MIN", "scrubber.max_end": "MIN", "scrubber.subset_last_update": "0'0", "scrubber.deep": false, "scrubber.waiting_on_whom": [] } }, { "name": "Started", "enter_time": "2020-08-11 10:41:55.090085" } ] # Below we temporarily change placement of new OSDs back and forth. The # initiated peering leads to re-discovery of all objects. # ceph osd crush move osd.288 host=bb-04 moved item id 288 name 'osd.288' to location {host=bb-04} in crush map # ceph osd crush move osd.289 host=bb-05 moved item id 289 name 'osd.289' to location {host=bb-05} in crush map # ceph osd crush move osd.290 host=bb-06 moved item id 290 name 'osd.290' to location {host=bb-06} in crush map # ceph osd crush move osd.291 host=bb-21 moved item id 291 name 'osd.291' to location {host=bb-21} in crush map # ceph osd crush move osd.292 host=bb-07 moved item id 292 name 'osd.292' to location {host=bb-07} in crush map # ceph osd crush move osd.293 host=bb-18 moved item id 293 name 'osd.293' to location {host=bb-18} in crush map # ceph osd crush move osd.295 host=bb-19 moved item id 295 name 'osd.295' to location {host=bb-19} in crush map # ceph osd crush move osd.294 host=bb-22 moved item id 294 name 'osd.294' to location {host=bb-22} in crush map # ceph osd crush move osd.296 host=bb-20 moved item id 296 name 'osd.296' to location {host=bb-20} in crush map # All objects found at this point. Notice that a slow op shows up for # one of the mons (see very end). It does not clear itself, a restart # is required. # ceph status cluster: id: xxx-x-xxx health: HEALTH_WARN noout,norebalance flag(s) set 59942033/1498816658 objects misplaced (3.999%) 1 pools nearfull 1 slow ops, oldest one blocked for 62 sec, mon.ceph-03 has slow ops services: mon: 3 daemons, quorum ceph-01,ceph-02,ceph-03 mgr: ceph-01(active), standbys: ceph-03, ceph-02 mds: con-fs2-1/1/1 up {0=ceph-12=up:active}, 1 up:standby-replay osd: 297 osds: 272 up, 272 in; 419 remapped pgs flags noout,norebalance data: pools: 11 pools, 3215 pgs objects: 178.0 M objects, 491 TiB usage: 685 TiB used, 1.2 PiB / 1.9 PiB avail pgs: 59942033/1498816658 objects misplaced (3.999%) 2747 active+clean 348 active+remapped+backfill_wait 71 active+remapped+backfilling 34 active+clean+snaptrim 12 active+clean+snaptrim_wait 3 active+clean+scrubbing+deep io: client: 130 MiB/s rd, 113 MiB/s wr, 1.38 kop/s rd, 1.53 kop/s wr # ceph pg 11.4 query | jq ".acting,.up,.recovery_state" [ 86, 74, 237, 235, 182, 63, 231, 84 ] [ 86, 74, 237, 235, 182, 63, 231, 84 ] [ { "name": "Started/Primary/Active", "enter_time": "2020-08-11 10:50:26.411524", "might_have_unfound": [], "recovery_progress": { "backfill_targets": [], "waiting_on_backfill": [], "last_backfill_started": "MIN", "backfill_info": { "begin": "MIN", "end": "MIN", "objects": [] }, "peer_backfill_info": [], "backfills_in_flight": [], "recovering": [], "pg_backend": { "recovery_ops": [], "read_ops": [] } }, "scrub": { "scrubber.epoch_start": "0", "scrubber.active": false, "scrubber.state": "INACTIVE", "scrubber.start": "MIN", "scrubber.end": "MIN", "scrubber.max_end": "MIN", "scrubber.subset_last_update": "0'0", "scrubber.deep": false, "scrubber.waiting_on_whom": [] } }, { "name": "Started", "enter_time": "2020-08-11 10:50:20.931555" } ] # ceph pg 11.a query | jq ".acting,.up,.recovery_state" [ 170, 156, 148, 74, 234, 86, 236, 232 ] [ 170, 156, 148, 74, 234, 86, 236, 232 ] [ { "name": "Started/Primary/Active", "enter_time": "2020-08-11 10:50:18.781335", "might_have_unfound": [ { "osd": "74(3)", "status": "already probed" }, { "osd": "86(5)", "status": "already probed" }, { "osd": "148(2)", "status": "already probed" }, { "osd": "156(1)", "status": "already probed" }, { "osd": "232(7)", "status": "already probed" }, { "osd": "234(4)", "status": "already probed" }, { "osd": "236(6)", "status": "already probed" }, { "osd": "289(3)", "status": "not queried" }, { "osd": "292(2)", "status": "not queried" } ], "recovery_progress": { "backfill_targets": [], "waiting_on_backfill": [], "last_backfill_started": "MIN", "backfill_info": { "begin": "MIN", "end": "MIN", "objects": [] }, "peer_backfill_info": [], "backfills_in_flight": [], "recovering": [], "pg_backend": { "recovery_ops": [], "read_ops": [] } }, "scrub": { "scrubber.epoch_start": "0", "scrubber.active": false, "scrubber.state": "INACTIVE", "scrubber.start": "MIN", "scrubber.end": "MIN", "scrubber.max_end": "MIN", "scrubber.subset_last_update": "0'0", "scrubber.deep": false, "scrubber.waiting_on_whom": [] } }, { "name": "Started", "enter_time": "2020-08-11 10:50:18.043630" } ] # ceph osd crush move osd.288 host=ceph-04 moved item id 288 name 'osd.288' to location {host=ceph-04} in crush map # ceph osd crush move osd.289 host=ceph-05 moved item id 289 name 'osd.289' to location {host=ceph-05} in crush map # ceph osd crush move osd.290 host=ceph-06 moved item id 290 name 'osd.290' to location {host=ceph-06} in crush map # ceph osd crush move osd.291 host=ceph-21 moved item id 291 name 'osd.291' to location {host=ceph-21} in crush map # ceph osd crush move osd.292 host=ceph-07 moved item id 292 name 'osd.292' to location {host=ceph-07} in crush map # ceph osd crush move osd.293 host=ceph-18 moved item id 293 name 'osd.293' to location {host=ceph-18} in crush map # ceph osd crush move osd.295 host=ceph-19 moved item id 295 name 'osd.295' to location {host=ceph-19} in crush map # ceph osd crush move osd.294 host=ceph-22 moved item id 294 name 'osd.294' to location {host=ceph-22} in crush map # ceph osd crush move osd.296 host=ceph-20 moved item id 296 name 'osd.296' to location {host=ceph-20} in crush map # After these placement operations, we start observing slow ops. Not sute what # is going on here, but something seems not to work the way it should. We recorded # here 2 ceph status reports to show the transition. In between these two there # was a PG going down as well, it was shown as 1 pg inactive for a short time. # Didn't manage to catch that for the record. # ceph status cluster: id: xxx-x-xxx health: HEALTH_WARN noout,norebalance flag(s) set 8630330/1498837232 objects misplaced (0.576%) 1 pools nearfull 8 slow ops, oldest one blocked for 212 sec, daemons [osd.169,osd.234,osd.288,osd.63,mon.ceph-03] have slow ops. services: mon: 3 daemons, quorum ceph-01,ceph-02,ceph-03 mgr: ceph-01(active), standbys: ceph-03, ceph-02 mds: con-fs2-1/1/1 up {0=ceph-12=up:active}, 1 up:standby-replay osd: 297 osds: 272 up, 272 in; 46 remapped pgs flags noout,norebalance data: pools: 11 pools, 3215 pgs objects: 178.0 M objects, 491 TiB usage: 685 TiB used, 1.2 PiB / 1.9 PiB avail pgs: 0.156% pgs not active 8630330/1498837232 objects misplaced (0.576%) 3158 active+clean 41 active+remapped+backfill_wait 6 active+clean+scrubbing+deep 4 active+remapped+backfilling 4 activating 1 activating+remapped 1 active+clean+snaptrim io: client: 85 MiB/s rd, 127 MiB/s wr, 534 op/s rd, 844 op/s wr # ceph status cluster: id: xxx-x-xxx health: HEALTH_WARN noout,norebalance flag(s) set 8630330/1498844491 objects misplaced (0.576%) 1 pools nearfull 1 slow ops, oldest one blocked for 247 sec, mon.ceph-03 has slow ops services: mon: 3 daemons, quorum ceph-01,ceph-02,ceph-03 mgr: ceph-01(active), standbys: ceph-03, ceph-02 mds: con-fs2-1/1/1 up {0=ceph-12=up:active}, 1 up:standby-replay osd: 297 osds: 272 up, 272 in; 46 remapped pgs flags noout,norebalance data: pools: 11 pools, 3215 pgs objects: 178.0 M objects, 491 TiB usage: 685 TiB used, 1.2 PiB / 1.9 PiB avail pgs: 8630330/1498844491 objects misplaced (0.576%) 3162 active+clean 42 active+remapped+backfill_wait 6 active+clean+scrubbing+deep 4 active+remapped+backfilling 1 active+clean+snaptrim io: client: 51 MiB/s rd, 66 MiB/s wr, 1.22 kop/s rd, 1.18 kop/s wr # ceph pg 11.4 query | jq ".acting,.up,.recovery_state" [ 86, 74, 237, 235, 182, 63, 231, 84 ] [ 86, 74, 237, 235, 182, 63, 231, 84 ] [ { "name": "Started/Primary/Active", "enter_time": "2020-08-11 10:53:20.059512", "might_have_unfound": [], "recovery_progress": { "backfill_targets": [], "waiting_on_backfill": [], "last_backfill_started": "MIN", "backfill_info": { "begin": "MIN", "end": "MIN", "objects": [] }, "peer_backfill_info": [], "backfills_in_flight": [], "recovering": [], "pg_backend": { "recovery_ops": [], "read_ops": [] } }, "scrub": { "scrubber.epoch_start": "0", "scrubber.active": false, "scrubber.state": "INACTIVE", "scrubber.start": "MIN", "scrubber.end": "MIN", "scrubber.max_end": "MIN", "scrubber.subset_last_update": "0'0", "scrubber.deep": false, "scrubber.waiting_on_whom": [] } }, { "name": "Started", "enter_time": "2020-08-11 10:53:08.947343" } ] # ceph pg 11.a query | jq ".acting,.up,.recovery_state" [ 170, 156, 148, 74, 234, 86, 236, 232 ] [ 170, 156, 292, 289, 234, 86, 236, 232 ] [ { "name": "Started/Primary/Active", "enter_time": "2020-08-11 10:53:08.842425", "might_have_unfound": [], "recovery_progress": { "backfill_targets": [ "289(3)", "292(2)" ], "waiting_on_backfill": [], "last_backfill_started": "MIN", "backfill_info": { "begin": "MIN", "end": "MIN", "objects": [] }, "peer_backfill_info": [], "backfills_in_flight": [], "recovering": [], "pg_backend": { "recovery_ops": [], "read_ops": [] } }, "scrub": { "scrubber.epoch_start": "0", "scrubber.active": false, "scrubber.state": "INACTIVE", "scrubber.start": "MIN", "scrubber.end": "MIN", "scrubber.max_end": "MIN", "scrubber.subset_last_update": "0'0", "scrubber.deep": false, "scrubber.waiting_on_whom": [] } }, { "name": "Started", "enter_time": "2020-08-11 10:53:04.962749" } ] # This operation got stuck. In other experiments I saw hundreds stuck and # always needed a mon restart. # ceph daemon mon.ceph-03 ops { "ops": [ { "description": "osd_pgtemp(e193916 {11.43=[236,168,85,86,228,169,60,148]} v193916)", "initiated_at": "2020-08-11 10:50:20.712996", "age": 310.065493, "duration": 310.065506, "type_data": { "events": [ { "time": "2020-08-11 10:50:20.712996", "event": "initiated" }, { "time": "2020-08-11 10:50:20.712996", "event": "header_read" }, { "time": "2020-08-11 10:50:20.713012", "event": "throttled" }, { "time": "2020-08-11 10:50:20.713015", "event": "all_read" }, { "time": "2020-08-11 10:50:20.713110", "event": "dispatched" }, { "time": "2020-08-11 10:50:20.713113", "event": "mon:_ms_dispatch" }, { "time": "2020-08-11 10:50:20.713113", "event": "mon:dispatch_op" }, { "time": "2020-08-11 10:50:20.713113", "event": "psvc:dispatch" }, { "time": "2020-08-11 10:50:20.713125", "event": "osdmap:preprocess_query" }, { "time": "2020-08-11 10:50:20.713155", "event": "forward_request_leader" }, { "time": "2020-08-11 10:50:20.713184", "event": "forwarded" } ], "info": { "seq": 56785507, "src_is_mon": false, "source": "osd.85 192.168.32.69:6820/2538607", "forwarded_to_leader": true } } } ], "num_ops": 1 }