Actions
Bug #62811
openPGs stuck in backfilling state after their primary OSD is removed by setting its crush weight to 0.
% Done:
0%
Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
I am pasting the problem description from the original BZ: https://bugzilla.redhat.com/show_bug.cgi?id=2233777
Description of problem: We are observing that the OSDs get stuck in draining process, and all the PGs are not removed the the OSD. it looks like the drain is stuck on PGs where the OSD being removed, is the Primary OSD in that PG. We had these configs set on the cluster during testing. mon advanced mon_osd_down_out_subtree_limit host osd advanced osd_async_recovery_min_cost 1099511627776 Procedure: 1. Deploy a 4 node RHCS cluster, with 2+2 EC pool and fill data. # ceph -s cluster: id: 66070a80-2f84-11ee-bc2c-0cc47af3ea56 health: HEALTH_OK services: mon: 3 daemons, quorum argo012,argo013,argo014 (age 8d) mgr: argo013.akdhka(active, since 12d), standbys: argo012.odttqx, argo014.xfhnzv osd: 35 osds: 35 up (since 67m), 35 in (since 30h) rgw: 4 daemons active (4 hosts, 1 zones) data: pools: 7 pools, 673 pgs objects: 4.28M objects, 3.0 TiB usage: 6.6 TiB used, 7.6 TiB / 14 TiB avail pgs: 665 active+clean 7 active+clean+scrubbing+deep 1 active+clean+scrubbing io: client: 7.2 KiB/s rd, 0 B/s wr, 7 op/s rd, 4 op/s wr [ceph: root@argo012 /]# ceph df --- RAW STORAGE --- CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 14 TiB 7.6 TiB 6.6 TiB 6.6 TiB 46.42 TOTAL 14 TiB 7.6 TiB 6.6 TiB 6.6 TiB 46.42 --- POOLS --- POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL .mgr 1 1 23 MiB 5 68 MiB 0 1.8 TiB .rgw.root 2 32 1.3 KiB 4 48 KiB 0 1.8 TiB default.rgw.log 3 32 3.6 KiB 209 408 KiB 0 1.8 TiB default.rgw.control 4 32 0 B 8 0 B 0 1.8 TiB default.rgw.meta 5 32 8.9 KiB 18 207 KiB 0 1.8 TiB ec22-pool 6 512 3.0 TiB 4.28M 6.1 TiB 52.35 2.8 TiB default.rgw.buckets.index 7 32 1.1 GiB 90 3.2 GiB 0.06 1.8 TiB # ceph osd df tree ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS TYPE NAME -1 14.21661 - 14 TiB 6.6 TiB 6.1 TiB 3.2 GiB 61 GiB 7.6 TiB 46.42 1.00 - root default -7 3.65570 - 3.7 TiB 1.7 TiB 1.5 TiB 642 MiB 15 GiB 2.0 TiB 45.23 0.97 - host argo012 3 hdd 0.40619 1.00000 416 GiB 183 GiB 167 GiB 111 MiB 1.6 GiB 233 GiB 43.89 0.95 74 up osd.3 6 hdd 0.40619 1.00000 416 GiB 194 GiB 178 GiB 57 MiB 1.4 GiB 222 GiB 46.68 1.01 64 up osd.6 10 hdd 0.40619 1.00000 416 GiB 176 GiB 160 GiB 23 MiB 1.4 GiB 240 GiB 42.32 0.91 61 up osd.10 15 hdd 0.40619 1.00000 416 GiB 191 GiB 175 GiB 26 MiB 2.0 GiB 225 GiB 45.95 0.99 71 up osd.15 19 hdd 0.40619 1.00000 416 GiB 189 GiB 173 GiB 180 MiB 1.5 GiB 227 GiB 45.32 0.98 69 up osd.19 23 hdd 0.40619 1.00000 416 GiB 177 GiB 161 GiB 33 KiB 1.5 GiB 239 GiB 42.43 0.91 67 up osd.23 27 hdd 0.40619 1.00000 416 GiB 185 GiB 169 GiB 2.9 MiB 1.4 GiB 231 GiB 44.58 0.96 69 up osd.27 31 hdd 0.40619 1.00000 416 GiB 204 GiB 188 GiB 233 MiB 1.6 GiB 212 GiB 49.06 1.06 81 up osd.31 35 hdd 0.40619 1.00000 416 GiB 195 GiB 179 GiB 9.8 MiB 2.2 GiB 221 GiB 46.87 1.01 75 up osd.35 -3 3.65570 - 3.7 TiB 1.7 TiB 1.5 TiB 979 MiB 15 GiB 2.0 TiB 45.23 0.97 - host argo013 0 hdd 0.40619 1.00000 416 GiB 179 GiB 163 GiB 312 MiB 1.8 GiB 237 GiB 43.05 0.93 67 up osd.0 4 hdd 0.40619 1.00000 416 GiB 191 GiB 175 GiB 149 MiB 1.8 GiB 225 GiB 46.01 0.99 77 up osd.4 8 hdd 0.40619 1.00000 416 GiB 186 GiB 170 GiB 134 MiB 1.3 GiB 230 GiB 44.63 0.96 72 up osd.8 12 hdd 0.40619 1.00000 416 GiB 192 GiB 176 GiB 1 KiB 1.5 GiB 224 GiB 46.04 0.99 67 up osd.12 16 hdd 0.40619 1.00000 416 GiB 188 GiB 172 GiB 10 MiB 1.9 GiB 228 GiB 45.27 0.98 71 up osd.16 20 hdd 0.40619 1.00000 416 GiB 186 GiB 170 GiB 46 MiB 1.7 GiB 230 GiB 44.70 0.96 71 up osd.20 24 hdd 0.40619 1.00000 416 GiB 195 GiB 179 GiB 77 MiB 1.5 GiB 221 GiB 46.78 1.01 72 up osd.24 28 hdd 0.40619 1.00000 416 GiB 179 GiB 163 GiB 90 MiB 1.1 GiB 237 GiB 43.06 0.93 58 up osd.28 32 hdd 0.40619 1.00000 416 GiB 198 GiB 182 GiB 161 MiB 1.9 GiB 218 GiB 47.56 1.02 76 up osd.32 -5 3.65570 - 3.7 TiB 1.7 TiB 1.5 TiB 683 MiB 16 GiB 2.0 TiB 45.23 0.97 - host argo014 2 hdd 0.40619 1.00000 416 GiB 183 GiB 167 GiB 136 MiB 1.8 GiB 233 GiB 44.02 0.95 72 up osd.2 7 hdd 0.40619 1.00000 416 GiB 185 GiB 169 GiB 9.7 MiB 1.6 GiB 231 GiB 44.54 0.96 69 up osd.7 11 hdd 0.40619 1.00000 416 GiB 206 GiB 190 GiB 1 KiB 2.1 GiB 210 GiB 49.59 1.07 78 up osd.11 14 hdd 0.40619 1.00000 416 GiB 188 GiB 172 GiB 5.2 MiB 1.8 GiB 228 GiB 45.25 0.97 61 up osd.14 18 hdd 0.40619 1.00000 416 GiB 188 GiB 172 GiB 176 MiB 1.4 GiB 228 GiB 45.27 0.98 77 up osd.18 22 hdd 0.40619 1.00000 416 GiB 182 GiB 166 GiB 152 MiB 1.8 GiB 234 GiB 43.79 0.94 70 up osd.22 26 hdd 0.40619 1.00000 416 GiB 194 GiB 178 GiB 25 MiB 1.9 GiB 222 GiB 46.66 1.01 66 up osd.26 30 hdd 0.40619 1.00000 416 GiB 192 GiB 176 GiB 107 MiB 1.8 GiB 224 GiB 46.20 1.00 73 up osd.30 34 hdd 0.40619 1.00000 416 GiB 174 GiB 158 GiB 72 MiB 1.5 GiB 242 GiB 41.77 0.90 68 up osd.34 -9 3.24951 - 3.2 TiB 1.6 TiB 1.5 TiB 934 MiB 16 GiB 1.6 TiB 50.41 1.09 - host argo017 5 hdd 0.40619 1.00000 416 GiB 234 GiB 218 GiB 162 MiB 2.2 GiB 182 GiB 56.26 1.21 86 up osd.5 9 hdd 0.40619 1.00000 416 GiB 207 GiB 191 GiB 124 MiB 2.3 GiB 209 GiB 49.75 1.07 83 up osd.9 13 hdd 0.40619 1.00000 416 GiB 194 GiB 178 GiB 45 MiB 1.9 GiB 222 GiB 46.61 1.00 72 up osd.13 17 hdd 0.40619 1.00000 416 GiB 195 GiB 179 GiB 48 MiB 1.6 GiB 221 GiB 46.86 1.01 73 up osd.17 21 hdd 0.40619 1.00000 416 GiB 219 GiB 203 GiB 112 MiB 2.1 GiB 197 GiB 52.60 1.13 84 up osd.21 25 hdd 0.40619 1.00000 416 GiB 195 GiB 179 GiB 36 MiB 1.7 GiB 221 GiB 46.78 1.01 72 up osd.25 29 hdd 0.40619 1.00000 416 GiB 203 GiB 187 GiB 221 MiB 2.2 GiB 213 GiB 48.92 1.05 81 up osd.29 33 hdd 0.40619 1.00000 416 GiB 231 GiB 215 GiB 185 MiB 2.0 GiB 185 GiB 55.48 1.20 84 up osd.33 TOTAL 14 TiB 6.6 TiB 6.1 TiB 3.2 GiB 61 GiB 7.6 TiB 46.42 MIN/MAX VAR: 0.90/1.21 STDDEV: 3.24 2. In a healthy cluster, where all PGs are in active+ clean state, try removing a OSD from 1 host. In the below example, OSD 0 from argo013 was being removed. # date Tue Aug 22 12:10:27 UTC 2023 # ceph orch osd rm 0 --force --zap Scheduled OSD(s) for removal. [ceph: root@argo012 /]# date Tue Aug 22 12:11:28 UTC 2023 [ceph: root@argo012 /]# ceph orch osd rm status OSD HOST STATE PGS REPLACE FORCE ZAP DRAIN STARTED AT 0 argo013 draining 58 False True True 2023-08-22 12:11:27.848424 3. Initially, the PGs start draining, but observing that the drain stops after it is done draining around 75% PGs. When I checked the PG dump at this point, it looked like the drain is stuck on PGs where the OSD being removed, is the Primary OSD in that PG. # ceph orch osd rm status OSD HOST STATE PGS REPLACE FORCE ZAP DRAIN STARTED AT 0 argo013 draining 22 False True True 2023-08-22 12:11:27.848424 [root@argo012 ~]# date Wed Aug 23 04:07:22 AM UTC 2023 # ceph orch osd rm status OSD HOST STATE PGS REPLACE FORCE ZAP DRAIN STARTED AT 0 argo013 draining 22 False True True 2023-08-22 12:11:27.848424 [root@argo012 ~]# date Wed Aug 23 05:38:27 AM UTC 2023 [root@argo012 ~]# ceph orch osd rm status OSD HOST STATE PGS REPLACE FORCE ZAP DRAIN STARTED AT 0 argo013 draining 22 False True True 2023-08-22 12:11:27.848424 [root@argo012 ~]# date Wed Aug 23 11:01:42 AM UTC 2023 At this point, we do not see any recovery activity in ceph -s o/p also. # ceph -s cluster: id: 66070a80-2f84-11ee-bc2c-0cc47af3ea56 health: HEALTH_OK services: mon: 3 daemons, quorum argo012,argo013,argo014 (age 9d) mgr: argo013.akdhka(active, since 13d), standbys: argo012.odttqx, argo014.xfhnzv osd: 35 osds: 35 up (since 21h), 35 in (since 2d); 22 remapped pgs rgw: 4 daemons active (4 hosts, 1 zones) data: pools: 7 pools, 673 pgs objects: 4.42M objects, 3.0 TiB usage: 6.6 TiB used, 7.6 TiB / 14 TiB avail pgs: 171950/17688441 objects misplaced (0.972%) 651 active+clean 16 active+remapped+backfilling 6 active+remapped+backfill_wait io: client: 10 KiB/s rd, 143 KiB/s wr, 14 op/s rd, 27 op/s wr [root@argo012 ~]# ceph pg dump | grep backfill 6.1bd 8679 0 0 8679 0 6519914496 0 0 7605 7605 active+remapped+backfilling 2023-08-22T12:11:40.514461+0000 19194'35959 19194:222764 [4,3,9,18] 4 [0,3,9,18] 0 18421'35707 2023-08-20T18:45:07.674753+0000 18421'35707 2023-08-18T14:56:15.799803+0000 0 36 queued for scrub 8427 0 6.1b8 8517 0 0 17034 0 6200819712 0 0 7393 7393 active+remapped+backfilling 2023-08-22T12:11:41.238886+0000 19194'35586 19194:218781 [20,23,26,13] 20 [0,23,26,25] 0 18422'35313 2023-08-20T17:33:29.237283+0000 18422'35313 2023-08-20T17:33:29.237283+0000 0 175 queued for scrub 8244 0 6.1af 8479 0 0 8479 0 6370885632 0 0 7261 7261 active+remapped+backfilling 2023-08-22T12:11:40.475878+0000 19194'35370 19194:192817 [12,18,3,9] 12 [0,18,3,9] 0 18422'35131 2023-08-22T12:09:30.568001+0000 18422'35131 2023-08-22T12:09:30.568001+0000 0 560 periodic scrub scheduled @ 2023-08-23T15:52:04.077250+0000 8240 0 6.1aa 8662 0 0 8662 0 6562643968 0 0 7425 7425 active+remapped+backfilling 2023-08-22T12:11:41.434841+0000 19194'35518 19194:193262 [12,11,6,13] 12 [0,11,6,13] 0 18422'35250 2023-08-20T16:57:40.112433+0000 18337'33987 2023-08-14T17:18:34.663615+0000 0 36 queued for deep scrub 8394 0 6.19d 8627 0 0 8627 0 6525550592 0 0 7314 7314 active+remapped+backfilling 2023-08-22T12:11:39.263735+0000 19194'35434 19194:197745 [16,25,26,6] 16 [0,25,26,6] 0 18422'35151 2023-08-20T09:25:48.183947+0000 18396'34264 2023-08-16T11:51:33.787266+0000 0 35 queued for scrub 8344 0 6.180 8826 0 0 8826 0 6708264960 0 0 6784 6784 active+remapped+backfilling 2023-08-22T12:11:40.553126+0000 19194'34345 19194:182555 [32,30,17,31] 32 [0,30,17,31] 0 18421'34084 2023-08-20T01:22:52.567792+0000 18421'34084 2023-08-20T01:22:52.567792+0000 0 176 queued for scrub 8565 0 6.59 8661 0 0 8661 0 6445924352 0 0 7796 7796 active+remapped+backfilling 2023-08-22T14:53:10.041447+0000 19194'36355 19194:117635 [28,18,13,15] 28 [0,18,13,15] 0 18421'36082 2023-08-22T11:26:01.536220+0000 18421'36082 2023-08-22T11:26:01.536220+0000 0 382 periodic scrub scheduled @ 2023-08-23T22:36:35.973627+0000 8388 0 6.4d 8510 0 0 8510 0 6429343744 0 0 7680 7680 active+remapped+backfill_wait 2023-08-22T14:48:22.748410+0000 19194'35958 19194:159698 [12,25,18,19] 12 [0,25,18,19] 0 18422'35685 2023-08-22T11:38:43.676947+0000 18411'35524 2023-08-17T05:34:58.535235+0000 0 35 periodic scrub scheduled @ 2023-08-23T11:46:45.368400+0000 8237 0 6.4a 8603 0 0 17206 0 6389039104 0 0 7624 7624 active+remapped+backfill_wait 2023-08-22T14:52:58.531620+0000 19194'35895 19194:145778 [12,22,25,27] 12 [0,22,33,27] 0 18422'35640 2023-08-22T11:29:58.371470+0000 18422'35640 2023-08-19T23:11:14.163755+0000 0 35 periodic scrub scheduled @ 2023-08-23T21:00:28.665562+0000 8348 0 7.1b 3 0 0 3 0 0 41666444 189764 10048 10048 active+remapped+backfilling 2023-08-22T12:11:41.250313+0000 19194'812502 19194:916015 [28,3,13] 28 [0,3,13] 0 18951'800381 2023-08-22T11:26:05.828515+0000 18951'800381 2023-08-22T11:26:05.828515+0000 0 4 periodic scrub scheduled @ 2023-08-23T23:25:02.991379+0000 3 0 3.6 6 0 0 12 0 0 0 0 9495 9495 active+remapped+backfill_wait 2023-08-22T14:53:10.040326+0000 19194'153995 19194:252905 [26,13,27] 26 [0,13,2] 0 18951'148821 2023-08-22T11:37:46.197728+0000 18510'134991 2023-08-20T11:05:55.008644+0000 0 1 periodic scrub scheduled @ 2023-08-23T19:10:36.035866+0000 6 0 7.4 4 0 0 4 0 0 31724167 138831 10068 10068 active+remapped+backfilling 2023-08-22T12:11:43.275135+0000 19194'416071 19194:486357 [8,9,2] 8 [0,9,2] 0 18422'403982 2023-08-20T14:21:33.629927+0000 18396'351102 2023-08-16T12:45:54.473320+0000 0 1 queued for scrub 4 0 7.6 7 0 0 7 0 0 67901442 315891 4422 4422 active+remapped+backfilling 2023-08-22T12:11:41.434578+0000 18508'472722 19193:565304 [12,22,5] 12 [0,22,5] 0 18508'472722 2023-08-22T11:30:09.680199+0000 18508'472722 2023-08-20T09:18:04.840742+0000 0 1 periodic scrub scheduled @ 2023-08-23T20:39:20.602754+0000 7 0 6.16 8668 0 0 8668 0 6560940032 0 0 6920 6920 active+remapped+backfill_wait 2023-08-22T14:48:33.793550+0000 19194'34730 19194:69433 [28,7,23,29] 28 [0,7,23,29] 0 18422'34477 2023-08-21T02:38:11.482845+0000 18333'33321 2023-08-14T10:17:48.909163+0000 0 36 queued for deep scrub 8415 0 6.b3 8588 0 0 8588 0 6560677888 0 0 7315 7315 active+remapped+backfill_wait 2023-08-22T14:48:41.968101+0000 19194'35707 19194:159836 [4,3,17,34] 4 [0,3,17,34] 0 18422'35433 2023-08-21T00:43:44.822688+0000 18422'35433 2023-08-21T00:43:44.822688+0000 0 151 queued for scrub 8314 0 6.c2 8542 0 0 8542 0 6335627264 0 0 7566 7566 active+remapped+backfill_wait 2023-08-22T14:52:39.323031+0000 19194'35591 19194:187451 [16,3,2,13] 16 [0,3,2,13] 0 18422'35321 2023-08-21T00:15:30.088884+0000 18406'34902 2023-08-16T22:56:36.951863+0000 0 35 queued for scrub 8272 0 6.c8 8385 0 0 8385 0 6342639616 0 0 7740 7740 active+remapped+backfilling 2023-08-22T13:53:02.847391+0000 19194'36287 19194:155784 [8,9,3,22] 8 [0,9,3,22] 0 18422'36007 2023-08-20T18:07:00.914425+0000 18337'34831 2023-08-14T15:39:09.546106+0000 0 35 queued for deep scrub 8105 0 6.ea 8700 0 0 8700 0 6701318144 0 0 8400 8400 active+remapped+backfilling 2023-08-22T12:37:57.250097+0000 19194'36507 19194:215173 [12,2,29,27] 12 [0,2,29,27] 0 18422'36214 2023-08-20T08:30:19.565009+0000 18422'36214 2023-08-19T02:49:58.782983+0000 0 36 queued for scrub 8407 0 6.f5 8692 0 0 8692 0 6698369024 0 0 8315 8315 active+remapped+backfilling 2023-08-22T12:11:40.501888+0000 19194'36206 19194:305781 [12,30,31,17] 12 [0,30,31,17] 0 18422'35937 2023-08-20T17:30:28.074025+0000 18422'35937 2023-08-20T17:30:28.074025+0000 0 170 queued for scrub 8423 0 6.f7 8521 0 0 8521 0 6457851904 0 0 8230 8230 active+remapped+backfilling 2023-08-22T12:11:40.120079+0000 19194'36752 19194:206013 [4,23,2,5] 4 [0,23,2,5] 0 18422'36454 2023-08-19T23:33:43.864135+0000 17969'35091 2023-08-13T21:55:02.307873+0000 0 35 queued for deep scrub 8223 0 dumped all 6.173 8556 0 0 8556 0 6451757056 0 0 8113 8113 active+remapped+backfilling 2023-08-22T12:11:40.476004+0000 19194'36649 19194:163973 [24,31,5,22] 24 [0,31,5,22] 0 18422'36378 2023-08-20T06:28:28.826979+0000 18422'36378 2023-08-19T00:02:15.118486+0000 0 35 queued for scrub 8285 0 6.177 8588 0 0 8588 0 6445072384 0 0 8132 8132 active+remapped+backfilling 2023-08-22T12:11:40.491822+0000 19194'36654 19194:177980 [8,22,29,3] 8 [0,22,29,3] 0 18422'36388 2023-08-20T05:53:14.247617+0000 18422'36388 2023-08-20T05:53:14.247617+0000 0 155 queued for scrub 8322 0 4. The PG would remain in the same state, and there would be no improvement in the drain and OSD removal. The OSD service is running and the target host during the entire procedure. Version-Release number of selected component (if applicable): ceph version 17.2.6-100.el9cp (ea4e3ef8df2cf26540aae06479df031dcfc80343) quincy (stable) How reproducible: 2/2 OSDs on 1 Cluster Steps to Reproduce: 1. Deploy 4 node RHCS cluster, deploy EC 2+2 pool, and fill with data. 2. Remove a running OSD from any host, cmd : ceph orch osd rm <id> --force 3. Observe that drain does not complete and is stuck for a few PGs on the OSD. Actual results: PGs stuck draining and removal does not go through Expected results: PGs drained and OSD removed. Additional info: Observed that, If i stop the service, the drain completes within a few hours, and the OSD is removed from the cluster.
Files
Actions