Bug #41255
closedbackfill_toofull seen on cluster where the most full OSD is at 1%
Added by Bryan Stillwell over 4 years ago. Updated over 4 years ago.
0%
Description
After upgrading both our test and staging clusters to Nautilus (14.2.2), we've seen both of them report some PGs as backfill_toofull since upgrading when marking a single OSD into the cluster.
The test cluster is currently using just 148 GiB of storage, yet it has over 100 4TB drives (it would be ~4% full if all data was on one drive!) The staging cluster is using 14 TiB of storage, yet it has over 400 4TB drives.
Both clusters are a mixture of FileStore and BlueStore OSDs, which I'm wondering if that has anything to do with it. The PGs end up backfilling eventually, but they shouldn't ever go into a toofull mode on either of these clusters.
Updated by Bryan Stillwell over 4 years ago
Here's the output from ceph status:
# ceph -s cluster: id: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX health: HEALTH_ERR Degraded data redundancy (low space): 1 pg backfill_toofull services: mon: 3 daemons, quorum a1cephmon002,a1cephmon003,a1cephmon004 (age 21h) mgr: a1cephmon002(active, since 21h), standbys: a1cephmon003, a1cephmon004 mds: cephfs:2 {0=a1cephmon002=up:active,1=a1cephmon003=up:active} 1 up:standby osd: 143 osds: 142 up, 142 in; 106 remapped pgs rgw: 11 daemons active (radosgw.a1cephrgw008, radosgw.a1cephrgw009, radosgw.a1cephrgw010, radosgw.a1cephrgw011, radosgw.a1tcephrgw002, radosgw.a1tcephrgw003, radosgw.a1tcephrgw004, radosgw.a1tcephrgw005, radosgw.a1tcephrgw006, radosgw.a1tcephrgw007, radosgw.a1tcephrgw008) data: pools: 19 pools, 5264 pgs objects: 1.45M objects, 148 GiB usage: 658 GiB used, 436 TiB / 437 TiB avail pgs: 44484/4351770 objects misplaced (1.022%) 5158 active+clean 104 active+remapped+backfill_wait 1 active+remapped+backfilling 1 active+remapped+backfill_wait+backfill_toofull io: client: 19 MiB/s rd, 13 MiB/s wr, 431 op/s rd, 509 op/s wr
Updated by Vikhyat Umrao over 4 years ago
- Related to Bug #39555: backfill_toofull while OSDs are not full (Unneccessary HEALTH_ERR) added
Updated by David Zafman over 4 years ago
- Pull request ID changed from 29850 to 29857
Updated by David Zafman over 4 years ago
- Backport set to luminous, mimic, nautilus
Updated by Kefu Chai over 4 years ago
- Status changed from In Progress to Pending Backport
Updated by Nathan Cutler over 4 years ago
- Copied to Backport #41582: luminous: backfill_toofull seen on cluster where the most full OSD is at 1% added
Updated by Nathan Cutler over 4 years ago
- Copied to Backport #41583: nautilus: backfill_toofull seen on cluster where the most full OSD is at 1% added
Updated by Nathan Cutler over 4 years ago
- Copied to Backport #41584: mimic: backfill_toofull seen on cluster where the most full OSD is at 1% added
Updated by Bryan Stillwell over 4 years ago
We didn't see this problem on any of our clusters with the 12.2.12 release, so maybe this isn't the fix if a backport to Luminous is needed? Is there a way we could test the fix in Nautilus on our test cluster?
Updated by David Zafman over 4 years ago
- Backport changed from luminous, mimic, nautilus to mimic, nautilus
Luminous doesn't have the issue!
Updated by Nathan Cutler over 4 years ago
- Backport changed from mimic, nautilus to mimic, nautilus, luminous
Addressing backport-create-issue script complaint:
ERROR:root:https://tracker.ceph.com/issues/41255 has more backport issues (luminous,mimic,nautilus) than expected (mimic,nautilus)
Alternatively, if you really want to have "Backport: mimic, nautilus" I can delete the luminous backport tracker issue, but it has comments in it, so I'm reluctant to do that.
(In the past leaving the Backport field unchanged and setting unwanted backport tracker issue(s) to "Rejected" has worked fine, so I see no reason why it shouldn't work here, too.)
Updated by Florian Haas over 4 years ago
- Affected Versions v14.2.4 added
Adding 14.2.4 as an affected version, as I am seeing the same issue on a 14.2.4 cluster that has recently had 9 OSDs added:
$ ceph pg ls backfill_toofull PG OBJECTS DEGRADED MISPLACED UNFOUND BYTES OMAP_BYTES* OMAP_KEYS* LOG STATE SINCE VERSION REPORTED UP ACTING SCRUB_STAMP DEEP_SCRUB_STAMP 5.ea 3422 0 6844 0 804144 0 0 2088 active+remapped+backfill_wait+backfill_toofull 14h 10049'23516 10098:58621 [9,10]p9 [5,7]p5 2019-10-08 09:35:18.312555 2019-10-03 13:33:19.653331 6.11c 1 0 2 0 0 0 0 2818 active+remapped+backfill_wait+backfill_toofull 12h 10089'100705 10098:182853 [11,9]p11 [7,8]p7 2019-10-09 13:08:50.460959 2019-10-04 12:31:27.445849 11.36 23 0 46 0 0 19985 88 1871 active+remapped+backfill_wait+backfill_toofull 14h 7735'17002 10098:75007 [10,9]p10 [5,7]p5 2019-10-08 10:39:51.692146 2019-10-08 10:39:51.692146 11.e6 30 0 60 0 0 1520615 6510 2285 active+remapped+backfill_wait+backfill_toofull 15h 10045'50499 10098:130940 [9,14]p9 [1,2]p1 2019-10-08 10:26:08.609821 2019-10-08 10:26:08.609821 13.61 792 0 1584 0 102544305 0 0 1307 active+remapped+backfill_wait+backfill_toofull 13h 7300'3470 10098:16818 [10,9]p10 [7,8]p7 2019-10-09 12:20:43.443785 2019-10-09 12:20:43.443785 13.97 841 0 1682 0 129795495 0 0 1533 active+remapped+backfill_wait+backfill_toofull 14h 7263'7775 10098:30896 [13,9]p13 [1,2]p1 2019-10-09 02:14:44.016217 2019-10-02 18:00:26.111812 13.b0 807 0 1614 0 130275254 0 0 3030 active+remapped+backfill_wait+backfill_toofull 15h 7263'3463 10098:18411 [9,16]p9 [1,2]p1 2019-10-08 13:41:33.267561 2019-10-01 18:24:57.382308
What those have in common is they all map to OSD 9, which currently is only 20.64% full.
Updated by Stefan Kooman over 4 years ago
We added a CRUSH policy (replicated_nvme) and set this policy on our cephfs metadata pool (with 1.2 Bilion objects) and hit this issue for 4 PGs:
ceph pg ls backfill_toofull
PG OBJECTS DEGRADED MISPLACED UNFOUND BYTES OMAP_BYTES* OMAP_KEYS* LOG STATE STATE_STAMP VERSION REPORTED UP ACTING SCRUB_STAMP DEEP_SCRUB_STAMP
6.41 824816 0 2474448 0 0 0 0 3005 active+remapped+backfill_wait+backfill_toofull 2019-12-19 08:44:39.794511 65444'10436893 65444:16341088 [107,97,113]p107 [27,59,64]p27 2019-12-17 08:16:34.458092 2019-12-17 08:16:34.458092
6.8c 826851 0 2480553 0 0 0 0 3093 active+remapped+backfill_wait+backfill_toofull 2019-12-19 08:44:39.458275 65444'10509038 65444:16433228 [118,109,91]p118 [12,35,55]p12 2019-12-18 06:39:12.263870 2019-12-14 07:51:37.046587
6.e3 826205 0 2478615 0 0 0 0 3056 active+remapped+backfill_wait+backfill_toofull 2019-12-19 08:44:39.444298 65444'10579374 65444:16499289 [116,90,109]p116 [16,40,62]p16 2019-12-18 05:18:42.368150 2019-12-14 08:18:05.110985
6.1f4 825482 0 2476446 0 0 0 0 3030 active+remapped+backfill_wait+backfill_toofull 2019-12-19 08:44:39.447672 65444'10540620 65444:16469557 [119,90,104]p119 [22,54,92]p22 2019-12-17 07:54:32.545703 2019-12-17 07:54:32.545703
- NOTE: Omap statistics are gathered during deep scrub and may be inaccurate soon afterwards depending on utilisation. See http://docs.ceph.com/docs/master/dev/placement-group/#omap-statistics for further details.
We used the "upmap" trick of Rene Diepstraten in ticket #39555 (set an upmap for the problematic PG on a different OSD on the same host) and remove that later.
Running Ceph Mimic 13.2.8
This bug manifests itself in different Ceph versions under different circumstances. Is there any info we (users) can provide when this occurs to gather more data so this bug can get properly addressed?
Updated by David Zafman over 4 years ago
A backport to Mimic of the fix can be found here:
https://github.com/ceph/ceph/pull/32361
Or if you can build from source apply https://github.com/ceph/ceph/pull/32361/commits/c71bfebf37b61bde0ffa1c8b816a07a39ab5add8 to v13.2.8
Updated by Stefan Kooman over 4 years ago
Hi David:
Good to know the bug is indeed fixed ... too bad it didn't make it in 13.2.8. Anyways ... building patched packages now ...
Updated by Nathan Cutler over 4 years ago
- Status changed from Pending Backport to Resolved
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".