Project

General

Profile

Bug #41255

backfill_toofull seen on cluster where the most full OSD is at 1%

Added by Bryan Stillwell over 3 years ago. Updated about 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
mimic, nautilus, luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

After upgrading both our test and staging clusters to Nautilus (14.2.2), we've seen both of them report some PGs as backfill_toofull since upgrading when marking a single OSD into the cluster.

The test cluster is currently using just 148 GiB of storage, yet it has over 100 4TB drives (it would be ~4% full if all data was on one drive!) The staging cluster is using 14 TiB of storage, yet it has over 400 4TB drives.

Both clusters are a mixture of FileStore and BlueStore OSDs, which I'm wondering if that has anything to do with it. The PGs end up backfilling eventually, but they shouldn't ever go into a toofull mode on either of these clusters.


Related issues

Related to RADOS - Bug #39555: backfill_toofull while OSDs are not full (Unneccessary HEALTH_ERR) Resolved
Copied to RADOS - Backport #41582: luminous: backfill_toofull seen on cluster where the most full OSD is at 1% Rejected
Copied to RADOS - Backport #41583: nautilus: backfill_toofull seen on cluster where the most full OSD is at 1% Resolved
Copied to RADOS - Backport #41584: mimic: backfill_toofull seen on cluster where the most full OSD is at 1% Resolved

History

#1 Updated by Bryan Stillwell over 3 years ago

Here's the output from ceph status:

# ceph -s
  cluster:
    id:     XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
    health: HEALTH_ERR
            Degraded data redundancy (low space): 1 pg backfill_toofull

  services:
    mon: 3 daemons, quorum a1cephmon002,a1cephmon003,a1cephmon004 (age 21h)
    mgr: a1cephmon002(active, since 21h), standbys: a1cephmon003, a1cephmon004
    mds: cephfs:2 {0=a1cephmon002=up:active,1=a1cephmon003=up:active} 1 up:standby
    osd: 143 osds: 142 up, 142 in; 106 remapped pgs
    rgw: 11 daemons active (radosgw.a1cephrgw008, radosgw.a1cephrgw009, radosgw.a1cephrgw010, radosgw.a1cephrgw011, radosgw.a1tcephrgw002, radosgw.a1tcephrgw003, radosgw.a1tcephrgw004, radosgw.a1tcephrgw005, radosgw.a1tcephrgw006, radosgw.a1tcephrgw007, radosgw.a1tcephrgw008)

  data:
    pools:   19 pools, 5264 pgs
    objects: 1.45M objects, 148 GiB
    usage:   658 GiB used, 436 TiB / 437 TiB avail
    pgs:     44484/4351770 objects misplaced (1.022%)
             5158 active+clean
             104  active+remapped+backfill_wait
             1    active+remapped+backfilling
             1    active+remapped+backfill_wait+backfill_toofull

  io:
    client:   19 MiB/s rd, 13 MiB/s wr, 431 op/s rd, 509 op/s wr

#2 Updated by Vikhyat Umrao over 3 years ago

  • Related to Bug #39555: backfill_toofull while OSDs are not full (Unneccessary HEALTH_ERR) added

#3 Updated by David Zafman over 3 years ago

  • Assignee set to David Zafman

#4 Updated by David Zafman over 3 years ago

  • Status changed from New to In Progress

#5 Updated by David Zafman over 3 years ago

  • Pull request ID set to 29850

#6 Updated by David Zafman over 3 years ago

  • Pull request ID changed from 29850 to 29857

#7 Updated by David Zafman over 3 years ago

  • Backport set to luminous, mimic, nautilus

#8 Updated by Kefu Chai over 3 years ago

  • Status changed from In Progress to Pending Backport

#9 Updated by Nathan Cutler over 3 years ago

  • Copied to Backport #41582: luminous: backfill_toofull seen on cluster where the most full OSD is at 1% added

#10 Updated by Nathan Cutler over 3 years ago

  • Copied to Backport #41583: nautilus: backfill_toofull seen on cluster where the most full OSD is at 1% added

#11 Updated by Nathan Cutler over 3 years ago

  • Copied to Backport #41584: mimic: backfill_toofull seen on cluster where the most full OSD is at 1% added

#12 Updated by Bryan Stillwell over 3 years ago

We didn't see this problem on any of our clusters with the 12.2.12 release, so maybe this isn't the fix if a backport to Luminous is needed? Is there a way we could test the fix in Nautilus on our test cluster?

#13 Updated by David Zafman over 3 years ago

  • Backport changed from luminous, mimic, nautilus to mimic, nautilus

Luminous doesn't have the issue!

#14 Updated by Nathan Cutler over 3 years ago

  • Backport changed from mimic, nautilus to mimic, nautilus, luminous

Addressing backport-create-issue script complaint:

ERROR:root:https://tracker.ceph.com/issues/41255 has more backport issues
(luminous,mimic,nautilus) than expected (mimic,nautilus)

Alternatively, if you really want to have "Backport: mimic, nautilus" I can delete the luminous backport tracker issue, but it has comments in it, so I'm reluctant to do that.

(In the past leaving the Backport field unchanged and setting unwanted backport tracker issue(s) to "Rejected" has worked fine, so I see no reason why it shouldn't work here, too.)

#15 Updated by Florian Haas over 3 years ago

  • Affected Versions v14.2.4 added

Adding 14.2.4 as an affected version, as I am seeing the same issue on a 14.2.4 cluster that has recently had 9 OSDs added:

$ ceph pg ls backfill_toofull
PG    OBJECTS DEGRADED MISPLACED UNFOUND BYTES     OMAP_BYTES* OMAP_KEYS* LOG  STATE                                          SINCE VERSION      REPORTED     UP        ACTING  SCRUB_STAMP                DEEP_SCRUB_STAMP           
5.ea     3422        0      6844       0    804144           0          0 2088 active+remapped+backfill_wait+backfill_toofull   14h  10049'23516  10098:58621  [9,10]p9 [5,7]p5 2019-10-08 09:35:18.312555 2019-10-03 13:33:19.653331 
6.11c       1        0         2       0         0           0          0 2818 active+remapped+backfill_wait+backfill_toofull   12h 10089'100705 10098:182853 [11,9]p11 [7,8]p7 2019-10-09 13:08:50.460959 2019-10-04 12:31:27.445849 
11.36      23        0        46       0         0       19985         88 1871 active+remapped+backfill_wait+backfill_toofull   14h   7735'17002  10098:75007 [10,9]p10 [5,7]p5 2019-10-08 10:39:51.692146 2019-10-08 10:39:51.692146 
11.e6      30        0        60       0         0     1520615       6510 2285 active+remapped+backfill_wait+backfill_toofull   15h  10045'50499 10098:130940  [9,14]p9 [1,2]p1 2019-10-08 10:26:08.609821 2019-10-08 10:26:08.609821 
13.61     792        0      1584       0 102544305           0          0 1307 active+remapped+backfill_wait+backfill_toofull   13h    7300'3470  10098:16818 [10,9]p10 [7,8]p7 2019-10-09 12:20:43.443785 2019-10-09 12:20:43.443785 
13.97     841        0      1682       0 129795495           0          0 1533 active+remapped+backfill_wait+backfill_toofull   14h    7263'7775  10098:30896 [13,9]p13 [1,2]p1 2019-10-09 02:14:44.016217 2019-10-02 18:00:26.111812 
13.b0     807        0      1614       0 130275254           0          0 3030 active+remapped+backfill_wait+backfill_toofull   15h    7263'3463  10098:18411  [9,16]p9 [1,2]p1 2019-10-08 13:41:33.267561 2019-10-01 18:24:57.382308

What those have in common is they all map to OSD 9, which currently is only 20.64% full.

#16 Updated by Stefan Kooman about 3 years ago

We added a CRUSH policy (replicated_nvme) and set this policy on our cephfs metadata pool (with 1.2 Bilion objects) and hit this issue for 4 PGs:

ceph pg ls backfill_toofull
PG OBJECTS DEGRADED MISPLACED UNFOUND BYTES OMAP_BYTES* OMAP_KEYS* LOG STATE STATE_STAMP VERSION REPORTED UP ACTING SCRUB_STAMP DEEP_SCRUB_STAMP
6.41 824816 0 2474448 0 0 0 0 3005 active+remapped+backfill_wait+backfill_toofull 2019-12-19 08:44:39.794511 65444'10436893 65444:16341088 [107,97,113]p107 [27,59,64]p27 2019-12-17 08:16:34.458092 2019-12-17 08:16:34.458092
6.8c 826851 0 2480553 0 0 0 0 3093 active+remapped+backfill_wait+backfill_toofull 2019-12-19 08:44:39.458275 65444'10509038 65444:16433228 [118,109,91]p118 [12,35,55]p12 2019-12-18 06:39:12.263870 2019-12-14 07:51:37.046587
6.e3 826205 0 2478615 0 0 0 0 3056 active+remapped+backfill_wait+backfill_toofull 2019-12-19 08:44:39.444298 65444'10579374 65444:16499289 [116,90,109]p116 [16,40,62]p16 2019-12-18 05:18:42.368150 2019-12-14 08:18:05.110985
6.1f4 825482 0 2476446 0 0 0 0 3030 active+remapped+backfill_wait+backfill_toofull 2019-12-19 08:44:39.447672 65444'10540620 65444:16469557 [119,90,104]p119 [22,54,92]p22 2019-12-17 07:54:32.545703 2019-12-17 07:54:32.545703

We used the "upmap" trick of Rene Diepstraten in ticket #39555 (set an upmap for the problematic PG on a different OSD on the same host) and remove that later.

Running Ceph Mimic 13.2.8

This bug manifests itself in different Ceph versions under different circumstances. Is there any info we (users) can provide when this occurs to gather more data so this bug can get properly addressed?

#17 Updated by David Zafman about 3 years ago

A backport to Mimic of the fix can be found here:
https://github.com/ceph/ceph/pull/32361

Or if you can build from source apply https://github.com/ceph/ceph/pull/32361/commits/c71bfebf37b61bde0ffa1c8b816a07a39ab5add8 to v13.2.8

#18 Updated by Stefan Kooman about 3 years ago

Hi David:

Good to know the bug is indeed fixed ... too bad it didn't make it in 13.2.8. Anyways ... building patched packages now ...

#19 Updated by Nathan Cutler about 3 years ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Also available in: Atom PDF