Project

General

Profile

Actions

Bug #62248

open

upstream Quincy incorrectly reporting pgs backfill_toofull

Added by Tim Wilkinson 10 months ago. Updated 9 months ago.

Status:
Need More Info
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Our fault insertion testing with 17.2.6 is showing a cluster status reporting PGs toofull but when we dig into the PGs, they doesn't appear full at all.

health: HEALTH_WARN
noscrub,nodeep-scrub flag(s) set
Low space hindering backfill (add storage if this doesn't resolve itself): 12 pgs backfill_toofull
Degraded data redundancy: 16006187/201134076 objects degraded (7.958%), 2498 pgs degraded, 2482 pgs undersized
# ceph df
--- RAW STORAGE ---
CLASS     SIZE    AVAIL     USED  RAW USED  %RAW USED
hdd    361 TiB  174 TiB  187 TiB   187 TiB      51.73
TOTAL  361 TiB  174 TiB  187 TiB   187 TiB      51.73

--- POOLS ---
POOL                        ID   PGS   STORED  OBJECTS     USED  %USED  MAX AVAIL
.mgr                         1     1  1.8 MiB        2  5.3 MiB      0     38 TiB
.rgw.root                    2    32  1.3 KiB        5   48 KiB      0     38 TiB
default.rgw.log              3   128   29 MiB      212   84 MiB      0     40 TiB
default.rgw.control          4   128      0 B       10      0 B      0     38 TiB
default.rgw.meta             5   128   18 KiB       47  579 KiB      0     38 TiB
default.rgw.buckets.index    6   256  2.8 GiB      220  8.0 GiB      0     40 TiB
default.rgw.buckets.data     7  4096  127 TiB   33.52M  175 TiB  60.26     83 TiB
default.rgw.buckets.non-ec   8    32   13 MiB        0   38 MiB      0     38 TiB


# ceph pg dump |grep _toofull
7.f54       8253                   0      8253          0        0  30667702272            0           0  1457      1457  active+undersized+degraded+remapped+backfill_wait+backfill_toofull  2023-07-31T05:40:30.871799+0000   3416'11018   6521:32310  [129,124,181,166,139,165]         129   [NONE,124,181,166,139,165]             124         0'0  2023-07-30T12:52:05.672341+0000              0'0  2023-07-30T12:48:01.120327+0000              0                    1  periodic scrub scheduled @ 2023-07-31T22:53:28.604484+0000                 0                0
  ...
7.731       8173                   0      8173          0        0  30332157952            0           0  1648      1648  active+undersized+degraded+remapped+backfill_wait+backfill_toofull  2023-07-31T05:40:26.339303+0000   3416'11028   6521:24442    [18,115,116,174,169,23]          18     [18,115,116,174,NONE,23]              18         0'0  2023-07-30T12:52:03.861531+0000              0'0  2023-07-30T12:44:17.866343+0000              0                    1  periodic scrub scheduled @ 2023-07-31T20:38:16.524162+0000                 0                0
dumped all


# ceph pg 7.731 query 
{
    "snap_trimq": "[]",
    "snap_trimq_len": 0,
    "state": "active+undersized+degraded+remapped+backfill_wait+backfill_toofull",
    "epoch": 6527,
    "up": [
        18,
        115,
        116,
        174,
        169,
        23
    ],
    "acting": [
        18,
        115,
        116,
        174,
        2147483647,
        23
    ],
    "backfill_targets": [
        "169(4)" 
    ],
    "acting_recovery_backfill": [
        "18(0)",
        "23(5)",
        "115(1)",
        "116(2)",
        "169(4)",
        "174(3)" 
   ...


# ceph osd df tree| grep 169
ID   CLASS  WEIGHT     REWEIGHT  SIZE     RAW USE   DATA      OMAP     META     AVAIL     %USE   VAR   PGS  STATUS  TYPE NAME
*169*    hdd    1.87949   1.00000  1.9 TiB   834 GiB   772 GiB   11 MiB  3.2 GiB   1.1 TiB  *43.36*  0.84   59      up          osd.169           

Actions

Also available in: Atom PDF