Project

General

Profile

Actions

Bug #54347

closed

ceph df stats break when there is an OSD with CRUSH weight == 0

Added by Ben Crisp about 2 years ago. Updated over 1 year ago.

Status:
Duplicate
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

OSD is out:

{
  "osd": 7,
  "uuid": "eea11d98-f027-4b81-8894-32adea02fee0",
  "up": 1,
  "in": 0,
  "weight": 0,
  "primary_affinity": 1,
  "last_clean_begin": 140035,
  "last_clean_end": 215828,
  "up_from": 215829,
  "up_thru": 215824,
  "down_at": 215827,
  "lost_at": 0,
  "public_addrs": {
    "addrvec": [
      {
        "type": "v2",
        "addr": "172.19.208.1:6827",
        "nonce": 1926154309
      },
      {
        "type": "v1",
        "addr": "172.19.208.1:6832",
        "nonce": 1926154309
      }
    ]
  },
  "cluster_addrs": {
    "addrvec": [
      {
        "type": "v2",
        "addr": "172.19.208.1:6960",
        "nonce": 1927154309
      },
      {
        "type": "v1",
        "addr": "172.19.208.1:6961",
        "nonce": 1927154309
      }
    ]
  },
  "heartbeat_back_addrs": {
    "addrvec": [
      {
        "type": "v2",
        "addr": "172.19.208.1:6879",
        "nonce": 1926154309
      },
      {
        "type": "v1",
        "addr": "172.19.208.1:6883",
        "nonce": 1926154309
      }
    ]
  },
  "heartbeat_front_addrs": {
    "addrvec": [
      {
        "type": "v2",
        "addr": "172.19.208.1:6867",
        "nonce": 1926154309
      },
      {
        "type": "v1",
        "addr": "172.19.208.1:6873",
        "nonce": 1926154309
      }
    ]
  },
  "public_addr": "172.19.208.1:6832/1926154309",
  "cluster_addr": "172.19.208.1:6961/1927154309",
  "heartbeat_back_addr": "172.19.208.1:6883/1926154309",
  "heartbeat_front_addr": "172.19.208.1:6873/1926154309",
  "state": [
    "exists",
    "up" 
  ]
}

OSD has CRUSH weight set to 0:

{
  "id": 7,
  "device_class": "hdd",
  "name": "osd.7",
  "type": "osd",
  "type_id": 0,
  "crush_weight": 0,
  "depth": 5,
  "pool_weights": {},
  "reweight": 0,
  "kb": 0,
  "kb_used": 0,
  "kb_used_data": 0,
  "kb_used_omap": 0,
  "kb_used_meta": 0,
  "kb_avail": 0,
  "utilization": 0,
  "var": 0,
  "pgs": 0,
  "status": "up" 
}

`ceph df` looks like:

--- RAW STORAGE ---
CLASS  SIZE     AVAIL    USED     RAW USED  %RAW USED
hdd    2.7 PiB  925 TiB  1.8 PiB   1.8 PiB      66.21
TOTAL  2.7 PiB  925 TiB  1.8 PiB   1.8 PiB      66.21

--- POOLS ---
POOL                   ID  PGS   STORED   OBJECTS  USED     %USED  MAX AVAIL
device_health_metrics   1     1  2.7 GiB      864  8.2 GiB      0    233 TiB
qxdata                  2  8192  1.2 PiB    1.12G  1.6 PiB  70.66    524 TiB
test                    3   512  2.6 TiB   19.40M  3.4 TiB   0.49    524 TiB

STORED != USED

I set the OSD "in":

$ ceph osd in 7

And now `ceph df` looks like:

--- RAW STORAGE ---
CLASS  SIZE     AVAIL    USED     RAW USED  %RAW USED
hdd    2.7 PiB  929 TiB  1.8 PiB   1.8 PiB      66.12
TOTAL  2.7 PiB  929 TiB  1.8 PiB   1.8 PiB      66.12

--- POOLS ---
POOL                   ID  PGS   STORED   OBJECTS  USED     %USED  MAX AVAIL
device_health_metrics   1     1  2.2 GiB      864  2.2 GiB      0    233 TiB
qxdata                  2  8192  1.2 PiB    1.12G  1.2 PiB  64.13    524 TiB
test                    3   512  2.3 TiB   19.40M  2.3 TiB   0.33    524 TiB

Now, STORED == USED.

As a result, %USED drops. This is quite misleading, suggesting the pool has more space than it does in reality.

If I "out" the OSD again, it returns to normal.

I have seen this same behaviour in 2 different Ceph clusters. One running Nautilus and one Octopus. The cluster in this example is running Octopus 15.2.11.


Related issues 1 (0 open1 closed)

Related to Ceph - Bug #57121: STORE==USED in ceph df Resolved

Actions
Actions #1

Updated by Renaud Miel about 2 years ago

Same issue observed with:
"ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific (stable)"

but for an unknown and apparently different reason: all osds have crush_weight > 0.

In cepfh df output:
--- RAW STORAGE --- section: USED RAW USED
--- POOLS --- section : STORED USED

Actions #2

Updated by Renaud Miel about 2 years ago

Equals signs were removed by Redmine in previous comment.

Same issue observed with:
"ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific (stable)" 

but for an unknown and apparently different reason: all osds have crush_weight > 0.

In cepfh df output:
--- RAW STORAGE --- section: USED == RAW USED
--- POOLS --- section : STORED == USED

Actions #3

Updated by Renaud Miel about 2 years ago

Additional notes:
  • As usual, the problem occurs only in our production environment: 5 osd servers, 2 mds servers, Ubuntu 20.04.4 LTS, ceph 16.2.7, 1 cephfs, 819 TiB hdd (cephfs data pool), 29 TiB ssd (cephfs metadata pool), 183 osds, all bare-metal servers.
  • The problem does NOT occur in our test environment: 5 osd servers, 2 mds servers, Ubuntu 20.04.4 LTS, ceph 16.2.7, 1 cephfs, 50 GiB hdd, 5 osds, all VM servers
Actions #4

Updated by Snow Si about 2 years ago

As "https://tracker.ceph.com/issues/48385" says, "STORED == USED" appears because "osd_sum.num_osds != osd_sum.num_per_pool_osds":

My best guess is it's related to this in PGMap.h:
bool use_per_pool_stats() const {
return osd_sum.num_osds == osd_sum.num_per_pool_osds;
}

You can use "ceph df -f json" to get "osd_sum.num_osds" and "osd_sum.num_per_pool_osds"

Actions #6

Updated by Renaud Miel almost 2 years ago

Thank you for your feedback Snow Si: this was helpful to workaround the "STORED == USED" issue in ceph df output.

You were right: we had osd_num.num_osds != osd_sum.num_per_pool_osds.
But this not our responsibility: ceph computed and found out that osd_num.num_osds != osd_sum.num_per_pool_osds.

It looks like this was because 1 osd of cephfs' metadata pool (replicated 3, 2.8GiB stored, 880k objects, 32 pgs backed by 33 ssds)
was not storing any pg despite it was in and up.

Stopping this osd allowed to workaround the issue: we now have as expected "STORED != USED" in ceph df output.

This raises a new question: do you have any idea why ceph failed to place data on this specific osd ?
Note: we have 5 ceph servers, 3 of them have each 11 ssds and 30 hdds. 2 of them have no ssds, only hdds.

Actions #7

Updated by Igor Fedotov over 1 year ago

  • Status changed from New to Duplicate
Actions #8

Updated by Igor Fedotov over 1 year ago

  • Related to Bug #57121: STORE==USED in ceph df added
Actions

Also available in: Atom PDF