Project

General

Profile

Actions

Bug #54347

closed

ceph df stats break when there is an OSD with CRUSH weight == 0

Added by Ben Crisp about 2 years ago. Updated over 1 year ago.

Status:
Duplicate
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

OSD is out:

{
  "osd": 7,
  "uuid": "eea11d98-f027-4b81-8894-32adea02fee0",
  "up": 1,
  "in": 0,
  "weight": 0,
  "primary_affinity": 1,
  "last_clean_begin": 140035,
  "last_clean_end": 215828,
  "up_from": 215829,
  "up_thru": 215824,
  "down_at": 215827,
  "lost_at": 0,
  "public_addrs": {
    "addrvec": [
      {
        "type": "v2",
        "addr": "172.19.208.1:6827",
        "nonce": 1926154309
      },
      {
        "type": "v1",
        "addr": "172.19.208.1:6832",
        "nonce": 1926154309
      }
    ]
  },
  "cluster_addrs": {
    "addrvec": [
      {
        "type": "v2",
        "addr": "172.19.208.1:6960",
        "nonce": 1927154309
      },
      {
        "type": "v1",
        "addr": "172.19.208.1:6961",
        "nonce": 1927154309
      }
    ]
  },
  "heartbeat_back_addrs": {
    "addrvec": [
      {
        "type": "v2",
        "addr": "172.19.208.1:6879",
        "nonce": 1926154309
      },
      {
        "type": "v1",
        "addr": "172.19.208.1:6883",
        "nonce": 1926154309
      }
    ]
  },
  "heartbeat_front_addrs": {
    "addrvec": [
      {
        "type": "v2",
        "addr": "172.19.208.1:6867",
        "nonce": 1926154309
      },
      {
        "type": "v1",
        "addr": "172.19.208.1:6873",
        "nonce": 1926154309
      }
    ]
  },
  "public_addr": "172.19.208.1:6832/1926154309",
  "cluster_addr": "172.19.208.1:6961/1927154309",
  "heartbeat_back_addr": "172.19.208.1:6883/1926154309",
  "heartbeat_front_addr": "172.19.208.1:6873/1926154309",
  "state": [
    "exists",
    "up" 
  ]
}

OSD has CRUSH weight set to 0:

{
  "id": 7,
  "device_class": "hdd",
  "name": "osd.7",
  "type": "osd",
  "type_id": 0,
  "crush_weight": 0,
  "depth": 5,
  "pool_weights": {},
  "reweight": 0,
  "kb": 0,
  "kb_used": 0,
  "kb_used_data": 0,
  "kb_used_omap": 0,
  "kb_used_meta": 0,
  "kb_avail": 0,
  "utilization": 0,
  "var": 0,
  "pgs": 0,
  "status": "up" 
}

`ceph df` looks like:

--- RAW STORAGE ---
CLASS  SIZE     AVAIL    USED     RAW USED  %RAW USED
hdd    2.7 PiB  925 TiB  1.8 PiB   1.8 PiB      66.21
TOTAL  2.7 PiB  925 TiB  1.8 PiB   1.8 PiB      66.21

--- POOLS ---
POOL                   ID  PGS   STORED   OBJECTS  USED     %USED  MAX AVAIL
device_health_metrics   1     1  2.7 GiB      864  8.2 GiB      0    233 TiB
qxdata                  2  8192  1.2 PiB    1.12G  1.6 PiB  70.66    524 TiB
test                    3   512  2.6 TiB   19.40M  3.4 TiB   0.49    524 TiB

STORED != USED

I set the OSD "in":

$ ceph osd in 7

And now `ceph df` looks like:

--- RAW STORAGE ---
CLASS  SIZE     AVAIL    USED     RAW USED  %RAW USED
hdd    2.7 PiB  929 TiB  1.8 PiB   1.8 PiB      66.12
TOTAL  2.7 PiB  929 TiB  1.8 PiB   1.8 PiB      66.12

--- POOLS ---
POOL                   ID  PGS   STORED   OBJECTS  USED     %USED  MAX AVAIL
device_health_metrics   1     1  2.2 GiB      864  2.2 GiB      0    233 TiB
qxdata                  2  8192  1.2 PiB    1.12G  1.2 PiB  64.13    524 TiB
test                    3   512  2.3 TiB   19.40M  2.3 TiB   0.33    524 TiB

Now, STORED == USED.

As a result, %USED drops. This is quite misleading, suggesting the pool has more space than it does in reality.

If I "out" the OSD again, it returns to normal.

I have seen this same behaviour in 2 different Ceph clusters. One running Nautilus and one Octopus. The cluster in this example is running Octopus 15.2.11.


Related issues 1 (0 open1 closed)

Related to Ceph - Bug #57121: STORE==USED in ceph df Resolved

Actions
Actions

Also available in: Atom PDF