Actions
Bug #54347
closedceph df stats break when there is an OSD with CRUSH weight == 0
Status:
Duplicate
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
OSD is out:
{
"osd": 7,
"uuid": "eea11d98-f027-4b81-8894-32adea02fee0",
"up": 1,
"in": 0,
"weight": 0,
"primary_affinity": 1,
"last_clean_begin": 140035,
"last_clean_end": 215828,
"up_from": 215829,
"up_thru": 215824,
"down_at": 215827,
"lost_at": 0,
"public_addrs": {
"addrvec": [
{
"type": "v2",
"addr": "172.19.208.1:6827",
"nonce": 1926154309
},
{
"type": "v1",
"addr": "172.19.208.1:6832",
"nonce": 1926154309
}
]
},
"cluster_addrs": {
"addrvec": [
{
"type": "v2",
"addr": "172.19.208.1:6960",
"nonce": 1927154309
},
{
"type": "v1",
"addr": "172.19.208.1:6961",
"nonce": 1927154309
}
]
},
"heartbeat_back_addrs": {
"addrvec": [
{
"type": "v2",
"addr": "172.19.208.1:6879",
"nonce": 1926154309
},
{
"type": "v1",
"addr": "172.19.208.1:6883",
"nonce": 1926154309
}
]
},
"heartbeat_front_addrs": {
"addrvec": [
{
"type": "v2",
"addr": "172.19.208.1:6867",
"nonce": 1926154309
},
{
"type": "v1",
"addr": "172.19.208.1:6873",
"nonce": 1926154309
}
]
},
"public_addr": "172.19.208.1:6832/1926154309",
"cluster_addr": "172.19.208.1:6961/1927154309",
"heartbeat_back_addr": "172.19.208.1:6883/1926154309",
"heartbeat_front_addr": "172.19.208.1:6873/1926154309",
"state": [
"exists",
"up"
]
}
OSD has CRUSH weight set to 0:
{
"id": 7,
"device_class": "hdd",
"name": "osd.7",
"type": "osd",
"type_id": 0,
"crush_weight": 0,
"depth": 5,
"pool_weights": {},
"reweight": 0,
"kb": 0,
"kb_used": 0,
"kb_used_data": 0,
"kb_used_omap": 0,
"kb_used_meta": 0,
"kb_avail": 0,
"utilization": 0,
"var": 0,
"pgs": 0,
"status": "up"
}
`ceph df` looks like:
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
hdd 2.7 PiB 925 TiB 1.8 PiB 1.8 PiB 66.21
TOTAL 2.7 PiB 925 TiB 1.8 PiB 1.8 PiB 66.21
--- POOLS ---
POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL
device_health_metrics 1 1 2.7 GiB 864 8.2 GiB 0 233 TiB
qxdata 2 8192 1.2 PiB 1.12G 1.6 PiB 70.66 524 TiB
test 3 512 2.6 TiB 19.40M 3.4 TiB 0.49 524 TiB
STORED != USED
I set the OSD "in":
$ ceph osd in 7
And now `ceph df` looks like:
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
hdd 2.7 PiB 929 TiB 1.8 PiB 1.8 PiB 66.12
TOTAL 2.7 PiB 929 TiB 1.8 PiB 1.8 PiB 66.12
--- POOLS ---
POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL
device_health_metrics 1 1 2.2 GiB 864 2.2 GiB 0 233 TiB
qxdata 2 8192 1.2 PiB 1.12G 1.2 PiB 64.13 524 TiB
test 3 512 2.3 TiB 19.40M 2.3 TiB 0.33 524 TiB
Now, STORED == USED.
As a result, %USED drops. This is quite misleading, suggesting the pool has more space than it does in reality.
If I "out" the OSD again, it returns to normal.
I have seen this same behaviour in 2 different Ceph clusters. One running Nautilus and one Octopus. The cluster in this example is running Octopus 15.2.11.
Actions