Project

General

Profile

Bug #8943

"ceph df" cannot show pool available space correctly

Added by Xiaoxi Chen over 9 years ago. Updated over 9 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (dev)
Tags:
Backport:
firefly
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Currently when user have 2 pools with different ruleset and different root, basically they will use different OSDs,but from ceph df output, the pools share a same MAX AVAIL value.
See below example:
I have 2 osds and set them belong to different hosts, then create 2 pools (pool1 and pool2), the 2 pools use different crush rules, so pool1 is on osd.0 and pool2 is on osd.1 . Each OSD is backed by a 1TB disk. So I suspect the ceph df will output separate MAX_AVAIL value for these 2 pools, but actually not. This bug was identified when I want to use ceph df to see how % my cache_tiering pool is used.

root@ChendiServer01:~/xiaoxi_ceph/src# ceph df
GLOBAL:
SIZE AVAIL RAW USED %%RAW USED
1862G 1862G 73520k 0
POOLS:
NAME ID USED %%USED MAX AVAIL OBJECTS
pool1 6 0 0 1862G 0
pool2 7 0 0 1862G 0

root@ChendiServer01:~/xiaoxi_ceph/src# ceph osd tree
  1. id weight type name up/down reweight
    -1 2 root default
    -3 2 rack unknownrack
    -2 1 host ChendiServer01
    0 1 osd.0 up 1
    -4 1 host ChendiServer012
    1 1 osd.1 up 1

root@ChendiServer01:~/xiaoxi_ceph/src# ceph osd pool get pool1 crush_ruleset
crush_ruleset: 2
root@ChendiServer01:~/xiaoxi_ceph/src# ceph osd pool get pool2 crush_ruleset
crush_ruleset: 3

root@ChendiServer01:~/xiaoxi_ceph/src# ceph osd crush rule dump rule1 { "rule_id": 2,
"rule_name": "rule1",
"ruleset": 2,
"type": 1,
"min_size": 1,
"max_size": 10,
"steps": [ { "op": "take",
"item": -2,
"item_name": "ChendiServer01"}, { "op": "choose_firstn",
"num": 0,
"type": "osd"}, { "op": "emit"}]}
root@ChendiServer01:~/xiaoxi_ceph/src# ceph osd crush rule dump rule2 { "rule_id": 3,
"rule_name": "rule2",
"ruleset": 3,
"type": 1,
"min_size": 1,
"max_size": 10,
"steps": [ { "op": "take",
"item": -4,
"item_name": "ChendiServer012"}, { "op": "choose_firstn",
"num": 0,
"type": "osd"}, { "op": "emit"}]}

The bug is caused by
1. CrushWrapper::get_rule_weight_map, this function should return a map of osd to weight, but in most case( hierarchy of osd >=2, for example, root-rack-host-osd) it doesn't , just return a weight in host/rack level

2. in PGMonitor::get_rule_avail, it should use osd's available stat to caculate the proj, but in the current code , it use osd_sum.kb_avail

History

#1 Updated by Sage Weil over 9 years ago

  • Status changed from New to Pending Backport
  • Priority changed from Normal to High
  • Source changed from other to Community (dev)
  • Backport set to firefly

#2 Updated by Sage Weil over 9 years ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF