Project

General

Profile

Actions

Bug #56650

open

ceph df reports invalid MAX AVAIL value for stretch mode crush rule

Added by Prashant D almost 2 years ago. Updated 9 months ago.

Status:
Fix Under Review
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
pacific,quincy
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

If we define crush rule for stretch mode cluster with multiple take then MAX AVAIL for pools associated with crush rule will report available size equals to available space from single datacenter.

Consider if we define crush rule stretch_rule as per https://docs.ceph.com/en/latest/rados/operations/stretch-mode/ documentation

rule stretch_rule {
        id 1
        type replicated
        step take DC1
        step chooseleaf firstn 2 type host
        step emit
        step take DC2
        step chooseleaf firstn 2 type host
        step emit
}

and another crush rule stretch_replicated_rule with similar placement strategy :

rule stretch_replicated_rule {
        id 2
        type replicated
        step take default
        step choose firstn 0 type datacenter
        step chooseleaf firstn 2 type host
        step emit
}

then "MAX AVAIL" for pools from stretch_rule show incorrect value whereas pools from stretch_replicated_rule shows correct value.

The way crush rule stretch_rule is defined, PGMap::get_rule_avail is considering only one datacenter's available size rather than total avail size from both datacenters.

More details :

$ ceph osd crush rule ls
replicated_rule
stretch_rule
stretch_replicated_rule

$ ceph osd crush rule dump stretch_rule
{
    "rule_id": 1,
    "rule_name": "stretch_rule",
    "type": 1,
    "steps": [
        {
            "op": "take",
            "item": -5,
            "item_name": "DC1" 
        },
        {
            "op": "chooseleaf_firstn",
            "num": 2,
            "type": "host" 
        },
        {
            "op": "emit" 
        },
        {
            "op": "take",
            "item": -6,
            "item_name": "DC2" 
        },
        {
            "op": "chooseleaf_firstn",
            "num": 2,
            "type": "host" 
        },
        {
            "op": "emit" 
        }
    ]
}

$ ceph osd crush rule dump stretch_replicated_rule
{
    "rule_id": 2,
    "rule_name": "stretch_replicated_rule",
    "type": 1,
    "steps": [
        {
            "op": "take",
            "item": -1,
            "item_name": "default" 
        },
        {
            "op": "choose_firstn",
            "num": 0,
            "type": "datacenter" 
        },
        {
            "op": "chooseleaf_firstn",
            "num": 2,
            "type": "host" 
        },
        {
            "op": "emit" 
        }
    ]
}

$ ceph osd pool ls detail
pool 1 '.mgr' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 19 flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr
pool 2 'cephfs.a.meta' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode on last_change 88 lfor 0/0/62 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs
pool 3 'cephfs.a.data' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 64 lfor 0/0/62 flags hashpspool stripe_width 0 application cephfs
pool 4 'rbdpool' replicated size 4 min_size 1 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 126 flags hashpspool stripe_width 0
pool 5 'rbdtest' replicated size 4 min_size 1 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 139 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd
pool 6 'stretched_rbdpool' replicated size 4 min_size 1 crush_rule 1 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 130 flags hashpspool stripe_width 0
pool 7 'stretched_rbdtest' replicated size 4 min_size 1 crush_rule 1 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 143 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd
pool 8 'stretched_replicated_rbdpool' replicated size 4 min_size 1 crush_rule 2 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 134 flags hashpspool stripe_width 0
pool 9 'stretched_replicated_rbdtest' replicated size 4 min_size 1 crush_rule 2 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 147 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd

$ ceph df
--- RAW STORAGE ---
CLASS     SIZE    AVAIL     USED  RAW USED  %RAW USED
hdd    1.2 TiB  960 GiB  252 GiB   252 GiB      20.81
TOTAL  1.2 TiB  960 GiB  252 GiB   252 GiB      20.81

--- POOLS ---
POOL                          ID  PGS   STORED  OBJECTS     USED  %USED  MAX AVAIL
.mgr                           1    1  1.5 MiB        2  4.5 MiB      0    289 GiB
cephfs.a.meta                  2   16  2.3 KiB       22   96 KiB      0    289 GiB
cephfs.a.data                  3   32      0 B        0      0 B      0    289 GiB
rbdpool                        4   32      0 B        0      0 B      0    216 GiB
rbdtest                        5   32   20 GiB    5.14k   80 GiB   8.46    216 GiB
stretched_rbdpool              6   32      0 B        0      0 B      0    108 GiB
stretched_rbdtest              7   32   20 GiB    5.14k   80 GiB  15.60    108 GiB                  
stretched_replicated_rbdpool   8   32      0 B        0      0 B      0    216 GiB
stretched_replicated_rbdtest   9   32   20 GiB    5.14k   80 GiB   8.46    216 GiB

Refer : https://bugzilla.redhat.com/show_bug.cgi?id=2100920

Actions

Also available in: Atom PDF