Actions
Bug #56650
openceph df reports invalid MAX AVAIL value for stretch mode crush rule
% Done:
0%
Source:
Tags:
Backport:
pacific,quincy
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
If we define crush rule for stretch mode cluster with multiple take then MAX AVAIL for pools associated with crush rule will report available size equals to available space from single datacenter.
Consider if we define crush rule stretch_rule as per https://docs.ceph.com/en/latest/rados/operations/stretch-mode/ documentation
rule stretch_rule { id 1 type replicated step take DC1 step chooseleaf firstn 2 type host step emit step take DC2 step chooseleaf firstn 2 type host step emit } and another crush rule stretch_replicated_rule with similar placement strategy : rule stretch_replicated_rule { id 2 type replicated step take default step choose firstn 0 type datacenter step chooseleaf firstn 2 type host step emit }
then "MAX AVAIL" for pools from stretch_rule show incorrect value whereas pools from stretch_replicated_rule shows correct value.
The way crush rule stretch_rule is defined, PGMap::get_rule_avail is considering only one datacenter's available size rather than total avail size from both datacenters.
More details :
$ ceph osd crush rule ls replicated_rule stretch_rule stretch_replicated_rule $ ceph osd crush rule dump stretch_rule { "rule_id": 1, "rule_name": "stretch_rule", "type": 1, "steps": [ { "op": "take", "item": -5, "item_name": "DC1" }, { "op": "chooseleaf_firstn", "num": 2, "type": "host" }, { "op": "emit" }, { "op": "take", "item": -6, "item_name": "DC2" }, { "op": "chooseleaf_firstn", "num": 2, "type": "host" }, { "op": "emit" } ] } $ ceph osd crush rule dump stretch_replicated_rule { "rule_id": 2, "rule_name": "stretch_replicated_rule", "type": 1, "steps": [ { "op": "take", "item": -1, "item_name": "default" }, { "op": "choose_firstn", "num": 0, "type": "datacenter" }, { "op": "chooseleaf_firstn", "num": 2, "type": "host" }, { "op": "emit" } ] } $ ceph osd pool ls detail pool 1 '.mgr' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 19 flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr pool 2 'cephfs.a.meta' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode on last_change 88 lfor 0/0/62 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs pool 3 'cephfs.a.data' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 64 lfor 0/0/62 flags hashpspool stripe_width 0 application cephfs pool 4 'rbdpool' replicated size 4 min_size 1 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 126 flags hashpspool stripe_width 0 pool 5 'rbdtest' replicated size 4 min_size 1 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 139 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd pool 6 'stretched_rbdpool' replicated size 4 min_size 1 crush_rule 1 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 130 flags hashpspool stripe_width 0 pool 7 'stretched_rbdtest' replicated size 4 min_size 1 crush_rule 1 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 143 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd pool 8 'stretched_replicated_rbdpool' replicated size 4 min_size 1 crush_rule 2 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 134 flags hashpspool stripe_width 0 pool 9 'stretched_replicated_rbdtest' replicated size 4 min_size 1 crush_rule 2 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 147 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd $ ceph df --- RAW STORAGE --- CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 1.2 TiB 960 GiB 252 GiB 252 GiB 20.81 TOTAL 1.2 TiB 960 GiB 252 GiB 252 GiB 20.81 --- POOLS --- POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL .mgr 1 1 1.5 MiB 2 4.5 MiB 0 289 GiB cephfs.a.meta 2 16 2.3 KiB 22 96 KiB 0 289 GiB cephfs.a.data 3 32 0 B 0 0 B 0 289 GiB rbdpool 4 32 0 B 0 0 B 0 216 GiB rbdtest 5 32 20 GiB 5.14k 80 GiB 8.46 216 GiB stretched_rbdpool 6 32 0 B 0 0 B 0 108 GiB stretched_rbdtest 7 32 20 GiB 5.14k 80 GiB 15.60 108 GiB stretched_replicated_rbdpool 8 32 0 B 0 0 B 0 216 GiB stretched_replicated_rbdtest 9 32 20 GiB 5.14k 80 GiB 8.46 216 GiB
Actions