Actions
Bug #23921
closedpg-upmap cannot balance in some case
% Done:
0%
Source:
Tags:
Backport:
luminous, mimic
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
I have a cluster with 21 osds, cluster topology is
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -5 21.00000 root test -7 11.00000 datacenter dc-1 -9 11.00000 rack rack-1 -11 5.00000 host host-1 5 hdd 1.00000 osd.5 up 0.50000 1.00000 6 hdd 1.00000 osd.6 up 1.00000 1.00000 7 hdd 1.00000 osd.7 up 1.00000 1.00000 8 hdd 1.00000 osd.8 up 1.00000 1.00000 9 hdd 1.00000 osd.9 up 1.00000 1.00000 -12 2.00000 host host-2 16 hdd 1.00000 osd.16 up 1.00000 1.00000 17 hdd 1.00000 osd.17 up 1.00000 1.00000 -13 2.00000 host host-3 15 hdd 1.00000 osd.15 up 1.00000 1.00000 18 hdd 1.00000 osd.18 up 1.00000 1.00000 -14 2.00000 host host-4 19 hdd 1.00000 osd.19 up 1.00000 1.00000 20 hdd 1.00000 osd.20 up 1.00000 1.00000 -8 10.00000 datacenter dc-2 -10 10.00000 rack rack-2 -15 5.00000 host host-5 10 hdd 1.00000 osd.10 up 1.00000 1.00000 11 hdd 1.00000 osd.11 up 1.00000 1.00000 12 hdd 1.00000 osd.12 up 1.00000 1.00000 13 hdd 1.00000 osd.13 up 1.00000 1.00000 14 hdd 1.00000 osd.14 up 1.00000 1.00000 -16 5.00000 host host-6 0 hdd 1.00000 osd.0 up 1.00000 1.00000 1 hdd 1.00000 osd.1 up 1.00000 1.00000 2 hdd 1.00000 osd.2 up 1.00000 1.00000 3 hdd 1.00000 osd.3 up 1.00000 1.00000 4 hdd 1.00000 osd.4 up 1.00000 1.00000 -1 21.00000 root default -2 21.00000 host huangjun 0 hdd 1.00000 osd.0 up 1.00000 1.00000 1 hdd 1.00000 osd.1 up 1.00000 1.00000 2 hdd 1.00000 osd.2 up 1.00000 1.00000 3 hdd 1.00000 osd.3 up 1.00000 1.00000 4 hdd 1.00000 osd.4 up 1.00000 1.00000 5 hdd 1.00000 osd.5 up 0.50000 1.00000 6 hdd 1.00000 osd.6 up 1.00000 1.00000 7 hdd 1.00000 osd.7 up 1.00000 1.00000 8 hdd 1.00000 osd.8 up 1.00000 1.00000 9 hdd 1.00000 osd.9 up 1.00000 1.00000 10 hdd 1.00000 osd.10 up 1.00000 1.00000 11 hdd 1.00000 osd.11 up 1.00000 1.00000 12 hdd 1.00000 osd.12 up 1.00000 1.00000 13 hdd 1.00000 osd.13 up 1.00000 1.00000 14 hdd 1.00000 osd.14 up 1.00000 1.00000 15 hdd 1.00000 osd.15 up 1.00000 1.00000 16 hdd 1.00000 osd.16 up 1.00000 1.00000 17 hdd 1.00000 osd.17 up 1.00000 1.00000 18 hdd 1.00000 osd.18 up 1.00000 1.00000 19 hdd 1.00000 osd.19 up 1.00000 1.00000 20 hdd 1.00000 osd.20 up 1.00000 1.00000
create a pool with 1024pgs, 2 replicated size.
after remap, it shows no change
ceph osd df
ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS 5 hdd 1.00000 0.50000 981M 34176k 948M 3.40 1.00 40 6 hdd 1.00000 1.00000 981M 34176k 948M 3.40 1.00 99 7 hdd 1.00000 1.00000 981M 34176k 948M 3.40 1.00 109 8 hdd 1.00000 1.00000 981M 34176k 948M 3.40 1.00 121 9 hdd 1.00000 1.00000 981M 34176k 948M 3.40 1.00 95 16 hdd 1.00000 1.00000 981M 34176k 948M 3.40 1.00 82 17 hdd 1.00000 1.00000 981M 34176k 948M 3.40 1.00 91 15 hdd 1.00000 1.00000 981M 34176k 948M 3.40 1.00 95 18 hdd 1.00000 1.00000 981M 34176k 948M 3.40 1.00 93 19 hdd 1.00000 1.00000 981M 34176k 948M 3.40 1.00 100 20 hdd 1.00000 1.00000 981M 34176k 948M 3.40 1.00 99 10 hdd 1.00000 1.00000 981M 34176k 948M 3.40 1.00 85 11 hdd 1.00000 1.00000 981M 34176k 948M 3.40 1.00 94 12 hdd 1.00000 1.00000 981M 34176k 948M 3.40 1.00 81 13 hdd 1.00000 1.00000 981M 34176k 948M 3.40 1.00 118 14 hdd 1.00000 1.00000 981M 34176k 948M 3.40 1.00 102 0 hdd 1.00000 1.00000 981M 34176k 948M 3.40 1.00 107 1 hdd 1.00000 1.00000 981M 34176k 948M 3.40 1.00 113 2 hdd 1.00000 1.00000 981M 34176k 948M 3.40 1.00 106 3 hdd 1.00000 1.00000 981M 34176k 948M 3.40 1.00 110 4 hdd 1.00000 1.00000 981M 34176k 948M 3.40 1.00 108
I check the log
2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.0 weight 0.1 pgs 107 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.1 weight 0.1 pgs 113 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.2 weight 0.1 pgs 106 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.3 weight 0.1 pgs 110 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.4 weight 0.1 pgs 108 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.5 weight 0.0454545 pgs 40 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.6 weight 0.0909091 pgs 99 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.7 weight 0.0909091 pgs 109 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.8 weight 0.0909091 pgs 121 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.9 weight 0.0909091 pgs 95 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.10 weight 0.1 pgs 85 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.11 weight 0.1 pgs 94 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.12 weight 0.1 pgs 81 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.13 weight 0.1 pgs 118 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.14 weight 0.1 pgs 102 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.15 weight 0.0909091 pgs 95 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.16 weight 0.0909091 pgs 82 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.17 weight 0.0909091 pgs 91 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.18 weight 0.0909091 pgs 93 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.19 weight 0.0909091 pgs 100 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.20 weight 0.0909091 pgs 99 2018-04-28 11:50:39.661 7f87a8cfd700 10 osd_weight_total 1.95455 2018-04-28 11:50:39.661 7f87a8cfd700 10 pgs_per_weight 1047.81 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.0 pgs 107 target 104.781 deviation 2.21863 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.1 pgs 113 target 104.781 deviation 8.21863 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.2 pgs 106 target 104.781 deviation 1.21863 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.3 pgs 110 target 104.781 deviation 5.21863 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.4 pgs 108 target 104.781 deviation 3.21863 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.5 pgs 40 target 47.6279 deviation -7.6279 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.6 pgs 99 target 95.2558 deviation 3.7442 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.7 pgs 109 target 95.2558 deviation 13.7442 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.8 pgs 121 target 95.2558 deviation 25.7442 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.9 pgs 95 target 95.2558 deviation -0.255798 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.10 pgs 85 target 104.781 deviation -19.7814 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.11 pgs 94 target 104.781 deviation -10.7814 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.12 pgs 81 target 104.781 deviation -23.7814 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.13 pgs 118 target 104.781 deviation 13.2186 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.14 pgs 102 target 104.781 deviation -2.78137 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.15 pgs 95 target 95.2558 deviation -0.255798 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.16 pgs 82 target 95.2558 deviation -13.2558 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.17 pgs 91 target 95.2558 deviation -4.2558 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.18 pgs 93 target 95.2558 deviation -2.2558 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.19 pgs 100 target 95.2558 deviation 4.7442 2018-04-28 11:50:39.661 7f87a8cfd700 20 osd.20 pgs 99 target 95.2558 deviation 3.7442 2018-04-28 11:50:39.661 7f87a8cfd700 10 total_deviation 170.065 overfull 0,1,2,3,4,6,7,8,13,19,20 underfull [12,10,16,11,5,17,14,18] 2018-04-28 11:50:39.661 7f87a8cfd700 10 osd.8 move 25 2018-04-28 11:50:39.661 7f87a8cfd700 10 trying 1.0 2018-04-28 11:50:39.661 7f87a8cfd700 10 try_pg_upmap 2018-04-28 11:50:39.661 7f87a8cfd700 10 try_remap_rule ruleno 1 numrep 2 overfull 0,1,2,3,4,6,7,8,13,19,20 underfull [12,10,16,11,5,17,14,18] orig [8,13] 2018-04-28 11:50:39.661 7f87a8cfd700 10 try_remap_rule step 0 w [] 2018-04-28 11:50:39.661 7f87a8cfd700 10 try_remap_rule take [-9] 2018-04-28 11:50:39.661 7f87a8cfd700 10 try_remap_rule step 1 w [-9] 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack stack [1,1,0,1] orig [8,13] at 8 pw [-9] 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack cumulative_fanout [1,1] 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack underfull 12 type 1 is -2 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack underfull 10 type 1 is -15 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack underfull 16 type 1 is -2 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack underfull 11 type 1 is -2 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack underfull 5 type 1 is -2 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack underfull 17 type 1 is -2 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack underfull 14 type 1 is -2 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack underfull 18 type 1 is -2 2018-04-28 11:50:39.661 7f87a8cfd700 20 _choose_type_stack underfull_buckets [-15,-2] 2018-04-28 11:50:39.661 7f87a8cfd700 10 level 0: type 1 fanout 1 cumulative 1 w [-9] 2018-04-28 11:50:39.661 7f87a8cfd700 10 from -9 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack from 13 got -2 of type 1 over leaves 8 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack w <- [-2] was [-9] 2018-04-28 11:50:39.661 7f87a8cfd700 10 level 1: type 0 fanout 1 cumulative 1 w [-2] 2018-04-28 11:50:39.661 7f87a8cfd700 10 from -2 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack pos 0 was 8 considering 12 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack pos 0 replace 8 -> 12 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack w <- [12] was [-2] 2018-04-28 11:50:39.661 7f87a8cfd700 10 try_remap_rule step 2 w [12] 2018-04-28 11:50:39.661 7f87a8cfd700 10 emit [12] 2018-04-28 11:50:39.661 7f87a8cfd700 10 try_remap_rule step 3 w [] 2018-04-28 11:50:39.661 7f87a8cfd700 10 try_remap_rule take [-10] 2018-04-28 11:50:39.661 7f87a8cfd700 10 try_remap_rule step 4 w [-10] 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack stack [1,1,0,1] orig [8,13] at 13 pw [-10] 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack cumulative_fanout [1,1] 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack underfull 12 type 1 is -2 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack underfull 10 type 1 is -15 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack underfull 16 type 1 is -2 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack underfull 11 type 1 is -2 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack underfull 5 type 1 is -2 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack underfull 17 type 1 is -2 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack underfull 14 type 1 is -2 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack underfull 18 type 1 is -2 2018-04-28 11:50:39.661 7f87a8cfd700 20 _choose_type_stack underfull_buckets [-15,-2] 2018-04-28 11:50:39.661 7f87a8cfd700 10 level 0: type 1 fanout 1 cumulative 1 w [-10] 2018-04-28 11:50:39.661 7f87a8cfd700 10 from -10 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack from -1142358840 got -2 of type 1 over leaves 13 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack w <- [-2] was [-10] 2018-04-28 11:50:39.661 7f87a8cfd700 10 level 1: type 0 fanout 1 cumulative 1 w [-2] 2018-04-28 11:50:39.661 7f87a8cfd700 10 from -2 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack pos 0 was 13 considering 12 2018-04-28 11:50:39.661 7f87a8cfd700 20 _choose_type_stack in used 12 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack pos 0 was 13 considering 10 2018-04-28 11:50:39.661 7f87a8cfd700 20 _choose_type_stack not in subtree -2 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack pos 0 was 13 considering 16 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack pos 0 replace 13 -> 16 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack end of orig, break 1 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack end of orig, break 2 2018-04-28 11:50:39.661 7f87a8cfd700 10 _choose_type_stack w <- [16] was [-2] 2018-04-28 11:50:39.661 7f87a8cfd700 10 try_remap_rule step 5 w [16] 2018-04-28 11:50:39.661 7f87a8cfd700 10 emit [16] 2018-04-28 11:50:39.661 7f87a8cfd700 10 try_pg_upmap orig [8,13], out [12,16] 2018-04-28 11:50:39.661 7f87a8cfd700 10 1.0 [8,13] -> [12,16] 2018-04-28 11:50:39.661 7f87a8cfd700 10 1.0 pg_upmap_items [8,12,13,16] 2018-04-28 11:50:39.716 7f87ab502700 10 maybe_remove_pg_upmaps 2018-04-28 11:50:39.716 7f87ab502700 10 maybe_remove_pg_upmaps pg 1.0 crush-rule-id 1 weight_map {0=0.1,1=0.1,2=0.1,3=0.1,4=0.1,5=0.0909091,6=0.0909091,7=0.0909091,8=0.0909091,9=0 .0909091,10=0.1,11=0.1,12=0.1,13=0.1,14=0.1,15=0.0909091,16=0.0909091,17=0.0909091,18=0.0909091,19=0.0909091,20=0.0909091} failure-domain-type 1 2018-04-28 11:50:39.716 7f87ab502700 10 maybe_remove_pg_upmaps pg 1.0 osd 12 parent -2 2018-04-28 11:50:39.716 7f87ab502700 10 maybe_remove_pg_upmaps pg 1.0 osd 16 parent -2 2018-04-28 11:50:39.717 7f87ab502700 10 maybe_remove_pg_upmaps cancel invalid pending pg_upmap_items entry 1.0->[8,12,13,16]
PG 1.0 remap from 8,13 to 12,16
and in root bucket test, the osd.12 and osd.16 are not in the same host,
but get the same parent -2, that it is werid. so it will clear the upmap items.
because osd.12 and osd.16 in the same host huangjun, but which is not used for pool 'test'
pool 1 'test' replicated size 2 min_size 1 crush_rule 1 object_hash rjenkins pg_num 1024 pgp_num 1024 last_change 104 lfor 0/102 flags hashpspool stripe_width 0 async_recovery_max_updates 200 osd_full_ratio 0.9
crush rule dump is
[ { "rule_id": 0, "rule_name": "replicated_rule", "ruleset": 0, "type": 1, "min_size": 1, "max_size": 10, "steps": [ { "op": "take", "item": -1, "item_name": "default" }, { "op": "choose_firstn", "num": 0, "type": "osd" }, { "op": "emit" } ] }, { "rule_id": 1, "rule_name": "test", "ruleset": 1, "type": 1, "min_size": 1, "max_size": 10, "steps": [ { "op": "take", "item": -9, "item_name": "rack-1" }, { "op": "chooseleaf_firstn", "num": 1, "type": "host" }, { "op": "emit" }, { "op": "take", "item": -10, "item_name": "rack-2" }, { "op": "chooseleaf_firstn", "num": 1, "type": "host" }, { "op": "emit" } ] } ]
Updated by huang jun almost 6 years ago
But if i unlink all osds from 'root default / host huangjun', every thing works ok.
for i in `seq 0 20`; do ./bin/ceph osd crush unlink osd.$i huangjun; done
Updated by xie xingguo almost 6 years ago
- Project changed from mgr to RADOS
- Category set to Correctness/Safety
- Assignee set to xie xingguo
- Severity changed from 3 - minor to 2 - major
Updated by xie xingguo almost 6 years ago
Updated by xie xingguo almost 6 years ago
- Status changed from New to Fix Under Review
Updated by Kefu Chai almost 6 years ago
- Copied to Backport #24026: mimic: pg-upmap cannot balance in some case added
Updated by xie xingguo almost 6 years ago
- Status changed from Fix Under Review to Pending Backport
- Backport set to luminous
Updated by Nathan Cutler almost 6 years ago
- Copied to Backport #24048: luminous: pg-upmap cannot balance in some case added
Updated by Nathan Cutler almost 6 years ago
- Backport changed from luminous to luminous, mimic
Updated by Nathan Cutler almost 6 years ago
- Status changed from Pending Backport to Resolved
Actions