Actions
Bug #24005
closedpg-upmap can break the crush rule in some cases
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
The test script is
./bin/init-ceph stop killall ceph-mon ceph-osd killall ceph-mon ceph-osd OSD=9 MON=1 MGR=0 MDS=0 ../src/vstart.sh -X -n ./bin/ceph osd crush add-bucket test root ./bin/ceph osd crush add-bucket huangjun-1 host ./bin/ceph osd crush add-bucket huangjun-2 host ./bin/ceph osd crush add-bucket huangjun-3 host ./bin/ceph osd crush move huangjun-1 root=test ./bin/ceph osd crush move huangjun-2 root=test ./bin/ceph osd crush move huangjun-3 root=test ./bin/ceph osd crush add osd.0 1.0 host=huangjun-1 ./bin/ceph osd crush add osd.1 1.0 host=huangjun-1 ./bin/ceph osd crush add osd.2 1.0 host=huangjun-1 ./bin/ceph osd crush add osd.3 1.0 host=huangjun-2 ./bin/ceph osd crush add osd.4 1.0 host=huangjun-2 ./bin/ceph osd crush add osd.5 1.0 host=huangjun-2 ./bin/ceph osd crush add osd.7 1.0 host=huangjun-3 ./bin/ceph osd crush add osd.6 1.0 host=huangjun-3 ./bin/ceph osd crush add osd.8 1.0 host=huangjun-3 #unlink all osds from default crush bucket for i in `seq 0 8`; do ./bin/ceph osd crush unlink osd.$i huangjun done ./bin/ceph osd erasure-code-profile set test k=4 m=2 crush-failure-domain=osd ./bin/ceph osd getcrushmap -o crush ./bin/crushtool -d crush -o crush.txt echo " rule test { id 1 type erasure min_size 1 max_size 10 step take huangjun-1 step chooseleaf indep 2 type osd step emit step take huangjun-2 step chooseleaf indep 2 type osd step emit step take huangjun-3 step chooseleaf indep 2 type osd step emit } " >> crush.txt ./bin/crushtool -c crush.txt -o crush.new ./bin/ceph osd setcrushmap -i crush.new ./bin/ceph osd pool create test 256 256 erasure test test ./bin/ceph osd set-require-min-compat-client luminous
The cluster topology is
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -5 9.00000 root test -7 3.00000 host huangjun-1 0 hdd 1.00000 osd.0 up 1.00000 1.00000 1 hdd 1.00000 osd.1 up 1.00000 1.00000 2 hdd 1.00000 osd.2 up 1.00000 1.00000 -8 3.00000 host huangjun-2 3 hdd 1.00000 osd.3 up 1.00000 1.00000 4 hdd 1.00000 osd.4 up 1.00000 1.00000 5 hdd 1.00000 osd.5 up 1.00000 1.00000 -9 3.00000 host huangjun-3 6 hdd 1.00000 osd.6 up 1.00000 1.00000 7 hdd 1.00000 osd.7 up 1.00000 1.00000 8 hdd 1.00000 osd.8 up 1.00000 1.00000 -1 0 root default -2 0 host huangjun
I choose pg 1.1 to test, the origin map is
osdmap e50 pg 1.1 (1.1) -> up [2,0,5,4,7,8] acting [2,0,5,4,7,8]
Then i use 'ceph osd pg-upmap-items' to set upmaps
./bin/ceph osd pg-upmap-items 1.1 2 1 4 2
and check the pg1.1 map again
[root@huangjun /usr/src/ceph-int/build]$ ./bin/ceph pg map 1.1 *** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH *** osdmap e52 pg 1.1 (1.1) -> up [1,0,5,2,7,8] acting [1,0,5,2,7,8]
My question is:
1. the crush rule said we take 2 osds from each host, but after this upmap,
we can see, pg1.1 have 3 osds in host huangjun-1, 1 osd in host huangjun-2,
which i think break the crush rule, and if the host huangjun-1 is halt, there
can be 3 osds down, that will result pg1.1 in down state.
Updated by huang jun almost 6 years ago
@xie xingguo
this pr https://github.com/ceph/ceph/pull/21815 didn't address this issue.
Updated by xie xingguo almost 6 years ago
I've just noticed that the failure-domain of your example is set to OSD level. Thus the result is expected, right?
Actions