Project

General

Profile

Actions

Bug #24005

closed

pg-upmap can break the crush rule in some cases

Added by huang jun almost 6 years ago. Updated over 5 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
OSDMap
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

The test script is

./bin/init-ceph stop
killall ceph-mon ceph-osd
killall ceph-mon ceph-osd
OSD=9 MON=1 MGR=0 MDS=0 ../src/vstart.sh -X -n

./bin/ceph osd crush add-bucket test root
./bin/ceph osd crush add-bucket huangjun-1 host 
./bin/ceph osd crush add-bucket huangjun-2 host 
./bin/ceph osd crush add-bucket huangjun-3 host

./bin/ceph osd crush move huangjun-1 root=test
./bin/ceph osd crush move huangjun-2 root=test
./bin/ceph osd crush move huangjun-3 root=test

./bin/ceph osd crush add osd.0 1.0 host=huangjun-1
./bin/ceph osd crush add osd.1 1.0 host=huangjun-1
./bin/ceph osd crush add osd.2 1.0 host=huangjun-1

./bin/ceph osd crush add osd.3 1.0 host=huangjun-2
./bin/ceph osd crush add osd.4 1.0 host=huangjun-2
./bin/ceph osd crush add osd.5 1.0 host=huangjun-2

./bin/ceph osd crush add osd.7 1.0 host=huangjun-3
./bin/ceph osd crush add osd.6 1.0 host=huangjun-3
./bin/ceph osd crush add osd.8 1.0 host=huangjun-3

#unlink all osds from default crush bucket
for i in `seq 0 8`; do
./bin/ceph osd crush unlink osd.$i huangjun
done

./bin/ceph osd erasure-code-profile set test k=4 m=2 crush-failure-domain=osd
./bin/ceph osd getcrushmap -o crush
./bin/crushtool -d crush -o crush.txt
echo " 
rule test {
id 1
type erasure
min_size 1
max_size 10
step take huangjun-1
step chooseleaf indep 2 type osd
step emit
step take huangjun-2
step chooseleaf indep 2 type osd
step emit
step take huangjun-3
step chooseleaf indep 2 type osd
step emit
}
" >> crush.txt
./bin/crushtool -c crush.txt -o crush.new
./bin/ceph osd setcrushmap -i crush.new

./bin/ceph osd pool create test 256 256 erasure test test

./bin/ceph osd set-require-min-compat-client luminous

The cluster topology is
ID CLASS WEIGHT  TYPE NAME           STATUS REWEIGHT PRI-AFF 
-5       9.00000 root test                                   
-7       3.00000     host huangjun-1                         
 0   hdd 1.00000         osd.0           up  1.00000 1.00000 
 1   hdd 1.00000         osd.1           up  1.00000 1.00000 
 2   hdd 1.00000         osd.2           up  1.00000 1.00000 
-8       3.00000     host huangjun-2                         
 3   hdd 1.00000         osd.3           up  1.00000 1.00000 
 4   hdd 1.00000         osd.4           up  1.00000 1.00000 
 5   hdd 1.00000         osd.5           up  1.00000 1.00000 
-9       3.00000     host huangjun-3                         
 6   hdd 1.00000         osd.6           up  1.00000 1.00000 
 7   hdd 1.00000         osd.7           up  1.00000 1.00000 
 8   hdd 1.00000         osd.8           up  1.00000 1.00000 
-1             0 root default                                
-2             0     host huangjun 

I choose pg 1.1 to test, the origin map is

osdmap e50 pg 1.1 (1.1) -> up [2,0,5,4,7,8] acting [2,0,5,4,7,8]

Then i use 'ceph osd pg-upmap-items' to set upmaps

./bin/ceph osd pg-upmap-items 1.1 2 1 4 2

and check the pg1.1 map again
[root@huangjun /usr/src/ceph-int/build]$ ./bin/ceph pg map 1.1                                                                                                                                                                                                            
*** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH ***
osdmap e52 pg 1.1 (1.1) -> up [1,0,5,2,7,8] acting [1,0,5,2,7,8]

My question is:
1. the crush rule said we take 2 osds from each host, but after this upmap,
we can see, pg1.1 have 3 osds in host huangjun-1, 1 osd in host huangjun-2,
which i think break the crush rule, and if the host huangjun-1 is halt, there
can be 3 osds down, that will result pg1.1 in down state.

Actions #1

Updated by xie xingguo almost 6 years ago

  • Assignee set to xie xingguo
Actions #2

Updated by huang jun almost 6 years ago

@xie xingguo
this pr https://github.com/ceph/ceph/pull/21815 didn't address this issue.

Actions #3

Updated by xie xingguo almost 6 years ago

I've just noticed that the failure-domain of your example is set to OSD level. Thus the result is expected, right?

Actions #4

Updated by xie xingguo over 5 years ago

  • Status changed from New to Closed
Actions

Also available in: Atom PDF