Project

General

Profile

Bug #15653

Updated by Sage Weil 11 months ago

CRUSH will correctly choose items with relative weights with the right probabilities for each independent choice. However, when choosing multiple replicas, each choice is *not* indepent, since it
has to be unique. The result is that low-weighted devices get too many items.

Simple example:
<pre>

maetl:src (master) 03:20 PM $ cat cm.txt
# begin crush map

# devices
device 0 device0
device 1 device1
device 2 device2
device 3 device3
device 4 device4

# types
type 0 osd
type 1 domain
type 2 pool

# buckets
domain root {
id -1 # do not change unnecessarily
# weight 5.000
alg straw2
hash 0 # rjenkins1
item device0 weight 10.00
item device1 weight 10.0
item device2 weight 10.0
item device3 weight 10.0
item device4 weight 1.000
}

# rules
rule data {
ruleset 0
type replicated
min_size 1
max_size 10
step take root
step choose firstn 0 type osd
step emit
}

# end crush map
maetl:src (master) 03:20 PM $ ./crushtool -c cm.txt -o cm
maetl:src (master) 03:20 PM $ ./crushtool -i cm --test --show-utilization --num-rep 1 --min-x 1 --max-x 1000000 --num-rep 1
rule 0 (data), x = 1..1000000, numrep = 1..1
rule 0 (data) num_rep 1 result size == 1: 1000000/1000000
device 0: stored : 243456 expected : 200000
device 1: stored : 243624 expected : 200000
device 2: stored : 244486 expected : 200000
device 3: stored : 243881 expected : 200000
device 4: stored : 24553 expected : 200000
maetl:src (master) 03:20 PM $ ./crushtool -i cm --test --show-utilization --num-rep 1 --min-x 1 --max-x 1000000 --num-rep 3
rule 0 (data), x = 1..1000000, numrep = 3..3
rule 0 (data) num_rep 3 result size == 3: 1000000/1000000
device 0: stored : 723984 expected : 600000
device 1: stored : 722923 expected : 600000
device 2: stored : 723153 expected : 600000
device 3: stored : 723394 expected : 600000
device 4: stored : 106546 expected : 600000
</pre>

Note that in the 1x case, we get 1/10th the items on device 4, as expected. For 3x, it grows to 1/7th. For lower weights the amplification is more pronounced.

Back