Project

General

Profile

Actions

Bug #11119

closed

data placement is a function of OSD id

Added by Dan van der Ster about 9 years ago. Updated about 7 years ago.

Status:
Won't Fix
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

While looking closely at straw vs. straw2 buckets I realized that one property of CRUSH/straw that I thought was true is in fact not true. What I expected is, given the following:

- two OSDs with ids x and y
- OSD x fails and is replaced
- the replacement OSD gets a new id y
- OSD x is removed from CRUSH
- OSD y is added to CRUSH at the same location and with the same weight that x had

then:

- OSD y should get the same PGs that x had
- there should be no data movement on other OSDs in the cluster

But this turns out to be not true. And since we rely on this falsehood in our operations procedures, our disk replacements are moving a lot more data than they should.

Here is my example.
We start with crush.txt.orig:

# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable straw_calc_version 1

# devices
device 0 osd.0
device 1 osd.1
device 2 osd.2
device 3 osd.3

# types
type 0 device
type 1 host
type 2 default

# buckets
host host0 {
        id -1           # do not change unnecessarily
        # weight 2.000
        alg straw
        hash 0  # rjenkins1
        item osd.0 weight 1.000
        item osd.1 weight 1.000
}
host host1 {
        id -2           # do not change unnecessarily
        # weight 2.000
        alg straw
        hash 0  # rjenkins1
        item osd.2 weight 1.000
        item osd.3 weight 1.000
}
default default {
        id -3           # do not change unnecessarily
        # weight 4.000
        alg straw
        hash 0  # rjenkins1
        item host0 weight 2.000
        item host1 weight 2.000
}

# rules
rule replicated_ruleset {
        ruleset 0
        type replicated
        min_size 1
        max_size 10
        step take default
        step chooseleaf firstn 0 type host
        step emit
}

# end crush map

Then after replacing osd.0 with osd.4 (to make crush.txt.new):

# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable straw_calc_version 1

# devices
device 0 device0
device 1 osd.1
device 2 osd.2
device 3 osd.3
device 4 osd.4

# types
type 0 device
type 1 host
type 2 default

# buckets
host host0 {
        id -1           # do not change unnecessarily
        # weight 2.000
        alg straw
        hash 0  # rjenkins1
        item osd.4 weight 1.000
        item osd.1 weight 1.000
}
host host1 {
        id -2           # do not change unnecessarily
        # weight 2.000
        alg straw
        hash 0  # rjenkins1
        item osd.2 weight 1.000
        item osd.3 weight 1.000
}
default default {
        id -3           # do not change unnecessarily
        # weight 4.000
        alg straw
        hash 0  # rjenkins1
        item host0 weight 2.000
        item host1 weight 2.000
}

# rules
rule replicated_ruleset {
        ruleset 0
        type replicated
        min_size 1
        max_size 10
        step take default
        step chooseleaf firstn 0 type host
        step emit
}

# end crush map

Then we test the new maps vs expected:

crushtool -c crush.txt.orig -o cm.orig
crushtool -c crush.txt.new -o cm.new
crushtool -i cm.orig --num-rep 2 --test --show-mappings > orig.mappings 2>&1
cat orig.mappings | sed -e 's/\[0/\[4/' | sed -e 's/0\]/4\]/' > expected.mappings
crushtool -i cm.new --num-rep 2 --test --show-mappings > actual.mappings 2>&1
wc -l orig.mappings
diff -u expected.mappings actual.mappings  | grep -c ^+

I get 344/1024 PGs which move. Comments?


Files

straw1.before.txt (328 KB) straw1.before.txt Dan van der Ster, 03/20/2015 11:04 AM
straw1.after.txt (328 KB) straw1.after.txt Dan van der Ster, 03/20/2015 11:04 AM
Actions

Also available in: Atom PDF