Project

General

Profile

Bug #51953

imbalancing data distribution for osds with custom device class

Added by xiaoyao ren over 2 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
OSD
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

(1)ceph cluster osds' disk occupation information as follows,
ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS
2 hdd 7.27689 1.00000 7.28TiB 145GiB 7.14TiB 1.94 0.84 85
3 hdd 7.27689 1.00000 7.28TiB 182GiB 7.10TiB 2.44 1.05 94
4 hdd 7.27689 1.00000 7.28TiB 165GiB 7.12TiB 2.22 0.96 98
5 hdd 7.27689 1.00000 7.28TiB 138GiB 7.14TiB 1.85 0.80 88
6 hdd 7.27689 1.00000 7.28TiB 122GiB 7.16TiB 1.64 0.71 90
7 hdd 7.27689 1.00000 7.28TiB 169GiB 7.11TiB 2.26 0.98 96
8 hdd 7.27689 1.00000 7.28TiB 161GiB 7.12TiB 2.16 0.93 91
9 hdd 7.27689 1.00000 7.28TiB 115GiB 7.16TiB 1.54 0.67 89
18 rgw-index-ssd 0.72769 1.00000 745GiB 531GiB 215GiB 71.20 30.74 46
0 ssd 1.74660 1.00000 1.75TiB 1.79GiB 1.74TiB 0.10 0.04 70
10 hdd 7.27689 1.00000 7.28TiB 138GiB 7.14TiB 1.86 0.80 77
11 hdd 7.27689 1.00000 7.28TiB 165GiB 7.12TiB 2.21 0.95 90
12 hdd 7.27689 1.00000 7.28TiB 101GiB 7.18TiB 1.35 0.58 59
13 hdd 7.27689 1.00000 7.28TiB 150GiB 7.13TiB 2.02 0.87 78
14 hdd 7.27689 1.00000 7.28TiB 166GiB 7.11TiB 2.23 0.96 97
15 hdd 7.27689 1.00000 7.28TiB 184GiB 7.10TiB 2.46 1.06 111
16 hdd 7.27689 1.00000 7.28TiB 131GiB 7.15TiB 1.76 0.76 93
17 hdd 7.27689 1.00000 7.28TiB 115GiB 7.16TiB 1.54 0.67 79
19 rgw-index-ssd 0.72769 1.00000 745GiB 523GiB 223GiB 70.12 30.28 43
1 ssd 1.74660 1.00000 1.75TiB 1.76GiB 1.74TiB 0.10 0.04 65
20 hdd 7.27689 1.00000 7.28TiB 124GiB 7.16TiB 1.67 0.72 85
21 hdd 7.27689 1.00000 7.28TiB 122GiB 7.16TiB 1.64 0.71 82
22 hdd 7.27689 1.00000 7.28TiB 144GiB 7.14TiB 1.93 0.83 90
23 hdd 7.27689 1.00000 7.28TiB 176GiB 7.11TiB 2.36 1.02 96
24 hdd 7.27689 1.00000 7.28TiB 178GiB 7.10TiB 2.38 1.03 96
25 hdd 7.27689 1.00000 7.28TiB 171GiB 7.11TiB 2.29 0.99 94
26 hdd 7.27689 1.00000 7.28TiB 157GiB 7.12TiB 2.10 0.91 100
27 hdd 7.27689 1.00000 7.28TiB 160GiB 7.12TiB 2.15 0.93 93
29 rgw-index-ssd 0.72769 1.00000 745GiB 1.03GiB 744GiB 0.14 0.06 41
28 ssd 1.74660 1.00000 1.75TiB 2.20GiB 1.74TiB 0.12 0.05 67
30 hdd 7.27689 1.00000 7.28TiB 114GiB 7.17TiB 1.53 0.66 78
31 hdd 7.27689 1.00000 7.28TiB 186GiB 7.10TiB 2.49 1.08 97
32 hdd 7.27689 1.00000 7.28TiB 143GiB 7.14TiB 1.92 0.83 77
33 hdd 7.27689 1.00000 7.28TiB 169GiB 7.11TiB 2.26 0.98 95
34 hdd 7.27689 1.00000 7.28TiB 153GiB 7.13TiB 2.05 0.89 84
35 hdd 7.27689 1.00000 7.28TiB 101GiB 7.18TiB 1.36 0.59 77
36 hdd 7.27689 1.00000 7.28TiB 111GiB 7.17TiB 1.49 0.65 79
37 hdd 7.27689 1.00000 7.28TiB 106GiB 7.17TiB 1.43 0.62 74
38 hdd 7.27689 1.00000 7.28TiB 151GiB 7.13TiB 2.03 0.88 88
TOTAL 248TiB 5.73TiB 242TiB 2.32
MIN/MAX VAR: 0.04/30.74 STDDEV: 15.50

where, rgw-index-ssd is a custom device class used for rgw bucket index by crush rule below,
begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable chooseleaf_vary_r 1
tunable chooseleaf_stable 1
tunable straw_calc_version 1
tunable allowed_bucket_algs 54
devices
device 0 osd.0 class ssd
device 1 osd.1 class ssd
device 2 osd.2 class hdd
device 3 osd.3 class hdd
device 4 osd.4 class hdd
device 5 osd.5 class hdd
device 6 osd.6 class hdd
device 7 osd.7 class hdd
device 8 osd.8 class hdd
device 9 osd.9 class hdd
device 10 osd.10 class hdd
device 11 osd.11 class hdd
device 12 osd.12 class hdd
device 13 osd.13 class hdd
device 14 osd.14 class hdd
device 15 osd.15 class hdd
device 16 osd.16 class hdd
device 17 osd.17 class hdd
device 18 osd.18 class rgw-index-ssd
device 19 osd.19 class rgw-index-ssd
device 20 osd.20 class hdd
device 21 osd.21 class hdd
device 22 osd.22 class hdd
device 23 osd.23 class hdd
device 24 osd.24 class hdd
device 25 osd.25 class hdd
device 26 osd.26 class hdd
device 27 osd.27 class hdd
device 28 osd.28 class ssd
device 29 osd.29 class rgw-index-ssd
device 30 osd.30 class hdd
device 31 osd.31 class hdd
device 32 osd.32 class hdd
device 33 osd.33 class hdd
device 34 osd.34 class hdd
device 35 osd.35 class hdd
device 36 osd.36 class hdd
device 37 osd.37 class hdd
device 38 osd.38 class hdd
types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 region
type 10 root
buckets
host ceph-dl-service-01 {
id -3 # do not change unnecessarily
id -4 class ssd # do not change unnecessarily
id -7 class hdd # do not change unnecessarily
id -10 class rgw-index-ssd # do not change unnecessarily # weight 60.689
alg straw2
hash 0 # rjenkins1
item osd.0 weight 1.747
item osd.2 weight 7.277
item osd.3 weight 7.277
item osd.4 weight 7.277
item osd.5 weight 7.277
item osd.6 weight 7.277
item osd.7 weight 7.277
item osd.8 weight 7.277
item osd.9 weight 7.277
item osd.18 weight 0.728
}
host ceph-dl-service-02 {
id -5 # do not change unnecessarily
id -6 class ssd # do not change unnecessarily
id -8 class hdd # do not change unnecessarily
id -11 class rgw-index-ssd # do not change unnecessarily # weight 60.689
alg straw2
hash 0 # rjenkins1
item osd.1 weight 1.747
item osd.10 weight 7.277
item osd.11 weight 7.277
item osd.12 weight 7.277
item osd.13 weight 7.277
item osd.14 weight 7.277
item osd.15 weight 7.277
item osd.16 weight 7.277
item osd.17 weight 7.277
item osd.19 weight 0.728
}
host ceph-dl-service-03 {
id -13 # do not change unnecessarily
id -14 class ssd # do not change unnecessarily
id -15 class hdd # do not change unnecessarily
id -16 class rgw-index-ssd # do not change unnecessarily # weight 60.689
alg straw2
hash 0 # rjenkins1
item osd.20 weight 7.277
item osd.21 weight 7.277
item osd.22 weight 7.277
item osd.23 weight 7.277
item osd.24 weight 7.277
item osd.25 weight 7.277
item osd.26 weight 7.277
item osd.27 weight 7.277
item osd.28 weight 1.747
item osd.29 weight 0.728
}
host ceph-dl-service-04 {
id -17 # do not change unnecessarily
id -18 class ssd # do not change unnecessarily
id -19 class hdd # do not change unnecessarily
id -20 class rgw-index-ssd # do not change unnecessarily # weight 65.492
alg straw2
hash 0 # rjenkins1
item osd.30 weight 7.277
item osd.31 weight 7.277
item osd.32 weight 7.277
item osd.33 weight 7.277
item osd.34 weight 7.277
item osd.35 weight 7.277
item osd.36 weight 7.277
item osd.37 weight 7.277
item osd.38 weight 7.277
}
root default {
id -1 # do not change unnecessarily
id -2 class ssd # do not change unnecessarily
id -9 class hdd # do not change unnecessarily
id -12 class rgw-index-ssd # do not change unnecessarily # weight 247.560
alg straw2
hash 0 # rjenkins1
item ceph-dl-service-01 weight 60.689
item ceph-dl-service-02 weight 60.689
item ceph-dl-service-03 weight 60.689
item ceph-dl-service-04 weight 65.492
}
rules
rule replicated_rule {
id 0
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type host
step emit
}
rule rgw-index-rule {
id 1
type replicated
min_size 1
max_size 10
step take default class rgw-index-ssd
step chooseleaf firstn 0 type host
step emit
}
rule hdd-rule {
id 2
type replicated
min_size 1
max_size 10
step take default class hdd
step chooseleaf firstn 0 type host
step emit
}
rule ssd-rule {
id 3
type replicated
min_size 1
max_size 10
step take default class ssd
step chooseleaf firstn 0 type host
step emit
}
(2)
Ceph: ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) luminous (stable)
OS: Linux *** 4.9.0-11-amd64 #1 SMP Debian 4.9.189-3 (2019-09-02) x86_64 GNU/Linux

by analysis, pgs with osd.18, osd.19 and osd.29 only are mapped to that three osds, obviously they are receiving data from osds in other data pools like cephfs or rbd but the pool called .rgw.index using them still not in use and nearly has no any data, that is confusing.
so osd.18, osd.19 and osd.29 data capability should be problematic and first two still increasing slowly over time, but not clear why so that

Also available in: Atom PDF