Project

General

Profile

Actions

Bug #2214

closed

crush: pgs only mapped to 2 devices with replication level 3

Added by Josh Durgin about 12 years ago. Updated almost 7 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This is from #2173. Note that all 3 osds are up.

./osdmaptool --print osdmap
./osdmaptool: osdmap file 'osdmap'
epoch 3212
fsid a743a194-fa91-48fb-8778-e294483273d9
created 2012-03-01 02:06:10.677024
modifed 2012-03-15 17:31:02.260488
flags 

pool 0 'data' rep size 3 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 lpg_num 2 lpgp_num 2 last_change 3172 owner 0 crash_replay_interval 45
pool 1 'metadata' rep size 3 crush_ruleset 1 object_hash rjenkins pg_num 64 pgp_num 64 lpg_num 2 lpgp_num 2 last_change 3162 owner 0
pool 2 'rbd' rep size 3 crush_ruleset 2 object_hash rjenkins pg_num 64 pgp_num 64 lpg_num 2 lpgp_num 2 last_change 3160 owner 0

max_osd 3
osd.0 up   in  weight 1 up_from 3037 up_thru 3211 down_at 3035 last_clean_interval [2865,3034) 192.168.10.205:6800/6301 192.168.10.205:6801/6301 192.168.10.205:6802/6301 exists,up
osd.1 up   in  weight 1 up_from 3055 up_thru 3211 down_at 3054 last_clean_interval [3013,3053) lost_at 358 192.168.10.201:6800/20518 192.168.10.201:6801/20518 192.168.10.201:6802/20518 exists,up
osd.2 up   in  weight 1 up_from 3211 up_thru 3211 down_at 3209 last_clean_interval [3207,3208) 192.168.10.201:6803/26378 192.168.10.201:6806/26378 192.168.10.201:6807/26378 exists,up

pg_temp 0.7 [0,2,1]
pg_temp 1.6 [0,2,1]
pg_temp 1.1c [0,2,1]

$ ./osdmaptool --test-map-pg 0.6 osdmap
./osdmaptool: osdmap file 'osdmap'
 parsed '0.6' -> 0.6
0.6 raw [2,1] up [2,1] acting [2,1]

$ ./osdmaptool --test-map-pg 2.4 osdmap
./osdmaptool: osdmap file 'osdmap'
 parsed '2.4' -> 2.4
2.4 raw [2,1] up [2,1] acting [2,1]

$ ./osdmaptool --test-map-pg 2.6 osdmap
./osdmaptool: osdmap file 'osdmap'
 parsed '2.6' -> 2.6
2.6 raw [0,2,1] up [0,2,1] acting [0,2,1]

The crushmap in the osdmap is:

# devices
device 0 osd.0
device 1 osd.1
device 2 osd.2

# types
type 0 osd
type 1 host
type 2 rack
type 3 pool

# buckets
host server01 {
    id -4        # do not change unnecessarily
    # weight 2.000
    alg straw
    hash 0    # rjenkins1
    item osd.1 weight 1.000
    item osd.2 weight 1.000
}
host server02 {
    id -2        # do not change unnecessarily
    # weight 1.000
    alg straw
    hash 0    # rjenkins1
    item osd.0 weight 1.000
}
rack unknownrack {
    id -3        # do not change unnecessarily
    # weight 2.000
    alg straw
    hash 0    # rjenkins1
    item server01 weight 1.000
    item server02 weight 1.000
}
pool default {
    id -1        # do not change unnecessarily
    # weight 1.000
    alg straw
    hash 0    # rjenkins1
    item unknownrack weight 1.000
}

# rules
rule data {
    ruleset 0
    type replicated
    min_size 1
    max_size 10
    step take default
    step choose firstn 0 type osd
    step emit
}
rule metadata {
    ruleset 1
    type replicated
    min_size 1
    max_size 10
    step take default
    step choose firstn 0 type osd
    step emit
}
rule rbd {
    ruleset 2
    type replicated
    min_size 1
    max_size 10
    step take default
    step choose firstn 0 type osd
    step emit
}

Is this due to the local retry behavior?


Files

osdmap (2.28 KB) osdmap Josh Durgin, 03/26/2012 09:58 AM

Related issues 1 (0 open1 closed)

Related to RADOS - Bug #2047: crush: with a rack->host->device hierarchy, several down devices are likely to cause bad mappingsResolved02/08/2012

Actions
Actions #1

Updated by Sage Weil almost 12 years ago

  • Status changed from New to Resolved
Actions #2

Updated by Greg Farnum almost 7 years ago

  • Project changed from Ceph to RADOS
  • Category deleted (10)
Actions

Also available in: Atom PDF