Project

General

Profile

Bug #1017

ceph 0.26 ,mkcephfs --crushmap crush.new ,wait for very long time,mds stat is still " creating"

Added by changping Wu almost 13 years ago. Updated almost 7 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hi ,
ceph 0.26 + btrfs +ubuntu 10.04 x86_64;
we want to use a special crushmap when do mkcephfs,but wait for very very long time,mds is still "creating" stat.

reproduce steps:
1. $mkcephfs -c /etc/ceph/ceph.conf -a -v -k adminkeyring --crushmap crush.new
2. $init-ceph -c /etc/ceph/ceph.conf -a -v start

root@ubuntu-mon0:/etc/ceph# ceph -s
2011-04-19 13:39:43.442162 pg v10: 1584 pgs: 1584 creating; 0 KB data, 448 KB used, 1200 GB / 1200 GB avail
2011-04-19 13:39:43.443851 mds e4: 1/1/1 up {0=up:creating}, 1 up:standby
2011-04-19 13:39:43.443899 osd e6: 6 osds: 6 up, 6 in
2011-04-19 13:39:43.443962 log 2011-04-19 13:38:42.159058 mon0 172.16.35.10:6789/0 10 : [INF] osd5 172.16.35.77:6803/6383 boot
2011-04-19 13:39:43.444026 mon e1: 3 mons at {0=172.16.35.10:6789/0,1=172.16.35.10:6790/0,2=172.16.35.10:6791/0}

ceph.conf View (3.61 KB) changping Wu, 04/19/2011 12:18 AM

crush.new.txt View (1.42 KB) changping Wu, 04/19/2011 12:18 AM

crush.origin.txt View (982 Bytes) changping Wu, 04/19/2011 12:18 AM

crush.new (729 Bytes) changping Wu, 04/19/2011 12:18 AM

History

#1 Updated by changping Wu almost 13 years ago

hi
ceph 0.27
take the following steps ,this issue doesn't exist.

1.# crushtool --num_osds 6 -o file --build host straw 2 root straw 0
2011-04-27 10:12:16.855925 7f62ff3df720 layer 1 host bucket type straw 2
2011-04-27 10:12:16.855983 7f62ff3df720 lower_items [0,1,2,3,4,5]
2011-04-27 10:12:16.855989 7f62ff3df720 lower_weights [65536,65536,65536,65536,65536,65536]
2011-04-27 10:12:16.855993 7f62ff3df720 item 0 weight 65536
2011-04-27 10:12:16.855997 7f62ff3df720 item 1 weight 65536
2011-04-27 10:12:16.856008 7f62ff3df720 in bucket -1 'host0' size 2 weight 131072
2011-04-27 10:12:16.856013 7f62ff3df720 item 2 weight 65536
2011-04-27 10:12:16.856017 7f62ff3df720 item 3 weight 65536
2011-04-27 10:12:16.856023 7f62ff3df720 in bucket -2 'host1' size 2 weight 131072
2011-04-27 10:12:16.856028 7f62ff3df720 item 4 weight 65536
2011-04-27 10:12:16.856032 7f62ff3df720 item 5 weight 65536
2011-04-27 10:12:16.856038 7f62ff3df720 in bucket -3 'host2' size 2 weight 131072
2011-04-27 10:12:16.856043 7f62ff3df720 layer 2 root bucket type straw 0
2011-04-27 10:12:16.856048 7f62ff3df720 lower_items [-1,-2,-3]
2011-04-27 10:12:16.856053 7f62ff3df720 lower_weights [131072,131072,131072]
2011-04-27 10:12:16.856057 7f62ff3df720 item -1 weight 131072
2011-04-27 10:12:16.856061 7f62ff3df720 item -2 weight 131072
2011-04-27 10:12:16.856065 7f62ff3df720 item -3 weight 131072
2011-04-27 10:12:16.856072 7f62ff3df720 in bucket -4 'root' size 3 weight 393216
2011-04-27 10:12:16.856080 7f62ff3df720 crush max_devices 6

2.root@ubuntu-mon0:/etc/ceph/crushmap# crushtool -d file -o file.txt
file.txt:
  1. begin crush map
  1. devices
    device 0 device0
    device 1 device1
    device 2 device2
    device 3 device3
    device 4 device4
    device 5 device5
  1. types
    type 0 device
    type 1 host
    type 2 root
  1. buckets
    host host0 {
    id -1 # do not change unnecessarily
    alg straw
    hash 0 # rjenkins1
    item device0 weight 1.000
    item device1 weight 1.000
    }
    host host1 {
    id -2 # do not change unnecessarily
    alg straw
    hash 0 # rjenkins1
    item device2 weight 1.000
    item device3 weight 1.000
    }
    host host2 {
    id -3 # do not change unnecessarily
    alg straw
    hash 0 # rjenkins1
    item device4 weight 1.000
    item device5 weight 1.000
    }
    root root {
    id -4 # do not change unnecessarily
    alg straw
    hash 0 # rjenkins1
    item host0 weight 2.000
    item host1 weight 2.000
    item host2 weight 2.000
    }
  1. rules
    rule data {
    ruleset 1
    type replicated
    min_size 2
    max_size 2
    step take root
    step chooseleaf firstn 0 type host
    step emit
    }
  1. end crush map

3. modify file.txt:
root@ubuntu-mon0:/etc/ceph/crushmap# vim file.txt

  1. begin crush map
  1. devices
    device 0 device0
    device 1 device1
    device 2 device2
    device 3 device3
    device 4 device4
    device 5 device5
  1. types
    type 0 device
    type 1 host
    type 2 root
  1. buckets
    host host0 {
    id -1 # do not change unnecessarily
    alg straw
    hash 0 # rjenkins1
    item device0 weight 1.000
    item device1 weight 1.000
    }
    host host1 {
    id -2 # do not change unnecessarily
    alg straw
    hash 0 # rjenkins1
    item device2 weight 1.000
    item device3 weight 1.000
    }
    host host2 {
    id -3 # do not change unnecessarily
    alg straw
    hash 0 # rjenkins1
    item device4 weight 1.000
    item device5 weight 1.000
    }
    root root {
    id -4 # do not change unnecessarily
    alg straw
    hash 0 # rjenkins1
    item host0 weight 2.000
    item host1 weight 2.000
    item host2 weight 2.000
    }
  1. rules
    rule data {
    ruleset 1
    type replicated
    min_size 2
    max_size 2
    step take root
    step chooseleaf firstn 0 type host
    step emit
    }
    rule metadata {
    ruleset 2
    type replicated
    min_size 2
    max_size 2
    step take root
    step chooseleaf firstn 0 type host
    step emit
    }
    rule casdata {
    ruleset 3
    type replicated
    min_size 2
    max_size 2
    step take root
    step chooseleaf firstn 0 type host
    step emit
    }
    rule rbd {
    ruleset 3
    type replicated
    min_size 2
    max_size 2
    step take root
    step chooseleaf firstn 0 type host
    step emit
    }
  1. end crush map
    4. # crushtool -c file.txt -o file.new
    5. # mkcephfs -c ceph.conf -a -v --crushmap ./file.new -k /etc/ceph/adminkeyring
    6. # init-ceph -c ceph.conf -a -v start
    7.
    root@ubuntu-mon0:/etc/ceph/crushmap# ceph -w
    2011-04-27 10:40:48.932643 pg v20: 1584 pgs: 396 creating, 19 active, 1169 active+clean; 24 KB data, 20264 KB used, 600 GB / 600 GB avail; 33/42 degraded (78.571%)
    2011-04-27 10:40:48.934483 mds e5: 1/1/1 up {0=up:active}, 1 up:standby
    2011-04-27 10:40:48.934566 osd e10: 6 osds: 6 up, 6 in
    2011-04-27 10:40:48.934611 log 2011-04-27 19:40:43.689547 osd4 172.16.35.77:6800/2743 1 : [INF] 1.2 scrub ok
    2011-04-27 10:40:48.934647 mon e1: 3 mons at {0=172.16.35.10:6789/0,1=172.16.35.10:6790/0,2=172.16.35.10:6791/0}
    2011-04-27 10:40:49.238229 log 2011-04-27 19:37:22.509102 osd2 172.16.35.76:6800/6630 3 : [INF] 1.2f scrub ok
    2011-04-27 10:40:49.499371 pg v21: 1584 pgs: 396 creating, 19 active, 1169 active+clean; 24 KB data, 20264 KB used, 600 GB / 600 GB avail; 33/42 degraded (78.571%)
    2011-04-27 10:40:50.588586 pg v22: 1584 pgs: 396 creating, 19 active, 1169 active+clean; 24 KB data, 20360 KB used, 600 GB / 600 GB avail; 33/42 degraded (78.571%)
    2011-04-27 10:40:50.855016 log 2011-04-27 19:37:24.624734 osd3 172.16.35.76:6803/7011 3 : [INF] 1.16 scrub ok
    2011-04-27 10:40:51.641037 pg v23: 1584 pgs: 396 creating, 19 active, 1169 active+clean; 24 KB data, 20364 KB used, 600 GB / 600 GB avail; 33/42 degraded (78.571%)
    2011-04-27 10:40:56.530292 log 2011-04-27 19:37:30.509995 osd2 172.16.35.76:6800/6630 4 : [INF] 1.30 scrub ok
    2011-04-27 10:40:57.613678 log 2011-04-27 19:37:31.520387 osd2 172.16.35.76:6800/6630 5 : [INF] 1.3a scrub ok
    2011-04-27 10:40:57.613678 log 2011-04-27 19:40:52.628756 osd4 172.16.35.77:6800/2743 2 : [INF] 1.e scrub ok
    2011-04-27 10:40:58.905384 log 2011-04-27 19:40:54.697618 osd4 172.16.35.77:6800/2743 3 : [INF] 1.1b scrub ok
    2011-04-27 10:40:59.508217 pg v24: 1584 pgs: 396 creating, 14 active, 1174 active+clean; 24 KB data, 20492 KB used, 600 GB / 600 GB avail; 22/42 degraded (52.381%)
    2011-04-27 10:41:00.589165 pg v25: 1584 pgs: 396 creating, 6 active, 1182 active+clean; 24 KB data, 20400 KB used, 600 GB / 600 GB avail; 12/42 degraded (28.571%)
    2011-04-27 10:41:01.705612 log 2011-04-27 19:37:35.626472 osd3 172.16.35.76:6803/7011 4 : [INF] 1.14 scrub ok
    2011-04-27 10:41:01.855947 pg v26: 1584 pgs: 396 creating, 1188 active+clean; 24 KB data, 19752 KB used, 600 GB / 600 GB avail
    2011-04-27 10:41:02.780621 log 2011-04-27 19:37:36.625676 osd3 172.16.35.76:6803/7011 5 : [INF] 1.18 scrub ok
    2011-04-27 10:41:06.600237 pg v27: 1584 pgs: 396 creating, 1188 active+clean; 24 KB data, 19744 KB used, 600 GB / 600 GB avail
    2011-04-27 10:41:07.080943 log 2011-04-27 19:41:02.762970 osd5 172.16.35.77:6803/2833 2 : [INF] 1.12 scrub ok
    2011-04-27 10:41:10.022937 pg v28: 1584 pgs: 396 creating, 1188 active+clean; 24 KB data, 20024 KB used, 600 GB / 600 GB avail
    2011-04-27 10:41:13.914529 log 2011-04-27 19:41:09.805932 osd4 172.16.35.77:6800/2743 4 : [INF] 1.2c scrub ok
    2011-04-27 10:41:14.873189 pg v29: 1584 pgs: 396 creating, 1188 active+clean; 24 KB data, 20144 KB used, 600 GB / 600 GB avail
    2011-04-27 10:41:15.181235 log 2011-04-27 19:41:10.763399 osd5 172.16.35.77:6803/2833 3 : [INF] 1.15 scrub ok
    2011-04-27 10:41:15.914893 pg v30: 1584 pgs: 396 creating, 1188 active+clean; 24 KB data, 20160 KB used, 600 GB / 600 GB avail
    2011-04-27 10:41:16.281337 log 2011-04-27 19:41:11.695613 osd4 172.16.35.77:6800/2743 5 : [INF] 1.33 scrub ok
    2011-04-27 10:41:17.348134 log 2011-04-27 19:41:12.764220 osd5 172.16.35.77:6803/2833 4 : [INF] 1.1c scrub ok
    2011-04-27 10:41:18.389738 log 2011-04-27 19:41:13.610640 osd4 172.16.35.77:6800/2743 6 : [INF] 1.35 scrub ok
    2011-04-27 10:41:18.389738 log 2011-04-27 19:41:13.855729 osd5 172.16.35.77:6803/2833 5 : [INF] 1.1e scrub ok
    2011-04-27 10:41:19.873573 pg v31: 1584 pgs: 396 creating, 1188 active+clean; 24 KB data, 20248 KB used, 600 GB / 600 GB avail
    2011-04-27 10:41:20.923537 pg v32: 1584 pgs: 396 creating, 1188 active+clean; 24 KB data, 20368 KB used, 600 GB / 600 GB avail
    2011-04-27 10:41:21.289926 log 2011-04-27 19:41:16.886773 osd5 172.16.35.77:6803/2833 6 : [INF] 1.27 scrub ok
    2011-04-27 10:41:22.406625 log 2011-04-27 19:41:17.764236 osd5 172.16.35.77:6803/2833 7 : [INF] 1.37 scrub ok
    2011-04-27 10:41:23.381661 log 2011-04-27 19:41:18.672749 osd4 172.16.35.77:6800/2743 7 : [INF] 1.36 scrub ok
    2011-04-27 10:41:24.556802 log 2011-04-27 19:37:57.512293 osd2 172.16.35.76:6800/6630 6 : [INF] 1.1d scrub ok
    2011-04-27 10:41:24.556802 log 2011-04-27 19:37:57.700372 osd3 172.16.35.76:6803/7011 6 : [INF] 1.1a scrub ok
    2011-04-27 10:41:24.848691 pg v33: 1584 pgs: 396 creating, 1188 active+clean; 24 KB data, 19808 KB used, 600 GB / 600 GB avail
    2011-04-27 10:41:25.590208 log 2011-04-27 19:37:58.627204 osd3 172.16.35.76:6803/7011 7 : [INF] 1.23 scrub ok
    2011-04-27 10:41:25.590208 log 2011-04-27 19:37:59.512084 osd2 172.16.35.76:6800/6630 7 : [INF] 1.3e scrub ok
    2011-04-27 10:41:25.590208 log 2011-04-27 19:41:20.797468 osd5 172.16.35.77:6803/2833 8 : [INF] 1.4c scrub ok
    2011-04-27 10:41:25.932134 pg v34: 1584 pgs: 396 creating, 1188 active+clean; 24 KB data, 19968 KB used, 600 GB / 600 GB avail
    2011-04-27 10:41:26.698581 log 2011-04-27 19:37:59.627149 osd3 172.16.35.76:6803/7011 8 : [INF] 1.25 scrub ok
    2011-04-27 10:41:26.698581 log 2011-04-27 19:38:00.628152 osd3 172.16.35.76:6803/7011 9 : [INF] 1.26 scrub ok
    2011-04-27 10:41:26.698581 log 2011-04-27 19:41:21.856700 osd4 172.16.35.77:6800/2743 8 : [INF] 1.3d scrub ok
    2011-04-27 10:41:26.698581 log 2011-04-27 19:41:21.911784 osd5 172.16.35.77:6803/2833 9 : [INF] 1.51 scrub ok
    2011-04-27 10:41:26.982145 pg v35: 1584 pgs: 396 creating, 1188 active+clean; 24 KB data, 20012 KB used, 600 GB / 600 GB avail
    2011-04-27 10:41:28.948628 log 2011-04-27 19:41:24.611477 osd4 172.16.35.77:6800/2743 9 : [INF] 1.40 scrub ok
    2011-04-27 10:41:29.515670 pg v36: 1584 pgs: 396 creating, 1188 active+clean; 24 KB data, 20028 KB used, 600 GB / 600 GB avail
    2011-04-27 10:41:30.023697 log 2011-04-27 19:38:03.610235 osd2 172.16.35.76:6800/6630 8 : [INF] 1.42 scrub ok
    2011-04-27 10:41:30.565685 pg v37: 1584 pgs: 396 creating, 1188 active+clean; 24 KB data, 18556 KB used, 600 GB / 600 GB avail
    2011-04-27 10:41:32.115523 log 2011-04-27 19:41:27.914125 osd5 172.16.35.77:6803/2833 10 : [INF] 1.53 scrub ok

#2 Updated by Sage Weil almost 13 years ago

  • Status changed from New to Closed

Looks like you need 'chooseleaf' instead of 'choose' in the crush rules.

#3 Updated by Greg Farnum almost 7 years ago

  • Project changed from Ceph to RADOS
  • Category deleted (10)

Also available in: Atom PDF