Project

General

Profile

Actions

Bug #2210

closed

osd: some PGs remains remapped or degraded

Added by soft crack about 12 years ago. Updated about 12 years ago.

Status:
Duplicate
Priority:
Normal
Assignee:
-
Category:
OSD
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Some PGs remains 'remapped' or 'degraded' status after adding an osd server.

The steps to re-produce the bugs:
1. Create a new ceph cluster in one server. The ceph.conf file is attached.

mkdir /tmp/ceph
mkcephfs -c /etc/ceph/ceph.conf --prepare-monmap -d /tmp/ceph
mkfs.btrfs /dev/sdb
sudo mount /dev/sdb /mnt/osd.0
sudo mkcephfs --init-local-daemons osd -d /tmp/ceph/
sudo mkcephfs --init-local-daemons mds -d /tmp/ceph/
mkcephfs --prepare-mon -d /tmp/ceph
sudo mkcephfs --init-local-daemons mon -d /tmp/ceph

sudo service ceph start
sudo cp /tmp/ceph/keyring.admin /etc/ceph/keyring.client.admin

As we have only one osd server and the replication level is set to 2 by default, all PGs are degraded. Now 'sudo ceph -s' outputs:

2012-03-26 10:43:24.101281    pg v6: 198 pgs: 198 active+degraded; 8730 bytes data, 1268 KB used, 67930 MB / 70006 MB avail
2012-03-26 10:43:24.102380   mds e4: 1/1/1 up {0=0=up:active}
2012-03-26 10:43:24.102462   osd e3: 1 osds: 1 up, 1 in
2012-03-26 10:43:24.102616   log 2012-03-26 10:41:14.676241 mon.0 192.168.12.201:6789/0 5 : [INF] mds.0 192.168.12.201:6800/18839 up:active
2012-03-26 10:43:24.102731   mon e1: 1 mons at {0=192.168.12.201:6789/0}

2. Add a new osd server on another computer to the cluster

a. copy file ceph.conf to the osd server and add a section:

[osd.1]
        host = server03
        btrfs devs = /dev/sdb2
        osd journal = /dev/sda3

b. get monmap file on mon server using command 'sudo ceph mon getmap -o /tmp/monmap' and copy the file to '/tmp/monmap' on the new osd server
c. create and start the new osd server

sudo mkfs.btrfs /dev/sdb2
sudo mount /dev/sdb2 /mnt/osd.1
sudo ceph-osd -i 1 --mkfs --monmap /tmp/monmap --mkkey

d. copy /etc/ceph/keyring.osd.1 to /tmp on mon host and on mon host:

sudo ceph auth add osd.1 osd 'allow *' mon 'allow rwx' -i /tmp/keyring.osd.1
sudo ceph osd setmaxosd 2

e. start osd server on osd host

Now, 'ceph -s' outputs:

2012-03-26 11:23:20.080953    pg v30: 198 pgs: 198 active+degraded; 8730 bytes data, 1884 KB used, 1923 GB / 1927 GB avail
2012-03-26 11:23:20.082102   mds e9: 1/1/1 up {0=0=up:active}
2012-03-26 11:23:20.082184   osd e14: 2 osds: 2 up, 2 in
2012-03-26 11:23:20.082339   log 2012-03-26 11:21:54.361487 mon.0 192.168.12.201:6789/0 7 : [INF] osd.1 192.168.12.203:6800/9713 boot
2012-03-26 11:23:20.082484   mon e1: 1 mons at {0=192.168.12.201:6789/0}

3. Include the new osd in data placement
sudo ceph osd getcrushmap -o /tmp/crush
crushtool -d /tmp/crush -o /tmp/crush.txt
#edit crush.txt, and the txt file is attached
vi /tmp/crush.txt
crushtool -c /tmp/crush.txt -o /tmp/crush.new

4. Watch cluster activity by 'ceph -w'. Finanlly, 'ceph -s' outputs:
2012-03-26 11:38:25.954504    pg v121: 198 pgs: 174 active+clean, 9 active+remapped, 15 active+degraded; 8730 bytes data, 2784 KB used, 1923 GB / 1927 GB avail
2012-03-26 11:38:25.956305   mds e9: 1/1/1 up {0=0=up:active}
2012-03-26 11:38:25.956522   osd e18: 2 osds: 2 up, 2 in
2012-03-26 11:38:25.956814   log 2012-03-26 11:35:23.955189 osd.1 192.168.12.203:6800/9713 97 : [INF] 2.3d scrub ok
2012-03-26 11:38:25.957011   mon e1: 1 mons at {0=192.168.12.201:6789/0}

'ceph osd dump' outputs:

2012-03-26 11:41:00.413699 mon <- [osd,dump]
2012-03-26 11:41:00.414630 mon.0 -> 'dumped osdmap epoch 18' (0)
epoch 18
fsid 3a82be90-56e1-4f57-ae53-94c46ef325aa
created 2012-03-26 10:41:11.824618
modifed 2012-03-26 11:30:45.981408
flags 

pool 0 'data' rep size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 lpg_num 2 lpgp_num 2 last_change 1 owner 0 crash_replay_interval 45
pool 1 'metadata' rep size 2 crush_ruleset 1 object_hash rjenkins pg_num 64 pgp_num 64 lpg_num 2 lpgp_num 2 last_change 1 owner 0
pool 2 'rbd' rep size 2 crush_ruleset 2 object_hash rjenkins pg_num 64 pgp_num 64 lpg_num 2 lpgp_num 2 last_change 1 owner 0

max_osd 2
osd.0 up   in  weight 1 up_from 11 up_thru 15 down_at 10 last_clean_interval [2,9) 192.168.12.201:6801/20205 192.168.12.201:6802/20205 192.168.12.201:6803/20205 exists,up
osd.1 up   in  weight 1 up_from 14 up_thru 17 down_at 8 last_clean_interval [5,7) 192.168.12.203:6800/9713 192.168.12.203:6801/9713 192.168.12.203:6802/9713 exists,up

pg_temp 0.e [1,0]
pg_temp 0.14 [1,0]
pg_temp 0.21 [1,0]
pg_temp 1.d [1,0]
pg_temp 1.13 [1,0]
pg_temp 1.20 [1,0]
pg_temp 2.c [1,0]
pg_temp 2.12 [1,0]
pg_temp 2.1f [1,0]
blacklist 192.168.12.201:6800/18839 expires 2012-03-26 11:44:27.224895

Files

ceph.conf (368 Bytes) ceph.conf soft crack, 03/25/2012 08:39 PM
crush.txt (1.07 KB) crush.txt soft crack, 03/25/2012 08:39 PM

Related issues 1 (0 open1 closed)

Is duplicate of RADOS - Bug #2047: crush: with a rack->host->device hierarchy, several down devices are likely to cause bad mappingsResolved02/08/2012

Actions
Actions #1

Updated by Josh Durgin about 12 years ago

  • Category set to OSD
  • Source changed from Development to Community (user)

#2173 has some osd logs and related info for the same problem on a less clean cluster. Thanks for the detailed steps to reproduce!

Actions #2

Updated by Sage Weil about 12 years ago

  • Status changed from New to Duplicate

this is actually a crush problem, see #2047.

Actions

Also available in: Atom PDF