Project

General

Profile

Actions

Bug #10375

closed

OSD journal partition link could be broken after off linking the disk via storcli and one has to manually create the link to be able to start OSD.

Added by Tupper Cole over 9 years ago. Updated over 9 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

1. offline the disk via stocli
/opt/MegaRAID/storcli/storcli64 /c0 /e20 /s6 set offline

2. verify OSD goes down

3. online the disk again

/opt/MegaRAID/storcli/storcli64 /c0 /e20 /s6 set online

4. start the osd

/etc/init.d/ceph start osd.<id>

5. verify /var/log/messages that osd cannot be started due to mount failure:

Oct 21 21:55:24 svl-csl-b-ceph2-003 bash: 2014-10-21 21:55:24.030076 7f5dc44a87c0 -1 filestore(/var/lib/ceph/osd/ceph-129) mount failed to open journal /var/lib/ceph/osd/ceph-129/journal: (2) No such file or directory

6. check the soft link of the journal partition and confirm it is broken:

ls -l var/lib/ceph/osd/ceph-129/journal

lsblk | grep 129 to find out the disk for the osd

cd /dev/disk/by-partuuid

ls -l | grep <journal partition, e.g. sdi2> and verify it is MISSING

7. create the partition link: e.g

ln -s ../../sdi2 84d73a48-5a26-4f95-80a2-09b5b0871ebf

8. restart OSD and verify it is running:

/etc/init.d/ceph start osd.129

tail -n50 /var/log/messages:

Oct 21 23:58:07 svl-csl-b-ceph2-003 systemd: Starting /usr/bin/bash -c ulimit -n 32768; /usr/bin/ceph-osd -i 129 --pid-file /var/run/ceph/osd.129.pid -c /etc/ceph/ceph.conf --cluster ceph -f...

Oct 21 23:58:07 svl-csl-b-ceph2-003 systemd: Started /usr/bin/bash -c ulimit -n 32768; /usr/bin/ceph-osd -i 129 --pid-file /var/run/ceph/osd.129.pid -c /etc/ceph/ceph.conf --cluster ceph -f.

Oct 21 23:58:07 svl-csl-b-ceph2-003 bash: starting osd.129 at :/0 osd_data /var/lib/ceph/osd/ceph-129 /var/lib/ceph/osd/ceph-129/journal

Actions #1

Updated by Sage Weil over 9 years ago

it sounds like udev is failing to create the link. this is a problem with either the megaraid driver or with udev..

Actions #2

Updated by Sage Weil over 9 years ago

  • Status changed from New to Rejected

Tupper, is this reproducible? which kernel? doesn't sounds like a ceph problem.

Actions

Also available in: Atom PDF