Project

General

Profile

Bug #5445

random osd EPERM on journal

Added by Sage Weil about 10 years ago. Updated about 10 years ago.

Status:
Can't reproduce
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

ubuntu@teuthology:/a/teuthology-2013-06-24_18:48:34-rados-cuttlefish-testing-basic/45071

2013-06-24 19:00:31.187572 7f9e55fa8780 -1 journal FileJournal::_open: disabling aio for non-block journal.  Use journal_force_aio to force use of aio anyway
2013-06-24 19:00:31.187586 7f9e55fa8780  1 journal _open /var/lib/ceph/osd/ceph-1/journal fd 27: 104857600 bytes, block size 4096 bytes, directio = 1, aio = 0
2013-06-24 19:00:31.187654 7f9e55fa8780  1 journal _open /var/lib/ceph/osd/ceph-1/journal fd 27: 104857600 bytes, block size 4096 bytes, directio = 1, aio = 0
2013-06-24 19:00:42.557275 7f9e55fa8780  1 journal close /var/lib/ceph/osd/ceph-1/journal
2013-06-24 19:00:42.557652 7f9e55fa8780 -1 ESC[0;31m ** ERROR: osd init failed: (1) Operation not permittedESC[0m

Associated revisions

Revision 8a17f33b (diff)
Added by Sage Weil about 10 years ago

ceph-disk: do not mount over an osd directly in /var/lib/ceph/osd/$cluster-$id

If we see a 'ready' file in the target OSD dir, do not mount our device
on top of it.

Among other things, this prevents ceph-disk activate on stray disks from
stepping on teuthology osds.

Fixes: #5445
Signed-off-by: Sage Weil <>

Revision 26e7a6ff (diff)
Added by Sage Weil about 10 years ago

ceph-disk: do not mount over an osd directly in /var/lib/ceph/osd/$cluster-$id

If we see a 'ready' file in the target OSD dir, do not mount our device
on top of it.

Among other things, this prevents ceph-disk activate on stray disks from
stepping on teuthology osds.

Fixes: #5445
Signed-off-by: Sage Weil <>
(cherry picked from commit 8a17f33b14d858235dfeaa42be1f4842dcfd66d2)

History

#1 Updated by Sage Weil about 10 years ago

  • Status changed from New to In Progress

this happens on tasks that don't use all available disks. a previous job with ceph-deploy leaves behind osd disks, something (not sure what) triggers an activate, and they get mounted over teuthology's osd.

#2 Updated by Sage Weil about 10 years ago

oh, the test in question isn't mounting a drive, but is storing the data directly in /var/lib/ceph/osd/ceph-$id. the ceph-disk test should bail out if a whoami or whatever file is already present

#3 Updated by Sage Weil about 10 years ago

  • Status changed from In Progress to Fix Under Review

pushed wip-5445.

this normally wouldn't happen, except that teuthology does not define fsid in the ceph.conf, so ceph-disk assumes the cluster name is 'ceph'. adding it there would also block this particular failure.

#4 Updated by Sage Weil about 10 years ago

  • Status changed from Fix Under Review to Pending Backport

#5 Updated by Sage Weil about 10 years ago

  • Status changed from Pending Backport to Resolved

#6 Updated by Sage Weil about 10 years ago

  • Status changed from Resolved to 12

teuthology-2013-07-05_01:00:13-rados-master-testing-basic 55351: and 55360:

#7 Updated by Sage Weil about 10 years ago

  • Priority changed from Urgent to High

#8 Updated by Sage Weil about 10 years ago

  • Status changed from 12 to Can't reproduce

Also available in: Atom PDF