Project

General

Profile

Actions

Bug #5445

closed

random osd EPERM on journal

Added by Sage Weil almost 11 years ago. Updated over 10 years ago.

Status:
Can't reproduce
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

ubuntu@teuthology:/a/teuthology-2013-06-24_18:48:34-rados-cuttlefish-testing-basic/45071

2013-06-24 19:00:31.187572 7f9e55fa8780 -1 journal FileJournal::_open: disabling aio for non-block journal.  Use journal_force_aio to force use of aio anyway
2013-06-24 19:00:31.187586 7f9e55fa8780  1 journal _open /var/lib/ceph/osd/ceph-1/journal fd 27: 104857600 bytes, block size 4096 bytes, directio = 1, aio = 0
2013-06-24 19:00:31.187654 7f9e55fa8780  1 journal _open /var/lib/ceph/osd/ceph-1/journal fd 27: 104857600 bytes, block size 4096 bytes, directio = 1, aio = 0
2013-06-24 19:00:42.557275 7f9e55fa8780  1 journal close /var/lib/ceph/osd/ceph-1/journal
2013-06-24 19:00:42.557652 7f9e55fa8780 -1 ESC[0;31m ** ERROR: osd init failed: (1) Operation not permittedESC[0m

Actions #1

Updated by Sage Weil almost 11 years ago

  • Status changed from New to In Progress

this happens on tasks that don't use all available disks. a previous job with ceph-deploy leaves behind osd disks, something (not sure what) triggers an activate, and they get mounted over teuthology's osd.

Actions #2

Updated by Sage Weil almost 11 years ago

oh, the test in question isn't mounting a drive, but is storing the data directly in /var/lib/ceph/osd/ceph-$id. the ceph-disk test should bail out if a whoami or whatever file is already present

Actions #3

Updated by Sage Weil almost 11 years ago

  • Status changed from In Progress to Fix Under Review

pushed wip-5445.

this normally wouldn't happen, except that teuthology does not define fsid in the ceph.conf, so ceph-disk assumes the cluster name is 'ceph'. adding it there would also block this particular failure.

Actions #4

Updated by Sage Weil almost 11 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #5

Updated by Sage Weil almost 11 years ago

  • Status changed from Pending Backport to Resolved
Actions #6

Updated by Sage Weil almost 11 years ago

  • Status changed from Resolved to 12

teuthology-2013-07-05_01:00:13-rados-master-testing-basic 55351: and 55360:

Actions #7

Updated by Sage Weil almost 11 years ago

  • Priority changed from Urgent to High
Actions #8

Updated by Sage Weil over 10 years ago

  • Status changed from 12 to Can't reproduce
Actions

Also available in: Atom PDF