Project

General

Profile

Bug #10920

ceph-osd fails on startup: ERROR: osd init failed: (1) Operation not permitted

Added by Sage Weil about 9 years ago. Updated about 8 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

the teuth failure trace looks like

764068/summary.yaml:failure_reason: 'Command failed on plana12 with status 1: ''sync && sudo umount -f
764068/summary.yaml-  /var/lib/ceph/osd/ceph-0'''

the mkfs succeeds:

2015-02-18T19:50:23.967 INFO:teuthology.orchestra.run.plana12:Running: 'sudo MALLOC_CHECK_=3 adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph-osd --mkfs --mkkey -i 0 --monmap /home/ubuntu/cephtest/monmap'
2015-02-18T19:50:24.230 INFO:teuthology.orchestra.run.plana12.stderr:2015-02-18 19:50:24.229139 7f0055b74780 -1 journal FileJournal::_open: disabling aio for non-block journal.  Use journal_force_aio to force use of aio anyway
2015-02-18T19:50:24.688 INFO:teuthology.orchestra.run.plana12.stderr:2015-02-18 19:50:24.687402 7f0055b74780 -1 journal FileJournal::_open: disabling aio for non-block journal.  Use journal_force_aio to force use of aio anyway
2015-02-18T19:50:24.696 INFO:teuthology.orchestra.run.plana12.stderr:2015-02-18 19:50:24.696016 7f0055b74780 -1 filestore(/var/lib/ceph/osd/ceph-0) could not find 23c2fcde/osd_superblock/0//-1 in index: (2) No such file or directory
2015-02-18T19:50:24.914 INFO:teuthology.orchestra.run.plana12.stderr:2015-02-18 19:50:24.913537 7f0055b74780 -1 created object store /var/lib/ceph/osd/ceph-0 journal /var/lib/ceph/osd/ceph-0/journal for osd.0 fsid 76b60f12-f734-48c6-be64-1a8e72ec3f3a
2015-02-18T19:50:24.915 INFO:teuthology.orchestra.run.plana12.stderr:2015-02-18 19:50:24.913595 7f0055b74780 -1 auth: error reading file: /var/lib/ceph/osd/ceph-0/keyring: can't open /var/lib/ceph/osd/ceph-0/keyring: (2) No such file or directory
2015-02-18T19:50:24.915 INFO:teuthology.orchestra.run.plana12.stderr:2015-02-18 19:50:24.913749 7f0055b74780 -1 created new key in keyring /var/lib/ceph/osd/ceph-0/keyring

but startup fails:
2015-02-18T19:50:32.504 INFO:tasks.ceph.osd.1:Started
2015-02-18T19:50:32.572 INFO:tasks.ceph.osd.1.plana34.stdout:starting osd.1 at :/0 osd_data /var/lib/ceph/osd/ceph-1 /var/lib/ceph/osd/ceph-1/journal
2015-02-18T19:50:32.886 INFO:tasks.ceph.osd.1.plana34.stderr:2015-02-18 19:50:32.884957 7f6698ad3780 -1 journal FileJournal::_open: disabling aio for non-block journal.  Use journal_force_aio to force use of aio anyway
2015-02-18T19:50:32.893 INFO:tasks.ceph.osd.1.plana34.stderr:2015-02-18 19:50:32.892711 7f6698ad3780 -1 osd.1 0 log_to_monitors {default=true}
2015-02-18T19:50:32.994 INFO:tasks.ceph.osd.1.plana34.stderr:2015-02-18 19:50:32.993355 7f6698ad3780 -1 ESC[0;31m ** ERROR: osd init failed: (1) Operation not permittedESC[0m
2015-02-18T19:50:33.161 INFO:tasks.ceph.osd.1.plana34.stderr:daemon-helper: command failed with exit status 1

..and because ceph.py hasn't started yet (i think?) it doesn't gather up any useful logs.

ubuntu@teuthology:/a/sage-2015-02-17_22:30:59-rados-wip-sage-testing-distro-basic-multi/764068
ubuntu@teuthology:/a/sage-2015-02-17_22:30:59-rados-wip-sage-testing-distro-basic-multi/764162
ubuntu@teuthology:/a/sage-2015-02-17_22:30:59-rados-wip-sage-testing-distro-basic-multi/764256

History

#1 Updated by Tyler Bishop about 9 years ago

Looks like a sudo permission issue more than a bug?

#2 Updated by Sage Weil about 9 years ago

  • Status changed from New to Resolved

this was bad job yaml, starting the same osd twice

#3 Updated by Daniel O'Brien about 8 years ago

I'm seeing the exact same error after I added an osd using ceph-deploy

2016-03-27 01:53:43.714811 7fa932422800  0 ceph version 10.1.0 (96ae8bd25f31862dbd5302f304ebf8bf1166aba6), process ceph-osd, pid 8269
2016-03-27 01:53:43.717123 7fa932422800  0 pidfile_write: ignore empty --pid-file
2016-03-27 01:53:43.747834 7fa932422800  0 filestore(/var/lib/ceph/osd/ceph-4) backend xfs (magic 0x58465342)
2016-03-27 01:53:43.748330 7fa932422800  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-4) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option
2016-03-27 01:53:43.748342 7fa932422800  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-4) detect_features: SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole' config option
2016-03-27 01:53:43.748365 7fa932422800  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-4) detect_features: splice is supported
2016-03-27 01:53:43.749348 7fa932422800  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-4) detect_features: syncfs(2) syscall fully supported (by glibc and kernel)
2016-03-27 01:53:43.749401 7fa932422800  0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-4) detect_feature: extsize is disabled by conf
2016-03-27 01:53:43.750263 7fa932422800  1 leveldb: Recovering log #37
2016-03-27 01:53:43.751088 7fa932422800  1 leveldb: Delete type=0 #37

2016-03-27 01:53:43.751126 7fa932422800  1 leveldb: Delete type=3 #36

2016-03-27 01:53:43.751459 7fa932422800  0 filestore(/var/lib/ceph/osd/ceph-4) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled
2016-03-27 01:53:43.753746 7fa932422800  1 journal _open /var/lib/ceph/osd/ceph-4/journal fd 18: 5368709120 bytes, block size 4096 bytes, directio = 1, aio = 1
2016-03-27 01:53:43.755160 7fa932422800  1 journal _open /var/lib/ceph/osd/ceph-4/journal fd 18: 5368709120 bytes, block size 4096 bytes, directio = 1, aio = 1
2016-03-27 01:53:43.756199 7fa932422800  1 filestore(/var/lib/ceph/osd/ceph-4) upgrade
2016-03-27 01:53:43.756566 7fa932422800  0 <cls> cls/cephfs/cls_cephfs.cc:202: loading cephfs_size_scan
2016-03-27 01:53:43.756881 7fa932422800  0 <cls> cls/hello/cls_hello.cc:305: loading cls_hello
2016-03-27 01:53:43.763752 7fa932422800  0 osd.4 101 crush map has features 2268850290688, adjusting msgr requires for clients
2016-03-27 01:53:43.763766 7fa932422800  0 osd.4 101 crush map has features 2543728197632 was 8705, adjusting msgr requires for mons
2016-03-27 01:53:43.763774 7fa932422800  0 osd.4 101 crush map has features 2543728197632, adjusting msgr requires for osds
2016-03-27 01:53:43.767592 7fa932422800  0 osd.4 101 load_pgs
2016-03-27 01:53:43.828350 7fa932422800  0 osd.4 101 load_pgs opened 128 pgs
2016-03-27 01:53:43.828394 7fa932422800  0 osd.4 101 using 0 op queue with priority op cut off at 64.
2016-03-27 01:53:43.829241 7fa932422800 -1 osd.4 101 log_to_monitors {default=true}
2016-03-27 01:53:43.832998 7fa932422800  1 journal close /var/lib/ceph/osd/ceph-4/journal
2016-03-27 01:53:43.834806 7fa932422800 -1  ** ERROR: osd init failed: (1) Operation not permitted

I can see its doing something twice

2016-03-27 01:53:43.753746 7fa932422800 1 journal _open /var/lib/ceph/osd/ceph-4/journal fd 18: 5368709120 bytes, block size 4096 bytes, directio = 1, aio = 1
2016-03-27 01:53:43.755160 7fa932422800 1 journal _open /var/lib/ceph/osd/ceph-4/journal fd 18: 5368709120 bytes, block size 4096 bytes, directio = 1, aio = 1

How/where do we edit the yaml to stop the osd being started twice?

I'm running ceph v10.1.0

#4 Updated by Daniel O'Brien about 8 years ago

rebooting the node resolved the issue...

#5 Updated by Daniel O'Brien about 8 years ago

Daniel O'Brien wrote:

rebooting the node resolved the issue...

Sorry, I lie, that happened to work once, but now the error is back again

Also available in: Atom PDF