Project

General

Profile

Bug #2115

OSD failed to start: Operation not permitted

Added by soft crack almost 12 years ago. Updated almost 12 years ago.

Status:
Rejected
Priority:
High
Assignee:
-
Category:
-
Target version:
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I'm setting up a new ceph cluster on ubuntu 11.10 with kernel version 3.0.0-16-server x86_64. The osd server failed to start with no core dump.

When running from command line, ceph-osd outputs:

2012-02-28 17:32:20.923661 7f41db5157a0 -- 0.0.0.0:6801/22874 accepter.bind ms_addr is 0.0.0.0:6801/22874 need_addr=1
2012-02-28 17:32:20.923772 7f41db5157a0 -- 0.0.0.0:6802/22874 accepter.bind ms_addr is 0.0.0.0:6802/22874 need_addr=1
2012-02-28 17:32:20.923805 7f41db5157a0 -- 0.0.0.0:6803/22874 accepter.bind ms_addr is 0.0.0.0:6803/22874 need_addr=1
 ** WARNING: Ceph is still under development.  Any feedback can be directed  **
 **          at ceph-devel@vger.kernel.org or http://ceph.newdream.net/.     **
2012-02-28 17:32:20.923856 7f41db5157a0 ceph version 0.42.2 (commit:732f3ec94e39d458230b7728b2a936d431e19322), process ceph-osd, pid 22874
starting osd.0 at 0.0.0.0:6801/22874 osd_data /mnt/osd.0 /dev/server02/osd.0.journal
2012-02-28 17:32:20.924252 7f41db5157a0 filestore(/mnt/osd.0) basedir /mnt/osd.0 journal /dev/server02/osd.0.journal
2012-02-28 17:32:20.924359 7f41db5157a0 filestore(/mnt/osd.0) mount FIEMAP ioctl is NOT supported
2012-02-28 17:32:20.924403 7f41db5157a0 filestore(/mnt/osd.0) mount detected btrfs
2012-02-28 17:32:20.924428 7f41db5157a0 filestore(/mnt/osd.0) mount btrfs CLONE_RANGE ioctl is supported
2012-02-28 17:32:20.976031 7f41db5157a0 filestore(/mnt/osd.0) mount btrfs SNAP_CREATE is supported
2012-02-28 17:32:21.045548 7f41db5157a0 filestore(/mnt/osd.0) mount btrfs SNAP_DESTROY is supported
2012-02-28 17:32:21.045739 7f41db5157a0 filestore(/mnt/osd.0) mount btrfs START_SYNC is supported (transid 481)
2012-02-28 17:32:21.075824 7f41db5157a0 filestore(/mnt/osd.0) mount btrfs WAIT_SYNC is supported
2012-02-28 17:32:21.077557 7f41db5157a0 filestore(/mnt/osd.0) mount btrfs SNAP_CREATE_V2 is supported
2012-02-28 17:32:21.129621 7f41db5157a0 filestore(/mnt/osd.0) mount fsid is 79fdd9c1-e23b-4654-afe2-9378a477145a
2012-02-28 17:32:21.129661 7f41db5157a0 filestore(/mnt/osd.0) mount found snaps <1>
2012-02-28 17:32:21.129684 7f41db5157a0 filestore(/mnt/osd.0)  current/ seq was 1
2012-02-28 17:32:21.129690 7f41db5157a0 filestore(/mnt/osd.0)  most recent snap from <1> is 1
2012-02-28 17:32:21.129695 7f41db5157a0 filestore(/mnt/osd.0) mount rolling back to consistent snap 1
2012-02-28 17:32:21.173036 7f41db5157a0 filestore(/mnt/osd.0) mount op_seq is 1
2012-02-28 17:32:21.173109 7f41db5157a0 filestore(/mnt/osd.0) open_journal at /dev/server02/osd.0.journal
2012-02-28 17:32:21.173144 7f41db5157a0 filestore(/mnt/osd.0) mount: enabling PARALLEL journal mode: btrfs, SNAP_CREATE_V2 detected and 'filestore btrfs snap' mode is enabled
2012-02-28 17:32:21.173159 7f41db5157a0 filestore(/mnt/osd.0) list_collections
2012-02-28 17:32:21.173276 7f41db5157a0 journal journal_replay fs op_seq 1
2012-02-28 17:32:21.173365 7f41db5157a0 journal open /dev/server02/osd.0.journal fsid 79fdd9c1-e23b-4654-afe2-9378a477145a fs_op_seq 1
2012-02-28 17:32:21.173389 7f41db5157a0 journal _open_block_device: no journal size specified in configuration. We'll use the entire block device (size: 1073741824)
2012-02-28 17:32:21.175505 7f41db5157a0 journal  kernel version is 3.0.0
2012-02-28 17:32:21.175839 7f41db5157a0 journal _open /dev/server02/osd.0.journal fd 15: 1073741824 bytes, block size 4096 bytes, directio = 1, aio = 0
2012-02-28 17:32:21.175915 7f41db5157a0 journal read_header
2012-02-28 17:32:21.193537 7f41db5157a0 journal header: block_size 4096 alignment 4096 max_size 1073741824
2012-02-28 17:32:21.193594 7f41db5157a0 journal header: start 4096
2012-02-28 17:32:21.193612 7f41db5157a0 journal  write_pos 4096
2012-02-28 17:32:21.193625 7f41db5157a0 journal open header.fsid = 79fdd9c1-e23b-4654-afe2-9378a477145a
2012-02-28 17:32:21.193678 7f41db5157a0 journal read_entry 4096 : seq 1 244 bytes
2012-02-28 17:32:21.193710 7f41db5157a0 journal read_entry 12288 : bad header magic, end of journal
2012-02-28 17:32:21.193719 7f41db5157a0 journal open reached end of journal.
2012-02-28 17:32:21.193729 7f41db5157a0 journal read_entry 12288 : bad header magic, end of journal
2012-02-28 17:32:21.193746 7f41db5157a0 journal journal_replay: end of journal, done.
2012-02-28 17:32:21.194203 7f41db5157a0 journal _open_block_device: no journal size specified in configuration. We'll use the entire block device (size: 1073741824)
2012-02-28 17:32:21.197128 7f41db5157a0 journal  kernel version is 3.0.0
2012-02-28 17:32:21.197425 7f41db5157a0 journal _open /dev/server02/osd.0.journal fd 15: 1073741824 bytes, block size 4096 bytes, directio = 1, aio = 0
2012-02-28 17:32:21.197520 7f41db5157a0 journal journal_start
2012-02-28 17:32:21.197580 7f41d7a1f700 journal write_thread_entry start
2012-02-28 17:32:21.197660 7f41d721e700 journal write_finish_thread_entry enter
2012-02-28 17:32:21.197851 7f41db5157a0 timer(0x2051688).init
2012-02-28 17:32:21.197994 7f41d3a17700 timer(0x2051688).timer_thread starting
2012-02-28 17:32:21.198090 7f41d3a17700 timer(0x2051688).timer_thread going to sleep
2012-02-28 17:32:21.198158 7f41db5157a0 filestore(/mnt/osd.0) umount /mnt/osd.0
2012-02-28 17:32:21.198329 7f41d8220700 journal commit_start op_seq 1, applied_seq 1, committed_seq 1
2012-02-28 17:32:21.198388 7f41d8220700 journal commit_start blocked, all open_ops have completed
2012-02-28 17:32:21.198406 7f41d8220700 journal commit_start nothing to do
2012-02-28 17:32:21.198425 7f41d8220700 journal commit_start
2012-02-28 17:32:21.198635 7f41db5157a0 journal journal_stop
2012-02-28 17:32:21.198838 7f41db5157a0 journal close /dev/server02/osd.0.journal
2012-02-28 17:32:21.198927 7f41d7a1f700 journal write_thread_entry finish
2012-02-28 17:32:21.198975 7f41d721e700 journal write_finish_thread_entry exit
2012-02-28 17:32:21.199432 7f41db5157a0 timer(0x2051688).shutdown
2012-02-28 17:32:21.199496 7f41db5157a0 timer(0x2051688).cancel_all_events
2012-02-28 17:32:21.199583 7f41d3a17700 timer(0x2051688).timer_thread awake
2012-02-28 17:32:21.199602 7f41d3a17700 timer(0x2051688).timer_thread exiting
2012-02-28 17:32:21.199699 7f41db5157a0 monclient(hunting): build_initial_monmap
2012-02-28 17:32:21.199798 7f41db5157a0 filestore(/mnt/osd.0) test_mount basedir /mnt/osd.0 journal /dev/server02/osd.0.journal

If running from service, /var/log/ceph/osd.0.log shows text like above with last line replaced with:

2012-02-28 16:37:55.954759 7f5dc0d077a0  ** ERROR: initializing osd failed: (1) Operation not permitted

osd.0.log View (13.5 KB) soft crack, 02/28/2012 06:39 PM

History

#2 Updated by Sage Weil almost 12 years ago

  • Target version set to v0.44
  • Source changed from Development to Community (user)

Can you attach the actual log? I want to make sure there is no subtle difference in the output. Thanks!

#3 Updated by soft crack almost 12 years ago

See attachment please

#4 Updated by Sage Weil almost 12 years ago

it looks like you may be having trouble authenticating with the monitor. can you reproduce this with 'debug ms = 1'? and verify that the osd keyring file exists? (usually in osd data dir)

#5 Updated by soft crack almost 12 years ago

problem resolved. Thank you very much to your hint! I didn't ever think it is caused by communication.

I created a ceph.conf file from sample [[http://ceph.newdream.net/wiki/Cluster_configuration]] with something like:

[global]
    auth supported = cephx
        keyring = /etc/ceph/keyring.bin

But I made a mistake that I didn't specify a keyring for mds and osd.

Sorry for report a fake bug.

#6 Updated by Sage Weil almost 12 years ago

  • Status changed from New to Rejected

Also available in: Atom PDF