Bug #2115
closedOSD failed to start: Operation not permitted
0%
Description
I'm setting up a new ceph cluster on ubuntu 11.10 with kernel version 3.0.0-16-server x86_64. The osd server failed to start with no core dump.
When running from command line, ceph-osd outputs:
2012-02-28 17:32:20.923661 7f41db5157a0 -- 0.0.0.0:6801/22874 accepter.bind ms_addr is 0.0.0.0:6801/22874 need_addr=1 2012-02-28 17:32:20.923772 7f41db5157a0 -- 0.0.0.0:6802/22874 accepter.bind ms_addr is 0.0.0.0:6802/22874 need_addr=1 2012-02-28 17:32:20.923805 7f41db5157a0 -- 0.0.0.0:6803/22874 accepter.bind ms_addr is 0.0.0.0:6803/22874 need_addr=1 ** WARNING: Ceph is still under development. Any feedback can be directed ** ** at ceph-devel@vger.kernel.org or http://ceph.newdream.net/. ** 2012-02-28 17:32:20.923856 7f41db5157a0 ceph version 0.42.2 (commit:732f3ec94e39d458230b7728b2a936d431e19322), process ceph-osd, pid 22874 starting osd.0 at 0.0.0.0:6801/22874 osd_data /mnt/osd.0 /dev/server02/osd.0.journal 2012-02-28 17:32:20.924252 7f41db5157a0 filestore(/mnt/osd.0) basedir /mnt/osd.0 journal /dev/server02/osd.0.journal 2012-02-28 17:32:20.924359 7f41db5157a0 filestore(/mnt/osd.0) mount FIEMAP ioctl is NOT supported 2012-02-28 17:32:20.924403 7f41db5157a0 filestore(/mnt/osd.0) mount detected btrfs 2012-02-28 17:32:20.924428 7f41db5157a0 filestore(/mnt/osd.0) mount btrfs CLONE_RANGE ioctl is supported 2012-02-28 17:32:20.976031 7f41db5157a0 filestore(/mnt/osd.0) mount btrfs SNAP_CREATE is supported 2012-02-28 17:32:21.045548 7f41db5157a0 filestore(/mnt/osd.0) mount btrfs SNAP_DESTROY is supported 2012-02-28 17:32:21.045739 7f41db5157a0 filestore(/mnt/osd.0) mount btrfs START_SYNC is supported (transid 481) 2012-02-28 17:32:21.075824 7f41db5157a0 filestore(/mnt/osd.0) mount btrfs WAIT_SYNC is supported 2012-02-28 17:32:21.077557 7f41db5157a0 filestore(/mnt/osd.0) mount btrfs SNAP_CREATE_V2 is supported 2012-02-28 17:32:21.129621 7f41db5157a0 filestore(/mnt/osd.0) mount fsid is 79fdd9c1-e23b-4654-afe2-9378a477145a 2012-02-28 17:32:21.129661 7f41db5157a0 filestore(/mnt/osd.0) mount found snaps <1> 2012-02-28 17:32:21.129684 7f41db5157a0 filestore(/mnt/osd.0) current/ seq was 1 2012-02-28 17:32:21.129690 7f41db5157a0 filestore(/mnt/osd.0) most recent snap from <1> is 1 2012-02-28 17:32:21.129695 7f41db5157a0 filestore(/mnt/osd.0) mount rolling back to consistent snap 1 2012-02-28 17:32:21.173036 7f41db5157a0 filestore(/mnt/osd.0) mount op_seq is 1 2012-02-28 17:32:21.173109 7f41db5157a0 filestore(/mnt/osd.0) open_journal at /dev/server02/osd.0.journal 2012-02-28 17:32:21.173144 7f41db5157a0 filestore(/mnt/osd.0) mount: enabling PARALLEL journal mode: btrfs, SNAP_CREATE_V2 detected and 'filestore btrfs snap' mode is enabled 2012-02-28 17:32:21.173159 7f41db5157a0 filestore(/mnt/osd.0) list_collections 2012-02-28 17:32:21.173276 7f41db5157a0 journal journal_replay fs op_seq 1 2012-02-28 17:32:21.173365 7f41db5157a0 journal open /dev/server02/osd.0.journal fsid 79fdd9c1-e23b-4654-afe2-9378a477145a fs_op_seq 1 2012-02-28 17:32:21.173389 7f41db5157a0 journal _open_block_device: no journal size specified in configuration. We'll use the entire block device (size: 1073741824) 2012-02-28 17:32:21.175505 7f41db5157a0 journal kernel version is 3.0.0 2012-02-28 17:32:21.175839 7f41db5157a0 journal _open /dev/server02/osd.0.journal fd 15: 1073741824 bytes, block size 4096 bytes, directio = 1, aio = 0 2012-02-28 17:32:21.175915 7f41db5157a0 journal read_header 2012-02-28 17:32:21.193537 7f41db5157a0 journal header: block_size 4096 alignment 4096 max_size 1073741824 2012-02-28 17:32:21.193594 7f41db5157a0 journal header: start 4096 2012-02-28 17:32:21.193612 7f41db5157a0 journal write_pos 4096 2012-02-28 17:32:21.193625 7f41db5157a0 journal open header.fsid = 79fdd9c1-e23b-4654-afe2-9378a477145a 2012-02-28 17:32:21.193678 7f41db5157a0 journal read_entry 4096 : seq 1 244 bytes 2012-02-28 17:32:21.193710 7f41db5157a0 journal read_entry 12288 : bad header magic, end of journal 2012-02-28 17:32:21.193719 7f41db5157a0 journal open reached end of journal. 2012-02-28 17:32:21.193729 7f41db5157a0 journal read_entry 12288 : bad header magic, end of journal 2012-02-28 17:32:21.193746 7f41db5157a0 journal journal_replay: end of journal, done. 2012-02-28 17:32:21.194203 7f41db5157a0 journal _open_block_device: no journal size specified in configuration. We'll use the entire block device (size: 1073741824) 2012-02-28 17:32:21.197128 7f41db5157a0 journal kernel version is 3.0.0 2012-02-28 17:32:21.197425 7f41db5157a0 journal _open /dev/server02/osd.0.journal fd 15: 1073741824 bytes, block size 4096 bytes, directio = 1, aio = 0 2012-02-28 17:32:21.197520 7f41db5157a0 journal journal_start 2012-02-28 17:32:21.197580 7f41d7a1f700 journal write_thread_entry start 2012-02-28 17:32:21.197660 7f41d721e700 journal write_finish_thread_entry enter 2012-02-28 17:32:21.197851 7f41db5157a0 timer(0x2051688).init 2012-02-28 17:32:21.197994 7f41d3a17700 timer(0x2051688).timer_thread starting 2012-02-28 17:32:21.198090 7f41d3a17700 timer(0x2051688).timer_thread going to sleep 2012-02-28 17:32:21.198158 7f41db5157a0 filestore(/mnt/osd.0) umount /mnt/osd.0 2012-02-28 17:32:21.198329 7f41d8220700 journal commit_start op_seq 1, applied_seq 1, committed_seq 1 2012-02-28 17:32:21.198388 7f41d8220700 journal commit_start blocked, all open_ops have completed 2012-02-28 17:32:21.198406 7f41d8220700 journal commit_start nothing to do 2012-02-28 17:32:21.198425 7f41d8220700 journal commit_start 2012-02-28 17:32:21.198635 7f41db5157a0 journal journal_stop 2012-02-28 17:32:21.198838 7f41db5157a0 journal close /dev/server02/osd.0.journal 2012-02-28 17:32:21.198927 7f41d7a1f700 journal write_thread_entry finish 2012-02-28 17:32:21.198975 7f41d721e700 journal write_finish_thread_entry exit 2012-02-28 17:32:21.199432 7f41db5157a0 timer(0x2051688).shutdown 2012-02-28 17:32:21.199496 7f41db5157a0 timer(0x2051688).cancel_all_events 2012-02-28 17:32:21.199583 7f41d3a17700 timer(0x2051688).timer_thread awake 2012-02-28 17:32:21.199602 7f41d3a17700 timer(0x2051688).timer_thread exiting 2012-02-28 17:32:21.199699 7f41db5157a0 monclient(hunting): build_initial_monmap 2012-02-28 17:32:21.199798 7f41db5157a0 filestore(/mnt/osd.0) test_mount basedir /mnt/osd.0 journal /dev/server02/osd.0.journal
If running from service, /var/log/ceph/osd.0.log shows text like above with last line replaced with:
2012-02-28 16:37:55.954759 7f5dc0d077a0 ** ERROR: initializing osd failed: (1) Operation not permitted
Files
Updated by soft crack about 12 years ago
ceph version 0.42.2 (732f3ec94e39d458230b7728b2a936d431e19322)
Updated by Sage Weil about 12 years ago
- Target version set to v0.44
- Source changed from Development to Community (user)
Can you attach the actual log? I want to make sure there is no subtle difference in the output. Thanks!
Updated by soft crack about 12 years ago
See attachment please
Updated by Sage Weil about 12 years ago
it looks like you may be having trouble authenticating with the monitor. can you reproduce this with 'debug ms = 1'? and verify that the osd keyring file exists? (usually in osd data dir)
Updated by soft crack about 12 years ago
problem resolved. Thank you very much to your hint! I didn't ever think it is caused by communication.
I created a ceph.conf file from sample [[http://ceph.newdream.net/wiki/Cluster_configuration]] with something like:
[global] auth supported = cephx keyring = /etc/ceph/keyring.bin
But I made a mistake that I didn't specify a keyring for mds and osd.
Sorry for report a fake bug.