Project

General

Profile

Actions

Support #19747

closed

Only the first 3 OSD could be activated

Added by Ming Gao about 7 years ago. Updated almost 7 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Tags:
Reviewed:
Affected Versions:
Pull request ID:

Description

I have 4 storage servers, each has 24 SSDs as ceph storage.

I use SLES12sp2 and SES4.0.

prepare stage completed successfully:

deploy osd prepare ses01:/dev/sdX

as for activate, the first 3 disks could be done. What ever select the disks by sequence or randomly.

The failed ones have message like this:

cephadm@mgt01:~/ceph> ceph-deploy osd activate ses03:sdb1
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadm/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (1.5.34): /usr/bin/ceph-deploy osd activate ses03:sdb1
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] subcommand : activate
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f1faf1cb3b0>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] func : <function osd at 0x7f1faf415b18>
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] disk : [('ses03', '/dev/sdb1', None)]
[ceph_deploy.osd][DEBUG ] Activating cluster ceph disks ses03:/dev/sdb1:
[ses03][DEBUG ] connection detected need for sudo
[ses03][DEBUG ] connected to host: ses03
[ses03][DEBUG ] detect platform information from remote host
[ses03][DEBUG ] detect machine type
[ses03][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO ] Distro info: SUSE Linux Enterprise Server 12 x86_64
[ceph_deploy.osd][DEBUG ] activating host ses03 disk /dev/sdb1
[ceph_deploy.osd][DEBUG ] will use init type: systemd
[ses03][DEBUG ] find the location of an executable
[ses03][INFO ] Running command: sudo /usr/sbin/ceph-disk v activate --mark-init systemd --mount /dev/sdb1
[ses03][WARNIN] main_activate: path = /dev/sdb1
[ses03][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb1 uuid path is /sys/dev/block/8:17/dm/uuid
[ses03][WARNIN] command: Running command: /usr/sbin/blkid -o udev -p /dev/sdb1
[ses03][WARNIN] command: Running command: /sbin/blkid -p -s TYPE -o value -
/dev/sdb1
[ses03][WARNIN] command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mount_options_xfs
[ses03][WARNIN] command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mount_options_xfs
[ses03][WARNIN] mount: Mounting /dev/sdb1 on /var/lib/ceph/tmp/mnt.Qib3jB with options noatime,inode64
[ses03][WARNIN] command_check_call: Running command: /usr/bin/mount t xfs -o noatime,inode64 - /dev/sdb1 /var/lib/ceph/tmp/mnt.Qib3jB
[ses03][WARNIN] activate: Cluster uuid is 8b693493-8921-4640-9474-bb51a61c3ed4
[ses03][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
[ses03][WARNIN] activate: Cluster name is ceph
[ses03][WARNIN] activate: OSD uuid is e9ddf344-c088-4754-aec8-c90f4de8c082
[ses03][WARNIN] activate: OSD id is 75
[ses03][WARNIN] activate: Initializing OSD...
[ses03][WARNIN] command_check_call: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap o /var/lib/ceph/tmp/mnt.Qib3jB/activate.monmap
[ses03][WARNIN] got monmap epoch 1
[ses03][WARNIN] command: Running command: /usr/bin/timeout 300 ceph-osd --cluster ceph --mkfs --mkkey -i 75 --monmap /var/lib/ceph/tmp/mnt.Qib3jB/activate.monmap --osd-data /var/lib/ceph/tmp/mnt.Qib3jB --osd-journal /var/lib/ceph/tmp/mnt.Qib3jB/journal --osd-uuid e9ddf344-c088-4754-aec8-c90f4de8c082 --keyring /var/lib/ceph/tmp/mnt.Qib3jB/keyring --setuser ceph --setgroup ceph
[ses03][WARNIN] mount_activate: Failed to activate
[ses03][WARNIN] unmount: Unmounting /var/lib/ceph/tmp/mnt.Qib3jB
[ses03][WARNIN] command_check_call: Running command: /bin/umount -
/var/lib/ceph/tmp/mnt.Qib3jB
[ses03][WARNIN] Traceback (most recent call last):
[ses03][WARNIN] File "/usr/sbin/ceph-disk", line 9, in <module>
[ses03][WARNIN] load_entry_point('ceph-disk==1.0.0', 'console_scripts', 'ceph-disk')()
[ses03][WARNIN] File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5009, in run
[ses03][WARNIN] main(sys.argv[1:])
[ses03][WARNIN] File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 4960, in main
[ses03][WARNIN] args.func(args)
[ses03][WARNIN] File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3321, in main_activate
[ses03][WARNIN] reactivate=args.reactivate,
[ses03][WARNIN] File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3078, in mount_activate
[ses03][WARNIN] (osd_id, cluster) = activate(path, activate_key_template, init)
[ses03][WARNIN] File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3254, in activate
[ses03][WARNIN] keyring=keyring,
[ses03][WARNIN] File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2747, in mkfs
[ses03][WARNIN] '--setgroup', get_ceph_group(),
[ses03][WARNIN] File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2694, in ceph_osd_mkfs
[ses03][WARNIN] raise Error('%s failed : s' % (str(arguments), error))
[ses03][WARNIN] ceph_disk.main.Error: Error: ['ceph-osd', '--cluster', 'ceph', '--mkfs', '--mkkey', '-i', u'75', '--monmap', '/var/lib/ceph/tmp/mnt.Qib3jB/activate.monmap', '--osd-data', '/var/lib/ceph/tmp/mnt.Qib3jB', '--osd-journal', '/var/lib/ceph/tmp/mnt.Qib3jB/journal', '--osd-uuid', u'e9ddf344-c088-4754-aec8-c90f4de8c082', '--keyring', '/var/lib/ceph/tmp/mnt.Qib3jB/keyring', '--setuser', 'ceph', '--setgroup', 'ceph'] failed : tcmalloc: large alloc 17481375744 bytes == (nil) @ 0x7f08355428ba 0x7f0835563a74 0x560eb8d5bfa5 0x560eb8da5d18 0x560eb8da609d 0x560eb8d88234 0x560eb8a32b09 0x560eb89c86c0 0x7f08320d86e5 0x560eb8a12fa9 (nil)
[ses03][WARNIN] terminate called after throwing an instance of 'std::bad_alloc'
[ses03][WARNIN] what(): std::bad_alloc
[ses03][WARNIN] * Caught signal (Aborted) *
[ses03][WARNIN] in thread 7f0835993800 thread_name:ceph-osd
[ses03][WARNIN] ceph version 10.2.3-560-g4429782 (4429782e6c7b1bf08516e2ee2f6b2c47f3bf62d7)
[ses03][WARNIN] 1: (()+0x91e6a2) [0x560eb8fae6a2]
[ses03][WARNIN] 2: (()+0x10b00) [0x7f0834658b00]
[ses03][WARNIN] 3: (gsignal()+0x37) [0x7f08320ec8d7]
[ses03][WARNIN] 4: (abort()+0x13a) [0x7f08320edcaa]
[ses03][WARNIN] 5: (_gnu_cxx::_verbose_terminate_handler()+0x15d) [0x7f0832a0772d]
[ses03][WARNIN] 6: (()+0x96706) [0x7f0832a05706]
[ses03][WARNIN] 7: (()+0x96751) [0x7f0832a05751]
[ses03][WARNIN] 8: (_cxa_rethrow()+0x46) [0x7f0832a059b6]
[ses03][WARNIN] 9: (std::_Hashtable<ghobject_t, std::pair<ghobject_t const, std::_List_iterator<std::pair<ghobject_t, DBObjectMap::_Header> > >, std::allocator<std::pair<ghobject_t const, std::_List_iterator<std::pair<ghobject_t, DBObjectMap::_Header> > > >, std::
_detail::_Select1st, std::equal_to<ghobject_t>, std::hash<ghobject_t>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::rehash(unsigned long)+0x15e) [0x560eb8da5dce]
[ses03][WARNIN] 10: (DBObjectMap::DBObjectMap(KeyValueDB
)+0x2bd) [0x560eb8da609d]
[ses03][WARNIN] 11: (FileStore::mount()+0x2514) [0x560eb8d88234]
[ses03][WARNIN] 12: (OSD::mkfs(CephContext*, ObjectStore*, std::string constx%x
, uuid_d, int)+0x399) [0x560eb8a32b09]
[ses03][WARNIN] 13: (main()+0x1030) [0x560eb89c86c0]
[ses03][WARNIN] 14: (_libc_start_main()+0xf5) [0x7f08320d86e5]
[ses03][WARNIN] 15: (_start()+0x29) [0x560eb8a12fa9]
[ses03][WARNIN] 2017-04-22 09:34:47.377352 7f0835993800 -1
Caught signal (Aborted)
[ses03][WARNIN] in thread 7f0835993800 thread_name:ceph-osd
[ses03][WARNIN]
[ses03][WARNIN] ceph version 10.2.3-560-g4429782 (4429782e6c7b1bf08516e2ee2f6b2c47f3bf62d7)
[ses03][WARNIN] 1: (()+0x91e6a2) [0x560eb8fae6a2]
[ses03][WARNIN] 2: (()+0x10b00) [0x7f0834658b00]
[ses03][WARNIN] 3: (gsignal()+0x37) [0x7f08320ec8d7]
[ses03][WARNIN] 4: (abort()+0x13a) [0x7f08320edcaa]
[ses03][WARNIN] 5: (
_gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7f0832a0772d]
[ses03][WARNIN] 6: (()+0x96706) [0x7f0832a05706]
[ses03][WARNIN] 7: (()+0x96751) [0x7f0832a05751]
[ses03][WARNIN] 8: (_cxa_rethrow()+0x46) [0x7f0832a059b6]
[ses03][WARNIN] 9: (std::_Hashtable<ghobject_t, std::pair<ghobject_t const, std::_List_iterator<std::pair<ghobject_t, DBObjectMap::_Header> > >, std::allocator<std::pair<ghobject_t const, std::_List_iterator<std::pair<ghobject_t, DBObjectMap::_Header> > > >, std::
_detail::_Select1st, std::equal_to<ghobject_t>, std::hash<ghobject_t>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::rehash(unsigned long)+0x15e) [0x560eb8da5dce]
[ses03][WARNIN] 10: (DBObjectMap::DBObjectMap(KeyValueDB
)+0x2bd) [0x560eb8da609d]
[ses03][WARNIN] 11: (FileStore::mount()+0x2514) [0x560eb8d88234]
[ses03][WARNIN] 12: (OSD::mkfs(CephContext*, ObjectStore*, std::string const&, uuid_d, int)+0x399) [0x560eb8a32b09]
[ses03][WARNIN] 13: (main()+0x1030) [0x560eb89c86c0]
[ses03][WARNIN] 14: (_libc_start_main()+0xf5) [0x7f08320d86e5]
[ses03][WARNIN] 15: (_start()+0x29) [0x560eb8a12fa9]
[ses03][WARNIN] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
[ses03][WARNIN]
[ses03][WARNIN] 0> 2017-04-22 09:34:47.377352 7f0835993800 -1
Caught signal (Aborted) *
[ses03][WARNIN] in thread 7f0835993800 thread_name:ceph-osd
[ses03][WARNIN]
[ses03][WARNIN] ceph version 10.2.3-560-g4429782 (4429782e6c7b1bf08516e2ee2f6b2c47f3bf62d7)
[ses03][WARNIN] 1: (()+0x91e6a2) [0x560eb8fae6a2]
[ses03][WARNIN] 2: (()+0x10b00) [0x7f0834658b00]
[ses03][WARNIN] 3: (gsignal()+0x37) [0x7f08320ec8d7]
[ses03][WARNIN] 4: (abort()+0x13a) [0x7f08320edcaa]
[ses03][WARNIN] 5: (
_gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7f0832a0772d]
[ses03][WARNIN] 6: (()+0x96706) [0x7f0832a05706]
[ses03][WARNIN] 7: (()+0x96751) [0x7f0832a05751]
[ses03][WARNIN] 8: (_cxa_rethrow()+0x46) [0x7f0832a059b6]
[ses03][WARNIN] 9: (std::_Hashtable<ghobject_t, std::pair<ghobject_t const, std::_List_iterator<std::pair<ghobject_t, DBObjectMap::_Header> > >, std::allocator<std::pair<ghobject_t const, std::_List_iterator<std::pair<ghobject_t, DBObjectMap::_Header> > > >, std::
_detail::_Select1st, std::equal_to<ghobject_t>, std::hash<ghobject_t>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::rehash(unsigned long)+0x15e) [0x560eb8da5dce]
[ses03][WARNIN] 10: (DBObjectMap::DBObjectMap(KeyValueDB*)+0x2bd) [0x560eb8da609d]
[ses03][WARNIN] 11: (FileStore::mount()+0x2514) [0x560eb8d88234]
[ses03][WARNIN] 12: (OSD::mkfs(CephContext*, ObjectStore*, std::string const&, uuid_d, int)+0x399) [0x560eb8a32b09]
[ses03][WARNIN] 13: (main()+0x1030) [0x560eb89c86c0]
[ses03][WARNIN] 14: (__libc_start_main()+0xf5) [0x7f08320d86e5]
[ses03][WARNIN] 15: (_start()+0x29) [0x560eb8a12fa9]
[ses03][WARNIN] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
[ses03][WARNIN]
[ses03][WARNIN]
[ses03][ERROR ] RuntimeError: command returned non-zero exit status: 1
[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: /usr/sbin/ceph-disk -v activate --mark-init systemd --mount /dev/sdb1

Actions

Also available in: Atom PDF