Project

General

Profile

Bug #22354

v12.2.2 unable to create bluestore osd using ceph-disk

Added by Nokia ceph-users almost 2 years ago. Updated over 1 year ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Administration/Usability
Target version:
-
Start date:
12/08/2017
Due date:
% Done:

0%

Source:
Tags:
Backport:
luminous
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
ceph-disk
Component(RADOS):
Pull request ID:

Description

Hello,

We aware that ceph-disk which is deprecated in 12.2.2 . As part of my testing, I can still using this ceph-disk utility for creating OSD's in 12.2.2

Here I'm getting activation error on the second hit onwards.

First occurance OSD's creating without any issue.


## ceph-disk prepare --bluestore --cluster ceph --cluster-uuid b2f1b9b9-eecc-4c17-8b92-cfa60b31c121 /dev/sde; ceph-disk activate /dev/sde1
/usr/lib/python2.7/site-packages/ceph_disk/main.py:5653: UserWarning:
*******************************************************************************
This tool is now deprecated in favor of ceph-volume.
It is recommended to use ceph-volume for OSD deployments. For details see:

    http://docs.ceph.com/docs/master/ceph-volume/#migrating

*******************************************************************************

  warnings.warn(DEPRECATION_WARNING)
Creating new GPT entries.
The operation has completed successfully.
The operation has completed successfully.
The operation has completed successfully.
meta-data=/dev/sde1              isize=2048   agcount=4, agsize=6336 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=0, sparse=0
data     =                       bsize=4096   blocks=25344, imaxpct=25
         =                       sunit=64     swidth=64 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal log           bsize=4096   blocks=1728, version=2
         =                       sectsz=512   sunit=64 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
Warning: The kernel is still using the old partition table.
The new table will be used at the next reboot.
The operation has completed successfully.
/usr/lib/python2.7/site-packages/ceph_disk/main.py:5685: UserWarning:
*******************************************************************************
This tool is now deprecated in favor of ceph-volume.
It is recommended to use ceph-volume for OSD deployments. For details see:

    http://docs.ceph.com/docs/master/ceph-volume/#migrating

*******************************************************************************

  warnings.warn(DEPRECATION_WARNING)
/usr/lib/python2.7/site-packages/ceph_disk/main.py:5653: UserWarning:
*******************************************************************************
This tool is now deprecated in favor of ceph-volume.
It is recommended to use ceph-volume for OSD deployments. For details see:

    http://docs.ceph.com/docs/master/ceph-volume/#migrating

*******************************************************************************

  warnings.warn(DEPRECATION_WARNING)
got monmap epoch 3
2017-12-08 16:07:57.262854 7fe58f6e6d00 -1 key
2017-12-08 16:07:57.769048 7fe58f6e6d00 -1 created object store /var/lib/ceph/tmp/mnt.dTiXMX for osd.16 fsid b2f1b9b9-eecc-4c17-8b92-cfa60b31c121
Removed symlink /run/systemd/system/ceph-osd.target.wants/ceph-osd@16.service.
Created symlink from /run/systemd/system/ceph-osd.target.wants/ceph-osd@16.service to /usr/lib/systemd/system/ceph-osd@.service.

On the second occurance. , I'm getting below issue...

# ceph-disk prepare --bluestore --cluster ceph --cluster-uuid b2f1b9b9-eecc-4c17-8b92-cfa60b31c121 /dev/sde; ceph-disk activate /dev/sde1
/usr/lib/python2.7/site-packages/ceph_disk/main.py:5653: UserWarning:
*******************************************************************************
This tool is now deprecated in favor of ceph-volume.
It is recommended to use ceph-volume for OSD deployments. For details see:

    http://docs.ceph.com/docs/master/ceph-volume/#migrating

*******************************************************************************

  warnings.warn(DEPRECATION_WARNING)
Creating new GPT entries.
The operation has completed successfully.
The operation has completed successfully.
The operation has completed successfully.
meta-data=/dev/sde1              isize=2048   agcount=4, agsize=6336 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=0, sparse=0
data     =                       bsize=4096   blocks=25344, imaxpct=25
         =                       sunit=64     swidth=64 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal log           bsize=4096   blocks=1728, version=2
         =                       sectsz=512   sunit=64 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
Warning: The kernel is still using the old partition table.
The new table will be used at the next reboot.
The operation has completed successfully.
/usr/lib/python2.7/site-packages/ceph_disk/main.py:5685: UserWarning:
*******************************************************************************
This tool is now deprecated in favor of ceph-volume.
It is recommended to use ceph-volume for OSD deployments. For details see:

    http://docs.ceph.com/docs/master/ceph-volume/#migrating

*******************************************************************************

  warnings.warn(DEPRECATION_WARNING)
/usr/lib/python2.7/site-packages/ceph_disk/main.py:5653: UserWarning:
*******************************************************************************
This tool is now deprecated in favor of ceph-volume.
It is recommended to use ceph-volume for OSD deployments. For details see:

    http://docs.ceph.com/docs/master/ceph-volume/#migrating

*******************************************************************************

  warnings.warn(DEPRECATION_WARNING)
got monmap epoch 3
2017-12-08 16:09:07.518454 7fa64e12fd00 -1 bluestore(/var/lib/ceph/tmp/mnt.7x0kCL/block) _check_or_set_bdev_label bdev /var/lib/ceph/tmp/mnt.7x0kCL/block fsid 54954cfd-b7f3-4f74-9b2e-2ef57c5143cc does not match our fsid 29262e99-12ff-4c45-9113-8f69830a1a5e
2017-12-08 16:09:07.772688 7fa64e12fd00 -1 bluestore(/var/lib/ceph/tmp/mnt.7x0kCL) mkfs fsck found fatal error: (5) Input/output error
2017-12-08 16:09:07.772723 7fa64e12fd00 -1 OSD::mkfs: ObjectStore::mkfs failed with error (5) Input/output error
2017-12-08 16:09:07.772823 7fa64e12fd00 -1  ** ERROR: error creating empty object store in /var/lib/ceph/tmp/mnt.7x0kCL: (5) Input/output error
mount_activate: Failed to activate
Traceback (most recent call last):
  File "/usr/sbin/ceph-disk", line 9, in <module>
    load_entry_point('ceph-disk==1.0.0', 'console_scripts', 'ceph-disk')()
  File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5736, in run
    main(sys.argv[1:])
  File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5682, in main
    main_catch(args.func, args)
  File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5710, in main_catch
    func(args)
  File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3761, in main_activate
    reactivate=args.reactivate,
  File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3524, in mount_activate
    (osd_id, cluster) = activate(path, activate_key_template, init)
  File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3701, in activate
    keyring=keyring,
  File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3153, in mkfs
    '--setgroup', get_ceph_group(),
  File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 570, in command_check_call
    return subprocess.check_call(arguments)
  File "/usr/lib64/python2.7/subprocess.py", line 542, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/usr/bin/ceph-osd', '--cluster', 'ceph', '--mkfs', '-i', u'16', '--monmap', '/var/lib/ceph/tmp/mnt.7x0kCL/activate.monmap', '--osd-data', '/var/lib/ceph/tmp/mnt.7x0kCL', '--osd-uuid', u'29262e99-12ff-4c45-9113-8f69830a1a5e', '--setuser', 'ceph', '--setgroup', 'ceph']' returned non-zero exit status 1

Do you think is this a bug ? Even I can reproducable it everytime on any osd's . On the first time I can recreate OSD's without any problem. second time onwards the issue happening.

Note;- This is not reproducable in 12.2.0 and 12.2.1

We got the issue in 12.2.2 and 12.2.1-632-ga5899a5 ( test )

Thanks


Related issues

Copied to RADOS - Backport #23103: luminous: v12.2.2 unable to create bluestore osd using ceph-disk Resolved

History

#1 Updated by Nokia ceph-users almost 2 years ago

Reproducing steps... =======================

Stopping osd.112
  1. systemctl stop ceph-osd@112
Removing 112 from cluster
  1. for x in {112..112}; do ceph osd down $x ; ceph osd out $x;ceph osd crush remove osd.$x;ceph auth del osd.$x;ceph osd rm osd.$x ;done
    osd.112 is already down.
    marked out osd.112.
    removed item id 112 name 'osd.112' from crush map
    updated
    removed osd.112
  1. umount /var/lib/ceph/osd/ceph-112
Formatting the device.
  1. sgdisk -Z /dev/sdx
    GPT data structures destroyed! You may now partition the disk using fdisk or
    other utilities.

On first occurance ...

  1. ceph-disk prepare --bluestore --cluster ceph --cluster-uuid b2f1b9b9-eecc-4c17-8b92-cfa60b31c121 /dev/sdx; ceph-disk --verbose activate /dev/sdx1
    /usr/lib/python2.7/site-packages/ceph_disk/main.py:5653: UserWarning: ***********************************************************************
    This tool is now deprecated in favor of ceph-volume.
    It is recommended to use ceph-volume for OSD deployments. For details see:

    http://docs.ceph.com/docs/master/ceph-volume/#migrating


warnings.warn(DEPRECATION_WARNING)
Creating new GPT entries.
The operation has completed successfully.
The operation has completed successfully.
The operation has completed successfully.
meta-data=/dev/sdx1 isize=2048 agcount=4, agsize=6336 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=0, sparse=0
data = bsize=4096 blocks=25344, imaxpct=25 = sunit=64 swidth=64 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=1
log =internal log bsize=4096 blocks=1728, version=2 = sectsz=512 sunit=64 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
Warning: The kernel is still using the old partition table.
The new table will be used at the next reboot.
The operation has completed successfully.
/usr/lib/python2.7/site-packages/ceph_disk/main.py:5685: UserWarning:

***********************************************************************
This tool is now deprecated in favor of ceph-volume.
It is recommended to use ceph-volume for OSD deployments. For details see:

http://docs.ceph.com/docs/master/ceph-volume/#migrating


warnings.warn(DEPRECATION_WARNING)
/usr/lib/python2.7/site-packages/ceph_disk/main.py:5653: UserWarning:

***********************************************************************
This tool is now deprecated in favor of ceph-volume.
It is recommended to use ceph-volume for OSD deployments. For details see:

http://docs.ceph.com/docs/master/ceph-volume/#migrating


warnings.warn(DEPRECATION_WARNING)
main_activate: path = /dev/sdx1
get_dm_uuid: get_dm_uuid /dev/sdx1 uuid path is /sys/dev/block/65:113/dm/uuid
command: Running command: /usr/sbin/blkid o udev -p /dev/sdx1
command: Running command: /sbin/blkid -p -s TYPE -o value -
/dev/sdx1
command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mount_options_xfs
command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mount_options_xfs
mount: Mounting /dev/sdx1 on /var/lib/ceph/tmp/mnt.iAVB07 with options noatime,inode64
command_check_call: Running command: /usr/bin/mount t xfs -o noatime,inode64 - /dev/sdx1 /var/lib/ceph/tmp/mnt.iAVB07
command: Running command: /usr/sbin/restorecon /var/lib/ceph/tmp/mnt.iAVB07
activate: Cluster uuid is b2f1b9b9-eecc-4c17-8b92-cfa60b31c121
command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
activate: Cluster name is ceph
activate: OSD uuid is cd79bd79-a830-4552-87b5-a6f08fcefb18
activate: OSD id is 112
activate: Initializing OSD...
command_check_call: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap o /var/lib/ceph/tmp/mnt.iAVB07/activate.monmap
got monmap epoch 3
command_check_call: Running command: /usr/bin/ceph-osd --cluster ceph --mkfs -i 112 --monmap /var/lib/ceph/tmp/mnt.iAVB07/activate.monmap --osd-data /var/lib/ceph/tmp/mnt.iAVB07 --osd-uuid cd79bd79-a830-4552-87b5-a6f08fcefb18 --setuser ceph --setgroup ceph
2017-12-08 19:24:11.417557 7f4b2676ad00 -1 key
2017-12-08 19:24:11.929132 7f4b2676ad00 -1 created object store /var/lib/ceph/tmp/mnt.iAVB07 for osd.112 fsid b2f1b9b9-eecc-4c17-8b92-cfa60b31c121
command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup init
command: Running command: /usr/bin/ceph-detect-init --default sysvinit
activate: Marking with init system systemd
command: Running command: /usr/sbin/restorecon -R /var/lib/ceph/tmp/mnt.iAVB07/systemd
command: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/tmp/mnt.iAVB07/systemd
command: Running command: /usr/sbin/restorecon -R /var/lib/ceph/tmp/mnt.iAVB07/active.165078.tmp
command: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/tmp/mnt.iAVB07/active.165078.tmp
activate: ceph osd.112 data dir is ready at /var/lib/ceph/tmp/mnt.iAVB07
move_mount: Moving mount to final location...
command_check_call: Running command: /bin/mount -o noatime,inode64 -
/dev/sdx1 /var/lib/ceph/osd/ceph-112
command_check_call: Running command: /bin/umount l - /var/lib/ceph/tmp/mnt.iAVB07
start_daemon: Starting ceph osd.112...
command_check_call: Running command: /usr/bin/systemctl disable ceph-osd@112
command_check_call: Running command: /usr/bin/systemctl disable ceph-osd@112 --runtime
Removed symlink .
command_check_call: Running command: /usr/bin/systemctl enable ceph-osd@112 --runtime
Created symlink from to /usr/lib/systemd/system/ceph-osd@.service.
command_check_call: Running command: /usr/bin/systemctl start ceph-osd@112
/usr/lib/python2.7/site-packages/ceph_disk/main.py:5685: UserWarning:

***********************************************************************
This tool is now deprecated in favor of ceph-volume.
It is recommended to use ceph-volume for OSD deployments. For details see:

http://docs.ceph.com/docs/master/ceph-volume/#migrating


warnings.warn(DEPRECATION_WARNING)

  • On first occurrence its created without any issue.

doing same steps again...

  1. systemctl stop ceph-osd@112
  1. for x in {112..112}; do ceph osd down $x ; ceph osd out $x;ceph osd crush remove osd.$x;ceph auth del osd.$x;ceph osd rm osd.$x ;done
    osd.112 is already down.
    marked out osd.112.
    removed item id 112 name 'osd.112' from crush map
    updated
    removed osd.112
  1. umount /var/lib/ceph/osd/ceph-112
  1. sgdisk -Z /dev/sdx
    GPT data structures destroyed! You may now partition the disk using fdisk or
    other utilities.
  1. ceph osd tree | grep 112
  1. ceph-disk prepare --bluestore --cluster ceph --cluster-uuid b2f1b9b9-eecc-4c17-8b92-cfa60b31c121 /dev/sdx; ceph-disk --verbose activate /dev/sdx1
    /usr/lib/python2.7/site-packages/ceph_disk/main.py:5653: UserWarning: ***********************************************************************
    This tool is now deprecated in favor of ceph-volume.
    It is recommended to use ceph-volume for OSD deployments. For details see:

    http://docs.ceph.com/docs/master/ceph-volume/#migrating


warnings.warn(DEPRECATION_WARNING)
Creating new GPT entries.
The operation has completed successfully.
The operation has completed successfully.
The operation has completed successfully.
meta-data=/dev/sdx1 isize=2048 agcount=4, agsize=6336 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=0, sparse=0
data = bsize=4096 blocks=25344, imaxpct=25 = sunit=64 swidth=64 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=1
log =internal log bsize=4096 blocks=1728, version=2 = sectsz=512 sunit=64 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
Warning: The kernel is still using the old partition table.
The new table will be used at the next reboot.
The operation has completed successfully.
/usr/lib/python2.7/site-packages/ceph_disk/main.py:5685: UserWarning:

***********************************************************************
This tool is now deprecated in favor of ceph-volume.
It is recommended to use ceph-volume for OSD deployments. For details see:

http://docs.ceph.com/docs/master/ceph-volume/#migrating


warnings.warn(DEPRECATION_WARNING)
/usr/lib/python2.7/site-packages/ceph_disk/main.py:5653: UserWarning:

***********************************************************************
This tool is now deprecated in favor of ceph-volume.
It is recommended to use ceph-volume for OSD deployments. For details see:

http://docs.ceph.com/docs/master/ceph-volume/#migrating


warnings.warn(DEPRECATION_WARNING)
main_activate: path = /dev/sdx1
get_dm_uuid: get_dm_uuid /dev/sdx1 uuid path is /sys/dev/block/65:113/dm/uuid
command: Running command: /usr/sbin/blkid o udev -p /dev/sdx1
command: Running command: /sbin/blkid -p -s TYPE -o value -
/dev/sdx1
command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mount_options_xfs
command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mount_options_xfs
mount: Mounting /dev/sdx1 on /var/lib/ceph/tmp/mnt.h2RFcX with options noatime,inode64
command_check_call: Running command: /usr/bin/mount t xfs -o noatime,inode64 - /dev/sdx1 /var/lib/ceph/tmp/mnt.h2RFcX
command: Running command: /usr/sbin/restorecon /var/lib/ceph/tmp/mnt.h2RFcX
activate: Cluster uuid is b2f1b9b9-eecc-4c17-8b92-cfa60b31c121
command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
activate: Cluster name is ceph
activate: OSD uuid is 297b1392-f900-494f-9b14-b7879eefd3ec
activate: OSD id is 112
activate: Initializing OSD...
command_check_call: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap o /var/lib/ceph/tmp/mnt.h2RFcX/activate.monmap
got monmap epoch 3
command_check_call: Running command: /usr/bin/ceph-osd --cluster ceph --mkfs -i 112 --monmap /var/lib/ceph/tmp/mnt.h2RFcX/activate.monmap --osd-data /var/lib/ceph/tmp/mnt.h2RFcX --osd-uuid 297b1392-f900-494f-9b14-b7879eefd3ec --setuser ceph --setgroup ceph
2017-12-08 19:25:14.183921 7fba96bd4d00 -1 bluestore(/var/lib/ceph/tmp/mnt.h2RFcX/block) _check_or_set_bdev_label bdev /var/lib/ceph/tmp/mnt.h2RFcX/block fsid cd79bd79-a830-4552-87b5-a6f08fcefb18 does not match our fsid 297b1392-f900-494f-9b14-b7879eefd3ec
2017-12-08 19:25:14.440659 7fba96bd4d00 -1 bluestore(/var/lib/ceph/tmp/mnt.h2RFcX) mkfs fsck found fatal error: (5) Input/output error
2017-12-08 19:25:14.440698 7fba96bd4d00 -1 OSD::mkfs: ObjectStore::mkfs failed with error (5) Input/output error
2017-12-08 19:25:14.440773 7fba96bd4d00 -1 ** ERROR: error creating empty object store in /var/lib/ceph/tmp/mnt.h2RFcX: (5) Input/output error -->>>>
mount_activate: Failed to activate
unmount: Unmounting /var/lib/ceph/tmp/mnt.h2RFcX
command_check_call: Running command: /bin/umount -
/var/lib/ceph/tmp/mnt.h2RFcX
/usr/lib/python2.7/site-packages/ceph_disk/main.py:5677: UserWarning:

***********************************************************************
This tool is now deprecated in favor of ceph-volume.
It is recommended to use ceph-volume for OSD deployments. For details see:

http://docs.ceph.com/docs/master/ceph-volume/#migrating


warnings.warn(DEPRECATION_WARNING)
Traceback (most recent call last):
File "/usr/sbin/ceph-disk", line 9, in <module>
load_entry_point('ceph-disk==1.0.0', 'console_scripts', 'ceph-disk')()
File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5736, in run
main(sys.argv[1:])
File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5674, in main
args.func(args)
File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3761, in main_activate
reactivate=args.reactivate,
File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3524, in mount_activate
(osd_id, cluster) = activate(path, activate_key_template, init)
File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3701, in activate
keyring=keyring,
File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3153, in mkfs
'--setgroup', get_ceph_group(),
File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 570, in command_check_call
return subprocess.check_call(arguments)
File "/usr/lib64/python2.7/subprocess.py", line 542, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/usr/bin/ceph-osd', '--cluster', 'ceph', '--mkfs', '-i', u'112', '--monmap', '/var/lib/ceph/tmp/mnt.h2RFcX/activate.monmap', '--osd-data', '/var/lib/ceph/tmp/mnt.h2RFcX', '--osd-uuid', u'297b1392-f900-494f-9b14-b7879eefd3ec', '--setuser', 'ceph', '--setgroup', 'ceph']' returned non-zero exit status 1

#2 Updated by Kefu Chai almost 2 years ago

  • Description updated (diff)

#3 Updated by Kefu Chai almost 2 years ago

  • Assignee set to Kefu Chai

#4 Updated by Jon Heese almost 2 years ago

I am getting the exact same behavior during ceph-deploy osd activate (which uses ceph-disk activate) on a newly-deployed Luminous cluster:

[root@r24obj-osd01 ~]# rpm -qa | grep ceph-common
ceph-common-12.2.2-0.el7.x86_64

These disks were used in a previous Luminous cluster that was used for testing and torn down for redeployment, but I have thoroughly zap@ped and otherwise cleared out (via @sgdisk -Z and @dd@ing zeros) the disks, so I'm fairly confident that leftover data on the disks is not the cause.

I've attempted this on 3 disks on different OSD nodes and they all result in the same behavior -- unlike the OP, the first disk does not succeed.

Here's the command I'm running and the output:

[root@r24obj-admin01 r24-cluster01]# ceph-deploy osd activate r24obj-osd01:sda1
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.39): /usr/bin/ceph-deploy osd activate r24obj-osd01:sda1
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  subcommand                    : activate
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x26b84d0>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  func                          : <function osd at 0x26abed8>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.cli][INFO  ]  disk                          : [('r24obj-osd01', '/dev/sda1', None)]
[ceph_deploy.osd][DEBUG ] Activating cluster ceph disks r24obj-osd01:/dev/sda1:
[r24obj-osd01][DEBUG ] connected to host: r24obj-osd01 
[r24obj-osd01][DEBUG ] detect platform information from remote host
[r24obj-osd01][DEBUG ] detect machine type
[r24obj-osd01][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.4.1708 Core
[ceph_deploy.osd][DEBUG ] activating host r24obj-osd01 disk /dev/sda1
[ceph_deploy.osd][DEBUG ] will use init type: systemd
[r24obj-osd01][DEBUG ] find the location of an executable
[r24obj-osd01][INFO  ] Running command: /usr/sbin/ceph-disk -v activate --mark-init systemd --mount /dev/sda1
[r24obj-osd01][WARNING] /usr/lib/python2.7/site-packages/ceph_disk/main.py:5653: UserWarning: 
[r24obj-osd01][WARNING] *******************************************************************************
[r24obj-osd01][WARNING] This tool is now deprecated in favor of ceph-volume.
[r24obj-osd01][WARNING] It is recommended to use ceph-volume for OSD deployments. For details see:
[r24obj-osd01][WARNING] 
[r24obj-osd01][WARNING]     http://docs.ceph.com/docs/master/ceph-volume/#migrating
[r24obj-osd01][WARNING] 
[r24obj-osd01][WARNING] *******************************************************************************
[r24obj-osd01][WARNING] 
[r24obj-osd01][WARNING]   warnings.warn(DEPRECATION_WARNING)
[r24obj-osd01][WARNING] main_activate: path = /dev/sda1
[r24obj-osd01][WARNING] get_dm_uuid: get_dm_uuid /dev/sda1 uuid path is /sys/dev/block/8:1/dm/uuid
[r24obj-osd01][WARNING] command: Running command: /usr/sbin/blkid -o udev -p /dev/sda1
[r24obj-osd01][WARNING] command: Running command: /sbin/blkid -p -s TYPE -o value -- /dev/sda1
[r24obj-osd01][WARNING] command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mount_options_xfs
[r24obj-osd01][WARNING] command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mount_options_xfs
[r24obj-osd01][WARNING] mount: Mounting /dev/sda1 on /var/lib/ceph/tmp/mnt.yDnSJ5 with options noatime,inode64
[r24obj-osd01][WARNING] command_check_call: Running command: /usr/bin/mount -t xfs -o noatime,inode64 -- /dev/sda1 /var/lib/ceph/tmp/mnt.yDnSJ5
[r24obj-osd01][WARNING] command: Running command: /usr/sbin/restorecon /var/lib/ceph/tmp/mnt.yDnSJ5
[r24obj-osd01][WARNING] activate: Cluster uuid is aab42ef9-2909-4b7a-961e-ef9bf692ec32
[r24obj-osd01][WARNING] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
[r24obj-osd01][WARNING] activate: Cluster name is ceph
[r24obj-osd01][WARNING] activate: OSD uuid is fb60ca95-f59e-4602-a498-aab1b24bb27b
[r24obj-osd01][WARNING] activate: OSD id is 0
[r24obj-osd01][WARNING] activate: Initializing OSD...
[r24obj-osd01][WARNING] command_check_call: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/tmp/mnt.yDnSJ5/activate.monmap
[r24obj-osd01][WARNING] got monmap epoch 1
[r24obj-osd01][WARNING] command_check_call: Running command: /usr/bin/ceph-osd --cluster ceph --mkfs -i 0 --monmap /var/lib/ceph/tmp/mnt.yDnSJ5/activate.monmap --osd-data /var/lib/ceph/tmp/mnt.yDnSJ5 --osd-uuid fb60ca95-f59e-4602-a498-aab1b24bb27b --setuser ceph --setgroup ceph
[r24obj-osd01][WARNING] 2017-12-14 15:12:01.557735 7f043e895d00 -1 bluestore(/var/lib/ceph/tmp/mnt.yDnSJ5/block) _check_or_set_bdev_label bdev /var/lib/ceph/tmp/mnt.yDnSJ5/block fsid a2b247c8-3201-429f-b908-262c321b7b10 does not match our fsid fb60ca95-f59e-4602-a498-aab1b24bb27b
[r24obj-osd01][WARNING] 2017-12-14 15:12:01.813055 7f043e895d00 -1 bluestore(/var/lib/ceph/tmp/mnt.yDnSJ5) mkfs fsck found fatal error: (5) Input/output error
[r24obj-osd01][WARNING] 2017-12-14 15:12:01.813085 7f043e895d00 -1 OSD::mkfs: ObjectStore::mkfs failed with error (5) Input/output error
[r24obj-osd01][WARNING] 2017-12-14 15:12:01.813199 7f043e895d00 -1 ESC[0;31m ** ERROR: error creating empty object store in /var/lib/ceph/tmp/mnt.yDnSJ5: (5) Input/output errorESC[0m
[r24obj-osd01][WARNING] mount_activate: Failed to activate
[r24obj-osd01][WARNING] unmount: Unmounting /var/lib/ceph/tmp/mnt.yDnSJ5
[r24obj-osd01][WARNING] command_check_call: Running command: /bin/umount -- /var/lib/ceph/tmp/mnt.yDnSJ5
[r24obj-osd01][WARNING] /usr/lib/python2.7/site-packages/ceph_disk/main.py:5677: UserWarning: 
[r24obj-osd01][WARNING] *******************************************************************************
[r24obj-osd01][WARNING] This tool is now deprecated in favor of ceph-volume.
[r24obj-osd01][WARNING] It is recommended to use ceph-volume for OSD deployments. For details see:
[r24obj-osd01][WARNING] 
[r24obj-osd01][WARNING]     http://docs.ceph.com/docs/master/ceph-volume/#migrating
[r24obj-osd01][WARNING] 
[r24obj-osd01][WARNING] *******************************************************************************
[r24obj-osd01][WARNING] 
[r24obj-osd01][WARNING]   warnings.warn(DEPRECATION_WARNING)
[r24obj-osd01][WARNING] Traceback (most recent call last):
[r24obj-osd01][WARNING]   File "/usr/sbin/ceph-disk", line 9, in <module>
[r24obj-osd01][WARNING]     load_entry_point('ceph-disk==1.0.0', 'console_scripts', 'ceph-disk')()
[r24obj-osd01][WARNING]   File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5736, in run
[r24obj-osd01][WARNING]     main(sys.argv[1:])
[r24obj-osd01][WARNING]   File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5674, in main
[r24obj-osd01][WARNING]     args.func(args)
[r24obj-osd01][WARNING]   File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3761, in main_activate
[r24obj-osd01][WARNING]     reactivate=args.reactivate,
[r24obj-osd01][WARNING]   File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3524, in mount_activate
[r24obj-osd01][WARNING]     (osd_id, cluster) = activate(path, activate_key_template, init)
[r24obj-osd01][WARNING]   File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3701, in activate
[r24obj-osd01][WARNING]     keyring=keyring,
[r24obj-osd01][WARNING]   File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3153, in mkfs
[r24obj-osd01][WARNING]     '--setgroup', get_ceph_group(),
[r24obj-osd01][WARNING]   File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 570, in command_check_call
[r24obj-osd01][WARNING]     return subprocess.check_call(arguments)
[r24obj-osd01][WARNING]   File "/usr/lib64/python2.7/subprocess.py", line 542, in check_call
[r24obj-osd01][WARNING]     raise CalledProcessError(retcode, cmd)
[r24obj-osd01][WARNING] subprocess.CalledProcessError: Command '['/usr/bin/ceph-osd', '--cluster', 'ceph', '--mkfs', '-i', u'0', '--monmap', '/var/lib/ceph/tmp/mnt.yDnSJ5/activate.monmap', '--osd-data', '/var/lib/ceph/tmp/mnt.yDnSJ5', '--osd-uuid', u'fb60ca95-f59e-4602-a498-aab1b24bb27b', '--setuser', 'ceph', '--setgroup', 'ceph']' returned non-zero exit status 1
[r24obj-osd01][ERROR ] RuntimeError: command returned non-zero exit status: 1
[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: /usr/sbin/ceph-disk -v activate --mark-init systemd --mount /dev/sda1
[root@r24obj-admin01 r24-cluster01]#

I can post more details if needed.

Regards,
Jon Heese

#5 Updated by Jon Heese almost 2 years ago

So I dug a little deeper on this, and followed this gentleman's efforts to manually set up bluestore OSDs (although he has separate devices for wal/db and I'm just collocating for now):
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-July/019103.html
https://github.com/MartinEmrich/kb/blob/master/ceph/Manual-Bluestore.md

So that allowed me to see exactly what place it was failing at, and it was giving this error when I ran the ceph-osd --setuser ceph -i 0 --mkkey --mkfs command:

-1 provided osd id 0 != superblock's 96

So that led me to suspect that the zap commands are not properly zeroing the disks. So I tried:

dd if=/dev/zero of=/dev/sda bs=4096k count=100

Which writes out 419MB vs. the 10MB that the standard ceph-volume lvm zap does, followed by a:
sgdisk --zap-all --clear --mbrtogpt -g -- /dev/sda

And that allowed the ceph-deploy create and activate commands to complete successfully.

So, I think this bug may actually be with the zap code -- it would appear that they don't completely erase the vestiges of previous bluestore OSDs from the disks.

Regards,
Jon Heese

#6 Updated by Hua Liu over 1 year ago

The problem of "ceph-disk activation issue in 12.2.2" has been catched ,It can be solved by this:

1. delete osd
2. ceph-disk zap /dev/sde
3. prepare osd && activate osd

#7 Updated by Jon Heese over 1 year ago

Unfortunately, `ceph-disk zap /dev/sde` does not wipe enough of the disk to avoid this issue. As I mentioned above, I have to wipe out at least the first 100MB of the disk, and do a zap (to properly destroy the partition boundaries) in order for the `activate` to succeed. I just confirmed this right now on a test box.

#8 Updated by Curt Bruns over 1 year ago

Jon Heese wrote:

Unfortunately, `ceph-disk zap /dev/sde` does not wipe enough of the disk to avoid this issue. As I mentioned above, I have to wipe out at least the first 100MB of the disk, and do a zap (to properly destroy the partition boundaries) in order for the `activate` to succeed. I just confirmed this right now on a test box.

Adding +1 to this: I have been observing the same issue with my all NVMe test setup. I have to do the workaround that Jon outlined - using dd to wipe the drives before ceph-deploy will work.

Thanks for the workaround/debug Jon!

- Curt

#9 Updated by Hua Liu over 1 year ago

Jon Heese wrote:

Unfortunately, `ceph-disk zap /dev/sde` does not wipe enough of the disk to avoid this issue. As I mentioned above, I have to wipe out at least the first 100MB of the disk, and do a zap (to properly destroy the partition boundaries) in order for the `activate` to succeed. I just confirmed this right now on a test box.

Actually, "ceph-disk zap /dev/sde" contains the sub commonds:
(1) wipefs all /dev/sde
(2) dd if=/dev/zero of=/dev/sde bs=1M count=10
(3) /usr/sbin/sgdisk --zap-all -
/dev/sde
(4) /usr/sbin/sgdisk --clear --mbrtogpt -- /dev/sde

So, "ceph-disk zap /dev/sde" could be replaced:

(1) wipefs all /dev/sde
(2) dd if=/dev/zero of=/dev/sde bs=1M count=100
(3) /usr/sbin/sgdisk --zap-all -
/dev/sde
(4) /usr/sbin/sgdisk --clear --mbrtogpt -- /dev/sde

#10 Updated by Jon Heese over 1 year ago

So my OSDs had the default Bluestore layout the first time around, i.e. a 100MB DB/WAL (xfs) partition followed by the raw Bluestore partition comprising the rest of the disk. If others have a larger (or smaller) xfs partition, the `count` number could need to be larger (or not need to be as large as) 100MB.

Also, I didn't test exactly a 100MB dd; I actually did something like 400MB just to be sure that it was wiped enough -- I'm assuming that it will need to be slightly more than 100MB (maybe 101MB would be enough?) to ensure that the ID (and/or whatever else on the raw Bluestore partition is messing up the activation) is wiped.

More testing is obviously necessary before changing the `zap` command.

Thanks!

Regards,
Jon Heese

#11 Updated by Greg Farnum over 1 year ago

  • Project changed from Ceph to ceph-volume

#12 Updated by Alfredo Deza over 1 year ago

  • Project changed from ceph-volume to Ceph

#13 Updated by Kefu Chai over 1 year ago

  • Status changed from New to Need Review
  • Backport set to luminous

#14 Updated by Kefu Chai over 1 year ago

  • Project changed from Ceph to RADOS
  • Category set to Administration/Usability
  • Status changed from Need Review to Pending Backport

#15 Updated by Nathan Cutler over 1 year ago

  • Copied to Backport #23103: luminous: v12.2.2 unable to create bluestore osd using ceph-disk added

#16 Updated by Nathan Cutler over 1 year ago

  • Status changed from Pending Backport to Resolved

#17 Updated by Geert Kloosterman over 1 year ago

The problem of left-over OSD data still persists when the partition table has been removed before "ceph-disk zap" is called. This happens because "ceph-disk zap" iterates over the partitions in the partition table to do its wiping. I ran into the same symptoms as described in this issue.

Could ceph-disk be adjusted do the 110MB wipe even when there is no partition table?

#18 Updated by kobi ginon over 1 year ago

Hi all
i m using the following version ceph-12.2.2-0.el7.x86_64.
it seem's that even with dd of 100MB or 110MB
i still get the error shown below
tried numerous possible option but no success.
Note: i can not use a different version at this stage
is there a suggested patch ?

got monmap epoch 1
command_check_call: Running command: /usr/bin/ceph-osd --cluster ceph --mkfs i 2 --monmap /var/lib/ceph/tmp/mnt.UnObMR/activate.monmap --osd-data /var/lib/ceph/tmp/mnt.UnObMR --osd-uuid
9abc69a8-784d-4137-9285-c585cc5fcceb --setuser ceph --setgroup disk
2018-05-14 02:22:41.939196 7fdc19dd7d00 -1 bluestore(/var/lib/ceph/tmp/mnt.UnObMR/block) _check_or_set_bdev_label bdev /var/lib/ceph/tmp/mnt.UnObMR/block fsid 99bafd55-eba7-49bc-8bbc-6dd0be410e41 does not match our fsid 9abc69a8-784d-4137-9285-c585cc5fcceb
2018-05-14 02:22:42.194005 7fdc19dd7d00 -1 bluestore(/var/lib/ceph/tmp/mnt.UnObMR) mkfs fsck found fatal error: (5) Input/output error
2018-05-14 02:22:42.194045 7fdc19dd7d00 -1 OSD::mkfs: ObjectStore::mkfs failed with error (5) Input/output error
2018-05-14 02:22:42.194152 7fdc19dd7d00 -1 ** ERROR: error creating empty object store in /var/lib/ceph/tmp/mnt.UnObMR: (5) Input/output error
mount_activate: Failed to activate
unmount: Unmounting /var/lib/ceph/tmp/mnt.UnObMR
command_check_call: Running command: /bin/umount -
/var/lib/ceph/tmp/mnt.UnObMR
/usr/lib/python2.7/site-packages/ceph_disk/main.py:5694: UserWarning:

#19 Updated by Jon Heese over 1 year ago

I think I ran into the same thing last week reusing an OSD disk. I did a dd of /dev/zero to the disk for ~10-15 minutes and then I was able to reuse the disk successfully. I'm not sure the exact amount of zeros that I had to write nor why the typical ~100MB didn't work, but you might give that a shot and report back.

#20 Updated by kobi ginon over 1 year ago

Hi Jon , thanks a lot for the reply

i'm fighting with issue for a day now, and i have a very strange observation
can you confirm that on your side also ?

the observation is:
when an OSD is running correctly for example /dev/sdb (with bluestore)

#lsblk
sdb 8:16 0 1.1T 0 disk
├─sdb1 8:17 0 100M 0 part
├─sdb2 8:18 0 1.1T 0 part
├─sdb3 8:19 0 1G 0 part
└─sdb4 8:20 0 576M 0 part

then i would check the label of sdb2 which is the block partition
[root@overcloud-ovscompute-0 ~]# ceph-bluestore-tool show-label --dev /dev/sdb2 {
"/dev/sdb2": {
"osd_uuid": "b7775a76-036e-4b4c-a711-f6a5c4af9f79",
"size": 1198426492928,
"btime": "2018-05-10 20:46:47.573155",
"description": "main",
"bluefs": "1",
"ceph_fsid": "7d42501d-1cd0-4db3-8d57-2f69008d8f43",
"kv_backend": "rocksdb",
"magic": "ceph osd volume v026",
"mkfs_done": "yes",
"ready": "ready",
"whoami": "2"
}
}

which is: "osd_uuid": "b7775a76-036e-4b4c-a711-f6a5c4af9f79",
and it is partition uuid of sdb1 (which have the metadata and linking to block device and journals)

[root@overcloud-ovscompute-0 ~]# blkid | grep sdb1
/dev/sdb1: UUID="58d3edb5-c7dd-460d-abce-97748d9199cb" TYPE="xfs" PARTLABEL="ceph data" PARTUUID="b7775a76-036e-4b4c-a711-f6a5c4af9f79"

then i will stop the osd and zap the disk (i tried many options)
let's say:
#sgdisk -Z /dev/sdb
#sgdisk -g /dev/sdb
#partprobe /dev/sdb

Then i have the sdb cleared (or so i think)

[root@overcloud-ovscompute-0 ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 1.1T 0 disk
├─sda1 8:1 0 1M 0 part
├─sda2 8:2 0 100M 0 part
├─sda3 8:3 0 776.7G 0 part
├─sda4 8:4 0 1G 0 part
├─sda5 8:5 0 576M 0 part
├─sda6 8:6 0 3M 0 part
└─sda7 8:7 0 339.5G 0 part /
sdb 8:16 0 1.1T 0 disk

Now when i recreate the OSD

i m running the osd in container - but the process is the same as ceph-prepare ....

Now when i check the label of sdb2
i can see a number that i can't find anywhere

[root@overcloud-ovscompute-0 ~]# ceph-bluestore-tool show-label --dev /dev/sdb2 {
"/dev/sdb2": {
"osd_uuid": "b7775a76-036e-4b4c-a711-f6a5c4af9f79",
"size": 1198426492928,
"btime": "2018-05-10 20:46:47.573155",
"description": "main",
"bluefs": "1",
"ceph_fsid": "7d42501d-1cd0-4db3-8d57-2f69008d8f43",
"kv_backend": "rocksdb",
"magic": "ceph osd volume v026",
"mkfs_done": "yes",
"ready": "ready",
"whoami": "2"
}
}
looking for it like that
[root@overcloud-ovscompute-0 ~]# blkid | grep b7775a76-036e-4b4c-a711-f6a5c4af9f79
[root@overcloud-ovscompute-0 ~]#

and as expected the OSD can not come up.

At this stage there is a strange thing i also observer
if i run the command:
#ceph-disk zap /dev/sdb
remove the osd and recreate it

it will come up Normaly

[root@overcloud-ovscompute-0 ~]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 3.28339 root default
-5 2.18179 host overcloud-ovscompute-0
1 hdd 1.09090 osd.1 up 1.00000 1.00000

and then you will observe a correct osd_uuid which is equal to teh sdb1 partition uuid

[root@overcloud-ovscompute-0 ~]# ceph-bluestore-tool show-label --dev /dev/sdb2 {
"/dev/sdb2": {
"osd_uuid": "c1d037b8-c832-4c0c-a9ca-63cf0e39f531",
"size": 1198426492928,
"btime": "2018-05-14 20:38:40.232312",
"description": "main",
"bluefs": "1",
"ceph_fsid": "7d42501d-1cd0-4db3-8d57-2f69008d8f43",
"kv_backend": "rocksdb",
"magic": "ceph osd volume v026",
"mkfs_done": "yes",
"ready": "ready",
"whoami": "1"
}
}

and find it in
[root@overcloud-ovscompute-0 ~]# blkid | grep "c1d037b8-c832-4c0c-a9ca-63cf0e39f531"
/dev/sdb1: UUID="68b8aadd-c60d-43f9-af8c-98d8c5f44238" TYPE="xfs" PARTLABEL="ceph data" PARTUUID="c1d037b8-c832-4c0c-a9ca-63cf0e39f531"

All this complicated thing makes me think this is not related directly to clearing the disk , i'm suspecting rocksdb is being used by the ceph
and it keeps a wrong value for osd_uuid during recreation of the OSD.

i believe that if there is a way to put the sdb1 partition uuid in the rocksdb the osd will come up normaly.

I might be away off , but this is what i have till now.

Also trying your suggestion now , but i m afraid that for large disks as i have 5.5TB
it will be too much to clear the out for 10-15 Minutes , especially when i ave storage nodes with 24 osd's per server.

Any additional idea's will be good

thx

#21 Updated by kobi ginon over 1 year ago

Hi again
indeed your method also works
in my simple test i just cleared 2 GB out of the disk
before zap setting gpt partition and partprobe
and the osd starts up fine.
Note: i still believe there is a relation to rocksdb somehow and the clearing of disk's forces the database to recreate the data.

systemctl stop ceph-osd@sdb
rm -Rf /var/lib/ceph/tmp/*
ceph auth del osd.$1
ceph osd crush remove osd.$1
ceph osd rm osd.$1
docker rm "ceph-osd-prepare-overcloud-ovscompute-0-sdb"
dd if=/dev/zero of=/dev/sdb bs=1M count=2048
sgdisk -Z /dev/sdb
sgdisk -g /dev/sdb
partprobe /dev/sdb

#...prepare ceph osd
#...start ceph osd

regards

#22 Updated by Jon Heese over 1 year ago

kobi ginon wrote:

Note: i still believe there is a relation to rocksdb somehow and the clearing of disk's forces the database to recreate the data.

Yes, I believe that's probably pretty close to what's going on here, but I can't say for sure one way or the other.

FYI, the OSD machines that I'm using have 24x 4TB SATA disks, so 10-15 minutes of dd is usually 120-180GB, so probably overkill. I'm glad that you found that 2GB was enough to clear out whatever was persisting the OSD ID on the disks. I'll use 2G for future wipe operations.

Regards,
Jon Heese

Also available in: Atom PDF