Project

General

Profile

Bug #14632

ceph-deploy osd create does not activate the disks sometimes

Added by Tamilarasi muthamizhan over 3 years ago. Updated over 2 years ago.

Status:
Closed
Priority:
Urgent
Assignee:
Target version:
-
Start date:
02/03/2016
Due date:
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:

Description

OS: rhel 7.1
ceph version: firefly [RH ceph 1.2.3]

this is an inconsistent issue, that we saw on rhel 7.1, ceph-deploy osd create sometimes does not activate the disks and we had to manually activate the disks once again to get it up active.

we have been seeing similar issues with osd prepare command too, that activates the disk, which it is not supposed to do.

attachment.txt View (40.3 KB) Russell Islam, 05/27/2016 06:07 PM

History

#1 Updated by Nathan Cutler over 3 years ago

Actually, in hammer it's expected for prepare alone to also activate, because the activate step is triggered by udev events.

#2 Updated by Russell Islam over 3 years ago

This is is bug for RHEL 7.2 distribution. Actually every time we call ceph-deploy osed create with zap disk option,
OSD is not starting automatically, either you have to manually activate the disk or reboot the machine.
I believe, this bug is because of the newer version of udevadm. udev event should trigger the activation but
unfortunately its not happening.

#3 Updated by Stefan Eriksson about 3 years ago

Hi we see this aswell on hammer 0.94.7 and Centos 7.2.1551

I have to run the udev event manually:

"ceph-disk-activate /dev/sdX1" on the osd node which has the disk physically.

its not coming up after:

ceph-deploy osd prepare ceph01-osd02:sdg:/journals/osd.11

[ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephcluster/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (1.5.33): /bin/ceph-deploy osd prepare ceph01-osd02:sdg:/journals/osd.11
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] disk : [('ceph01-osd02', '/dev/sdg', '/journals/osd.11')]
[ceph_deploy.cli][INFO ] dmcrypt : False
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] bluestore : None
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] subcommand : prepare
[ceph_deploy.cli][INFO ] dmcrypt_key_dir : /etc/ceph/dmcrypt-keys
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0xbce320>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] fs_type : xfs
[ceph_deploy.cli][INFO ] func : <function osd at 0xbc0c08>
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] zap_disk : False
[ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks ceph01-osd02:/dev/sdg:/journals/osd.11
[ceph01-osd02][DEBUG ] connection detected need for sudo
[ceph01-osd02][DEBUG ] connected to host: ceph01-osd02
[ceph01-osd02][DEBUG ] detect platform information from remote host
[ceph01-osd02][DEBUG ] detect machine type
[ceph01-osd02][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO ] Distro info: CentOS Linux 7.2.1511 Core
[ceph_deploy.osd][DEBUG ] Deploying osd to ceph01-osd02
[ceph01-osd02][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.osd][DEBUG ] Preparing host ceph01-osd02 disk /dev/sdg journal /journals/osd.11 activate False
[ceph01-osd02][DEBUG ] find the location of an executable
[ceph01-osd02][INFO ] Running command: sudo /usr/sbin/ceph-disk v prepare --cluster ceph --fs-type xfs - /dev/sdg /journals/osd.11
[ceph01-osd02][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
[ceph01-osd02][WARNIN] command: Running command: /usr/bin/ceph-osd --check-allows-journal i 0 --cluster ceph
[ceph01-osd02][WARNIN] command: Running command: /usr/bin/ceph-osd --check-wants-journal -i 0 --cluster ceph
[ceph01-osd02][WARNIN] command: Running command: /usr/bin/ceph-osd --check-needs-journal -i 0 --cluster ceph
[ceph01-osd02][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdg uuid path is /sys/dev/block/8:96/dm/uuid
[ceph01-osd02][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=osd_journal_size
[ceph01-osd02][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdg uuid path is /sys/dev/block/8:96/dm/uuid
[ceph01-osd02][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdg uuid path is /sys/dev/block/8:96/dm/uuid
[ceph01-osd02][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdg uuid path is /sys/dev/block/8:96/dm/uuid
[ceph01-osd02][WARNIN] command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mkfs_options_xfs
[ceph01-osd02][WARNIN] command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mkfs_options_xfs
[ceph01-osd02][WARNIN] command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mount_options_xfs
[ceph01-osd02][WARNIN] command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mount_options_xfs
[ceph01-osd02][WARNIN] prepare_file: Creating journal file /journals/osd.11 with size 0 (ceph-osd will resize and allocate)
[ceph01-osd02][WARNIN] prepare_file: Journal is file /journals/osd.11
[ceph01-osd02][WARNIN] prepare_file: OSD will not be hot-swappable if journal is not the same device as the osd data
[ceph01-osd02][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdg uuid path is /sys/dev/block/8:96/dm/uuid
[ceph01-osd02][WARNIN] set_data_partition: Creating osd partition on /dev/sdg
[ceph01-osd02][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdg uuid path is /sys/dev/block/8:96/dm/uuid
[ceph01-osd02][WARNIN] ptype_tobe_for_name: name = data
[ceph01-osd02][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdg uuid path is /sys/dev/block/8:96/dm/uuid
[ceph01-osd02][WARNIN] create_partition: Creating data partition num 1 size 0 on /dev/sdg
[ceph01-osd02][WARNIN] command_check_call: Running command: /sbin/sgdisk --largest-new=1 --change-name=1:ceph data --partition-guid=1:0f41ea64-cb7c-4fe1-8d29-ba349824629f --typecode=1:89c57f98-2fe5-4dc0-89c1-f3ad0ceff2be --mbrtogpt -
/dev/sdg
[ceph01-osd02][DEBUG ] The operation has completed successfully.
[ceph01-osd02][WARNIN] update_partition: Calling partprobe on created device /dev/sdg
[ceph01-osd02][WARNIN] command_check_call: Running command: /usr/bin/udevadm settle --timeout=600
[ceph01-osd02][WARNIN] command: Running command: /sbin/partprobe /dev/sdg
[ceph01-osd02][WARNIN] command_check_call: Running command: /usr/bin/udevadm settle --timeout=600
[ceph01-osd02][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdg uuid path is /sys/dev/block/8:96/dm/uuid
[ceph01-osd02][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdg uuid path is /sys/dev/block/8:96/dm/uuid
[ceph01-osd02][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdg1 uuid path is /sys/dev/block/8:97/dm/uuid
[ceph01-osd02][WARNIN] populate_data_path_device: Creating xfs fs on /dev/sdg1
[ceph01-osd02][WARNIN] command_check_call: Running command: /sbin/mkfs t xfs -f -i size=2048 - /dev/sdg1
[ceph01-osd02][DEBUG ] meta-data=/dev/sdg1 isize=2048 agcount=4, agsize=121896255 blks
[ceph01-osd02][DEBUG ] = sectsz=512 attr=2, projid32bit=1
[ceph01-osd02][DEBUG ] = crc=0 finobt=0
[ceph01-osd02][DEBUG ] data = bsize=4096 blocks=487585019, imaxpct=5
[ceph01-osd02][DEBUG ] = sunit=0 swidth=0 blks
[ceph01-osd02][DEBUG ] naming =version 2 bsize=4096 ascii-ci=0 ftype=0
[ceph01-osd02][DEBUG ] log =internal log bsize=4096 blocks=238078, version=2
[ceph01-osd02][DEBUG ] = sectsz=512 sunit=0 blks, lazy-count=1
[ceph01-osd02][DEBUG ] realtime =none extsz=4096 blocks=0, rtextents=0
[ceph01-osd02][WARNIN] mount: Mounting /dev/sdg1 on /var/lib/ceph/tmp/mnt.2m5zsG with options noatime,inode64
[ceph01-osd02][WARNIN] command_check_call: Running command: /usr/bin/mount t xfs -o noatime,inode64 - /dev/sdg1 /var/lib/ceph/tmp/mnt.2m5zsG
[ceph01-osd02][WARNIN] command: Running command: /sbin/restorecon /var/lib/ceph/tmp/mnt.2m5zsG
[ceph01-osd02][WARNIN] populate_data_path: Preparing osd data dir /var/lib/ceph/tmp/mnt.2m5zsG
[ceph01-osd02][WARNIN] command: Running command: /sbin/restorecon R /var/lib/ceph/tmp/mnt.2m5zsG/ceph_fsid.5801.tmp
[ceph01-osd02][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/tmp/mnt.2m5zsG/ceph_fsid.5801.tmp
[ceph01-osd02][WARNIN] command: Running command: /sbin/restorecon -R /var/lib/ceph/tmp/mnt.2m5zsG/fsid.5801.tmp
[ceph01-osd02][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/tmp/mnt.2m5zsG/fsid.5801.tmp
[ceph01-osd02][WARNIN] command: Running command: /sbin/restorecon -R /var/lib/ceph/tmp/mnt.2m5zsG/magic.5801.tmp
[ceph01-osd02][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/tmp/mnt.2m5zsG/magic.5801.tmp
[ceph01-osd02][WARNIN] command: Running command: /sbin/restorecon -R /var/lib/ceph/tmp/mnt.2m5zsG/journal_uuid.5801.tmp
[ceph01-osd02][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/tmp/mnt.2m5zsG/journal_uuid.5801.tmp
[ceph01-osd02][WARNIN] command: Running command: /sbin/restorecon -R /var/lib/ceph/tmp/mnt.2m5zsG
[ceph01-osd02][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/tmp/mnt.2m5zsG
[ceph01-osd02][WARNIN] unmount: Unmounting /var/lib/ceph/tmp/mnt.2m5zsG
[ceph01-osd02][WARNIN] command_check_call: Running command: /bin/umount -
/var/lib/ceph/tmp/mnt.2m5zsG
[ceph01-osd02][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdg uuid path is /sys/dev/block/8:96/dm/uuid
[ceph01-osd02][WARNIN] command_check_call: Running command: /sbin/sgdisk --typecode=1:4fbd7e29-9d25-41b8-afd0-062c0ceff05d -- /dev/sdg
[ceph01-osd02][DEBUG ] Warning: The kernel is still using the old partition table.
[ceph01-osd02][DEBUG ] The new table will be used at the next reboot.
[ceph01-osd02][DEBUG ] The operation has completed successfully.
[ceph01-osd02][WARNIN] update_partition: Calling partprobe on prepared device /dev/sdg
[ceph01-osd02][WARNIN] command_check_call: Running command: /usr/bin/udevadm settle --timeout=600
[ceph01-osd02][WARNIN] command: Running command: /sbin/partprobe /dev/sdg
[ceph01-osd02][WARNIN] command_check_call: Running command: /usr/bin/udevadm settle --timeout=600
[ceph01-osd02][WARNIN] command_check_call: Running command: /usr/bin/udevadm trigger --action=add --sysname-match sdg1
[ceph01-osd02][INFO ] checking OSD status...
[ceph01-osd02][DEBUG ] find the location of an executable
[ceph01-osd02][INFO ] Running command: sudo /bin/ceph --cluster=ceph osd stat --format=json
[ceph_deploy.osd][DEBUG ] Host ceph01-osd02 is now ready for osd use

#4 Updated by Russell Islam about 3 years ago

I have been watching this problem for long time in Redhat 7.2 based distribution. After creating the OSD with zap-disk option , osd daemon is not starting automatically. Hence all the OSDs are out and down. See attached.

#5 Updated by Ian Colle almost 3 years ago

  • Assignee set to Alfredo Deza

#6 Updated by Alfredo Deza almost 3 years ago

  • Status changed from New to Need More Info

If you could try to reproduce this by running ceph-disk, it would help determine if ceph-deploy is doing anything that is at fault.

For example, taking the output above that uses:

ceph-deploy osd prepare ceph01-osd02:sdg:/journals/osd.11

That would mean, to run this in the remote server:

sudo /usr/sbin/ceph-disk v prepare --cluster ceph --fs-type xfs - /dev/sdg /journals/osd.11

#7 Updated by Alfredo Deza over 2 years ago

  • Status changed from Need More Info to Closed

Closed as no updates received. Feel free to re-open with more information (as requested)

Also available in: Atom PDF