Bug #9665
Updated by Loïc Dachary about 9 years ago
h3. User description
How to reproduce the problem:
* A disk is used by an OSD
* The OSD is not longer useful and the disk is cleared
* The disk is prepared for a new OSD
* The new OSD is prepared but does not activate
Problem description:
The OSD is not activated because the /dev/disk/by-partuuid symbolic link is not updated by udev.
Workaround:
* Reboot the machine
Fix:
* When the disk is cleared via ceph-disk zap (which is called indirectly by ceph-deploy zap), it must notify the kernel via partprobe or partx.
h3. Description
Not calling partprobe after zap may create situations that confuses udev:
* ceph-disk prepare /dev/loop2
* links are created in /dev/disk/by-partuuid
* ceph-disk zap /dev/loop2
* links are *not* removed from /dev/disk/by-partuuid
* ceph-disk prepare /dev/loop2
* some links are not created in /dev/disk/by-partuuid
In the following note that the /dev/loop2p2 link in /dev/disk/by-partuuid has not the current uuid. running *udevadm monitor -e* further shows that the /dev/loop2p2 is not removed as it should although /dev/loop2p1 is.
<pre>
# ./ceph-disk $CEPH_DISK_ARGS prepare /dev/loop2
EPH_DISK_ARGS prepare /dev/loop2
INFO:ceph-disk:Running command: ceph-osd --cluster=ceph --show-config-value=fsid
INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_mkfs_type
INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_type
INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_mkfs_options_xfs
INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mkfs_options_xfs
INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_mount_options_xfs
INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mount_options_xfs
INFO:ceph-disk:Running command: ceph-osd --cluster=ceph --show-config-value=osd_journal_size
INFO:ceph-disk:Will colocate journal with data on /dev/loop2
DEBUG:ceph-disk:Creating journal partition num 2 size 100 on /dev/loop2
INFO:ceph-disk:Running command: /sbin/sgdisk --new=2:0:100M --change-name=2:ceph journal --partition-guid=2:f5189072-ce9d-4522-8336-b4b96c25c023 --typecode=2:45b0969e-9b03-4f30-b4c6-b4b80ceff106 --mbrtogpt -- /dev/loop2
Warning: The kernel is still using the old partition table.
The new table will be used at the next reboot.
The operation has completed successfully.
DEBUG:ceph-disk:Calling partprobe on prepared device /dev/loop2
INFO:ceph-disk:Running command: /sbin/partprobe /dev/loop2
INFO:ceph-disk:Running command: /bin/udevadm settle
DEBUG:ceph-disk:Journal is GPT partition /dev/disk/by-partuuid/f5189072-ce9d-4522-8336-b4b96c25c023
DEBUG:ceph-disk:Creating osd partition on /dev/loop2
INFO:ceph-disk:Running command: /sbin/sgdisk --largest-new=1 --change-name=1:ceph data --partition-guid=1:14ae949b-75f1-4842-b48a-746be713b7e0 --typecode=1:89c57f98-2fe5-4dc0-89c1-f3ad0ceff2be -- /dev/loop2
Warning: The kernel is still using the old partition table.
The new table will be used at the next reboot.
The operation has completed successfully.
INFO:ceph-disk:Running command: /sbin/partprobe /dev/loop2
INFO:ceph-disk:Running command: /bin/udevadm settle
DEBUG:ceph-disk:Creating xfs fs on /dev/loop2p1
INFO:ceph-disk:Running command: /sbin/mkfs -t xfs -f -i size=2048 -- /dev/loop2p1
meta-data=/dev/loop2p1 isize=2048 agcount=4, agsize=6335 blks
= sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=25339, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=1232, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
DEBUG:ceph-disk:Mounting /dev/loop2p1 on test-ceph-disk/tmp/mnt.83v37u with options noatime,inode64
INFO:ceph-disk:Running command: mount -t xfs -o noatime,inode64 -- /dev/loop2p1 test-ceph-disk/tmp/mnt.83v37u
DEBUG:ceph-disk:Preparing osd data dir test-ceph-disk/tmp/mnt.83v37u
DEBUG:ceph-disk:Creating symlink test-ceph-disk/tmp/mnt.83v37u/journal -> /dev/disk/by-partuuid/f5189072-ce9d-4522-8336-b4b96c25c023
DEBUG:ceph-disk:Unmounting test-ceph-disk/tmp/mnt.83v37u
INFO:ceph-disk:Running command: /bin/umount -- test-ceph-disk/tmp/mnt.83v37u
INFO:ceph-disk:Running command: /sbin/sgdisk --typecode=1:4fbd7e29-9d25-41b8-afd0-062c0ceff05d -- /dev/loop2
Warning: The kernel is still using the old partition table.
The new table will be used at the next reboot.
The operation has completed successfully.
DEBUG:ceph-disk:Calling partprobe on prepared device /dev/loop2
INFO:ceph-disk:Running command: /sbin/partprobe /dev/loop2
# ls -l /dev/disk/by-partuuid
k/by-partuuid
total 0
lrwxrwxrwx 1 root root 13 Oct 6 14:40 14ae949b-75f1-4842-b48a-746be713b7e0 -> ../../loop2p1
lrwxrwxrwx 1 root root 13 Oct 6 14:40 f5189072-ce9d-4522-8336-b4b96c25c023 -> ../../loop2p2
# ceph-disk zap /dev/loop2
/dev/loop2
Caution: invalid backup GPT header, but valid main header; regenerating
backup header from main header.
Warning! Main and backup partition tables differ! Use the 'c' and 'e' options
on the recovery & transformation menu to examine the two tables.
Warning! One or more CRCs don't match. You should repair the disk!
****************************************************************************
Caution: Found protective or hybrid MBR and corrupt GPT. Using GPT, but disk
verification and recovery are STRONGLY recommended.
****************************************************************************
Warning: The kernel is still using the old partition table.
The new table will be used at the next reboot.
GPT data structures destroyed! You may now partition the disk using fdisk or
other utilities.
Warning: The kernel is still using the old partition table.
The new table will be used at the next reboot.
The operation has completed successfully.
# ls -l /dev/disk/by-partuuid
k/by-partuuid
total 0
lrwxrwxrwx 1 root root 13 Oct 6 14:40 14ae949b-75f1-4842-b48a-746be713b7e0 -> ../../loop2p1
lrwxrwxrwx 1 root root 13 Oct 6 14:40 f5189072-ce9d-4522-8336-b4b96c25c023 -> ../../loop2p2
# ./ceph-disk $CEPH_DISK_ARGS prepare /dev/loop2
EPH_DISK_ARGS prepare /dev/loop2
INFO:ceph-disk:Running command: ceph-osd --cluster=ceph --show-config-value=fsid
INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_mkfs_type
INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_type
INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_mkfs_options_xfs
INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mkfs_options_xfs
INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_mount_options_xfs
INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mount_options_xfs
INFO:ceph-disk:Running command: ceph-osd --cluster=ceph --show-config-value=osd_journal_size
INFO:ceph-disk:Will colocate journal with data on /dev/loop2
DEBUG:ceph-disk:Creating journal partition num 2 size 100 on /dev/loop2
INFO:ceph-disk:Running command: /sbin/sgdisk --new=2:0:100M --change-name=2:ceph journal --partition-guid=2:046fe72b-5ee2-494d-950c-54a4995f6f4e --typecode=2:45b0969e-9b03-4f30-b4c6-b4b80ceff106 --mbrtogpt -- /dev/loop2
Warning: The kernel is still using the old partition table.
The new table will be used at the next reboot.
The operation has completed successfully.
DEBUG:ceph-disk:Calling partprobe on prepared device /dev/loop2
INFO:ceph-disk:Running command: /sbin/partprobe /dev/loop2
INFO:ceph-disk:Running command: /bin/udevadm settle
DEBUG:ceph-disk:Journal is GPT partition /dev/disk/by-partuuid/046fe72b-5ee2-494d-950c-54a4995f6f4e
DEBUG:ceph-disk:Creating osd partition on /dev/loop2
INFO:ceph-disk:Running command: /sbin/sgdisk --largest-new=1 --change-name=1:ceph data --partition-guid=1:3fdcaea2-c113-480e-8bdf-90534c5e2a67 --typecode=1:89c57f98-2fe5-4dc0-89c1-f3ad0ceff2be -- /dev/loop2
Warning: The kernel is still using the old partition table.
The new table will be used at the next reboot.
The operation has completed successfully.
INFO:ceph-disk:Running command: /sbin/partprobe /dev/loop2
INFO:ceph-disk:Running command: /bin/udevadm settle
DEBUG:ceph-disk:Creating xfs fs on /dev/loop2p1
INFO:ceph-disk:Running command: /sbin/mkfs -t xfs -f -i size=2048 -- /dev/loop2p1
meta-data=/dev/loop2p1 isize=2048 agcount=4, agsize=6335 blks
= sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=25339, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=1232, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
DEBUG:ceph-disk:Mounting /dev/loop2p1 on test-ceph-disk/tmp/mnt.gxIGef with options noatime,inode64
INFO:ceph-disk:Running command: mount -t xfs -o noatime,inode64 -- /dev/loop2p1 test-ceph-disk/tmp/mnt.gxIGef
DEBUG:ceph-disk:Preparing osd data dir test-ceph-disk/tmp/mnt.gxIGef
DEBUG:ceph-disk:Creating symlink test-ceph-disk/tmp/mnt.gxIGef/journal -> /dev/disk/by-partuuid/046fe72b-5ee2-494d-950c-54a4995f6f4e
DEBUG:ceph-disk:Unmounting test-ceph-disk/tmp/mnt.gxIGef
INFO:ceph-disk:Running command: /bin/umount -- test-ceph-disk/tmp/mnt.gxIGef
INFO:ceph-disk:Running command: /sbin/sgdisk --typecode=1:4fbd7e29-9d25-41b8-afd0-062c0ceff05d -- /dev/loop2
Warning: The kernel is still using the old partition table.
The new table will be used at the next reboot.
The operation has completed successfully.
DEBUG:ceph-disk:Calling partprobe on prepared device /dev/loop2
INFO:ceph-disk:Running command: /sbin/partprobe /dev/loop2
# ls -l /dev/disk/by-partuuid
k/by-partuuid
total 0
lrwxrwxrwx 1 root root 13 Oct 6 14:41 3fdcaea2-c113-480e-8bdf-90534c5e2a67 -> ../../loop2p1
lrwxrwxrwx 1 root root 13 Oct 6 14:40 f5189072-ce9d-4522-8336-b4b96c25c023 -> ../../loop2p2
#
</pre>