Actions
Bug #9665
closedceph-disk zap should call partprobe
% Done:
90%
Source:
other
Tags:
Backport:
giant, firefly, dumpling
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
User description¶
Symptoms:
- A disk is used by an OSD
- The OSD is not longer useful and the disk is cleared
- The disk is prepared for a new OSD
- The new OSD is prepared but does not activate
Diagnostic:
- The OSD is not activated because the /dev/disk/by-partuuid symbolic link is not updated by udev.
Workaround:
- Reboot the machine
Fix:
- When the disk is cleared via ceph-disk zap (which is called indirectly by ceph-deploy zap), it must notify the kernel via partprobe or partx.
Description¶
Not calling partprobe after zap may create situations that confuses udev:
- ceph-disk prepare /dev/loop2
- links are created in /dev/disk/by-partuuid
- ceph-disk zap /dev/loop2
- links are not removed from /dev/disk/by-partuuid
- ceph-disk prepare /dev/loop2
- some links are not created in /dev/disk/by-partuuid
In the following note that the /dev/loop2p2 link in /dev/disk/by-partuuid has not the current uuid. running udevadm monitor -e further shows that the /dev/loop2p2 is not removed as it should although /dev/loop2p1 is.
# ./ceph-disk $CEPH_DISK_ARGS prepare /dev/loop2 EPH_DISK_ARGS prepare /dev/loop2 INFO:ceph-disk:Running command: ceph-osd --cluster=ceph --show-config-value=fsid INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_mkfs_type INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_type INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_mkfs_options_xfs INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mkfs_options_xfs INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_mount_options_xfs INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mount_options_xfs INFO:ceph-disk:Running command: ceph-osd --cluster=ceph --show-config-value=osd_journal_size INFO:ceph-disk:Will colocate journal with data on /dev/loop2 DEBUG:ceph-disk:Creating journal partition num 2 size 100 on /dev/loop2 INFO:ceph-disk:Running command: /sbin/sgdisk --new=2:0:100M --change-name=2:ceph journal --partition-guid=2:f5189072-ce9d-4522-8336-b4b96c25c023 --typecode=2:45b0969e-9b03-4f30-b4c6-b4b80ceff106 --mbrtogpt -- /dev/loop2 Warning: The kernel is still using the old partition table. The new table will be used at the next reboot. The operation has completed successfully. DEBUG:ceph-disk:Calling partprobe on prepared device /dev/loop2 INFO:ceph-disk:Running command: /sbin/partprobe /dev/loop2 INFO:ceph-disk:Running command: /bin/udevadm settle DEBUG:ceph-disk:Journal is GPT partition /dev/disk/by-partuuid/f5189072-ce9d-4522-8336-b4b96c25c023 DEBUG:ceph-disk:Creating osd partition on /dev/loop2 INFO:ceph-disk:Running command: /sbin/sgdisk --largest-new=1 --change-name=1:ceph data --partition-guid=1:14ae949b-75f1-4842-b48a-746be713b7e0 --typecode=1:89c57f98-2fe5-4dc0-89c1-f3ad0ceff2be -- /dev/loop2 Warning: The kernel is still using the old partition table. The new table will be used at the next reboot. The operation has completed successfully. INFO:ceph-disk:Running command: /sbin/partprobe /dev/loop2 INFO:ceph-disk:Running command: /bin/udevadm settle DEBUG:ceph-disk:Creating xfs fs on /dev/loop2p1 INFO:ceph-disk:Running command: /sbin/mkfs -t xfs -f -i size=2048 -- /dev/loop2p1 meta-data=/dev/loop2p1 isize=2048 agcount=4, agsize=6335 blks = sectsz=512 attr=2, projid32bit=0 data = bsize=4096 blocks=25339, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal log bsize=4096 blocks=1232, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 DEBUG:ceph-disk:Mounting /dev/loop2p1 on test-ceph-disk/tmp/mnt.83v37u with options noatime,inode64 INFO:ceph-disk:Running command: mount -t xfs -o noatime,inode64 -- /dev/loop2p1 test-ceph-disk/tmp/mnt.83v37u DEBUG:ceph-disk:Preparing osd data dir test-ceph-disk/tmp/mnt.83v37u DEBUG:ceph-disk:Creating symlink test-ceph-disk/tmp/mnt.83v37u/journal -> /dev/disk/by-partuuid/f5189072-ce9d-4522-8336-b4b96c25c023 DEBUG:ceph-disk:Unmounting test-ceph-disk/tmp/mnt.83v37u INFO:ceph-disk:Running command: /bin/umount -- test-ceph-disk/tmp/mnt.83v37u INFO:ceph-disk:Running command: /sbin/sgdisk --typecode=1:4fbd7e29-9d25-41b8-afd0-062c0ceff05d -- /dev/loop2 Warning: The kernel is still using the old partition table. The new table will be used at the next reboot. The operation has completed successfully. DEBUG:ceph-disk:Calling partprobe on prepared device /dev/loop2 INFO:ceph-disk:Running command: /sbin/partprobe /dev/loop2 # ls -l /dev/disk/by-partuuid k/by-partuuid total 0 lrwxrwxrwx 1 root root 13 Oct 6 14:40 14ae949b-75f1-4842-b48a-746be713b7e0 -> ../../loop2p1 lrwxrwxrwx 1 root root 13 Oct 6 14:40 f5189072-ce9d-4522-8336-b4b96c25c023 -> ../../loop2p2 # ceph-disk zap /dev/loop2 /dev/loop2 Caution: invalid backup GPT header, but valid main header; regenerating backup header from main header. Warning! Main and backup partition tables differ! Use the 'c' and 'e' options on the recovery & transformation menu to examine the two tables. Warning! One or more CRCs don't match. You should repair the disk! **************************************************************************** Caution: Found protective or hybrid MBR and corrupt GPT. Using GPT, but disk verification and recovery are STRONGLY recommended. **************************************************************************** Warning: The kernel is still using the old partition table. The new table will be used at the next reboot. GPT data structures destroyed! You may now partition the disk using fdisk or other utilities. Warning: The kernel is still using the old partition table. The new table will be used at the next reboot. The operation has completed successfully. # ls -l /dev/disk/by-partuuid k/by-partuuid total 0 lrwxrwxrwx 1 root root 13 Oct 6 14:40 14ae949b-75f1-4842-b48a-746be713b7e0 -> ../../loop2p1 lrwxrwxrwx 1 root root 13 Oct 6 14:40 f5189072-ce9d-4522-8336-b4b96c25c023 -> ../../loop2p2 # ./ceph-disk $CEPH_DISK_ARGS prepare /dev/loop2 EPH_DISK_ARGS prepare /dev/loop2 INFO:ceph-disk:Running command: ceph-osd --cluster=ceph --show-config-value=fsid INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_mkfs_type INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_type INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_mkfs_options_xfs INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mkfs_options_xfs INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_mount_options_xfs INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mount_options_xfs INFO:ceph-disk:Running command: ceph-osd --cluster=ceph --show-config-value=osd_journal_size INFO:ceph-disk:Will colocate journal with data on /dev/loop2 DEBUG:ceph-disk:Creating journal partition num 2 size 100 on /dev/loop2 INFO:ceph-disk:Running command: /sbin/sgdisk --new=2:0:100M --change-name=2:ceph journal --partition-guid=2:046fe72b-5ee2-494d-950c-54a4995f6f4e --typecode=2:45b0969e-9b03-4f30-b4c6-b4b80ceff106 --mbrtogpt -- /dev/loop2 Warning: The kernel is still using the old partition table. The new table will be used at the next reboot. The operation has completed successfully. DEBUG:ceph-disk:Calling partprobe on prepared device /dev/loop2 INFO:ceph-disk:Running command: /sbin/partprobe /dev/loop2 INFO:ceph-disk:Running command: /bin/udevadm settle DEBUG:ceph-disk:Journal is GPT partition /dev/disk/by-partuuid/046fe72b-5ee2-494d-950c-54a4995f6f4e DEBUG:ceph-disk:Creating osd partition on /dev/loop2 INFO:ceph-disk:Running command: /sbin/sgdisk --largest-new=1 --change-name=1:ceph data --partition-guid=1:3fdcaea2-c113-480e-8bdf-90534c5e2a67 --typecode=1:89c57f98-2fe5-4dc0-89c1-f3ad0ceff2be -- /dev/loop2 Warning: The kernel is still using the old partition table. The new table will be used at the next reboot. The operation has completed successfully. INFO:ceph-disk:Running command: /sbin/partprobe /dev/loop2 INFO:ceph-disk:Running command: /bin/udevadm settle DEBUG:ceph-disk:Creating xfs fs on /dev/loop2p1 INFO:ceph-disk:Running command: /sbin/mkfs -t xfs -f -i size=2048 -- /dev/loop2p1 meta-data=/dev/loop2p1 isize=2048 agcount=4, agsize=6335 blks = sectsz=512 attr=2, projid32bit=0 data = bsize=4096 blocks=25339, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal log bsize=4096 blocks=1232, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 DEBUG:ceph-disk:Mounting /dev/loop2p1 on test-ceph-disk/tmp/mnt.gxIGef with options noatime,inode64 INFO:ceph-disk:Running command: mount -t xfs -o noatime,inode64 -- /dev/loop2p1 test-ceph-disk/tmp/mnt.gxIGef DEBUG:ceph-disk:Preparing osd data dir test-ceph-disk/tmp/mnt.gxIGef DEBUG:ceph-disk:Creating symlink test-ceph-disk/tmp/mnt.gxIGef/journal -> /dev/disk/by-partuuid/046fe72b-5ee2-494d-950c-54a4995f6f4e DEBUG:ceph-disk:Unmounting test-ceph-disk/tmp/mnt.gxIGef INFO:ceph-disk:Running command: /bin/umount -- test-ceph-disk/tmp/mnt.gxIGef INFO:ceph-disk:Running command: /sbin/sgdisk --typecode=1:4fbd7e29-9d25-41b8-afd0-062c0ceff05d -- /dev/loop2 Warning: The kernel is still using the old partition table. The new table will be used at the next reboot. The operation has completed successfully. DEBUG:ceph-disk:Calling partprobe on prepared device /dev/loop2 INFO:ceph-disk:Running command: /sbin/partprobe /dev/loop2 # ls -l /dev/disk/by-partuuid k/by-partuuid total 0 lrwxrwxrwx 1 root root 13 Oct 6 14:41 3fdcaea2-c113-480e-8bdf-90534c5e2a67 -> ../../loop2p1 lrwxrwxrwx 1 root root 13 Oct 6 14:40 f5189072-ce9d-4522-8336-b4b96c25c023 -> ../../loop2p2 #
Updated by Loïc Dachary over 9 years ago
- Status changed from 12 to 7
- % Done changed from 0 to 60
Updated by Loïc Dachary over 9 years ago
- Backport set to giant, firefly, emperor, dumpling
Updated by Loïc Dachary over 9 years ago
<dvanders> loicd: I saw that change. you factorized the "update partitions" part, but the issue i observe is that partx/partprobe is not triggering udev correctly on a loaded server <loicd> oh really ? <loicd> dam <dvanders> i never observed this on our (idle) test cluster <loicd> dvanders: did you trace this back to a known bug ? <dvanders> but now that i tried to reuse a journal partition on our busy prod cluster, the new journal symlink isn't appearing <dvanders> i didn't find a known bug about this
In the context of https://github.com/ceph/ceph/pull/2955
Updated by Loïc Dachary over 9 years ago
- Status changed from 7 to Resolved
- % Done changed from 60 to 100
Updated by Loïc Dachary over 9 years ago
- Status changed from Resolved to Pending Backport
- % Done changed from 100 to 90
let's wait a week or two before backporting
Updated by Loïc Dachary over 9 years ago
- giant backport https://github.com/ceph/ceph/pull/3005
Updated by Loïc Dachary over 9 years ago
- Status changed from Pending Backport to Fix Under Review
Updated by Loïc Dachary over 9 years ago
- Backport changed from giant, firefly, emperor, dumpling to giant, firefly, dumpling
- firefly backport https://github.com/ceph/ceph/pull/3014
- dumpling backport https://github.com/ceph/ceph/pull/3015
Updated by Loïc Dachary over 9 years ago
- Status changed from Fix Under Review to Resolved
Actions