Bug #6955
closedceph-disk should set the guid correctly when re-using a partition
0%
Description
If an existing ceph data partition is re-used, the partition GUID is not reset when ceph-disk prepares it. As a consequence it will not be identified as a ceph data partition by the ceph udev rules and the OSD won't start when the machine boots.
Steps to reproduce:
loic@fold:~/software/ceph/ceph/src$ sudo sgdisk --typecode=1:99999999-9d25-41b8-afd0-062c0ceff05d /dev/loop2 Warning: The kernel is still using the old partition table. The new table will be used at the next reboot. The operation has completed successfully. loic@fold:~/software/ceph/ceph/src$ sudo partprobe /dev/loop2 loic@fold:~/software/ceph/ceph/src$ sudo sgdisk --info 1 /dev/loop2 Partition GUID code: 99999999-9D25-41B8-AFD0-062C0CEFF05D (Unknown) Partition unique GUID: 66D87896-493A-48B4-A803-6ED6DE5D49AB First sector: 2048 (at 1024.0 KiB) Last sector: 819166 (at 400.0 MiB) Partition size: 817119 sectors (399.0 MiB) Attribute flags: 0000000000000000 Partition name: 'ceph data' loic@fold:~/software/ceph/ceph/src$ #sudo env PATH=$PATH ceph-disk --verbose prepare /dev/loop2 /tmp/journal loic@fold:~/software/ceph/ceph/src$ touch /tmp/journal loic@fold:~/software/ceph/ceph/src$ sudo env PATH=$PATH ceph-disk --verbose prepare /dev/loop2p1 /tmp/journal INFO:ceph-disk:Running command: ceph-osd --cluster=ceph --show-config-value=fsid INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_mkfs_type INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_type INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_mkfs_options_xfs INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mkfs_options_xfs INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_mount_options_xfs INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mount_options_xfs INFO:ceph-disk:Running command: ceph-osd --cluster=ceph --show-config-value=osd_journal_size INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_cryptsetup_parameters INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_dmcrypt_key_size INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_dmcrypt_type DEBUG:ceph-disk:Journal is file /tmp/journal WARNING:ceph-disk:OSD will not be hot-swappable if journal is not the same device as the osd data DEBUG:ceph-disk:OSD data device /dev/loop2p1 is a partition DEBUG:ceph-disk:Creating xfs fs on /dev/loop2p1 INFO:ceph-disk:Running command: /sbin/mkfs -t xfs -f -i size=2048 -- /dev/loop2p1 meta-data=/dev/loop2p1 isize=2048 agcount=4, agsize=25535 blks = sectsz=512 attr=2, projid32bit=0 data = bsize=4096 blocks=102139, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal log bsize=4096 blocks=1232, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 DEBUG:ceph-disk:Mounting /dev/loop2p1 on /var/lib/ceph/tmp/mnt.IUm2Xm with options noatime,inode64 INFO:ceph-disk:Running command: mount -t xfs -o noatime,inode64 -- /dev/loop2p1 /var/lib/ceph/tmp/mnt.IUm2Xm DEBUG:ceph-disk:Preparing osd data dir /var/lib/ceph/tmp/mnt.IUm2Xm DEBUG:ceph-disk:Creating symlink /var/lib/ceph/tmp/mnt.IUm2Xm/journal -> /tmp/journal DEBUG:ceph-disk:Unmounting /var/lib/ceph/tmp/mnt.IUm2Xm INFO:ceph-disk:Running command: /bin/umount -- /var/lib/ceph/tmp/mnt.IUm2Xm umount: /var/lib/ceph/tmp/mnt.IUm2Xm: device is busy. (In some cases useful info about processes that use the device is found by lsof(8) or fuser(1)) DEBUG:ceph-disk:Unmounting /var/lib/ceph/tmp/mnt.IUm2Xm INFO:ceph-disk:Running command: /bin/umount -- /var/lib/ceph/tmp/mnt.IUm2Xm DEBUG:ceph-disk:Calling partprobe on prepared device /dev/loop2p1 INFO:ceph-disk:Running command: /sbin/partprobe /dev/loop2p1 Error: Partition(s) 1 on /dev/loop2p1 have been written, but we have been unable to inform the kernel of the change, probably because it/they are in use. As a result, the old partition(s) will remain in use. You should reboot now before making further changes. loic@fold:~/software/ceph/ceph/src$ sudo sgdisk --info 1 /dev/loop2 Partition GUID code: 99999999-9D25-41B8-AFD0-062C0CEFF05D (Unknown) Partition unique GUID: 66D87896-493A-48B4-A803-6ED6DE5D49AB First sector: 2048 (at 1024.0 KiB) Last sector: 819166 (at 400.0 MiB) Partition size: 817119 sectors (399.0 MiB) Attribute flags: 0000000000000000 Partition name: 'ceph data' loic@fold:~/software/ceph/ceph/src$ sudo partprobe /dev/loop2 loic@fold:~/software/ceph/ceph/src$ sudo sgdisk --info 1 /dev/loop2 Partition GUID code: 99999999-9D25-41B8-AFD0-062C0CEFF05D (Unknown) Partition unique GUID: 66D87896-493A-48B4-A803-6ED6DE5D49AB First sector: 2048 (at 1024.0 KiB) Last sector: 819166 (at 400.0 MiB) Partition size: 817119 sectors (399.0 MiB) Attribute flags: 0000000000000000 Partition name: 'ceph data'
See mailing list thread: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-December/006650.html
Updated by Loïc Dachary about 9 years ago
- Subject changed from ceph-disk should set the guid correctly (it currently does not) to ceph-disk should set the guid correctly when re-using a partition
- Description updated (diff)
- Status changed from New to 12
- Priority changed from Normal to High
Updated by Sage Weil about 9 years ago
- Assignee set to Loïc Dachary
Loic, can you look? I thought we fixed this?
Updated by Loïc Dachary almost 9 years ago
- Status changed from 12 to In Progress
- Regression set to No
Updated by Loïc Dachary almost 9 years ago
- Status changed from In Progress to Need More Info
Sage,
If we set the typecode / uuid of an existing partition (something like https://github.com/dachary/ceph/commit/5da9d01720da2e94f6e29f12d29429869c38655e#diff-788c3cea6213c27f5fdb22f8337096d5R1403 and figuring out the partition number instead of the hardcoded 1), an existing partition given to ceph-disk will be reset. As of now, preparing an existing partition does not change the typecode / uuid and the partition will not be reset.
Do we really want to change this behavior ? I think it makes the ceph-disk behavior more predictible: people are confused by the fact that re-using an existing partition for data requires additional manual steps to reset the typecode. But it may also wipe out an existing partition if the wrong partition is given in argument.
Maybe we could add --yes-the-data-partition-will-wiped-out-and-all-data-it-currently-contains-will-be-lost to confirm ?
Updated by Sage Weil almost 9 years ago
- Status changed from Need More Info to 12
Loic Dachary wrote:
Sage,
If we set the typecode / uuid of an existing partition (something like https://github.com/dachary/ceph/commit/5da9d01720da2e94f6e29f12d29429869c38655e#diff-788c3cea6213c27f5fdb22f8337096d5R1403 and figuring out the partition number instead of the hardcoded 1), an existing partition given to ceph-disk will be reset. As of now, preparing an existing partition does not change the typecode / uuid and the partition will not be reset.
Do we really want to change this behavior ? I think it makes the ceph-disk behavior more predictible: people are confused by the fact that re-using an existing partition for data requires additional manual steps to reset the typecode. But it may also wipe out an existing partition if the wrong partition is given in argument.
Maybe we could add --yes-the-data-partition-will-wiped-out-and-all-data-it-currently-contains-will-be-lost to confirm ?
I think we should change it. Adding a --yes-i-really-mean it type flag sound good.