Project

General

Profile

Actions

Bug #6955

closed

ceph-disk should set the guid correctly when re-using a partition

Added by Alfredo Deza over 10 years ago. Updated over 8 years ago.

Status:
Won't Fix
Priority:
High
Assignee:
-
Category:
ceph cli
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

If an existing ceph data partition is re-used, the partition GUID is not reset when ceph-disk prepares it. As a consequence it will not be identified as a ceph data partition by the ceph udev rules and the OSD won't start when the machine boots.

Steps to reproduce:

loic@fold:~/software/ceph/ceph/src$ sudo sgdisk --typecode=1:99999999-9d25-41b8-afd0-062c0ceff05d /dev/loop2
Warning: The kernel is still using the old partition table.
The new table will be used at the next reboot.
The operation has completed successfully.
loic@fold:~/software/ceph/ceph/src$ sudo partprobe /dev/loop2
loic@fold:~/software/ceph/ceph/src$ sudo sgdisk --info 1 /dev/loop2
Partition GUID code: 99999999-9D25-41B8-AFD0-062C0CEFF05D (Unknown)
Partition unique GUID: 66D87896-493A-48B4-A803-6ED6DE5D49AB
First sector: 2048 (at 1024.0 KiB)
Last sector: 819166 (at 400.0 MiB)
Partition size: 817119 sectors (399.0 MiB)
Attribute flags: 0000000000000000
Partition name: 'ceph data'
loic@fold:~/software/ceph/ceph/src$ #sudo env PATH=$PATH ceph-disk --verbose prepare /dev/loop2 /tmp/journal
loic@fold:~/software/ceph/ceph/src$ touch /tmp/journal
loic@fold:~/software/ceph/ceph/src$ sudo env PATH=$PATH ceph-disk --verbose prepare /dev/loop2p1 /tmp/journal
INFO:ceph-disk:Running command: ceph-osd --cluster=ceph --show-config-value=fsid
INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_mkfs_type
INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_type
INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_mkfs_options_xfs
INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mkfs_options_xfs
INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_mount_options_xfs
INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mount_options_xfs
INFO:ceph-disk:Running command: ceph-osd --cluster=ceph --show-config-value=osd_journal_size
INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_cryptsetup_parameters
INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_dmcrypt_key_size
INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_dmcrypt_type
DEBUG:ceph-disk:Journal is file /tmp/journal
WARNING:ceph-disk:OSD will not be hot-swappable if journal is not the same device as the osd data
DEBUG:ceph-disk:OSD data device /dev/loop2p1 is a partition
DEBUG:ceph-disk:Creating xfs fs on /dev/loop2p1
INFO:ceph-disk:Running command: /sbin/mkfs -t xfs -f -i size=2048 -- /dev/loop2p1
meta-data=/dev/loop2p1           isize=2048   agcount=4, agsize=25535 blks
         =                       sectsz=512   attr=2, projid32bit=0
data     =                       bsize=4096   blocks=102139, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal log           bsize=4096   blocks=1232, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
DEBUG:ceph-disk:Mounting /dev/loop2p1 on /var/lib/ceph/tmp/mnt.IUm2Xm with options noatime,inode64
INFO:ceph-disk:Running command: mount -t xfs -o noatime,inode64 -- /dev/loop2p1 /var/lib/ceph/tmp/mnt.IUm2Xm
DEBUG:ceph-disk:Preparing osd data dir /var/lib/ceph/tmp/mnt.IUm2Xm
DEBUG:ceph-disk:Creating symlink /var/lib/ceph/tmp/mnt.IUm2Xm/journal -> /tmp/journal
DEBUG:ceph-disk:Unmounting /var/lib/ceph/tmp/mnt.IUm2Xm
INFO:ceph-disk:Running command: /bin/umount -- /var/lib/ceph/tmp/mnt.IUm2Xm
umount: /var/lib/ceph/tmp/mnt.IUm2Xm: device is busy.
        (In some cases useful info about processes that use
         the device is found by lsof(8) or fuser(1))
DEBUG:ceph-disk:Unmounting /var/lib/ceph/tmp/mnt.IUm2Xm
INFO:ceph-disk:Running command: /bin/umount -- /var/lib/ceph/tmp/mnt.IUm2Xm
DEBUG:ceph-disk:Calling partprobe on prepared device /dev/loop2p1
INFO:ceph-disk:Running command: /sbin/partprobe /dev/loop2p1
Error: Partition(s) 1 on /dev/loop2p1 have been written, but we have been unable to inform the kernel of the change, probably because it/they are in use.  As a result, the old partition(s) will remain in use.  You should reboot now before making further changes.
loic@fold:~/software/ceph/ceph/src$ sudo sgdisk --info 1 /dev/loop2
Partition GUID code: 99999999-9D25-41B8-AFD0-062C0CEFF05D (Unknown)
Partition unique GUID: 66D87896-493A-48B4-A803-6ED6DE5D49AB
First sector: 2048 (at 1024.0 KiB)
Last sector: 819166 (at 400.0 MiB)
Partition size: 817119 sectors (399.0 MiB)
Attribute flags: 0000000000000000
Partition name: 'ceph data'
loic@fold:~/software/ceph/ceph/src$ sudo partprobe /dev/loop2
loic@fold:~/software/ceph/ceph/src$ sudo sgdisk --info 1 /dev/loop2
Partition GUID code: 99999999-9D25-41B8-AFD0-062C0CEFF05D (Unknown)
Partition unique GUID: 66D87896-493A-48B4-A803-6ED6DE5D49AB
First sector: 2048 (at 1024.0 KiB)
Last sector: 819166 (at 400.0 MiB)
Partition size: 817119 sectors (399.0 MiB)
Attribute flags: 0000000000000000
Partition name: 'ceph data'

See mailing list thread: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-December/006650.html

Actions #1

Updated by Loïc Dachary about 9 years ago

  • Subject changed from ceph-disk should set the guid correctly (it currently does not) to ceph-disk should set the guid correctly when re-using a partition
  • Description updated (diff)
  • Status changed from New to 12
  • Priority changed from Normal to High
Actions #2

Updated by Sage Weil about 9 years ago

  • Assignee set to Loïc Dachary

Loic, can you look? I thought we fixed this?

Actions #3

Updated by Loïc Dachary almost 9 years ago

  • Status changed from 12 to In Progress
  • Regression set to No
Actions #4

Updated by Loïc Dachary almost 9 years ago

  • Status changed from In Progress to Need More Info

Sage,

If we set the typecode / uuid of an existing partition (something like https://github.com/dachary/ceph/commit/5da9d01720da2e94f6e29f12d29429869c38655e#diff-788c3cea6213c27f5fdb22f8337096d5R1403 and figuring out the partition number instead of the hardcoded 1), an existing partition given to ceph-disk will be reset. As of now, preparing an existing partition does not change the typecode / uuid and the partition will not be reset.

Do we really want to change this behavior ? I think it makes the ceph-disk behavior more predictible: people are confused by the fact that re-using an existing partition for data requires additional manual steps to reset the typecode. But it may also wipe out an existing partition if the wrong partition is given in argument.

Maybe we could add --yes-the-data-partition-will-wiped-out-and-all-data-it-currently-contains-will-be-lost to confirm ?

Actions #5

Updated by Sage Weil almost 9 years ago

  • Status changed from Need More Info to 12

Loic Dachary wrote:

Sage,

If we set the typecode / uuid of an existing partition (something like https://github.com/dachary/ceph/commit/5da9d01720da2e94f6e29f12d29429869c38655e#diff-788c3cea6213c27f5fdb22f8337096d5R1403 and figuring out the partition number instead of the hardcoded 1), an existing partition given to ceph-disk will be reset. As of now, preparing an existing partition does not change the typecode / uuid and the partition will not be reset.

Do we really want to change this behavior ? I think it makes the ceph-disk behavior more predictible: people are confused by the fact that re-using an existing partition for data requires additional manual steps to reset the typecode. But it may also wipe out an existing partition if the wrong partition is given in argument.

Maybe we could add --yes-the-data-partition-will-wiped-out-and-all-data-it-currently-contains-will-be-lost to confirm ?

I think we should change it. Adding a --yes-i-really-mean it type flag sound good.

Actions #6

Updated by Loïc Dachary over 8 years ago

  • Assignee deleted (Loïc Dachary)
Actions #7

Updated by Sage Weil over 8 years ago

  • Status changed from 12 to Won't Fix
Actions

Also available in: Atom PDF