ceph-disk: /sys/block/<device>/queue/physical_block_size is not obeyed
ceph-disk is creating data partition from sector 1 ignoring what sgdisk is recommending (256 in my disk). Basically, it should be aligned with physical sector size (reported in /sys/block/<device>/queue/physical_block_size). In my case it is 16K physical and 4K logical...256 is perfectly fine as sgdisk/fdisk internally decides.
Disk performance will be severely impacted because of partitioning this way from ceph-disk
#8 Updated by Hans Boot over 3 years ago
Please people, I do not understand this is classified as "minor".
On many physical disks, the alignment must be respected or otherwise performance will suffer greatly.
This situation means that one CANNOT use ceph-disk or "ceph-deploy disk prepare" if one seeks performance, which should be the majority of the users.
On top of that, having a partition that is not aligned also breaks "ceph-deploy disk activate" due to parsing issues.
So people with certain disks simply cannot use this tool. A bug that disqualifies an entire tool is worth at least "major" to me.
#10 Updated by Hans Boot over 3 years ago
to continue on this, as most disks now have 4096 alignment and sgdisk uses 2048, a quick and dirty solution would be to add --set-alignment=4096 before the --largest-new=... via some minor adaptations in ceph-disk.
I personally solved my case like this, but I do not say this is the definitive solution. Just in case someone else stumbles on this and needs a quick solution.
#12 Updated by Kefu Chai over 3 years ago
- Status changed from 12 to Need More Info
sgdisk always moves the start sector to the multiple of sector alignment. see https://sourceforge.net/p/gptfdisk/code/ci/master/tree/gpt.cc#l1862
a typical output of ceph-disk looks like:
# ceph-disk prepare --osd-uuid "$osd_uuid" \ --fs-type xfs --cluster ceph -- \ /dev/sdc3 /dev/sda WARNING:ceph-disk:OSD will not be hot-swappable if ... Information: Moved requested sector from 34 to 2048 in order to align on 2048-sector boundaries. The operation has completed successfully. meta-data=/dev/sdc3 isize=2048 agcount=4, agsize=61083136 blks = sectsz=512 attr=2, projid32bit=0 data = bsize=4096 blocks=244332544, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal log bsize=4096 blocks=119303, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0
in the above example, sgdisk moves the start sector to 2048 so it's the multiple of the sector alignment. could you post the output of
#13 Updated by Hans Boot over 3 years ago
Terribly sorry, but I am no longer involved with ceph at this moment, and I no longer have an operational ceph cluster at disposal. From a backup I got the following, but that is without access to the disks that were used as storage, so I do not know if this is useful.
$sgdisk --version GPT fdisk (sgdisk) version 1.0.1
The disks I used at the time were relatively old but fairly standard disks: WD4000FYYZ
#14 Updated by Kefu Chai over 3 years ago
strange enough, i checked the commit of 846a9e30cda88f75369d175f2f549cad3ea15db2 of gptfdisk: https://sourceforge.net/p/gptfdisk/code/ci/846a9e30cda88f75369d175f2f549cad3ea15db2/tree/gpt.cc#l1790
it also aligns the start sector when creating a new partition.