Project

General

Profile

Bug #19428

Updated by Loic Dachary over 2 years ago

On the ceph-docker CI, I do have a trace that shows up a racing between the sgdisk actions to create partitions and the device node availability in /sys/block.

I've been instrumenting the code in ceph-disk, to be more explicit.
My consist in printing the content of /sys/block/<device> to see what's inside.
At the first iteration, the sdb2 is missing while at the second one (1 sec after) the sdb2 showed up.

I investigated & try to reproduce it locally. I have to admit I failed even by running that code inside a nasted VM.
I suspected 'udevadm settle' returning too soon but didn't got able to proove that.

So that could mean update_partition() isn't blocking enough.

The actual master code of ceph-disk is affected.

My current thought are just about adding a testing loop inside get_partition_dev() which isn't that satisfying.

<pre>
+ ceph-disk -v prepare --cluster test --journal-uuid ab5a4232-ab11-5ae7-ac16-6e8d08eab818 /dev/sdb
command: Running command: /usr/bin/ceph-osd --cluster=test --show-config-value=fsid
command: Running command: /usr/bin/ceph-osd --check-allows-journal -i 0 --cluster test --setuser ceph --setgroup ceph
command: Running command: /usr/bin/ceph-osd --check-wants-journal -i 0 --cluster test --setuser ceph --setgroup ceph
command: Running command: /usr/bin/ceph-osd --check-needs-journal -i 0 --cluster test --setuser ceph --setgroup ceph
get_dm_uuid: get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
set_type: Will colocate journal with data on /dev/sdb
command: Running command: /usr/bin/ceph-osd --cluster=test --show-config-value=osd_journal_size
get_dm_uuid: get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
get_dm_uuid: get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
get_dm_uuid: get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
command: Running command: /usr/bin/ceph-conf --cluster=test --name=osd. --lookup osd_mkfs_type
command: Running command: /usr/bin/ceph-conf --cluster=test --name=osd. --lookup osd_mkfs_options_xfs
command: Running command: /usr/bin/ceph-conf --cluster=test --name=osd. --lookup osd_mount_options_xfs
get_dm_uuid: get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
get_dm_uuid: get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
ptype_tobe_for_name: name = journal
get_dm_uuid: get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
create_partition: Creating journal partition num 2 size 100 on /dev/sdb
command_check_call: Running command: /usr/sbin/sgdisk --new=2:0:+100M --change-name=2:ceph journal --partition-guid=2:ab5a4232-ab11-5ae7-ac16-6e8d08eab818 --typecode=2:45b0969e-9b03-4f30-b4c6-b4b80ceff106 --mbrtogpt -- /dev/sdb
update_partition: Calling partprobe on created device /dev/sdb
command_check_call: Running command: /usr/bin/udevadm settle --timeout=600
command: Running command: /usr/bin/flock -s /dev/sdb /usr/sbin/partprobe /dev/sdb
command_check_call: Running command: /usr/bin/udevadm settle --timeout=600
get_dm_uuid: get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
get_partition_dev: Listing /sys/block/sdb
get_partition_dev: -> ro
get_partition_dev: -> bdi
get_partition_dev: -> dev
get_partition_dev: -> size
get_partition_dev: -> stat
get_partition_dev: -> power
get_partition_dev: -> range
get_partition_dev: -> queue
get_partition_dev: -> trace
get_partition_dev: -> discard_alignment
get_partition_dev: -> device
get_partition_dev: -> events
get_partition_dev: -> subsystem
get_partition_dev: -> ext_range
get_partition_dev: -> slaves
get_partition_dev: -> uevent
get_partition_dev: -> events_poll_msecs
get_partition_dev: -> alignment_offset
get_partition_dev: -> holders
get_partition_dev: -> badblocks
get_partition_dev: -> inflight
get_partition_dev: -> removable
get_partition_dev: -> capability
get_partition_dev: -> events_async
get_partition_dev: ERREUR PARTITION !
get_partition_dev: Listing no 1 /sys/block/sdb
get_partition_dev: -> ro
get_partition_dev: -> bdi
get_partition_dev: -> dev
get_partition_dev: -> sdb2
get_partition_dev: -> size
get_partition_dev: -> stat
get_partition_dev: -> power
get_partition_dev: -> range
get_partition_dev: -> queue
get_partition_dev: -> trace
get_partition_dev: -> discard_alignment
get_partition_dev: -> device
get_partition_dev: -> events
get_partition_dev: -> subsystem
get_partition_dev: -> ext_range
get_partition_dev: -> slaves
get_partition_dev: -> uevent
get_partition_dev: -> events_poll_msecs
get_partition_dev: -> alignment_offset
get_partition_dev: -> holders
get_partition_dev: -> badblocks
get_partition_dev: -> inflight
get_partition_dev: -> removable
get_partition_dev: -> capability
get_partition_dev: -> events_async
</pre>

Back