Project

General

Profile

Bug #19428

Updated by Loïc Dachary about 7 years ago

On the ceph-docker CI, I do have a trace that shows up a racing between the sgdisk actions to create partitions and the device node availability in /sys/block. 

 I've been instrumenting the code in ceph-disk, to be more explicit. 
 My consist in printing the content of /sys/block/<device> to see what's inside. 
 At the first iteration, the sdb2 is missing while at the second one (1 sec after) the sdb2 showed up. 


 I investigated & try to reproduce it locally. I have to admit I failed even by running that code inside a nasted VM. 
 I suspected 'udevadm settle' returning too soon but didn't got able to proove that. 

 So that could mean update_partition() isn't blocking enough. 

 The actual master code of ceph-disk is affected. 

 My current thought are just about adding a testing loop inside get_partition_dev() which isn't that satisfying. 

 <pre> 
 + ceph-disk -v prepare --cluster test --journal-uuid ab5a4232-ab11-5ae7-ac16-6e8d08eab818 /dev/sdb 
 command: Running command: /usr/bin/ceph-osd --cluster=test --show-config-value=fsid 
 command: Running command: /usr/bin/ceph-osd --check-allows-journal -i 0 --cluster test --setuser ceph --setgroup ceph 
 command: Running command: /usr/bin/ceph-osd --check-wants-journal -i 0 --cluster test --setuser ceph --setgroup ceph 
 command: Running command: /usr/bin/ceph-osd --check-needs-journal -i 0 --cluster test --setuser ceph --setgroup ceph 
 get_dm_uuid: get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid 
 set_type: Will colocate journal with data on /dev/sdb 
 command: Running command: /usr/bin/ceph-osd --cluster=test --show-config-value=osd_journal_size 
 get_dm_uuid: get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid 
 get_dm_uuid: get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid 
 get_dm_uuid: get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid 
 command: Running command: /usr/bin/ceph-conf --cluster=test --name=osd. --lookup osd_mkfs_type 
 command: Running command: /usr/bin/ceph-conf --cluster=test --name=osd. --lookup osd_mkfs_options_xfs 
 command: Running command: /usr/bin/ceph-conf --cluster=test --name=osd. --lookup osd_mount_options_xfs 
 get_dm_uuid: get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid 
 get_dm_uuid: get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid 
 ptype_tobe_for_name: name = journal 
 get_dm_uuid: get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid 
 create_partition: Creating journal partition num 2 size 100 on /dev/sdb 
 command_check_call: Running command: /usr/sbin/sgdisk --new=2:0:+100M --change-name=2:ceph journal --partition-guid=2:ab5a4232-ab11-5ae7-ac16-6e8d08eab818 --typecode=2:45b0969e-9b03-4f30-b4c6-b4b80ceff106 --mbrtogpt -- /dev/sdb 
 update_partition: Calling partprobe on created device /dev/sdb 
 command_check_call: Running command: /usr/bin/udevadm settle --timeout=600 
 command: Running command: /usr/bin/flock -s /dev/sdb /usr/sbin/partprobe /dev/sdb 
 command_check_call: Running command: /usr/bin/udevadm settle --timeout=600 
 get_dm_uuid: get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid 
 get_partition_dev: Listing /sys/block/sdb 
 get_partition_dev: -> ro 
 get_partition_dev: -> bdi 
 get_partition_dev: -> dev 
 get_partition_dev: -> size 
 get_partition_dev: -> stat 
 get_partition_dev: -> power 
 get_partition_dev: -> range 
 get_partition_dev: -> queue 
 get_partition_dev: -> trace 
 get_partition_dev: -> discard_alignment 
 get_partition_dev: -> device 
 get_partition_dev: -> events 
 get_partition_dev: -> subsystem 
 get_partition_dev: -> ext_range 
 get_partition_dev: -> slaves 
 get_partition_dev: -> uevent 
 get_partition_dev: -> events_poll_msecs 
 get_partition_dev: -> alignment_offset 
 get_partition_dev: -> holders 
 get_partition_dev: -> badblocks 
 get_partition_dev: -> inflight 
 get_partition_dev: -> removable 
 get_partition_dev: -> capability 
 get_partition_dev: -> events_async 
 get_partition_dev: ERREUR PARTITION ! 
 get_partition_dev: Listing no 1 /sys/block/sdb 
 get_partition_dev: -> ro 
 get_partition_dev: -> bdi 
 get_partition_dev: -> dev 
 get_partition_dev: -> sdb2 
 get_partition_dev: -> size 
 get_partition_dev: -> stat 
 get_partition_dev: -> power 
 get_partition_dev: -> range 
 get_partition_dev: -> queue 
 get_partition_dev: -> trace 
 get_partition_dev: -> discard_alignment 
 get_partition_dev: -> device 
 get_partition_dev: -> events 
 get_partition_dev: -> subsystem 
 get_partition_dev: -> ext_range 
 get_partition_dev: -> slaves 
 get_partition_dev: -> uevent 
 get_partition_dev: -> events_poll_msecs 
 get_partition_dev: -> alignment_offset 
 get_partition_dev: -> holders 
 get_partition_dev: -> badblocks 
 get_partition_dev: -> inflight 
 get_partition_dev: -> removable 
 get_partition_dev: -> capability 
 get_partition_dev: -> events_async 
 </pre>

Back