Bug #19428
Updated by Loïc Dachary about 7 years ago
On the ceph-docker CI, I do have a trace that shows up a racing between the sgdisk actions to create partitions and the device node availability in /sys/block. I've been instrumenting the code in ceph-disk, to be more explicit. My consist in printing the content of /sys/block/<device> to see what's inside. At the first iteration, the sdb2 is missing while at the second one (1 sec after) the sdb2 showed up. I investigated & try to reproduce it locally. I have to admit I failed even by running that code inside a nasted VM. I suspected 'udevadm settle' returning too soon but didn't got able to proove that. So that could mean update_partition() isn't blocking enough. The actual master code of ceph-disk is affected. My current thought are just about adding a testing loop inside get_partition_dev() which isn't that satisfying. <pre> + ceph-disk -v prepare --cluster test --journal-uuid ab5a4232-ab11-5ae7-ac16-6e8d08eab818 /dev/sdb command: Running command: /usr/bin/ceph-osd --cluster=test --show-config-value=fsid command: Running command: /usr/bin/ceph-osd --check-allows-journal -i 0 --cluster test --setuser ceph --setgroup ceph command: Running command: /usr/bin/ceph-osd --check-wants-journal -i 0 --cluster test --setuser ceph --setgroup ceph command: Running command: /usr/bin/ceph-osd --check-needs-journal -i 0 --cluster test --setuser ceph --setgroup ceph get_dm_uuid: get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid set_type: Will colocate journal with data on /dev/sdb command: Running command: /usr/bin/ceph-osd --cluster=test --show-config-value=osd_journal_size get_dm_uuid: get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid get_dm_uuid: get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid get_dm_uuid: get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid command: Running command: /usr/bin/ceph-conf --cluster=test --name=osd. --lookup osd_mkfs_type command: Running command: /usr/bin/ceph-conf --cluster=test --name=osd. --lookup osd_mkfs_options_xfs command: Running command: /usr/bin/ceph-conf --cluster=test --name=osd. --lookup osd_mount_options_xfs get_dm_uuid: get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid get_dm_uuid: get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid ptype_tobe_for_name: name = journal get_dm_uuid: get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid create_partition: Creating journal partition num 2 size 100 on /dev/sdb command_check_call: Running command: /usr/sbin/sgdisk --new=2:0:+100M --change-name=2:ceph journal --partition-guid=2:ab5a4232-ab11-5ae7-ac16-6e8d08eab818 --typecode=2:45b0969e-9b03-4f30-b4c6-b4b80ceff106 --mbrtogpt -- /dev/sdb update_partition: Calling partprobe on created device /dev/sdb command_check_call: Running command: /usr/bin/udevadm settle --timeout=600 command: Running command: /usr/bin/flock -s /dev/sdb /usr/sbin/partprobe /dev/sdb command_check_call: Running command: /usr/bin/udevadm settle --timeout=600 get_dm_uuid: get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid get_partition_dev: Listing /sys/block/sdb get_partition_dev: -> ro get_partition_dev: -> bdi get_partition_dev: -> dev get_partition_dev: -> size get_partition_dev: -> stat get_partition_dev: -> power get_partition_dev: -> range get_partition_dev: -> queue get_partition_dev: -> trace get_partition_dev: -> discard_alignment get_partition_dev: -> device get_partition_dev: -> events get_partition_dev: -> subsystem get_partition_dev: -> ext_range get_partition_dev: -> slaves get_partition_dev: -> uevent get_partition_dev: -> events_poll_msecs get_partition_dev: -> alignment_offset get_partition_dev: -> holders get_partition_dev: -> badblocks get_partition_dev: -> inflight get_partition_dev: -> removable get_partition_dev: -> capability get_partition_dev: -> events_async get_partition_dev: ERREUR PARTITION ! get_partition_dev: Listing no 1 /sys/block/sdb get_partition_dev: -> ro get_partition_dev: -> bdi get_partition_dev: -> dev get_partition_dev: -> sdb2 get_partition_dev: -> size get_partition_dev: -> stat get_partition_dev: -> power get_partition_dev: -> range get_partition_dev: -> queue get_partition_dev: -> trace get_partition_dev: -> discard_alignment get_partition_dev: -> device get_partition_dev: -> events get_partition_dev: -> subsystem get_partition_dev: -> ext_range get_partition_dev: -> slaves get_partition_dev: -> uevent get_partition_dev: -> events_poll_msecs get_partition_dev: -> alignment_offset get_partition_dev: -> holders get_partition_dev: -> badblocks get_partition_dev: -> inflight get_partition_dev: -> removable get_partition_dev: -> capability get_partition_dev: -> events_async </pre>