Bug #15398
closedOSD: udev rule for osd device is broken on CentOS 7.2
0%
Description
workaround¶
- rm /lib/udev/rules.d/95-ceph-osd.rules
- add ceph-disk activate /dev/sda2 etc. in /etc/rc.local
description¶
OS:
[runsisi@hust ~]$ lsb_release -a LSB Version: :core-4.1-amd64:core-4.1-noarch Distributor ID: CentOS Description: CentOS Linux release 7.2.1511 (Core) Release: 7.2.1511 Codename: Core
systemd version:
[runsisi@hust ~]$ rpm -qa | grep systemd systemd-219-19.el7.x86_64 systemd-devel-219-19.el7.x86_64 systemd-libs-219-19.el7.x86_64 systemd-python-219-19.el7.x86_64 systemd-sysv-219-19.el7.x86_64
udev version, 219, of course:
[runsisi@hust ~]$ udevadm --version 219
OSD fs type:
[runsisi@hust ~]$ sudo blkid /dev/sdb1 /dev/sdb1: UUID="5a3804e8-275c-4dbb-8481-37235e6fb7fa" TYPE="xfs" PARTLABEL="ceph data" PARTUUID="35be26f8-dae9-421a-9584-5b7132c1089a"
description:
1. during system booting, udev rule for osd device, i.e, /lib/udev/rules.d/95-ceph-osd.rules, invokes /sbin/ceph-disk to activate each newly added osd device, /sbin/ceph-disk mounts the device to a temp location and then re-mount to the correct location, then starts the ceph-osd daemon;
2. unfortunately, the mount command executed under the udev (version 219) context always fails to mount the osd device, something needs to be clarified here though: the mount command exits with 0, but the device is not mounted;
3. what's worse is that the mount command has increased the s_active counter of xfs's in memory superblock instance by one, which prevents the device from being re-parted or any other modifications, e.g,
mount info, nothing for /dev/sdb1:
[runsisi@hust ~]$ mount sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) devtmpfs on /dev type devtmpfs (rw,nosuid,size=98998320k,nr_inodes=24749580,mode=755) securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) tmpfs on /run type tmpfs (rw,nosuid,nodev,mode=755) tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd) pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpuacct,cpu) cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb) cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) cgroup on /sys/fs/cgroup/net_cls type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls) cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) configfs on /sys/kernel/config type configfs (rw,relatime) /dev/mapper/hust-root on / type xfs (rw,relatime,attr2,inode64,noquota) systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=29,pgrp=1,timeout=300,minproto=5,maxproto=5,direct) mqueue on /dev/mqueue type mqueue (rw,relatime) debugfs on /sys/kernel/debug type debugfs (rw,relatime) hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime) sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw,relatime) nfsd on /proc/fs/nfsd type nfsd (rw,relatime) /dev/sda1 on /boot type xfs (rw,relatime,attr2,inode64,noquota) tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=19804112k,mode=700,uid=1000,gid=1000) binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,relatime)
and kernel threads for /dev/sdb1 still there:
[runsisi@hust ~]$ ps aux | grep xfs root 572 0.0 0.0 0 0 ? S< 14:11 0:00 [xfsalloc] root 573 0.0 0.0 0 0 ? S< 14:11 0:00 [xfs_mru_cache] root 574 0.0 0.0 0 0 ? S< 14:11 0:00 [xfslogd] root 575 0.0 0.0 0 0 ? S< 14:11 0:00 [xfs-data/dm-0] root 576 0.0 0.0 0 0 ? S< 14:11 0:00 [xfs-conv/dm-0] root 577 0.0 0.0 0 0 ? S< 14:11 0:00 [xfs-cil/dm-0] root 579 0.0 0.0 0 0 ? S 14:11 0:02 [xfsaild/dm-0] root 850 0.0 0.0 0 0 ? S< 14:11 0:00 [xfs-data/sda1] root 851 0.0 0.0 0 0 ? S< 14:11 0:00 [xfs-conv/sda1] root 852 0.0 0.0 0 0 ? S< 14:11 0:00 [xfs-cil/sda1] root 854 0.0 0.0 0 0 ? S 14:11 0:00 [xfsaild/sda1] root 1764 0.0 0.0 0 0 ? S< 14:12 0:00 [xfs-data/sdc1] root 1765 0.0 0.0 0 0 ? S< 14:12 0:00 [xfs-conv/sdc1] root 1766 0.0 0.0 0 0 ? S< 14:12 0:00 [xfs-cil/sdc1] root 1767 0.0 0.0 0 0 ? S 14:12 0:00 [xfsaild/sdc1] root 2356 0.0 0.0 0 0 ? S< 15:58 0:00 [xfs-data/sdb1] root 2357 0.0 0.0 0 0 ? S< 15:58 0:00 [xfs-conv/sdb1] root 2358 0.0 0.0 0 0 ? S< 15:58 0:00 [xfs-cil/sdb1] root 2359 0.0 0.0 0 0 ? S 15:58 0:00 [xfsaild/sdb1] runsisi 2362 0.0 0.0 112648 956 pts/3 S+ 15:58 0:00 grep --color=auto xfs
4. this issue can be easily reproduced by creating a custom udev rule which mounts the device;
5. i have verified that the udev carried with CentOS 7.0.x or 7.1.x (with udev version 208) works like a charm, this issue seems only exists with newer udev.
Updated by Loïc Dachary about 8 years ago
Which version of Ceph is it ? I guess it's hammer but a confirmation would be good :-)
Updated by runsisi hust about 8 years ago
Loic Dachary wrote:
Which version of Ceph is it ? I guess it's hammer but a confirmation would be good :-)
hi @Loïc Dachary, yes, it's ceph version 0.94.5, sorry i forgot this :)
Updated by Loïc Dachary about 8 years ago
- Description updated (diff)
I updated the description with a simple workaround that you can use.
There unfortunately is no backport for problem because it would involve changes that are too risky for hammer. In a nutshell there are various race conditions that cause problems when booting but also when preparing disks. They have been solved in infernalis. Upgrading to v0.94.7 when it is published later this month may also help as it removes one race condition (maybe this is enough in your context) ( see http://tracker.ceph.com/issues/14602 ).
Updated by runsisi hust about 8 years ago
OK, @Loïc Dachary many thanks for your info, i skimmed through the latest code on master branch and i think the delegation to systemd service may solve my problem ultimately.
i will test the PR, or maybe the jewel release tomorrow, i will give my feedback :)
Updated by Loïc Dachary about 8 years ago
- Status changed from New to Won't Fix
- Release set to hammer
thanks for creating this issue, it will make it easier for other people in the same situation to find the workaround :-)