Bug #16234
closedCeph osd daemon keeps running even after disk has been pulled
0%
Description
During our system test at Cisco one of the things we do is simulate disk failure. We do this by taking the disk offline from the CIMC by marking the disk as unconfigured and also by pulling the disk out. I our test we have 4 data disks and one SSD set for journaling
These are the mount points:
/dev/sdb1 838G 86M 838G 1% /var/lib/ceph/osd/ceph-22
/dev/sdd1 838G 46M 838G 1% /var/lib/ceph/osd/ceph-24
/dev/sda1 838G 46M 838G 1% /var/lib/ceph/osd/ceph-21
/dev/sdc1 838G 78M 838G 1% /var/lib/ceph/osd/ceph-23
After pulling disk ceph-disk does not show the disk taken offline in my case was sdc1:
[root@david_server-4 ~]# ceph-disk list
/dev/sda :
/dev/sda1 ceph data, active, cluster ceph, osd.21, journal /dev/sde1
/dev/sdb :
/dev/sdb1 ceph data, active, cluster ceph, osd.22, journal /dev/sde2
/dev/sdd :
/dev/sdd1 ceph data, active, cluster ceph, osd.24, journal /dev/sde4
/dev/sde :
/dev/sde1 ceph journal, for /dev/sda1
/dev/sde2 ceph journal, for /dev/sdb1
/dev/sde3 ceph journal
/dev/sde4 ceph journal, for /dev/sdd1
/dev/sdf :
/dev/sdf1 other, ext4, mounted on /boot
/dev/sdf2 other, LVM2_member
The mountpoint still looks OK:
[root@david_server-4 ~]# ls /var/lib/ceph/osd/ceph-23
activate.monmap active ceph_fsid current fsid journal journal_uuid keyring magic ready store_version superblock sysvinit whoami
But sdc1 is not there:
[root@david_server-4 ~]# ls /dev/sdc1
ls: cannot access /dev/sdc1: No such file or directory
Obviously ceph osd daemon for ceph-23 still things everything is OK:
[root@david_server-4 ~]# service ceph status osd.23
=== osd.23 ===
osd.23: running {"version":"0.94.5-9.el7cp"}
And ceph monitors never get notified that something is wrong:
[ceph@david_server-2 /]$ ceph osd stat
osdmap e166: 25 osds: 25 up, 25 in
This could stay like this for days as long as there is no i/o on that particular OSD.
I looked around to check if there is a config option to allow for osd daemon to check the status of the disk. But I can't find anything
We will like to be able to know that a disk is bad within a reasonable amount of time, and let things rebalance even before we try to do i/o of the bad disk.
Here is my ceph info
[ceph@david_server-2 /]$ ceph version
ceph version 0.94.5-9.el7cp (deef183a81111fa5e128ec88c90a32c9587c615d)
[root@david_server-4 ~]# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.2 (Maipo)
[ceph@david_server-2 /]$ ceph status
cluster 1af8284d-d1e1-42a0-af36-c1076115c853
health HEALTH_OK
monmap e1: 3 mons at {ceph-david_server-1=20.0.0.6:6789/0,ceph-david_server-2=20.0.0.5:6789/0,ceph-david_server-3=20.0.0.7:6789/0}
election epoch 16, quorum 0,1,2 ceph-david_server-2,ceph-david_server-1,ceph-david_server-3
osdmap e166: 25 osds: 25 up, 25 in
pgmap v1864: 1024 pgs, 5 pools, 406 MB data, 54 objects
2249 MB used, 61525 GB / 61527 GB avail
1024 active+clean
Updated by Josh Durgin almost 7 years ago
- Status changed from New to Rejected
No cluster should be entirely idle. At the very least scrubbing should be running, and will get an error from the disk at that point.