Bug #12763: rbd: unmap failed: (16) Device or resource busy - Linux kernel client - Ceph

Actions

Copy link

Bug #12763

closed

rbd: unmap failed: (16) Device or resource busy

Added by ceph zte over 8 years ago. Updated over 7 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Ilya Dryomov

Category:

rbd

Target version:

% Done:

Source:

other

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

krbd

Crash signature (v1):

Crash signature (v2):

Description

My ceph version is 0.87.

The linux system info
[root@client27 ~]# uname -a
Linux client27 3.10.0-123.el7.x86_64 #1 SMP Thu Apr 30 13:53:41 CST 2015 x86_64 x86_64 x86_64 GNU/Linux

[root@song my-cluster]# rbd ls -p song
im1

[root@song my-cluster]# rbd info -p song -i im1
rbd image 'im1':
size 1024 MB in 256 objects
order 22 (4096 kB objects)
block_name_prefix: rb.0.101f.2ae8944a
format: 1

[root@song ~]# rbd showmapped
id pool image snap device
1 song im1 - /dev/rbd1

There is no clone or snap on the image im1.I have no operation on the image im1.

When unmap the im1,the below error accurs.I do not kown how the watchers generate and

how to remove the listwatcher to unmap the im1.

[root@song ~]# rados -p song listwatchers im1.rbd
watcher=10.118.202.189:0/748084886 client.4137 cookie=1

[root@song my-cluster]# rbd unmap /dev/rbd1
rbd: sysfs write failed
rbd: unmap failed: (16) Device or resource busy

Actions

Copy link

Updated by Loïc Dachary over 8 years ago

Project changed from Ceph to rbd

Actions

Copy link

Updated by Josh Durgin over 8 years ago

Project changed from rbd to Linux kernel client
Category set to rbd

Actions

Copy link

Updated by Ilya Dryomov over 8 years ago

Status changed from New to Need More Info

Sorry for a late reply - this was lingering in the Ceph project.
Are you sure there wasn't a filesystem mounted on top of that rbd device?

Actions

Copy link

Updated by ceph zte over 8 years ago

The image im1 is on the osd.And the osd is not install on the disk.
It is install on the /var/cache/local.

[root@song /var/local]# pwd
/var/local
[root@song /var/local]# ls
osd0 osd1 osd2
[root@song /var/local]#

[root@song /var/local]# mount
/dev/mapper/vg_song-lv_root on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw,rootcontext="system_u:object_r:tmpfs_t:s0")
/dev/sda2 on /boot type ext4 (rw)
/dev/sda1 on /boot/efi type vfat (rw,umask=0077,shortname=winnt)
/dev/mapper/vg_song-lv_home on /home type ext4 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
vmware-vmblock on /var/run/vmblock-fuse type fuse.vmware-vmblock (rw,nosuid,nodev,default_permissions,allow_other)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
gvfs-fuse-daemon on /root/.gvfs type fuse.gvfs-fuse-daemon (rw,nosuid,nodev)

Actions

Copy link

Updated by Ilya Dryomov over 8 years ago

Assignee set to Ilya Dryomov

The location of osd dirs shouldn't matter here. Typically, you get -EBUSY on rbd unmap if your block device is still being used. You said "I have no operation on the image im1." but I just wanted to make sure.
Is it reproducible? Can you come up with a script that would reproduce it?

Actions

Copy link

Updated by ceph zte over 8 years ago

Thank you for your reploy!

I think this is relate to the vmvare.I install the same system and ceph in the real machine.

It does not have the problem. Only in my vmvare system have the problem.

Actions

Copy link

Updated by Ilya Dryomov over 8 years ago

Well, that shouldn't be the case. What are the steps to reproduce on your vmware VM?

Actions

Copy link

Updated by ceph zte over 8 years ago

The below is the operation in my vmware.The same operation is ok on the real machine.

[root@song ~]# rbd ls -p song
im1
im2
im4
im3
im5
im6
[root@song ~]# rbd info song/im6
rbd image 'im6':
size 200 MB in 50 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.1a5f2ae8944a
format: 2
features: layering
[root@song ~]# rbd map song/im6
/dev/rbd2
[root@song ~]# rbd unmap /dev/rbd2
rbd: sysfs write failed
rbd: unmap failed: (16) Device or resource busy
[root@song ~]#

Actions

Copy link

Updated by runsisi hust over 8 years ago

i think the device may be being used by multipathd or something else.

lsof /dev/rbdx

should tell us the truth.

Actions

Copy link

#10

Updated by Kongming Wu over 8 years ago

when doing rbd map with a process of multipathd, this issue will come up, with no doubt. The weird thing is both multipathd command and fuser command show no sign of holding rbd.

Actions

Copy link

#11

Updated by Ilya Dryomov over 8 years ago

If you post a reproducer - one that starts with the installation of multipathd package, then sets up multipath and hits this issue, I'll investigate. When rbd unmap hangs, is there anything in /sys/kernel/debug/ceph/<id>/osdc?

Actions

Copy link

#12

Updated by ceph zte over 8 years ago

Thank you for your answer. It is too long.The prolem above only in that vmware machine can show.
In our oter machine is not show.But thar vmware is not exit.

Actions

Copy link

#13

Updated by Kongming Wu over 8 years ago

Ilya Dryomov wrote:

If you post a reproducer - one that starts with the installation of multipathd package, then sets up multipath and hits this issue, I'll investigate. When rbd unmap hangs, is there anything in /sys/kernel/debug/ceph/<id>/osdc?

Surely nothing. this issue bothers me for a log time. But you can modify multipath configure file by adding rbd device to blacklist.

Actions

Copy link

#14

Updated by Kongming Wu over 8 years ago

Ilya Dryomov wrote:

If you post a reproducer - one that starts with the installation of multipathd package, then sets up multipath and hits this issue, I'll investigate. When rbd unmap hangs, is there anything in /sys/kernel/debug/ceph/<id>/osdc?

not hangs,but return

rbd: sysfs write failed
rbd: unmap failed: (16) Device or resource busy

Actions

Copy link

#15

Updated by Kongming Wu over 8 years ago

ceph zte wrote:

Thank you for your answer. It is too long.The prolem above only in that vmware machine can show.
In our oter machine is not show.But thar vmware is not exit.

multipath can produce the same phenomenon, no matter VM or PM, Really,.

Actions

Copy link

#16

Updated by Ilya Dryomov over 8 years ago

Yeah, sorry - rbd unmap returns -EBUSY, not hangs. So there is nothing in /sys/kernel/debug/ceph/<id>/osdc after it fails?

Actions

Copy link

#17

Updated by Kongming Wu over 8 years ago

Ilya Dryomov wrote:

Yeah, sorry - rbd unmap returns -EBUSY, not hangs. So there is nothing in /sys/kernel/debug/ceph/<id>/osdc after it fails?

root@cvknode40:/sys/kernel/debug/ceph/1a016e2d-01d2-4cd3-b83d-872112e35098.client117727# rbd unmap /dev/rbd1
rbd: sysfs write failed
rbd: unmap failed: (16) Device or resource busy
root@cvknode40:/sys/kernel/debug/ceph/1a016e2d-01d2-4cd3-b83d-872112e35098.client117727# cat osdc
root@cvknode40:/sys/kernel/debug/ceph/1a016e2d-01d2-4cd3-b83d-872112e35098.client117727# vi osdc

nothing

Actions

Copy link

#18

Updated by Ilya Dryomov over 8 years ago

There was a similar issue reported on the mailing list - http://www.spinics.net/lists/ceph-devel/msg27435.html. The thread is split between ceph-users and ceph-devel, so it's a bit hard to follow, but there are a couple of links in the linked message that suggest that, at least with ext4, this issue occurs without rbd or multipath being involved. I'm afraid I can't do much more without a reproducer.

Actions

Copy link

#19

Updated by Ivan Koldyazhny almost 8 years ago

Here is related bug in Ubuntu: https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1581764

I can reproduce it 100% times with running multipathd

Actions

Copy link

#20

Updated by Ilya Dryomov almost 8 years ago

Can you paste your reproducer here, including how you set up multipath and map the rbd image?

Actions

Copy link

#21

Updated by Ivan Koldyazhny almost 8 years ago

Here is output from console:

Ubuntu 14.04.4 LTS, I'm using OpenStack + Ceph backend with Ceph Devstack plugin

Actions

Copy link

#22

Updated by Ivan Koldyazhny almost 8 years ago

A bit more details about my environment:
$ ceph -v
ceph version 0.94.7 (d56bdf93ced6b80b07397d57e3fa68fe68304432)

Steps to reproduce:

start multipathd: sudo /etc/init.d/multipath-tools start
map image: sudo rbd map -p volumes volume-2e4c8f0f-c088-44c7-bb20-76fce705f58b
try to unmap image: sudo rbd unmap -p volumes /dev/rbd2

Actions

Copy link

#23

Updated by Ilya Dryomov almost 8 years ago

Well, if multipathd is holding the device, as one of your pastes shows, you won't be able to unmap it - that's expected, at least from the rbd point of view.
What is the output of "multipath -ll" right before the unmap?

Actions

Copy link

#24

Updated by Ivan Koldyazhny almost 8 years ago

'sudo multipath -ll' doesn't show anything at all

Actions

Copy link

#25

Updated by Ilya Dryomov almost 8 years ago

What about lsof right before the unmap?

Actions

Copy link

#26

Updated by Ivan Koldyazhny almost 8 years ago

$ sudo lsof /dev/rbd1
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
multipath 31388 root 10u BLK 251,0 0t0 773422 /dev/rbd1

Actions

Copy link

#27

Updated by Ilya Dryomov almost 8 years ago

This is likely a bug in either multipathd configuration or udev rules. Either way, not an rbd kernel client issue - multipathd shouldn't be holding random devices.
As a workaround, I'd try blacklisting rbd devices with something like

blacklist {
       devnode "^(rbd)[0-9]*" 
}

in /etc/multipath.conf.

Actions

Copy link

#28

Updated by Ivan Koldyazhny almost 8 years ago

Thanks, Ilya,

This workaround works for me. So, there is no support for multipath from the rbd side, right?

Actions

Copy link

#29

Updated by Ilya Dryomov almost 8 years ago

Depends on what you mean by "support" - rbd CLI tool won't attempt to mess with your multipath settings ;)

Actions

Copy link

#30

Updated by Ilya Dryomov over 7 years ago

Status changed from Need More Info to Closed

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » Linux kernel client

Custom queries

Bug #12763

rbd: unmap failed: (16) Device or resource busy

Updated by Loïc Dachary over 8 years ago

Updated by Josh Durgin over 8 years ago

Updated by Ilya Dryomov over 8 years ago

Updated by ceph zte over 8 years ago

Updated by Ilya Dryomov over 8 years ago

Updated by ceph zte over 8 years ago

Updated by Ilya Dryomov over 8 years ago

Updated by ceph zte over 8 years ago

Updated by runsisi hust over 8 years ago

Updated by Kongming Wu over 8 years ago

Updated by Ilya Dryomov over 8 years ago

Updated by ceph zte over 8 years ago

Updated by Kongming Wu over 8 years ago

Updated by Kongming Wu over 8 years ago

Updated by Kongming Wu over 8 years ago

Updated by Ilya Dryomov over 8 years ago

Updated by Kongming Wu over 8 years ago

Updated by Ilya Dryomov over 8 years ago

Updated by Ivan Koldyazhny almost 8 years ago

Updated by Ilya Dryomov almost 8 years ago

Updated by Ivan Koldyazhny almost 8 years ago

Updated by Ivan Koldyazhny almost 8 years ago

Updated by Ilya Dryomov almost 8 years ago

Updated by Ivan Koldyazhny almost 8 years ago

Updated by Ilya Dryomov almost 8 years ago

Updated by Ivan Koldyazhny almost 8 years ago

Updated by Ilya Dryomov almost 8 years ago

Updated by Ivan Koldyazhny almost 8 years ago

Updated by Ilya Dryomov almost 8 years ago

Updated by Ilya Dryomov over 7 years ago