Project

General

Profile

Bug #12763

rbd: unmap failed: (16) Device or resource busy

Added by ceph zte over 5 years ago. Updated over 4 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
rbd
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
krbd
Crash signature (v1):
Crash signature (v2):

Description

My ceph version is 0.87.

The linux system info
[root@client27 ~]# uname -a
Linux client27 3.10.0-123.el7.x86_64 #1 SMP Thu Apr 30 13:53:41 CST 2015 x86_64 x86_64 x86_64 GNU/Linux

[root@song my-cluster]# rbd ls -p song
im1

[root@song my-cluster]# rbd info -p song -i im1
rbd image 'im1':
size 1024 MB in 256 objects
order 22 (4096 kB objects)
block_name_prefix: rb.0.101f.2ae8944a
format: 1

[root@song ~]# rbd showmapped
id pool image snap device
1 song im1 - /dev/rbd1

There is no clone or snap on the image im1.I have no operation on the image im1.

When unmap the im1,the below error accurs.I do not kown how the watchers generate and

how to remove the listwatcher to unmap the im1.

[root@song ~]# rados -p song listwatchers im1.rbd
watcher=10.118.202.189:0/748084886 client.4137 cookie=1

[root@song my-cluster]# rbd unmap /dev/rbd1
rbd: sysfs write failed
rbd: unmap failed: (16) Device or resource busy

History

#1 Updated by Loïc Dachary over 5 years ago

  • Project changed from Ceph to rbd

#2 Updated by Josh Durgin over 5 years ago

  • Project changed from rbd to Linux kernel client
  • Category set to rbd

#3 Updated by Ilya Dryomov over 5 years ago

  • Status changed from New to Need More Info

Sorry for a late reply - this was lingering in the Ceph project.
Are you sure there wasn't a filesystem mounted on top of that rbd device?

#4 Updated by ceph zte over 5 years ago

The image im1 is on the osd.And the osd is not install on the disk.
It is install on the /var/cache/local.

[root@song /var/local]# pwd
/var/local
[root@song /var/local]# ls
osd0 osd1 osd2
[root@song /var/local]#

[root@song /var/local]# mount
/dev/mapper/vg_song-lv_root on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw,rootcontext="system_u:object_r:tmpfs_t:s0")
/dev/sda2 on /boot type ext4 (rw)
/dev/sda1 on /boot/efi type vfat (rw,umask=0077,shortname=winnt)
/dev/mapper/vg_song-lv_home on /home type ext4 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
vmware-vmblock on /var/run/vmblock-fuse type fuse.vmware-vmblock (rw,nosuid,nodev,default_permissions,allow_other)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
gvfs-fuse-daemon on /root/.gvfs type fuse.gvfs-fuse-daemon (rw,nosuid,nodev)

#5 Updated by Ilya Dryomov over 5 years ago

  • Assignee set to Ilya Dryomov

The location of osd dirs shouldn't matter here. Typically, you get -EBUSY on rbd unmap if your block device is still being used. You said "I have no operation on the image im1." but I just wanted to make sure.
Is it reproducible? Can you come up with a script that would reproduce it?

#6 Updated by ceph zte over 5 years ago

Thank you for your reploy!

I think this is relate to the vmvare.I install the same system and ceph in the real machine.

It does not have the problem. Only in my vmvare system have the problem.

#7 Updated by Ilya Dryomov over 5 years ago

Well, that shouldn't be the case. What are the steps to reproduce on your vmware VM?

#8 Updated by ceph zte over 5 years ago

The below is the operation in my vmware.The same operation is ok on the real machine.

[root@song ~]# rbd ls -p song
im1
im2
im4
im3
im5
im6
[root@song ~]# rbd info song/im6
rbd image 'im6':
size 200 MB in 50 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.1a5f2ae8944a
format: 2
features: layering
[root@song ~]# rbd map song/im6
/dev/rbd2
[root@song ~]# rbd unmap /dev/rbd2
rbd: sysfs write failed
rbd: unmap failed: (16) Device or resource busy
[root@song ~]#

#9 Updated by runsisi hust over 5 years ago

i think the device may be being used by multipathd or something else.

lsof /dev/rbdx 

should tell us the truth.

#10 Updated by Kongming Wu over 5 years ago

when doing rbd map with a process of multipathd, this issue will come up, with no doubt. The weird thing is both multipathd command and fuser command show no sign of holding rbd.

#11 Updated by Ilya Dryomov over 5 years ago

If you post a reproducer - one that starts with the installation of multipathd package, then sets up multipath and hits this issue, I'll investigate. When rbd unmap hangs, is there anything in /sys/kernel/debug/ceph/<id>/osdc?

#12 Updated by ceph zte over 5 years ago

Thank you for your answer. It is too long.The prolem above only in that vmware machine can show.
In our oter machine is not show.But thar vmware is not exit.

#13 Updated by Kongming Wu over 5 years ago

Ilya Dryomov wrote:

If you post a reproducer - one that starts with the installation of multipathd package, then sets up multipath and hits this issue, I'll investigate. When rbd unmap hangs, is there anything in /sys/kernel/debug/ceph/<id>/osdc?

Surely nothing. this issue bothers me for a log time. But you can modify multipath configure file by adding rbd device to blacklist.

#14 Updated by Kongming Wu over 5 years ago

Ilya Dryomov wrote:

If you post a reproducer - one that starts with the installation of multipathd package, then sets up multipath and hits this issue, I'll investigate. When rbd unmap hangs, is there anything in /sys/kernel/debug/ceph/<id>/osdc?

not hangs,but return

rbd: sysfs write failed
rbd: unmap failed: (16) Device or resource busy

#15 Updated by Kongming Wu over 5 years ago

ceph zte wrote:

Thank you for your answer. It is too long.The prolem above only in that vmware machine can show.
In our oter machine is not show.But thar vmware is not exit.

multipath can produce the same phenomenon, no matter VM or PM, Really,.

#16 Updated by Ilya Dryomov over 5 years ago

Yeah, sorry - rbd unmap returns -EBUSY, not hangs. So there is nothing in /sys/kernel/debug/ceph/<id>/osdc after it fails?

#17 Updated by Kongming Wu over 5 years ago

Ilya Dryomov wrote:

Yeah, sorry - rbd unmap returns -EBUSY, not hangs. So there is nothing in /sys/kernel/debug/ceph/<id>/osdc after it fails?

root@cvknode40:/sys/kernel/debug/ceph/1a016e2d-01d2-4cd3-b83d-872112e35098.client117727# rbd unmap /dev/rbd1
rbd: sysfs write failed
rbd: unmap failed: (16) Device or resource busy
root@cvknode40:/sys/kernel/debug/ceph/1a016e2d-01d2-4cd3-b83d-872112e35098.client117727# cat osdc
root@cvknode40:/sys/kernel/debug/ceph/1a016e2d-01d2-4cd3-b83d-872112e35098.client117727# vi osdc

nothing

#18 Updated by Ilya Dryomov over 5 years ago

There was a similar issue reported on the mailing list - http://www.spinics.net/lists/ceph-devel/msg27435.html. The thread is split between ceph-users and ceph-devel, so it's a bit hard to follow, but there are a couple of links in the linked message that suggest that, at least with ext4, this issue occurs without rbd or multipath being involved. I'm afraid I can't do much more without a reproducer.

#19 Updated by Ivan Koldyazhny almost 5 years ago

Here is related bug in Ubuntu: https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1581764

I can reproduce it 100% times with running multipathd

#20 Updated by Ilya Dryomov almost 5 years ago

Can you paste your reproducer here, including how you set up multipath and map the rbd image?

#21 Updated by Ivan Koldyazhny almost 5 years ago

Here is output from console:

Ubuntu 14.04.4 LTS, I'm using OpenStack + Ceph backend with Ceph Devstack plugin

#22 Updated by Ivan Koldyazhny almost 5 years ago

A bit more details about my environment:
$ ceph -v
ceph version 0.94.7 (d56bdf93ced6b80b07397d57e3fa68fe68304432)

Steps to reproduce:
  1. start multipathd: sudo /etc/init.d/multipath-tools start
  2. map image: sudo rbd map -p volumes volume-2e4c8f0f-c088-44c7-bb20-76fce705f58b
  3. try to unmap image: sudo rbd unmap -p volumes /dev/rbd2

#23 Updated by Ilya Dryomov almost 5 years ago

Well, if multipathd is holding the device, as one of your pastes shows, you won't be able to unmap it - that's expected, at least from the rbd point of view.
What is the output of "multipath -ll" right before the unmap?

#24 Updated by Ivan Koldyazhny almost 5 years ago

'sudo multipath -ll' doesn't show anything at all

#25 Updated by Ilya Dryomov almost 5 years ago

What about lsof right before the unmap?

#26 Updated by Ivan Koldyazhny almost 5 years ago

$ sudo lsof /dev/rbd1
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
multipath 31388 root 10u BLK 251,0 0t0 773422 /dev/rbd1

#27 Updated by Ilya Dryomov almost 5 years ago

This is likely a bug in either multipathd configuration or udev rules. Either way, not an rbd kernel client issue - multipathd shouldn't be holding random devices.
As a workaround, I'd try blacklisting rbd devices with something like

blacklist {
       devnode "^(rbd)[0-9]*" 
}
in /etc/multipath.conf.

#28 Updated by Ivan Koldyazhny almost 5 years ago

Thanks, Ilya,

This workaround works for me. So, there is no support for multipath from the rbd side, right?

#29 Updated by Ilya Dryomov almost 5 years ago

Depends on what you mean by "support" - rbd CLI tool won't attempt to mess with your multipath settings ;)

#30 Updated by Ilya Dryomov over 4 years ago

  • Status changed from Need More Info to Closed

Also available in: Atom PDF