Project

General

Profile

Actions

Bug #4522

closed

RBD utility "showmapped" bug

Added by Ivan Kudryavtsev about 11 years ago. Updated almost 11 years ago.

Status:
Can't reproduce
Priority:
High
Assignee:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hello, my command "rbd showmapped" doesn't show
one rbd volume which is in use:

/usr/bin/kvm -daemonize -name vm-bw-web01-00:F4:E4:38:4C:5E -drive file=/dev/rbd/rbd/lun-legacy-vm-bw-web01,cache=writeback,if=ide,index=0,media=disk,boot=on -cdrom /images/public/debian-6.0.0-amd64-i386-netinst.iso -m 4096 -smp 4,maxcpus=4,cores=4,threads=1,sockets=1 -boot order=cd,menu=on -net nic,vlan=1,macaddr=00:F4:E4:38:4C:5E,model=virtio -net tap,vlan=1,script=/usr/share/openqrm/plugins/kvm-storage/bin/openqrm-kvm-ifup-net1,downscript=/usr/share/openqrm/plugins/kvm-storage/bin/openqrm-kvm-ifdown-net1 -net nic,vlan=254,macaddr=00:76:6D:F9:52:3A,model=virtio -net tap,vlan=254,script=/usr/share/openqrm/plugins/kvm-storage/bin/z_private-cloud-254 -vnc 91.226.13.110:200,password -balloon virtio -monitor unix:/var/run/openqrm/kvm-storage/kvm.vm-bw-web01.mon,server,nowait
# ls -la /dev/rbd/rbd/lun-legacy-vm-bw-web01
lrwxrwxrwx 1 root root 11 Mar  5 06:56 /dev/rbd/rbd/lun-legacy-vm-bw-web01 -> ../../rbd62
# rbd showmapped | grep web01
# rbd showmapped | grep rbd62

no results, as you can see. This cause possible failures if someone uses "rbd showmapped in scripts to define if it's mapped".

# ceph -v
ceph version 0.56.3 (6eb7e15a4783b122e9b0c85ea9ba064145958aa5)

Related issues 1 (0 open1 closed)

Related to rbd - Bug #2654: Stale rbd volume cannot be unmapedWon't FixSage Weil06/26/2012

Actions
Actions #1

Updated by Ivan Kudryavtsev about 11 years ago

Some new info.

I stopped VM and info about /dev/rbd62 and /dev/rbd/rbd/lun-legacy-vm-bw-web01 was wiped from fs automatically without my actions, I don't know how. Next, I've tried to map it again and got some dmesg:

[2136858.614275] ------------[ cut here ]------------
[2136858.614285] WARNING: at include/linux/kref.h:42 kobject_get+0x21/0x2a()
[2136858.614287] Hardware name: X9DRW
[2136858.614288] Modules linked in: flashcache(OF) isofs ebt_ip ebtable_filter ebtables x_tables tun cn fuse cbc ceph nfsv3 nfs_acl nfs fscache dns_resolver lockd sunrpc bridge bonding rbd libceph libcrc32c 8021q garp stp llc ib_ipath ib_uverbs ib_addr ib_umad ib_ipoib ib_cm ib_sa loop coretemp kvm_intel acpi_cpufreq tpm_tis tpm kvm snd_pcm sg snd_timer snd soundcore snd_page_alloc joydev crc32c_intel evdev hid_generic tpm_bios mperf sr_mod pcspkr cdrom processor thermal_sys psmouse lpc_ich mfd_core ib_mthca microcode wmi i2c_i801 i2c_core ib_mad serio_raw shpchp pci_hotplug ib_core ioatdma container button ext4 mbcache jbd2 crc16 dm_mod usb_storage raid1 md_mod usbhid hid sd_mod crc_t10dif ahci libahci isci libsas libata scsi_transport_sas ehci_hcd usbcore scsi_mod usb_common igb dca
[2136858.614364] Pid: 4618, comm: blkid Tainted: GF          O 3.7.2-ceph #1
[2136858.614366] Call Trace:
[2136858.614372]  [<ffffffff8103efdc>] ? warn_slowpath_common+0x78/0x8c
[2136858.614375]  [<ffffffff811bd8ef>] ? kobject_get+0x21/0x2a
[2136858.614379]  [<ffffffff8126096b>] ? get_device+0x14/0x1a
[2136858.614385]  [<ffffffffa02cc853>] ? rbd_open+0x37/0x4d [rbd]
[2136858.614390]  [<ffffffff81135adf>] ? __blkdev_get+0x10c/0x3d6
[2136858.614393]  [<ffffffff81135f84>] ? blkdev_get+0x1db/0x2df
[2136858.614396]  [<ffffffff81136088>] ? blkdev_get+0x2df/0x2df
[2136858.614401]  [<ffffffff8110a5b9>] ? do_dentry_open+0x171/0x217
[2136858.614404]  [<ffffffff8110a732>] ? finish_open+0x2c/0x35
[2136858.614408]  [<ffffffff81115ce6>] ? do_last+0x878/0xa1d
[2136858.614411]  [<ffffffff81113fd8>] ? __inode_permission+0x5b/0x9a
[2136858.614415]  [<ffffffff8111641f>] ? path_openat+0xc0/0x33b
[2136858.614418]  [<ffffffff8111678d>] ? do_filp_open+0x2c/0x72
[2136858.614421]  [<ffffffff81104e23>] ? kmem_cache_alloc+0x2a/0xed
[2136858.614424]  [<ffffffff81120a82>] ? __alloc_fd+0x4c/0x10e
[2136858.614427]  [<ffffffff8110a30e>] ? do_sys_open+0x60/0xe7
[2136858.614432]  [<ffffffff8136cbe9>] ? system_call_fastpath+0x16/0x1b
[2136858.614435] ---[ end trace c657e437db29227b ]---
[2136858.614523] BUG: unable to handle kernel NULL pointer dereference at           (null)
[2136858.614600] IP: [<ffffffffa02ccf8b>] rbd_dev_release+0x22/0x14a [rbd]
[2136858.614649] PGD 846604067 PUD b2b7df067 PMD 0 
[2136858.614700] Oops: 0000 [#1] SMP 
[2136858.614745] Modules linked in: flashcache(OF) isofs ebt_ip ebtable_filter ebtables x_tables tun cn fuse cbc ceph nfsv3 nfs_acl nfs fscache dns_resolver lockd sunrpc bridge bonding rbd libceph libcrc32c 8021q garp stp llc ib_ipath ib_uverbs ib_addr ib_umad ib_ipoib ib_cm ib_sa loop coretemp kvm_intel acpi_cpufreq tpm_tis tpm kvm snd_pcm sg snd_timer snd soundcore snd_page_alloc joydev crc32c_intel evdev hid_generic tpm_bios mperf sr_mod pcspkr cdrom processor thermal_sys psmouse lpc_ich mfd_core ib_mthca microcode wmi i2c_i801 i2c_core ib_mad serio_raw shpchp pci_hotplug ib_core ioatdma container button ext4 mbcache jbd2 crc16 dm_mod usb_storage raid1 md_mod usbhid hid sd_mod crc_t10dif ahci libahci isci libsas libata scsi_transport_sas ehci_hcd usbcore scsi_mod usb_common igb dca
[2136858.615587] CPU 19 
[2136858.615598] Pid: 4618, comm: blkid Tainted: GF       W  O 3.7.2-ceph #1 Supermicro X9DRW/X9DRW
[2136858.615693] RIP: 0010:[<ffffffffa02ccf8b>]  [<ffffffffa02ccf8b>] rbd_dev_release+0x22/0x14a [rbd]
[2136858.615768] RSP: 0018:ffff8825b6bddde8  EFLAGS: 00010286
[2136858.615808] RAX: 0000000000000000 RBX: ffff883fc18d4938 RCX: 0000000000000286
[2136858.615874] RDX: ffff883fc18d4b70 RSI: ffff881ff2f5d400 RDI: ffff883fc18d4938
[2136858.615941] RBP: ffff883fc18d4938 R08: ffff883ff2fa7500 R09: ffff8825b6bdddd8
[2136858.616007] R10: ffff8825b6bdddd8 R11: ffff883ff2fa7500 R12: ffff883fc18d4800
[2136858.616073] R13: ffff882b1f0bafb0 R14: ffff881ff2f5b800 R15: ffff883ff2fa7518
[2136858.616140] FS:  00007f30ea896740(0000) GS:ffff881fffd60000(0000) knlGS:0000000000000000
[2136858.616208] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[2136858.616250] CR2: 0000000000000000 CR3: 00000002bcda5000 CR4: 00000000000427e0
[2136858.616316] DR0: 0000000000000090 DR1: 00000000000000a4 DR2: 00000000000000ff
[2136858.616382] DR3: 000000000000000f DR6: 00000000ffff0ff0 DR7: 0000000000000400
[2136858.616449] Process blkid (pid: 4618, threadinfo ffff8825b6bdc000, task ffff883ff0f494a0)
[2136858.616517] Stack:
[2136858.616548]  ffff883ff0180f00 ffff883fc18d4938 ffff883ff0180f00 ffff883fc18d4938
[2136858.616633]  ffffffff8164e960 ffffffff81260c7a 0000000000000014 ffff883fc18d4948
[2136858.616739]  ffff883ff057de50 ffffffff811bd8b8 ffff883ff2fa7500 0000000000000000
[2136858.616826] Call Trace:
[2136858.616863]  [<ffffffff81260c7a>] ? device_release+0x4e/0x83
[2136858.616908]  [<ffffffff811bd8b8>] ? kobject_release+0x48/0x5e
[2136858.616954]  [<ffffffffa02cc817>] ? rbd_release+0x17/0x1c [rbd]
[2136858.616998]  [<ffffffff81135812>] ? __blkdev_put+0xa4/0x14f
[2136858.617046]  [<ffffffff81364d23>] ? mutex_lock+0xd/0x2c
[2136858.617089]  [<ffffffff8110c9c5>] ? __fput+0xe9/0x1ae
[2136858.617135]  [<ffffffff81057ed7>] ? task_work_run+0x7f/0x96
[2136858.617184]  [<ffffffff8100f563>] ? do_notify_resume+0x56/0x67
[2136858.617228]  [<ffffffff81057f95>] ? task_work_add+0x47/0x4e
[2136858.617274]  [<ffffffff8136cea2>] ? int_signal+0x12/0x17
[2136858.617315] Code: 41 5c 41 5d 41 5e 41 5f c3 41 54 4c 8d a7 c8 fe ff ff 55 53 48 89 fb 48 83 ec 10 48 8b 77 90 48 85 f6 74 16 48 8b 87 e0 fe ff ff <48> 8b 38 48 81 c7 48 07 00 00 e8 92 3d fe ff 48 83 7b 88 00 74 
[2136858.617781] RIP  [<ffffffffa02ccf8b>] rbd_dev_release+0x22/0x14a [rbd]
[2136858.617832]  RSP <ffff8825b6bddde8>
[2136858.617869] CR2: 0000000000000000
[2136858.618537] ---[ end trace c657e437db29227c ]---

Actions #2

Updated by Ivan Kudryavtsev about 11 years ago

# ps xa | grep 'rbd map'
 5501 ?        D      0:00 rbd map lun-legacy-vm-bw-web01
12844 pts/1    S+     0:00 grep rbd map
Actions #3

Updated by Ivan Kudryavtsev about 11 years ago

If I run rbd map lun-legacy-vm-bw-web01 one more time, It succeeds.

Actions #4

Updated by Sage Weil about 11 years ago

Do you still see this?

What 'showmapped' is looking at is /sys/bus/rbd/devices/*... an ls -al of that directory would be helpful. Assuming the entry appears there, we should see something on stderr if it isn't included in the final output. If it is not present there, it is probably a kernel bug.

Actions #5

Updated by Sage Weil about 11 years ago

  • Status changed from New to Need More Info
Actions #6

Updated by Sage Weil almost 11 years ago

  • Status changed from Need More Info to Can't reproduce
Actions

Also available in: Atom PDF