Bug #2754
cephfs show_location produces kernel "divide error: 0000 [#1]" when run against a directory that is not the root of the mounted filesystem
0%
Description
Originally reported in http://marc.info/?l=ceph-devel&m=134151028212170&w=2:
Really easy to reproduce on my 3.2.0 Debian squeeze-backports kernel:
mount a Ceph FS, create a directory in it. Then run "cephfs <dir>
show_location".
dmesg stacktrace:
[ 7153.714260] libceph: mon2 192.168.42.116:6789 session established [ 7308.584193] divide error: 0000 [#1] SMP [ 7308.584936] Modules linked in: cryptd aes_i586 aes_generic cbc ceph libceph nfsd lockd nfs_acl auth_rpcgss sunrpc fuse joydev usbhid hid snd_pcm snd_timer snd processor soundcore snd_page_alloc thermal_sys button tpm_tis tpm tpm_bios psmouse i2c_piix4 evdev serio_raw i2c_core virtio_balloon pcspkr ext3 jbd mbcache btrfs zlib_deflate crc32c libcrc32c sg sr_mod cdrom ata_generic virtio_net virtio_blk ata_piix uhci_hcd ehci_hcd libata usbcore floppy scsi_mod virtio_pci usb_common [last unloaded: scsi_wait_scan] [ 7308.588013] [ 7308.588013] Pid: 1444, comm: cephfs Not tainted 3.2.0-0.bpo.2-686-pae #1 Bochs Bochs [ 7308.588013] EIP: 0060:[<f848c6c2>] EFLAGS: 00010246 CPU: 0 [ 7308.588013] EIP is at ceph_calc_file_object_mapping+0x44/0xe8 [libceph] [ 7308.588013] EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00000000 [ 7308.588013] ESI: 00000000 EDI: 00000000 EBP: 00000000 ESP: f7495ce4 [ 7308.588013] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [ 7308.588013] Process cephfs (pid: 1444, ti=f7494000 task=f7266a60 task.ti=f7494000) [ 7308.588013] Stack: [ 7308.588013] 00000000 00000000 00000000 0001b053 f5f20624 f5f203f0 f749a800 f5f20420 [ 7308.588013] f84ca6a7 f7495d40 f7495d58 f7495d50 f7495d38 00000001 00000246 f5f20420 [ 7308.588013] f749a90c bff6ff70 c14203a4 fffba978 000a0050 00000000 f79f0298 00000001 [ 7308.588013] Call Trace: [ 7308.588013] [<f84ca6a7>] ? ceph_ioctl_get_dataloc+0x9e/0x213 [ceph] [ 7308.588013] [<c10b6781>] ? __do_fault+0x3ee/0x42b [ 7308.588013] [<c10b75f3>] ? handle_pte_fault+0x3aa/0xa67 [ 7308.588013] [<c10e0844>] ? path_openat+0x27f/0x294 [ 7308.588013] [<f84cac16>] ? ceph_ioctl+0x3fa/0x460 [ceph] [ 7308.588013] [<c10d9fdb>] ? cp_new_stat64+0xee/0x100 [ 7308.588013] [<c10b7ebe>] ? handle_mm_fault+0x20e/0x224 [ 7308.588013] [<f84ca81c>] ? ceph_ioctl_get_dataloc+0x213/0x213 [ceph]
I unfortunately don't have a more recent kernel to test with, so if
this has been fixed upstream feel free to ignore me. Otherwise,
perhaps something that could go into the 3.5-rc cycle.
Doing show_location on a file, and on the root directory of the fs,
both work fine.
History
#1 Updated by Sage Weil about 11 years ago
- Priority changed from Normal to Urgent
#2 Updated by Sage Weil about 11 years ago
- Project changed from Ceph to Linux kernel client
- Category deleted (
26)
#3 Updated by Sage Weil about 11 years ago
- Status changed from New to 12
- Assignee set to Alex Elder
ceph_calc_file_object_mapping() does no divide-by-zero checking.
#4 Updated by Sage Weil about 11 years ago
- Status changed from 12 to Fix Under Review
#5 Updated by Alex Elder about 11 years ago
Sage posted four patches for review. I just reviewed them
and am working on some final testing before committing them
to the testing branch.
#6 Updated by Alex Elder about 11 years ago
- Status changed from Fix Under Review to 7
Reviewed, just waiting for a nightly test to complete before
marking this resolved. I did NOT explicitly test the fix; I
trust Sage did that (and it looks correct to me).
#7 Updated by Alex Elder almost 11 years ago
- Status changed from 7 to Resolved
A suite of tests including these fixes completed without
error. As I said, I did not specifically test this
problem but my review indicates it's fixed, and I trust
Sage tested them. In any case, marking this bug resolved.
457712a0b ceph: return EIO on invalid layout on GET_DATALOC ioctl
6cae3717c rbd: BUG on invalid layout
6816282da ceph: propagate layout error on osd request creation
d63b77f4c libceph: check for invalid mapping