Project

General

Profile

Actions

Bug #2754

closed

cephfs show_location produces kernel "divide error: 0000 [#1]" when run against a directory that is not the root of the mounted filesystem

Added by Florian Haas almost 12 years ago. Updated over 11 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

Originally reported in http://marc.info/?l=ceph-devel&m=134151028212170&w=2:

Really easy to reproduce on my 3.2.0 Debian squeeze-backports kernel:
mount a Ceph FS, create a directory in it. Then run "cephfs <dir>
show_location".

dmesg stacktrace:

[ 7153.714260] libceph: mon2 192.168.42.116:6789 session established
[ 7308.584193] divide error: 0000 [#1] SMP
[ 7308.584936] Modules linked in: cryptd aes_i586 aes_generic cbc ceph
libceph nfsd lockd nfs_acl auth_rpcgss sunrpc fuse joydev usbhid hid
snd_pcm snd_timer snd processor soundcore snd_page_alloc thermal_sys
button tpm_tis tpm tpm_bios psmouse i2c_piix4 evdev serio_raw i2c_core
virtio_balloon pcspkr ext3 jbd mbcache btrfs zlib_deflate crc32c
libcrc32c sg sr_mod cdrom ata_generic virtio_net virtio_blk ata_piix
uhci_hcd ehci_hcd libata usbcore floppy scsi_mod virtio_pci usb_common
[last unloaded: scsi_wait_scan]
[ 7308.588013]
[ 7308.588013] Pid: 1444, comm: cephfs Not tainted
3.2.0-0.bpo.2-686-pae #1 Bochs Bochs
[ 7308.588013] EIP: 0060:[<f848c6c2>] EFLAGS: 00010246 CPU: 0
[ 7308.588013] EIP is at ceph_calc_file_object_mapping+0x44/0xe8 [libceph]
[ 7308.588013] EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00000000
[ 7308.588013] ESI: 00000000 EDI: 00000000 EBP: 00000000 ESP: f7495ce4
[ 7308.588013]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[ 7308.588013] Process cephfs (pid: 1444, ti=f7494000 task=f7266a60
task.ti=f7494000)
[ 7308.588013] Stack:
[ 7308.588013]  00000000 00000000 00000000 0001b053 f5f20624 f5f203f0
f749a800 f5f20420
[ 7308.588013]  f84ca6a7 f7495d40 f7495d58 f7495d50 f7495d38 00000001
00000246 f5f20420
[ 7308.588013]  f749a90c bff6ff70 c14203a4 fffba978 000a0050 00000000
f79f0298 00000001
[ 7308.588013] Call Trace:
[ 7308.588013]  [<f84ca6a7>] ? ceph_ioctl_get_dataloc+0x9e/0x213 [ceph]
[ 7308.588013]  [<c10b6781>] ? __do_fault+0x3ee/0x42b
[ 7308.588013]  [<c10b75f3>] ? handle_pte_fault+0x3aa/0xa67
[ 7308.588013]  [<c10e0844>] ? path_openat+0x27f/0x294
[ 7308.588013]  [<f84cac16>] ? ceph_ioctl+0x3fa/0x460 [ceph]
[ 7308.588013]  [<c10d9fdb>] ? cp_new_stat64+0xee/0x100
[ 7308.588013]  [<c10b7ebe>] ? handle_mm_fault+0x20e/0x224
[ 7308.588013]  [<f84ca81c>] ? ceph_ioctl_get_dataloc+0x213/0x213 [ceph]


I unfortunately don't have a more recent kernel to test with, so if
this has been fixed upstream feel free to ignore me. Otherwise,
perhaps something that could go into the 3.5-rc cycle.

Doing show_location on a file, and on the root directory of the fs,
both work fine.

Actions #1

Updated by Sage Weil over 11 years ago

  • Priority changed from Normal to Urgent
Actions #2

Updated by Sage Weil over 11 years ago

  • Project changed from Ceph to Linux kernel client
  • Category deleted (26)
Actions #3

Updated by Sage Weil over 11 years ago

  • Status changed from New to 12
  • Assignee set to Alex Elder

ceph_calc_file_object_mapping() does no divide-by-zero checking.

Actions #4

Updated by Sage Weil over 11 years ago

  • Status changed from 12 to Fix Under Review
Actions #5

Updated by Alex Elder over 11 years ago

Sage posted four patches for review. I just reviewed them
and am working on some final testing before committing them
to the testing branch.

Actions #6

Updated by Alex Elder over 11 years ago

  • Status changed from Fix Under Review to 7

Reviewed, just waiting for a nightly test to complete before
marking this resolved. I did NOT explicitly test the fix; I
trust Sage did that (and it looks correct to me).

Actions #7

Updated by Alex Elder over 11 years ago

  • Status changed from 7 to Resolved

A suite of tests including these fixes completed without
error. As I said, I did not specifically test this
problem but my review indicates it's fixed, and I trust
Sage tested them. In any case, marking this bug resolved.

457712a0b  ceph: return EIO on invalid layout on GET_DATALOC ioctl
6cae3717c rbd: BUG on invalid layout
6816282da ceph: propagate layout error on osd request creation
d63b77f4c libceph: check for invalid mapping
Actions

Also available in: Atom PDF