Project

General

Profile

Actions

Bug #283

closed

ceph_add_cap: couldn't find snap realm, NULL ptr deref

Added by Sage Weil almost 14 years ago. Updated over 13 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

mds restart.
rmdir dir (ENOTEMPTY)
ls dir/.snap
then crash.

[ 6635.285314] ceph: ceph_add_cap: couldn't find snap realm 1000017dc34
[ 6635.866656] BUG: unable to handle kernel NULL pointer dereference at 00000000000000a0
[ 6635.868575] IP: [<ffffffffa00ebfc1>] __send_cap+0x22c/0x513 [ceph]
[ 6635.868575] PGD 11be17067 PUD 11bcb4067 PMD 0 
[ 6635.868575] Oops: 0000 [#1] PREEMPT SMP 
[ 6635.868575] last sysfs file: /sys/kernel/uevent_seqnum
[ 6635.868575] CPU 0 
[ 6635.868575] Modules linked in: aes_x86_64 aes_generic ceph fan ac battery container ehci_hcd uhci_hcd thermal processor button
[ 6635.868575] 
[ 6635.868575] Pid: 2629, comm: ceph-msgr/0 Not tainted 2.6.35-rc3+ #44 PDSMi+/PDSMi
[ 6635.868575] RIP: 0010:[<ffffffffa00ebfc1>]  [<ffffffffa00ebfc1>] __send_cap+0x22c/0x513 [ceph]
[ 6635.868575] RSP: 0018:ffff88011b47d6c0  EFLAGS: 00010246
[ 6635.868575] RAX: 0000000000000000 RBX: ffff88011d94cba0 RCX: 0000000000000001
[ 6635.868575] RDX: 0000000000000000 RSI: ffff88011c3430a0 RDI: 0000000000000200
[ 6635.868575] RBP: ffff88011b47d820 R08: 0000000000000001 R09: 0000000000003ffd
[ 6635.868575] R10: ffffffffa00ed48c R11: ffff88011b62b308 R12: 0000000000000000
[ 6635.868575] R13: 00000000ffffc202 R14: ffff880118ecc1c0 R15: 0000000000000000
[ 6635.868575] FS:  0000000000000000(0000) GS:ffff880002600000(0000) knlGS:0000000000000000
[ 6635.868575] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 6635.868575] CR2: 00000000000000a0 CR3: 000000011cbdd000 CR4: 00000000000006f0
[ 6635.868575] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 6635.868575] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 6635.868575] Process ceph-msgr/0 (pid: 2629, threadinfo ffff88011b47c000, task ffff88011b2e0910)
[ 6635.868575] Stack:
[ 6635.868575]  ffff88011b47d6e0 ffffffff8104fbee 0000000000000000 00000000001d2a80
[ 6635.868575] <0> ffff88011b47d710 ffffffff8104fd16 ffff88011b47d700 0000000000000000
[ 6635.868575] <0> ffff88011b47d710 2821418a2c52228d 2821418a2c52228d ffffffff81d0f9a0
[ 6635.868575] Call Trace:
[ 6635.868575]  [<ffffffff8104fbee>] ? sched_clock_local+0x11/0x73
[ 6635.868575]  [<ffffffff8104fd16>] ? sched_clock_cpu+0xc6/0xd4
[ 6635.868575]  [<ffffffff8105b70d>] ? __lock_acquire+0x81b/0x87e
[ 6635.868575]  [<ffffffffa00edbce>] ceph_check_caps+0x7c3/0xb66 [ceph]
[ 6635.868575]  [<ffffffff810098e7>] ? native_sched_clock+0x37/0x71
[ 6635.868575]  [<ffffffff8104fd16>] ? sched_clock_cpu+0xc6/0xd4
[ 6635.868575]  [<ffffffff8102ce5c>] ? sub_preempt_count+0x92/0x9e
[ 6635.868575]  [<ffffffffa00f08bf>] ceph_handle_caps+0xb11/0x15cd [ceph]
[ 6635.868575]  [<ffffffff8104fd16>] ? sched_clock_cpu+0xc6/0xd4
[ 6635.868575]  [<ffffffff8104fbee>] ? sched_clock_local+0x11/0x73
[ 6635.868575]  [<ffffffff8102bcea>] ? get_parent_ip+0x11/0x41
[ 6635.868575]  [<ffffffff81058793>] ? mark_held_locks+0x49/0x64
[ 6635.868575]  [<ffffffff8144d24d>] ? __mutex_unlock_slowpath+0x10d/0x130
[ 6635.868575]  [<ffffffffa00fedfb>] dispatch+0xe02/0x118a [ceph]
[ 6635.868575]  [<ffffffff8102ce5c>] ? sub_preempt_count+0x92/0x9e
[ 6635.868575]  [<ffffffff81058793>] ? mark_held_locks+0x49/0x64
[ 6635.868575]  [<ffffffff8144d24d>] ? __mutex_unlock_slowpath+0x10d/0x130
[ 6635.868575]  [<ffffffff8105892a>] ? trace_hardirqs_on+0xd/0xf
[ 6635.868575]  [<ffffffffa00f7ae8>] try_read+0xc44/0x129b [ceph]
[ 6635.868575]  [<ffffffffa00f9bad>] ? con_work+0xad/0x6b2 [ceph]
[ 6635.868575]  [<ffffffff8144d6c6>] ? mutex_lock_nested+0x2f7/0x314
[ 6635.868575]  [<ffffffffa00f9bad>] ? con_work+0xad/0x6b2 [ceph]
[ 6635.868575]  [<ffffffffa00f9c29>] con_work+0x129/0x6b2 [ceph]
[ 6635.868575]  [<ffffffff81048406>] worker_thread+0x1e8/0x2fa
[ 6635.868575]  [<ffffffff810483ad>] ? worker_thread+0x18f/0x2fa
[ 6635.868575]  [<ffffffff8102ce5c>] ? sub_preempt_count+0x92/0x9e
[ 6635.868575]  [<ffffffffa00f9b00>] ? con_work+0x0/0x6b2 [ceph]
[ 6635.8685]  [<ffffffff8104b4c8>] ? autoremove_wake_f? kernel_thread_helper+0x0/0x10
[ 6635.868575] Code: 48 89 45 c8 49 8b 86 d8 04 00 00 48 89 45 b0 49 8b 86 e0 04 00 00 48 89 45 b8 41 8b 56 18 89 95 5c ff ff ff 49 8b 86 60 03 00 00 <48> 8b 80 a0 00 00 00 4c 8b 78 08 41 8b 86 b8 04 00 00 89 45 88 
[ 6635.868575] RIP  [<ffffffffa00ebfc1>] __send_cap+0x22c/0x513 [ceph]
[ 6635.868575]  RSP <ffff88011b47d6c0>
[ 6635.868575] CR2: 00000000000000a0
[ 6636.295385] ---[ end trace ed926b2d4556264f ]---
Actions #1

Updated by Sage Weil over 13 years ago

  • Target version changed from v2.6.35 to v2.6.36
Actions #2

Updated by Sage Weil over 13 years ago

  • Status changed from New to Resolved

This is a server-side problem with CInode::encode_inodestat, fixed by commit:6573635ba48a9b6c4f364e8f8b7132c90ea2e8e9 (for v0.21.1).

Actions

Also available in: Atom PDF