Actions
Bug #2447
closedprepare_write_connect NULL pointer dereference
Status:
Resolved
Priority:
Immediate
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Q/A
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):
Description
trivially reproducible with currenting testing branch ubuntu@plana02:~$ [17575.980776] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 [17576.053725] IP: [<ffffffffa03b417c>] prepare_write_connect+0x14c/0x250 [libceph] [17576.125881] PGD 21a983067 PUD 21fb0b067 PMD 0 [17576.162234] Oops: 0000 [#1] SMP [17576.196273] CPU 1 [17576.198533] Modules linked in: ceph libceph ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs xfs exportfs reiserf [17576.375071] [17576.404835] Pid: 640, comm: kworker/1:2 Not tainted 3.3.0-ceph-00110-g1d4a9bf #1 Dell Inc. PowerEdge R410/01V648 [17576.472797] RIP: 0010:[<ffffffffa03b417c>] [<ffffffffa03b417c>] prepare_write_connect+0x14c/0x250 [libceph] [17576.542074] RSP: 0018:ffff8802218a9ad0 EFLAGS: 00010246 [17576.578023] RAX: 0000000000000000 RBX: ffff8802203a2800 RCX: 0000000000000000 [17576.616808] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff88021fb20120 [17576.655985] RBP: ffff8802218a9b00 R08: ffffffff81f58390 R09: 000000000000000f [17576.695459] R10: 0000000000000004 R11: 0000000000000000 R12: 000000000000000f [17576.734855] R13: 0000000000000000 R14: ffff88022722e240 R15: ffff8802218a9dc0 [17576.774297] FS: 0000000000000000(0000) GS:ffff880227220000(0000) knlGS:0000000000000000 [17576.846308] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [17576.885141] CR2: 0000000000000010 CR3: 00000002218b5000 CR4: 00000000000006e0 [17576.926049] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [17576.966732] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [17577.007199] Process kworker/1:2 (pid: 640, threadinfo ffff8802218a8000, task ffff88020e5bde80) [17577.082688] Stack: [17577.116681] ffff880200000000 000000000e5bde80 ffffffff816130d2 ffff8802203a2800 [17577.188551] ffff8802203a2c58 ffff8802203a2828 ffff8802218a9c10 ffffffffa03b5816 [17577.262528] ffff8802203a2978 0000000000000000 ffff8802218a9b30 ffff8802203a2978 [17577.339531] Call Trace: [17577.375575] [<ffffffff816130d2>] ? __mutex_lock_common+0x282/0x3d0 [17577.415928] [<ffffffffa03b5816>] try_write+0xa46/0x1030 [libceph] [17577.455677] [<ffffffffa03b6526>] ? con_work+0x46/0x1b30 [libceph] [17577.494810] [<ffffffffa03b7120>] con_work+0xc40/0x1b30 [libceph] [17577.533359] [<ffffffff810ab1cc>] ? __lock_acquire+0xa8c/0x15d0 [17577.571002] [<ffffffff81615dc0>] ? _raw_spin_unlock_irq+0x30/0x40 [17577.608155] [<ffffffff8106b256>] process_one_work+0x1a6/0x520 [17577.644210] [<ffffffff8106b1e7>] ? process_one_work+0x137/0x520 [17577.680172] [<ffffffffa03b64e0>] ? ceph_con_revoke_message+0x130/0x130 [libceph] [17577.746965] [<ffffffff8106d593>] worker_thread+0x173/0x400 [17577.783177] [<ffffffff8106d420>] ? manage_workers+0x210/0x210 [17577.819464] [<ffffffff8107280e>] kthread+0xbe/0xd0 [17577.854588] [<ffffffff8161f5f4>] kernel_thread_helper+0x4/0x10 [17577.890749] [<ffffffff81616134>] ? retint_restore_args+0x13/0x13 [17577.926334] [<ffffffff81072750>] ? __init_kthread_worker+0x70/0x70 [17577.961824] [<ffffffff8161f5f0>] ? gs_change+0x13/0x13 [17577.995717] Code: 29 ff ff ff 48 8b 43 28 f6 c4 20 75 c5 49 8b 45 18 48 89 83 68 01 00 00 49 8b 45 20 89 83 70 01 0c [17578.107356] RIP [<ffffffffa03b417c>] prepare_write_connect+0x14c/0x250 [libceph] [17578.175891] RSP <ffff8802218a9ad0> [17578.209743] CR2: 0000000000000010 [17578.288962] ---[ end trace e8de61daec71ea6e ]--- [17578.322716] BUG: unable to handle kernel paging request at fffffffffffffff8 [17578.359066] IP: [<ffffffff81072170>] kthread_data+0x10/0x20 [17578.393528] PGD 1c07067 PUD 1c08067 PMD 0 [17578.425822] Oops: 0000 [#2] SMP [17578.457107] CPU 1 [17578.459375] Modules linked in: ceph libceph ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs xfs exportfs reiserf [17578.632262] [17578.663587] Pid: 640, comm: kworker/1:2 Tainted: G D 3.3.0-ceph-00110-g1d4a9bf #1 Dell Inc. PowerEdge R410 [17578.737318] RIP: 0010:[<ffffffff81072170>] [<ffffffff81072170>] kthread_data+0x10/0x20 [17578.808692] RSP: 0018:ffff8802218a96b8 EFLAGS: 00010092 [17578.845765] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000001 [17578.885483] RDX: ffffffff81e35fc0 RSI: 0000000000000001 RDI: ffff88020e5bde80 [17578.925200] RBP: ffff8802218a96b8 R08: 0000000000989680 R09: 0000000000000001 [17578.964999] R10: 0000000000000400 R11: 0000000000000001 R12: 0000000000000001 [17579.004755] R13: ffff88020e5be220 R14: ffff880223cd0000 R15: ffff8802218a97e0 [17579.044703] FS: 0000000000000000(0000) GS:ffff880227220000(0000) knlGS:0000000000000000 [17579.118253] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [17579.156836] CR2: fffffffffffffff8 CR3: 00000002218b5000 CR4: 00000000000006e0 [17579.197399] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [17579.237192] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [17579.275979] Process kworker/1:2 (pid: 640, threadinfo ffff8802218a8000, task ffff88020e5bde80) [17579.348411] Stack: [17579.381924] ffff8802218a96d8 ffffffff8106a665 ffff8802218a96d8 ffff880227233b40 [17579.453584] ffff8802218a9768 ffffffff81614293 ffff88020e5bde80 0000000000013b40 [17579.527219] ffff8802218a9fd8 ffff8802218a8010 0000000000013b40 0000000000013b40 [17579.604141] Call Trace: [17579.640106] [<ffffffff8106a665>] wq_worker_sleeping+0x15/0xa0 [17579.679774] [<ffffffff81614293>] __schedule+0x5c3/0x810 [17579.718192] [<ffffffff8161480f>] schedule+0x3f/0x60 [17579.755909] [<ffffffff8105206c>] do_exit+0x60c/0x8c0 [17579.793052] [<ffffffff8104ebe6>] ? kmsg_dump+0x116/0x160 [17579.830149] [<ffffffff81617020>] oops_end+0xb0/0xf0 [17579.865763] [<ffffffff8104053d>] no_context+0x11d/0x2d0 [17579.900958] [<ffffffff8104083d>] __bad_area_nosemaphore+0x14d/0x230 [17579.937168] [<ffffffff81619ae1>] ? do_page_fault+0x221/0x4b0 [17579.972351] [<ffffffff81040933>] bad_area_nosemaphore+0x13/0x20 [17580.007868] [<ffffffff81619c0e>] do_page_fault+0x34e/0x4b0 [17580.042859] [<ffffffff8131d8fd>] ? trace_hardirqs_off_thunk+0x3a/0x3c [17580.079185] [<ffffffff816163b5>] page_fault+0x25/0x30 [17580.113764] [<ffffffffa03b417c>] ? prepare_write_connect+0x14c/0x250 [libceph] [17580.179019] [<ffffffff816130d2>] ? __mutex_lock_common+0x282/0x3d0 [17580.215050] [<ffffffffa03b5816>] try_write+0xa46/0x1030 [libceph] [17580.250608] [<ffffffffa03b6526>] ? con_work+0x46/0x1b30 [libceph] [17580.285993] [<ffffffffa03b7120>] con_work+0xc40/0x1b30 [libceph] [17580.321694] [<ffffffff810ab1cc>] ? __lock_acquire+0xa8c/0x15d0 [17580.356683] [<ffffffff81615dc0>] ? _raw_spin_unlock_irq+0x30/0x40 [17580.391259] [<ffffffff8106b256>] process_one_work+0x1a6/0x520 [17580.425779] [<ffffffff8106b1e7>] ? process_one_work+0x137/0x520 [17580.459958] [<ffffffffa03b64e0>] ? ceph_con_revoke_message+0x130/0x130 [libceph] [17580.521872] [<ffffffff8106d593>] worker_thread+0x173/0x400 [17580.554747] [<ffffffff8106d420>] ? manage_workers+0x210/0x210 [17580.587176] [<ffffffff8107280e>] kthread+0xbe/0xd0 [17580.617706] [<ffffffff8161f5f4>] kernel_thread_helper+0x4/0x10 [17580.649757] [<ffffffff81616134>] ? retint_restore_args+0x13/0x13 [17580.681696] [<ffffffff81072750>] ? __init_kthread_worker+0x70/0x70 [17580.713767] [<ffffffff8161f5f0>] ? gs_change+0x13/0x13 [17580.744798] Code: 66 66 66 90 65 48 8b 04 25 00 c8 00 00 48 8b 80 48 03 00 00 8b 40 f0 c9 c3 66 90 55 48 89 e5 66 66 [17580.848756] RIP [<ffffffff81072170>] kthread_data+0x10/0x20 [17580.882708] RSP <ffff8802218a96b8> [17580.914082] CR2: fffffffffffffff8 [17580.944880] ---[ end trace e8de61daec71ea6f ]--- [17580.977233] Fixing recursive fault but reboot is needed!
Updated by Sage Weil almost 12 years ago
- Status changed from New to Resolved
this was a bug in the new messenger refactor, pushed an updated commit:3da54776e2c0385c32d143fd497a7f40a88e29dd
Actions