Project

General

Profile

Actions

Bug #1793

closed

NULL pointer dereference at try_write+0x627/0x1060

Added by Josh Durgin over 12 years ago. Updated about 12 years ago.

Status:
Can't reproduce
Priority:
High
Assignee:
Category:
libceph
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

Found in sepia50's console:

[ 8727.491802] Call Trace:
[ 8727.491802]  [<ffffffff8107f0c5>] wq_worker_sleeping+0x15/0xa0
[ 8727.491802]  [<ffffffff81600763>] __schedule+0x5c3/0x940
[ 8727.491802]  [<ffffffff81066e81>] ? do_exit+0x551/0x880
[ 8727.491802]  [<ffffffff81065d7d>] ? release_task+0x1d/0x470
[ 8727.491802]  [<ffffffff81600e0f>] schedule+0x3f/0x60
[ 8727.491802]  [<ffffffff81066ef7>] do_exit+0x5c7/0x880
[ 8727.491802]  [<ffffffff81063b05>] ? kmsg_dump+0x75/0x140
[ 8727.491802]  [<ffffffff81604ac0>] oops_end+0xb0/0xf0

[ 8727.491802]  [<ffffffff8103f85d>] no_context+0xfd/0x270
[ 8727.491802]  [<ffffffff814edfd5>] ? release_sock+0x35/0x190
[ 8727.491802]  [<ffffffff8103fb15>] __bad_area_nosemaphore+0x145/0x230
[ 8727.491802]  [<ffffffff810509a3>] ? get_parent_ip+0x33/0x50
[ 8727.491802]  [<ffffffff8103fc13>] bad_area_nosemaphore+0x13/0x20
[ 8727.491802]  [<ffffffff8160763e>] do_page_fault+0x34e/0x4b0
[ 8727.491802]  [<ffffffff814ee0e4>] ? release_sock+0x144/0x190
[ 8727.491802]  [<ffffffff815429e5>] ? tcp_sendpage+0xc5/0x590
[ 8727.491802]  [<ffffffff81313fbd>] ? trace_hardirqs_off_thunk+0x3a/0x3c
[ 8727.491802]  [<ffffffff81603ef5>] page_fault+0x25/0x30
[ 8727.491802]  [<ffffffffa00c6a87>] ? try_write+0x627/0x1060 [libceph]
[ 8727.491802]  [<ffffffff814e8db4>] ? kernel_recvmsg+0x44/0x60

[ 8727.491802]  [<ffffffffa00c5748>] ? ceph_tcp_recvmsg+0x48/0x60 [libceph]

[ 8727.491802]  [<ffffffffa00c80d0>] con_work+0xc10/0x1b00 [libceph]

[ 8727.491802]  [<ffffffff81313f7e>] ? trace_hardirqs_on_thunk+0x3a/0x3f

[ 8727.491802]  [<ffffffff81053978>] ? finish_task_switch+0x48/0x120

[ 8727.491802]  [<ffffffff8107fcb6>] process_one_work+0x1a6/0x520

[ 8727.491802]  [<ffffffff8107fc47>] ? process_one_work+0x137/0x520
[ 8727.491802]  [<ffffffffa00c74c0>] ? try_write+0x1060/0x1060 [libceph]
[ 8727.491802]  [<ffffffff81081fe3>] worker_thread+0x173/0x400
[ 8727.491802]  [<ffffffff81081e70>] ? manage_workers+0x210/0x210
[ 8727.491802]  [<ffffffff81087026>] kthread+0xb6/0xc0
[ 8727.491802]  [<ffffffff8160dd44>] kernel_thread_helper+0x4/0x10
[ 8727.491802]  [<ffffffff81603c74>] ? retint_restore_args+0x13/0x13
[ 8727.491802]  [<ffffffff81086f70>] ? __init_kthread_worker+0x70/0x70
[ 8727.491802]  [<ffffffff8160dd40>] ? gs_change+0x13/0x13
[ 8727.491802] Code: 66 66 66 90 65 48 8b 04 25 40 c4 00 00 48 8b 80 48 03 00 00 8b 40 f0 c9 c3 66 90 55 48 89 e5 66 66 66 66 90 48 8b 87 48 03 00 00 
[ 8727.491802]  8b 40 f8 c9 c3 eb 08 90 90 90 90 90 90 90 90 55 48 89 e5 66 
[ 8727.491802] RIP  [<ffffffff81086a10>] kthread_data+0x10/0x20

[ 8727.491802]  RSP <ffff8800455b76f8>
[ 8727.491802] CR2: fffffffffffffff8
[ 8727.491802] ---[ end trace 9654ef6c74784bc5 ]---
[ 8727.491802] Fixing recursive fault but reboot is needed!
[ 8727.491802] BUG: spinlock lockup on CPU#0, kworker/0:2/5057

[ 8727.491802]  lock: ffff8800fbc13b80, .magic: dead4ead, .owner: kworker/0:2/5057, .owner_cpu: 0

[ 8727.491802] Pid: 5057, comm: kworker/0:2 Tainted: G      D     3.1.0-ceph-08936-gb643987 #1

[ 8727.491802] Call Trace:

[ 8727.491802]  [<ffffffff8131a178>] spin_dump+0x78/0xc0
[ 8727.491802]  [<ffffffff8131a39d>] do_raw_spin_lock+0xed/0x120
[ 8727.491802]  [<ffffffff816031e5>] _raw_spin_lock_irq+0x45/0x50
[ 8727.491802]  [<ffffffff81600277>] ? __schedule+0xd7/0x940
[ 8727.491802]  [<ffffffff81600277>] __schedule+0xd7/0x940
[ 8727.491802]  [<ffffffff81600e0f>] schedule+0x3f/0x60
[ 8727.491802]  [<ffffffff81067164>] do_exit+0x834/0x880
[ 8727.491802]  [<ffffffff81063b95>] ? kmsg_dump+0x105/0x140
[ 8727.491802]  [<ffffffff81063b05>] ? kmsg_dump+0x75/0x140
[ 8727.491802]  [<ffffffff81604ac0>] oops_end+0xb0/0xf0
[ 8727.491802]  [<ffffffff8103f85d>] no_context+0xfd/0x270
[ 8727.491802]  [<ffffffff8103fb15>] __bad_area_nosemaphore+0x145/0x230
[ 8727.491802]  [<ffffffff811b3af0>] ? fsnotify_clear_marks_by_inode+0x30/0xf0
[ 8727.491802]  [<ffffffff8103fc13>] bad_area_nosemaphore+0x13/0x20
[ 8727.491802]  [<ffffffff8160763e>] do_page_fault+0x34e/0x4b0
[ 8727.491802]  [<ffffffff81057bc4>] ? cpuacct_charge+0x24/0xb0
[ 8727.491802]  [<ffffffff81057bc4>] ? cpuacct_charge+0x24/0xb0
[ 8727.491802]  [<ffffffff81313fbd>] ? trace_hardirqs_off_thunk+0x3a/0x3c
[ 8727.491802]  [<ffffffff81603ef5>] page_fault+0x25/0x30
[ 8727.491802]  [<ffffffff81086a10>] ? kthread_data+0x10/0x20
[ 8727.491802]  [<ffffffff8107f0c5>] wq_worker_sleeping+0x15/0xa0
[ 8727.491802]  [<ffffffff81600763>] __schedule+0x5c3/0x940
[ 8727.491802]  [<ffffffff81066e81>] ? do_exit+0x551/0x880

[ 8727.491802]  [<ffffffff81065d7d>] ? release_task+0x1d/0x470
[ 8727.491802]  [<ffffffff81600e0f>] schedule+0x3f/0x60
[ 8727.491802]  [<ffffffff81066ef7>] do_exit+0x5c7/0x880
[ 8727.491802]  [<ffffffff81063b05>] ? kmsg_dump+0x75/0x140
[ 8727.491802]  [<ffffffff81604ac0>] oops_end+0xb0/0xf0
[ 8727.491802]  [<ffffffff8103f85d>] no_context+0xfd/0x270
[ 8727.491802]  [<ffffffff814edfd5>] ? release_sock+0x35/0x190
[ 8727.491802]  [<ffffffff8103fb15>] __bad_area_nosemaphore+0x145/0x230
[ 8727.491802]  [<ffffffff810509a3>] ? get_parent_ip+0x33/0x50
[ 8727.491802]  [<ffffffff8103fc13>] bad_area_nosemaphore+0x13/0x20
[ 8727.491802]  [<ffffffff8160763e>] do_page_fault+0x34e/0x4b0
[ 8727.491802]  [<ffffffff814ee0e4>] ? release_sock+0x144/0x190
[ 8727.491802]  [<ffffffff815429e5>] ? tcp_sendpage+0xc5/0x590
[ 8727.491802]  [<ffffffff81313fbd>] ? trace_hardirqs_off_thunk+0x3a/0x3c
[ 8727.491802]  [<ffffffff81603ef5>] page_fault+0x25/0x30
[ 8727.491802]  [<ffffffffa00c6a87>] ? try_write+0x627/0x1060 [libceph]
[ 8727.491802]  [<ffffffff814e8db4>] ? kernel_recvmsg+0x44/0x60
[ 8727.491802]  [<ffffffffa00c5748>] ? ceph_tcp_recvmsg+0x48/0x60 [libceph]
[ 8727.491802]  [<ffffffffa00c80d0>] con_work+0xc10/0x1b00 [libceph]
[ 8727.491802]  [<ffffffff81313f7e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 8727.491802]  [<ffffffff81053978>] ? finish_task_switch+0x48/0x120
[ 8727.491802]  [<ffffffff8107fcb6>] process_one_work+0x1a6/0x520
[ 8727.491802]  [<ffffffff8107fc47>] ? process_one_work+0x137/0x520
[ 8727.491802]  [<ffffffffa00c74c0>] ? try_write+0x1060/0x1060 [libceph]
[ 8727.491802]  [<ffffffff81081fe3>] worker_thread+0x173/0x400
[ 8727.491802]  [<ffffffff81081e70>] ? manage_workers+0x210/0x210
[ 8727.491802]  [<ffffffff81087026>] kthread+0xb6/0xc0
[ 8727.491802]  [<ffffffff8160dd44>] kernel_thread_helper+0x4/0x10
[ 8727.491802]  [<ffffffff81603c74>] ? retint_restore_args+0x13/0x13
[ 8727.491802]  [<ffffffff81086f70>] ? __init_kthread_worker+0x70/0x70
[ 8727.491802]  [<ffffffff8160dd40>] ? gs_change+0x13/0x13

Related issues 1 (0 open1 closed)

Has duplicate Linux kernel client - Bug #1866: null pointer dereference after osd went downDuplicate12/29/2011

Actions
Actions

Also available in: Atom PDF