Bug #784
kclient crash
0%
Description
I got this on a local rsync_better branch, which is built off of unstable commit 9c01177. As best I can tell it's not because of any of my changes.
Got this during my typical rsync test: Mount ceph, rsync a copy of ceph-client to the ceph mount. It was partway through the rsync, but not done, when UML crashed.
Backtrace via gdb:
Program terminated with signal 11, Segmentation fault. #0 *__GI_abort () at abort.c:128 128 abort.c: No such file or directory. in abort.c (gdb) bt #0 *__GI_abort () at abort.c:128 #1 0x000000006002589a in os_dump_core () at arch/um/os-Linux/util.c:119 #2 0x0000000060017d5d in panic_exit (self=<value optimized out>, unused1=<value optimized out>, unused2=<value optimized out>) at arch/um/kernel/um_arch.c:233 #3 0x000000006004b172 in notifier_call_chain (nl=<value optimized out>, val=0, v=0x60413780, nr_to_call=-2, nr_calls=<value optimized out>) at kernel/notifier.c:93 #4 0x000000006004b1c0 in __atomic_notifier_call_chain (nh=<value optimized out>, val=18446744073709551615, v=0xffffffffffffffa8) at kernel/notifier.c:182 #5 atomic_notifier_call_chain (nh=<value optimized out>, val=18446744073709551615, v=0xffffffffffffffa8) at kernel/notifier.c:191 #6 0x000000006026ac73 in panic (fmt=0x602dcf2b "Kernel mode fault at addr 0x%lx, ip 0x%lx") at kernel/panic.c:99 #7 0x0000000060017b7b in segv (fi=..., ip=<value optimized out>, is_user=0, regs=0x60367600) at arch/um/kernel/trap.c:201 #8 0x0000000060017bda in segv_handler (sig=<value optimized out>, regs=0xffffffffffffffa8) at arch/um/kernel/trap.c:147 #9 0x0000000060024898 in sig_handler_common (sig=6, sc=0x603677d8) at arch/um/os-Linux/signal.c:49 #10 0x00000000600249de in sig_handler (sig=1, sc=0xffffffffffffffff) at arch/um/os-Linux/signal.c:226 #11 0x0000000060024c10 in handle_signal (sig=1, sc=0x603677d8) at arch/um/os-Linux/signal.c:158 #12 0x0000000060026684 in hard_handler (sig=1) at arch/um/os-Linux/sys-x86_64/signal.c:15 #13 <signal handler called> #14 0x0000000000000000 in ?? () #15 0x000000006005b7fa in mask_ack_irq (irq=10, desc=0x6037d6c0) at kernel/irq/chip.c:387 #16 handle_edge_irq (irq=10, desc=0x6037d6c0) at kernel/irq/chip.c:622 #17 0x0000000060014ff9 in generic_handle_irq_desc (irq=10, regs=<value optimized out>) at include/linux/irqdesc.h:119 #18 generic_handle_irq (irq=10, regs=<value optimized out>) at include/linux/irqdesc.h:130 #19 do_IRQ (irq=10, regs=<value optimized out>) at arch/um/kernel/irq.c:337 #20 0x0000000060017614 in winch (sig=<value optimized out>, regs=0x6037d6c0) at arch/um/kernel/trap.c:246 #21 0x0000000060024898 in sig_handler_common (sig=1614192128, sc=0x60367c28) at arch/um/os-Linux/signal.c:49 #22 0x00000000600249de in sig_handler (sig=1614272192, sc=0x6037d6c0) at arch/um/os-Linux/signal.c:226 #23 0x0000000060024c10 in handle_signal (sig=1614272192, sc=0x60367c28) at arch/um/os-Linux/signal.c:158 #24 0x0000000060026684 in hard_handler (sig=1614272192) at arch/um/os-Linux/sys-x86_64/signal.c:15 #25 <signal handler called> #26 0x00007fe1b8d4423c in ptrace (request=PTRACE_SETREGS) at ../sysdeps/unix/sysv/linux/ptrace.c:118 #27 0x0000000060027419 in userspace (regs=0xa020f428) at arch/um/os-Linux/skas/process.c:373 #28 0x0000000060015b9c in fork_handler () at arch/um/kernel/process.c:181 #29 0x0000000000000000 in ?? ()
The end of the kclient log:
[ 3268.080000] libceph: osds timeout [ 3268.080000] libceph: __remove_old_osds 00000000a021a9f0 [ 3271.880000] libceph: monc delayed_work [ 3271.880000] libceph: __send_subscribe sub_sent=0 exp=0 want_osd=2 [ 3271.880000] libceph: __schedule_delayed after 2000 [ 3272.280000] [ 3272.280000] Modules linked in: [ 3272.280000] Pid: 1214, comm: bash Not tainted 2.6.37-32890-g9651e88-dirty [ 3272.280000] RIP: 0033:[<0000000000000000>] [ 3272.280000] RSP: 00000000603679e8 EFLAGS: 00010246 [ 3272.280000] RAX: 0000000060369e00 RBX: 000000006037d6c0 RCX: 000000000000001c [ 3272.280000] RDX: 0000000000000000 RSI: 000000006037d6c0 RDI: 000000006037d6c0 [ 3272.280000] RBP: 0000000060367a10 R08: 00000000a0254000 R09: 0000000010000000 [ 3272.280000] R10: 00000000a020f100 R11: 0000000000000206 R12: 000000006037d728 [ 3272.280000] R13: 000000000000000a R14: 0000000000000000 R15: 0000000010000000 [ 3272.280000] Call Trace: [ 3272.280000] 603674e8: [<60017b5e>] segv+0x1f5/0x212 [ 3272.280000] 60367518: [<6026d0a6>] _raw_spin_unlock_irqrestore+0x18/0x1c [ 3272.280000] 60367538: [<6002cfb7>] try_to_wake_up+0x86/0x98 [ 3272.280000] 603675c8: [<60017bda>] segv_handler+0x5f/0x65 [ 3272.280000] 603675f8: [<60024898>] sig_handler_common+0x84/0x98 [ 3272.280000] 603676a8: [<601d76a9>] sk_reset_timer+0x17/0x23 [ 3272.280000] 603676c8: [<60211493>] tcp_send_delayed_ack+0xb4/0xb6 [ 3272.280000] 60367728: [<600249de>] sig_handler+0x30/0x3b [ 3272.280000] 60367748: [<60024c10>] handle_signal+0x6d/0xa3 [ 3272.280000] 60367798: [<60026684>] hard_handler+0x10/0x14 [ 3272.280000] 603678b8: [<601e2150>] process_backlog+0x116/0x128 [ 3272.280000] 603678e8: [<60024963>] set_signals+0x1c/0x2e [ 3272.280000] 60367908: [<6005c43b>] rcu_qsctr_help+0x41/0x4a [ 3272.280000] 60367918: [<60036682>] __local_bh_enable+0x48/0x83 [ 3272.280000] 60367948: [<600367cf>] __do_softirq+0x112/0x128 [ 3272.280000] 603679d8: [<6026d122>] _raw_spin_lock+0x9/0xb [ 3272.280000] 603679e8: [<6005b7fa>] handle_edge_irq+0x5a/0x14c [ 3272.280000] 60367a18: [<60014ff9>] do_IRQ+0x2b/0x41 [ 3272.280000] 60367a38: [<60017614>] winch+0xe/0x10 [ 3272.280000] 60367a48: [<60024898>] sig_handler_common+0x84/0x98 [ 3272.280000] 60367a68: [<60024945>] real_alarm_handler+0x3c/0x3e [ 3272.280000] 60367af0: [<60085e51>] fput+0x0/0x1ef [ 3272.280000] 60367b38: [<600248f7>] unblock_signals+0x4b/0x5d [ 3272.280000] 60367b78: [<600249de>] sig_handler+0x30/0x3b [ 3272.280000] 60367b98: [<60024c10>] handle_signal+0x6d/0xa3 [ 3272.280000] 60367be8: [<60026684>] hard_handler+0x10/0x14 [ 3272.280000] [ 3272.280000] Kernel panic - not syncing: Kernel mode fault at addr 0x0, ip 0x0 3272.280000] Call Trace: [ 3272.280000] 603673e8: [<6026ac58>] panic+0xea/0x1dc [ 3272.280000] 60367440: [<6005523e>] __module_text_address+0xd/0x53 [ 3272.280000] 60367458: [<6005528d>] is_module_text_address+0x9/0x11 [ 3272.280000] 60367468: [<600455c4>] __kernel_text_address+0x65/0x6b [ 3272.280000] 60367470: [<60026684>] hard_handler+0x10/0x14 [ 3272.280000] 60367488: [<6001694a>] show_trace+0x8e/0x95 [ 3272.280000] 603674b8: [<6002a140>] show_regs+0x2b/0x2f [ 3272.280000] 603674e8: [<60017b7b>] segv_handler+0x0/0x65 [ 3272.280000] 60367518: [<6026d0a6>] _raw_spin_unlock_irqrestore+0x18/0x1c [ 3272.280000] 60367538: [<6002cfb7>] try_to_wake_up+0x86/0x98 [ 3272.280000] 603675c8: [<60017bda>] segv_handler+0x5f/0x65 [ 3272.280000] 603675f8: [<60024898>] sig_handler_common+0x84/0x98 [ 3272.280000] 603676a8: [<601d76a9>] sk_reset_timer+0x17/0x23 [ 3272.280000] 603676c8: [<60211493>] tcp_send_delayed_ack+0xb4/0xb6 [ 3272.280000] 60367728: [<600249de>] sig_handler+0x30/0x3b [ 3272.280000] 60367748: [<60024c10>] handle_signal+0x6d/0xa3 [ 3272.280000] 60367798: [<60026684>] hard_handler+0x10/0x14 [ 3272.280000] 603678b8: [<601e2150>] process_backlog+0x116/0x128 [ 3272.280000] 603678e8: [<60024963>] set_signals+0x1c/0x2e [ 3272.280000] 60367908: [<6005c43b>] rcu_qsctr_help+0x41/0x4a [ 3272.280000] 60367918: [<60036682>] __local_bh_enable+0x48/0x83 [ 3272.280000] 60367948: [<600367cf>] __do_softirq+0x112/0x128 [ 3272.280000] 603679d8: [<6026d122>] _raw_spin_lock+0x9/0xb [ 3272.280000] 603679e8: [<6005b7fa>] handle_edge_irq+0x5a/0x14c [ 3272.280000] 60367a18: [<60014ff9>] do_IRQ+0x2b/0x41 [ 3272.280000] 60367a38: [<60017614>] winch+0xe/0x10 [ 3272.280000] 60367a48: [<60024898>] sig_handler_common+0x84/0x98 [ 3272.280000] 60367a68: [<60024945>] real_alarm_handler+0x3c/0x3e [ 3272.280000] 60367af0: [<60085e51>] fput+0x0/0x1ef [ 3272.280000] 60367b38: [<600248f7>] unblock_signals+0x4b/0x5d [ 3272.280000] 60367b78: [<600249de>] sig_handler+0x30/0x3b [ 3272.280000] 60367b98: [<60024c10>] handle_signal+0x6d/0xa3 [ 3272.280000] 60367be8: [<60026684>] hard_handler+0x10/0x14 [ 3272.280000] [ 3272.280000] [ 3272.280000] Modules linked in: [ 3272.280000] Pid: 1214, comm: bash Not tainted 2.6.37-32890-g9651e88-dirty [ 3272.280000] RIP: 0033:[<000000000047ed50>] [ 3272.280000] RSP: 0000007fbfb83f58 EFLAGS: 00000246 [ 3272.280000] RAX: 0000000000000000 RBX: 000000004099f6a0 RCX: ffffffffffffffff [ 3272.280000] RDX: 0000007fbfb83f60 RSI: 0000007fbfb84090 RDI: 000000000000001c [ 3272.280000] RBP: 0000007fbfb843b7 R08: 0000000000000000 R09: 0000000000000001 [ 3272.280000] R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000001 [ 3272.280000] R13: 00000000ffffffff R14: 0000007fbfb852e0 R15: 0000000000000000 [ 3272.280000] Call Trace: [ 3272.280000] 60367378: [<60017d47>] panic_exit+0x2f/0x45 [ 3272.280000] 60367398: [<6004b172>] notifier_call_chain+0x32/0x5e [ 3272.280000] 603673d8: [<6004b1c0>] atomic_notifier_call_chain+0x13/0x15 [ 3272.280000] 603673e8: [<6026ac73>] panic+0x105/0x1dc [ 3272.280000] 60367440: [<6005523e>] __module_text_address+0xd/0x53 [ 3272.280000] 60367458: [<6005528d>] is_module_text_address+0x9/0x11 [ 3272.280000] 60367468: [<600455c4>] __kernel_text_address+0x65/0x6b [ 3272.280000] 60367470: [<60026684>] hard_handler+0x10/0x14 [ 3272.280000] 60367488: [<6001694a>] show_trace+0x8e/0x95 [ 3272.280000] 603674b8: [<6002a140>] show_regs+0x2b/0x2f [ 3272.280000] 603674e8: [<60017b7b>] segv_handler+0x0/0x65 [ 3272.280000] 60367518: [<6026d0a6>] _raw_spin_unlock_irqrestore+0x18/0x1c [ 3272.280000] 60367538: [<6002cfb7>] try_to_wake_up+0x86/0x98 [ 3272.280000] 603675c8: [<60017bda>] segv_handler+0x5f/0x65 [ 3272.280000] 603675f8: [<60024898>] sig_handler_common+0x84/0x98 [ 3272.280000] 603676a8: [<601d76a9>] sk_reset_timer+0x17/0x23 [ 3272.280000] 603676c8: [<60211493>] tcp_send_delayed_ack+0xb4/0xb6 [ 3272.280000] 60367728: [<600249de>] sig_handler+0x30/0x3b [ 3272.280000] 60367748: [<60024c10>] handle_signal+0x6d/0xa3 [ 3272.280000] 60367798: [<60026684>] hard_handler+0x10/0x14 [ 3272.280000] 603678b8: [<601e2150>] process_backlog+0x116/0x128 [ 3272.280000] 603678e8: [<60024963>] set_signals+0x1c/0x2e [ 3272.280000] 60367908: [<6005c43b>] rcu_qsctr_help+0x41/0x4a [ 3272.280000] 60367918: [<60036682>] __local_bh_enable+0x48/0x83 [ 3272.280000] 60367948: [<600367cf>] __do_softirq+0x112/0x128 [ 3272.280000] 603679d8: [<6026d122>] _raw_spin_lock+0x9/0xb [ 3272.280000] 603679e8: [<6005b7fa>] handle_edge_irq+0x5a/0x14c [ 3272.280000] 60367a18: [<60014ff9>] do_IRQ+0x2b/0x41 [ 3272.280000] 60367a38: [<60017614>] winch+0xe/0x10 [ 3272.280000] 60367a48: [<60024898>] sig_handler_common+0x84/0x98 [ 3272.280000] 60367a68: [<60024945>] real_alarm_handler+0x3c/0x3e [ 3272.280000] 60367af0: [<60085e51>] fput+0x0/0x1ef [ 3272.280000] 60367b38: [<600248f7>] unblock_signals+0x4b/0x5d [ 3272.280000] 60367b78: [<600249de>] sig_handler+0x30/0x3b [ 3272.280000] 60367b98: [<60024c10>] handle_signal+0x6d/0xa3 [ 3272.280000] 60367be8: [<60026684>] hard_handler+0x10/0x14 [ 3272.280000]
History
#1 Updated by Greg Farnum about 13 years ago
- Project changed from Ceph to Linux kernel client
#2 Updated by Sage Weil about 13 years ago
if this was uml and you have a core file, gdb will give you a more useful backtrace...
#3 Updated by Greg Farnum about 13 years ago
That backtrace was via gdb. I kept the stuff around but I'm not much good at handling linux crashes so I just put it in here for now...
#4 Updated by Sage Weil about 13 years ago
Greg Farnum wrote:
That backtrace was via gdb. I kept the stuff around but I'm not much good at handling linux crashes so I just put it in here for now...
oh right.. nevermind :). frequently the console dump stops but the gdb one includes the real culprit. i don't see any ceph code at all in this case, though. let's see if it's reproducible?
#5 Updated by Sage Weil about 13 years ago
- Status changed from New to Can't reproduce