Project

General

Profile

Actions

Bug #742

closed

ceph-client.git unstable , tiotest ,printk "timeout, ceph: mds0 caps stale",then oops

Added by changping Wu about 13 years ago. Updated about 13 years ago.

Status:
Won't Fix
Priority:
High
Assignee:
Category:
fs/ceph
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

hi ,
i git ceph-client.git ,checkout to unstable.
build it at ubuntu 10.04 ,make install,reboot,
then run as linux-2.6.37+ kernel.
system infos:
1.ubuntu 10.04 server + linux-2.6.37+(ceph-client.git unstable);
2.cpu intel(R) Core(TM)2 Uuad CPU Q8400 @2.66HGz
3.mem 4Gb
4.SATA disk:WDC WD3200AAKS-7 Rev:02.0
5. one mon ,one mds,two osd

6. run tiotest example shell script: =======================================
#!/bin/bash

i=0
while [ $i -le 100 ]
do
./tiotest -f 10 -t 30 -d /mnt/ceph -T -D 20 -r 100
let i+=1
done
exit0

============
7.logs:

run tiotest example ,for several minutes,
printk "timeout ,reset osd" ,
then printk "mds0 caps stale" ,then a OOPS
maybe

at __cap_is_valid(struct ceph_cap *cap),exist a invalid spinlock.

The detail logs attached.


[ 1475.050194] libceph: tid 57429 timed out on osd1, will reset osd
[ 1548.990035] ceph: mds0 caps stale
[ 1558.349986] INFO: rcu_sched_state detected stall on CPU 1 (t=6500 jiffies)


[ 1346.330300] libceph: tid 50475 timed out on osd0, will reset osd
[ 1366.400033] handle_timeout:timeout
[ 1366.403740] libceph: tid 49617 timed out on osd0, will reset osd
[ 1412.640037] handle_timeout:timeout
[ 1434.920041] handle_timeout:timeout
[ 1454.960029] handle_timeout:timeout
[ 1454.963664] libceph: tid 55571 timed out on osd1, will reset osd
[ 1454.970235] libceph: tid 55675 timed out on osd0, will reset osd
[ 1475.040045] handle_timeout:timeout
[ 1475.043719] libceph: tid 57408 timed out on osd0, will reset osd
[ 1475.050194] libceph: tid 57429 timed out on osd1, will reset osd
[ 1548.990035] ceph: mds0 caps stale
[ 1558.349986] INFO: rcu_sched_state detected stall on CPU 1 (t=6500 jiffies)
[ 1558.350008] sending NMI to all CPUs:
[ 1558.350008] NMI backtrace for cpu 1
[ 1558.350008] CPU 1
[ 1558.350008] Modules linked in: ceph libceph crc32c libcrc32c fbcon tileblit font bitblit softcursor i915 snd_hda_codec_analog drm_kms_helper drm snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_timer snd i2c_algo_bit i2c_core ppdev psmouse intel_agp video soundcore lp intel_gtt snd_page_alloc parport_pc shpchp output dell_wmi parport dcdbas serio_raw sparse_keymap usbhid r8169 mii hid ata_piix [last unloaded: libceph]
[ 1558.350008]
[ 1558.350008] Pid: 2783, comm: cosd Not tainted 2.6.37+ #1 0V4W66/OptiPlex 780
[ 1558.350008] RIP: 0010:[<ffffffff812b76a7>] [<ffffffff812b76a7>] delay_tsc+0x27/0x80
[ 1558.350008] RSP: 0018:ffff8800dc403cd8 EFLAGS: 00000006
[ 1558.350008] RAX: 000003cf49baea28 RBX: 0000000000000001 RCX: 0000000049baea28
[ 1558.350008] RDX: 00000000000003cf RSI: 0000000000000000 RDI: 00000000002896a5
[ 1558.350008] RBP: ffff8800dc403cf8 R08: 0000000000000000 R09: 0000000000000001
[ 1558.350008] R10: ffffffff8163b960 R11: 0000000000000001 R12: ffffffff81a21b80
[ 1558.350008] R13: 0000000000000001 R14: 00000000002896a5 R15: ffffffff8109c120
[ 1558.350008] FS: 00007f28daace700(0000) GS:ffff8800dc400000(0000) knlGS:0000000000000000
[ 1558.350008] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1558.350008] CR2: 000000000f153000 CR3: 000000010ba17000 CR4: 00000000000406e0
[ 1558.350008] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1558.350008] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1558.350008] Process cosd (pid: 2783, threadinfo ffff8801118de000, task ffff88010b598000)
[ 1558.350008] Stack:
[ 1558.350008] 0000000000000001 ffffffff81a21b80 ffffffff81a21b80 ffff8800dc403e88
[ 1558.350008] ffff8800dc403d08 ffffffff812b7677 ffff8800dc403d28 ffffffff8102e895
[ 1558.350008] 0000000000000000 ffff8800dc5d0360 ffff8800dc403d78 ffffffff810e2de9
[ 1558.350008] Call Trace:
[ 1558.350008] <IRQ>
[ 1558.350008] [<ffffffff812b7677>] __const_udelay+0x47/0x50
[ 1558.350008] [<ffffffff8102e895>] arch_trigger_all_cpu_backtrace+0x55/0x70
[ 1558.350008] [<ffffffff810e2de9>] __rcu_pending+0x169/0x3b0
[ 1558.350008] [<ffffffff8109c120>] ? tick_sched_timer+0x0/0xc0
[ 1558.350008] [<ffffffff810e309c>] rcu_check_callbacks+0x6c/0x120
[ 1558.350008] [<ffffffff81076998>] update_process_times+0x48/0x90
[ 1558.350008] [<ffffffff8109c186>] tick_sched_timer+0x66/0xc0
[ 1558.350008] [<ffffffff8108f5d3>] __run_hrtimer+0x93/0x200
[ 1558.350008] [<ffffffff8108f9de>] hrtimer_interrupt+0xde/0x250
[ 1558.350008] [<ffffffff81091ebf>] ? local_clock+0x6f/0x80
[ 1558.350008] [<ffffffff810351d7>] hpet_interrupt_handler+0x17/0x40
[ 1558.350008] [<ffffffff810dcf05>] handle_IRQ_event+0x55/0x190
[ 1558.350008] [<ffffffff812c94fe>] ? do_raw_spin_unlock+0x5e/0xb0
[ 1558.350008] [<ffffffff810df636>] handle_edge_irq+0xd6/0x180
[ 1558.350008] [<ffffffff8100eef9>] handle_irq+0x49/0xa0
[ 1558.350008] [<ffffffff81595948>] do_IRQ+0x68/0xf0
[ 1558.350008] [<ffffffff8158db53>] ret_from_intr+0x0/0x16
[ 1558.350008] <EOI>
[ 1558.350008] [<ffffffff810a4a9c>] ? lock_acquire+0xcc/0x150
[ 1558.350008] [<ffffffffa03ecfb0>] ? __cap_is_valid+0x30/0xf0 [ceph]
[ 1558.350008] [<ffffffff8158cca6>] _raw_spin_lock+0x36/0x70
[ 1558.350008] [<ffffffffa03ecfb0>] ? __cap_is_valid+0x30/0xf0 [ceph]
[ 1558.350008] [<ffffffffa03ecfb0>] __cap_is_valid+0x30/0xf0 [ceph]
[ 1558.350008] [<ffffffffa03edcdc>] __ceph_caps_issued+0x5c/0x100 [ceph]
[ 1558.350008] [<ffffffffa03f05e6>] ceph_check_caps+0x146/0xc60 [ceph]
[ 1558.350008] [<ffffffff81013ae3>] ? native_sched_clock+0x13/0x60
[ 1558.350008] [<ffffffff81013949>] ? sched_clock+0x9/0x10
[ 1558.350008] [<ffffffff81013ae3>] ? native_sched_clock+0x13/0x60
[ 1558.350008] [<ffffffff81013949>] ? sched_clock+0x9/0x10
[ 1558.350008] [<ffffffff81091cd5>] ? sched_clock_local+0x25/0x90
[ 1558.350008] [<ffffffffa03f12d2>] ceph_flush_dirty_caps+0xd2/0x240 [ceph]
[ 1558.350008] [<ffffffff81197dc0>] ? sync_one_sb+0x0/0x30
[ 1558.350008] [<ffffffffa03dbb6f>] ceph_sync_fs+0x2f/0x160 [ceph]
[ 1558.350008] [<ffffffff81197d83>] __sync_filesystem+0x63/0xa0
[ 1558.350008] [<ffffffff81197de0>] sync_one_sb+0x20/0x30
[ 1558.350008] [<ffffffff8116f4b7>] iterate_supers+0x77/0xe0
[ 1558.350008] [<ffffffff81197e1f>] sys_sync+0x2f/0x70
[ 1558.350008] [<ffffffff8100c082>] system_call_fastpath+0x16/0x1b
[ 1558.350008] Code: 00 00 00 00 55 48 89 e5 41 56 41 55 41 54 53 0f 1f 44 00 00 65 44 8b 2c 25 40 dc 00 00 49 89 fe 66 66 90 0f ae e8 e8 c9 ce d5 ff <66> 90 4c 63 e0 eb 11 66 90 f3 90 65 8b 1c 25 40 dc 00 00 44 39
[ 1558.350008] Call Trace:


Files

ceph-client.git_tiotest_logs (90 KB) ceph-client.git_tiotest_logs changping Wu, 01/25/2011 11:47 PM
mds_caps_stale_logs (14.2 KB) mds_caps_stale_logs changping Wu, 01/28/2011 12:27 AM
ceph.conf (2.38 KB) ceph.conf changping Wu, 02/09/2011 07:55 PM
filetype (490 Bytes) filetype changping Wu, 02/09/2011 07:55 PM
tiotest_log_20110210 (113 KB) tiotest_log_20110210 changping Wu, 02/09/2011 07:55 PM
cephs (434 Bytes) cephs ceph -s changping Wu, 02/09/2011 07:55 PM
Actions #1

Updated by changping Wu about 13 years ago

ceph server : ceph-0.24.2

Actions #2

Updated by changping Wu about 13 years ago

run ceph server and ceph client at the same machine.

Actions #3

Updated by changping Wu about 13 years ago

if printk " ceph: mds0 caps renewed"
,then ,system hang.

[ 3495.283726] libceph: tid 139576 timed out on osd0, will reset osd
[ 3495.290633] libceph: tid 139013 timed out on osd1, will reset osd
[ 3537.200037] handle_timeout:timeout
[ 3565.738111] ceph: mds0 caps went stale, renewing
[ 3565.744525] ceph: mds0 caps renewed

Actions #4

Updated by Sage Weil about 13 years ago

  • Project changed from Ceph to Linux kernel client
Actions #5

Updated by Sage Weil about 13 years ago

  • Category set to fs/ceph
  • Assignee set to Samuel Just
  • Priority changed from Normal to High
  • Target version set to v2.6.38
Actions #6

Updated by changping Wu about 13 years ago

git ceph-client-standalone.git master-backport:

if printk "caps stale" ,then cosd blocked.
mon osd mds is at the same machine.

detail logs attached.
----------------------------------------------
[ 1881.040047] ceph: mds0 caps stale
[ 1896.040160] ceph: mds0 caps stale
[ 2047.320039] INFO: task cosd:1200 blocked for more than 120 seconds.
[ 2047.327036] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2047.335657] cosd D 0000000100025763 4688 1200 1 0x00000000
[ 2047.343325] ffff88010c403e08 0000000000000046 ffff88010c403d78 0000000000000246
[ 2047.351533] 00000000001d4140 ffff88010ca6c620 ffff88010ca6c9b8 ffff88010c403fd8
[ 2047.359726] ffff88010ca6c9c0 00000000001d4140 ffff88010c402010 00000000001d4140
[ 2047.367979] Call Trace:

Actions #7

Updated by changping Wu about 13 years ago

ceph-client unstable:
make menuconfig:
kernel hacking:

i selected :

[*] RCU debugging: sparse-based checks for pointer usage
<* > torture tests for RCU
[* ] Check for stalled CPUs delaying RCU grace periods

Actions #8

Updated by Samuel Just about 13 years ago

Were the osds running on top of btrfs? Also, could you post the ceph.conf?

Updated by changping Wu about 13 years ago

Hi samuel,

osds are running on ext4 .

the steps:
1.
git ceph-client ,checkout unstable (commit 9c01177349b435186025a088f612a6f5ce2f3de9)

make menuconfig
make -j8
make modules_install
make install
reboot

modprobe libcrc32c
modprobe libceph
modprobe ceph

init ceph server
mount.ceph 172.16.10.68:6789:/ /mnt/ceph -o mount_timeout=10
run tiotest example shell script: =======================================
#!/bin/bash

i=0
while [ $i -le 100 ]
do
./tiotest -f 10 -t 30 -d /mnt/ceph -T -D 20 -r 100
rm -rf /mnt/ceph/*
let i+=1
done
exit0 ========================================

printk:

[ 733.850027] handle_timeout:timeout
[ 738.860033] handle_timeout:timeout
[ 743.870029] handle_timeout:timeout
[ 748.880028] handle_timeout:timeout
[ 814.960028] ceph: mds0 caps stale
[ 816.820013] BUG: soft lockup - CPU#2 stuck for 67s! [cosd:1292]
[ 816.826247] Modules linked in: ceph libceph crc32c libcrc32c fbcon tileblit font bitblit softcursor i915 snd_hda_codec_analog drm_kms_helper drm snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_timer snd i2c_algo_bit i2c_core dell_wmi ppdev shpchp intel_agp lp video parport_pc sparse_keymap soundcore intel_gtt snd_page_alloc parport serio_raw output r8169 mii
[ 816.840033] CPU 2
[ 816.840033] Modules linked in: ceph libceph crc32c libcrc32c fbcon tileblit font bitblit softcursor i915 snd_hda_codec_analog drm_kms_helper drm snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_timer snd i2c_algo_bit i2c_core dell_wmi ppdev shpchp intel_agp lp video parport_pc sparse_keymap soundcore intel_gtt snd_page_alloc parport serio_raw output r8169 mii
[ 816.840033]
[ 816.840033] Pid: 1292, comm: cosd Not tainted 2.6.37+ #8 0V4W66/OptiPlex 780
[ 816.840033] RIP: 0010:[<ffffffff812a55ad>] [<ffffffff812a55ad>] rb_first+0x1d/0x30
[ 816.840033] RSP: 0018:ffff88010b3b7cb0 EFLAGS: 00000246
[ 816.840033] RAX: ffff8800d4883328 RBX: ffff88010b3b7d08 RCX: 0000000000000000
[ 816.840033] RDX: 0000000000000000 RSI: ffff88010b3b7e34 RDI: ffff88010e0a4128
[ 816.840033] RBP: ffffffff815adf0e R08: 0000000000000000 R09: 0000000000000000
[ 816.840033] R10: 0000000000000000 R11: 0000000000000000 R12: 000004d28108fb15
[ 816.840033] R13: ffff880111a62348 R14: ffff880111a62a08 R15: 0000000000000000
[ 816.840033] FS: 00007f2e01d0d700(0000) GS:ffff8800dc600000(0000) knlGS:0000000000000000
[ 816.840033] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 816.840033] CR2: 00007f6d1be15000 CR3: 000000010b57e000 CR4: 00000000000406e0
[ 816.840033] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 816.840033] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 816.840033] Process cosd (pid: 1292, threadinfo ffff88010b3b6000, task ffff880111a62310)
[ 816.840033] Stack:
[ 816.840033] ffffffffa0256c41 ffff88010e0a4518 ffff88010e0a3fa8 ffff88010e0a4440
[ 816.840033] ffff88010c5b6fe8 ffff88010b3b7d08 0000000000000000 0000000000000000
[ 816.840033] ffff88010e0a3fa8 ffff88010e0a4440 00000000ffffffff ffff88010b3b7e68
[ 816.840033] Call Trace:
[ 816.840033] [<ffffffffa0256c41>] ? __ceph_caps_issued+0x41/0x100 [ceph]
[ 816.840033] [<ffffffffa0259566>] ? ceph_check_caps+0x146/0xc60 [ceph]
[ 816.840033] [<ffffffff81013a73>] ? native_sched_clock+0x13/0x60
[ 816.840033] [<ffffffff81013909>] ? sched_clock+0x9/0x10
[ 816.840033] [<ffffffff81013a73>] ? native_sched_clock+0x13/0x60
[ 816.840033] [<ffffffff81013909>] ? sched_clock+0x9/0x10
[ 816.840033] [<ffffffff8108fb15>] ? sched_clock_local+0x25/0x90
[ 816.840033] [<ffffffffa025a252>] ? ceph_flush_dirty_caps+0xd2/0x240 [ceph]
[ 816.840033] [<ffffffff8118d260>] ? sync_one_sb+0x0/0x30
[ 816.840033] [<ffffffffa0244b5f>] ? ceph_sync_fs+0x2f/0x160 [ceph]
[ 816.840033] [<ffffffff8118d223>] ? __sync_filesystem+0x63/0xa0
[ 816.840033] [<ffffffff8118d280>] ? sync_one_sb+0x20/0x30
[ 816.840033] [<ffffffff811656e7>] ? iterate_supers+0x77/0xe0
[ 816.840033] [<ffffffff8118d2bf>] ? sys_sync+0x2f/0x70
[ 816.840033] [<ffffffff8100c0c2>] ? system


logs and ceph.conf were attached.

Actions #10

Updated by changping Wu about 13 years ago

ceph-client unstable:
kernel hacking:

select:

[] RCU debugging: sparse-based checks for pointer usage
< > torture tests for RCU
[ ] Check for stalled CPUs delaying RCU grace periods

Actions #11

Updated by Greg Farnum about 13 years ago

  • Status changed from New to Won't Fix

This is an unfortunate result of running the kernel client on the same machine as the OSD:
The cosd process needs to run syncs to guarantee consistency. Unfortunately, there's no VFS interface to run a sync on a specific filesystem. On btrfs, it can use a special ioctl, but on other filesystems it needs to call "sync", which syncs every filesystem on the machine -- including the kernel client one!
This leads to deadlock, and can't be worked around. (Although we are trying on-and-off to get either an ioctl or a syscall merged into Linux.)

Actions

Also available in: Atom PDF