Project

General

Profile

Actions

Bug #471

closed

NULL pointer dereference __list_add+0x42/0x89 kick_requests+0x24/0x9e

Added by Sage Weil over 13 years ago. Updated over 13 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

On commit:0d328c1

[94880.387538] ceph: osd15 10.3.14.142:6800 socket closed
[94880.392791] INFO: trying to register non-static key.
[94880.396785] the code is fine but needs lockdep annotation.
[94880.396785] turning off the locking correctness validator.
[94880.396785] Pid: 10, comm: kworker/0:1 Not tainted 2.6.36-rc7+ #61
[94880.396785] Call Trace:
[94880.396785]  [<ffffffff8105dd50>] ? static_obj+0x43/0x53
[94880.396785]  [<ffffffff8106234c>] __lock_acquire+0x852/0x87a
[94880.396785]  [<ffffffff810623fc>] lock_acquire+0x88/0xa5
[94880.396785]  [<ffffffffa002bf1b>] ? osd_reset+0x40/0x8d [ceph]
[94880.396785]  [<ffffffff814b5382>] down_read+0x47/0x8d
[94880.396785]  [<ffffffffa002bf1b>] ? osd_reset+0x40/0x8d [ceph]
[94880.396785]  [<ffffffffa002bf1b>] osd_reset+0x40/0x8d [ceph]
[94880.396785]  [<ffffffffa001fef5>] con_work+0x37b/0x6bb [ceph]
[94880.396785]  [<ffffffff8104c89f>] process_one_work+0x1fd/0x38f
[94880.396785]  [<ffffffff8104c83d>] ? process_one_work+0x19b/0x38f
[94880.396785]  [<ffffffffa001fb7a>] ? con_work+0x0/0x6bb [ceph]
[94880.396785]  [<ffffffff8104e269>] worker_thread+0x147/0x22b
[94880.396785]  [<ffffffff8104e122>] ? worker_thread+0x0/0x22b
[94880.396785]  [<ffffffff81051a6d>] kthread+0x8d/0x95
[94880.396785]  [<ffffffff81003794>] kernel_thread_helper+0x4/0x10
[94880.396785]  [<ffffffff81030fe9>] ? finish_task_switch+0x0/0xa8
[94880.396785]  [<ffffffff81031052>] ? finish_task_switch+0x69/0xa8
[94880.396785]  [<ffffffff814b6f00>] ? restore_args+0x0/0x30
[94880.396785]  [<ffffffff810519e0>] ? kthread+0x0/0x95
[94880.396785]  [<ffffffff81003790>] ? kernel_thread_helper+0x0/0x10
[94880.535663] BUG: unable to handle kernel NULL pointer dereference at (null)
[94880.539585] IP: [<ffffffff8126ffbd>] __list_add+0x42/0x89
[94880.539585] PGD 11d119067 PUD 11cbc0067 PMD 0 
[94880.539585] Oops: 0000 [#1] PREEMPT SMP 
[94880.539585] last sysfs file: /sys/kernel/uevent_seqnum
[94880.539585] CPU 0 
[94880.539585] Modules linked in: ceph
[94880.539585] 
[94880.539585] Pid: 10, comm: kworker/0:1 Not tainted 2.6.36-rc7+ #61 PDSMi+/PDSMi
[94880.539585] RIP: 0010:[<ffffffff8126ffbd>]  [<ffffffff8126ffbd>] __list_add+0x42/0x89
[94880.539585] RSP: 0018:ffff88011faddc40  EFLAGS: 00010046
[94880.539585] RAX: 0000000000000000 RBX: ffff88011e1be958 RCX: 0000000000000000
[94880.539585] RDX: ffff88011e1be958 RSI: 0000000000000000 RDI: ffff88011faddc90
[94880.539585] RBP: ffff88011faddc60 R08: 0000000000000000 R09: ffff88011faddc90
[94880.539585] R10: ffffffff81056048 R11: ffffffff81055a87 R12: 0000000000000000
[94880.539585] R13: ffff88011faddc90 R14: ffff88011fada280 R15: ffffffffa002be61
[94880.539585] FS:  0000000000000000(0000) GS:ffff880002600000(0000) knlGS:0000000000000000
[94880.539585] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[94880.539585] CR2: 0000000000000000 CR3: 000000011dcd4000 CR4: 00000000000006f0
[94880.539585] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[94880.539585] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[94880.539585] Process kworker/0:1 (pid: 10, threadinfo ffff88011fadc000, task ffff88011fada280)
[94880.539585] Stack:
[94880.539585]  ffff88011e1be910 00000000ffffffff ffff88011e1be910 0000000000000202
[94880.539585] <0> ffff88011faddce0 ffffffff814b479f ffffffffa002be61 0000000000000000
[94880.539585] <0> ffff88011faddce0 0000000000000246 ffff88011faddc90 ffff88011faddc90
[94880.539585] Call Trace:
[94880.539585]  [<ffffffff814b479f>] mutex_lock_nested+0x130/0x31e
[94880.539585]  [<ffffffffa002be61>] ? kick_requests+0x24/0x9e [ceph]
[94880.539585]  [<ffffffffa002be61>] kick_requests+0x24/0x9e [ceph]
[94880.539585]  [<ffffffffa002bf1b>] ? osd_reset+0x40/0x8d [ceph]
[94880.539585]  [<ffffffffa002bf26>] osd_reset+0x4b/0x8d [ceph]
[94880.539585]  [<ffffffffa001fef5>] con_work+0x37b/0x6bb [ceph]
[94880.539585]  [<ffffffff8104c89f>] process_one_work+0x1fd/0x38f
[94880.539585]  [<ffffffff8104c83d>] ? process_one_work+0x19b/0x38f
[94880.539585]  [<ffffffffa001fb7a>] ? con_work+0x0/0x6bb [ceph]
[94880.539585]  [<ffffffff8104e269>] worker_thread+0x147/0x22b
[94880.539585]  [<ffffffff8104e122>] ? worker_thread+0x0/0x22b
[94880.539585]  [<ffffffff81051a6d>] kthread+0x8d/0x95
[94880.539585]  [<ffffffff81003794>] kernel_thread_helper+0x4/0x10
[94880.539585]  [<ffffffff81030fe9>] ? finish_task_switch+0x0/0xa8
[94880.539585]  [<ffffffff81031052>] ? finish_task_switch+0x69/0xa8
[94880.539585]  [<ffffffff814b6f00>] ? restore_args+0x0/0x30
[94880.539585]  [<ffffffff810519e0>] ? kthread+0x0/0x95
[94880.539585]  [<ffffffff81003790>] ? kernel_thread_helper+0x0/0x10
[94880.539585] Code: 8b 42 08 48 39 f0 74 23 49 89 d1 49 89 c0 48 89 f1 48 c7 c2 06 d6 64 81 be 1a 00 00 00 48 c7 c7 51 d5 64 81 31 c0 e8 fc a5 dc ff <49> 8b 04 24 48 39 d8 74 23 49 89 c0 4d 89 e1 48 89 d9 48 c7 c2 
[94880.539585] RIP  [<ffffffff8126ffbd>] __list_add+0x42/0x89
[94880.539585]  RSP <ffff88011faddc40>
[94880.539585] CR2: 0000000000000000
[94880.539585] ---[ end trace 555371ce86832624 ]---

This was on ceph1.

Actions #1

Updated by Sage Weil over 13 years ago

Here's teh full dmesg, fwiw:


ceph1 login: [  383.024127] ceph: version magic '2.6.36-rc3+ SMP preempt mod_unload ' should be '2.6.36-rc7+ SMP preempt mod_unload '
[  425.208545] ceph: version magic '2.6.36-rc3+ SMP preempt mod_unload ' should be '2.6.36-rc7+ SMP preempt mod_unload '
[ 2280.788691] ceph: loaded (mon/mds/osd proto 15/32/24, osdmap 5/5 5/5)
[ 2284.412794] ceph: mon0 10.3.14.136:6789 connection failed
[ 2294.439930] ceph: mon0 10.3.14.136:6789 connection failed
[ 2304.454969] ceph: mon0 10.3.14.136:6789 connection failed
[ 2314.470179] ceph: mon0 10.3.14.136:6789 connection failed
[ 2324.485309] ceph: mon0 10.3.14.136:6789 connection failed
[ 2334.500533] ceph: mon0 10.3.14.136:6789 connection failed
[ 2361.694189] ceph: mon0 10.3.14.136:6789 connection failed
[ 2371.713385] ceph: mon0 10.3.14.136:6789 connection failed
[ 2381.728586] ceph: mon0 10.3.14.136:6789 connection failed
[ 2391.743728] ceph: mon0 10.3.14.136:6789 connection failed
[ 2401.758922] ceph: mon0 10.3.14.136:6789 connection failed
[ 2402.321279] ceph: client4225 fsid e67f3214-bb55-4590-5bd2-92b0ea895f7e
[ 2402.328651] ceph: mon0 10.3.14.154:6789 session established
[ 2411.778001] ceph: mon0 10.3.14.136:6789 connection failed
[ 2448.375277] ceph: osd20 10.3.14.147:6800 socket closed
[ 2448.380917] ceph: osd20 10.3.14.147:6800 connection failed
[ 2449.070804] ceph: osd20 10.3.14.147:6800 connection failed
[ 2450.070688] ceph: osd20 10.3.14.147:6800 connection failed
[ 2452.074453] ceph: osd20 10.3.14.147:6800 connection failed
[ 2456.078101] ceph: osd20 10.3.14.147:6800 connection failed
[ 2464.093460] ceph: osd20 10.3.14.147:6800 connection failed
[ 2469.908987] ceph: osd20 down
[ 2472.203746] ceph: get_reply unknown tid 328 from osd5
[ 2472.209032] ceph: get_reply unknown tid 497 from osd5
[ 2472.220570] ceph: get_reply unknown tid 1118 from osd15
[ 2472.226126] ceph: get_reply unknown tid 742 from osd15
[ 2472.231558] ceph: get_reply unknown tid 502 from osd16
[ 2472.237009] ceph: get_reply unknown tid 246 from osd3
[ 2472.363685] ceph: get_reply unknown tid 554 from osd13
[ 2472.369091] ceph: get_reply unknown tid 966 from osd13
[ 2472.374525] ceph: get_reply unknown tid 572 from osd13
[ 2472.380024] ceph: get_reply unknown tid 692 from osd18
[ 2472.385746] ceph: get_reply unknown tid 112 from osd8
[ 2472.391192] ceph: get_reply unknown tid 122 from osd8
[ 2472.396548] ceph: get_reply unknown tid 258 from osd8
[ 2472.402153] ceph: get_reply unknown tid 981 from osd2
[ 2472.407479] ceph: get_reply unknown tid 93 from osd2
[ 2472.412676] ceph: get_reply unknown tid 845 from osd2
[ 2472.418146] ceph: get_reply unknown tid 172 from osd17
[ 2472.423475] ceph: get_reply unknown tid 754 from osd13
[ 2472.428845] ceph: get_reply unknown tid 576 from osd22
[ 2472.434282] ceph: get_reply unknown tid 954 from osd6
[ 2472.440099] ceph: get_reply unknown tid 207 from osd24
[ 2472.445565] ceph: get_reply unknown tid 414 from osd24
[ 2472.450937] ceph: get_reply unknown tid 338 from osd24
[ 2472.456534] ceph: get_reply unknown tid 174 from osd24
[ 2472.461801] ceph: get_reply unknown tid 230 from osd24
[ 2472.467114] ceph: get_reply unknown tid 557 from osd24
[ 2472.472553] ceph: get_reply unknown tid 135 from osd14
[ 2472.477921] ceph: get_reply unknown tid 472 from osd14
[ 2472.483250] ceph: get_reply unknown tid 614 from osd24
[ 2472.492541] ceph: get_reply unknown tid 237 from osd24
[ 2472.539801] ceph: get_reply unknown tid 1095 from osd21
[ 2472.545333] ceph: get_reply unknown tid 400 from osd21
[ 2472.555431] ceph: get_reply unknown tid 684 from osd21
[ 2472.560810] ceph: get_reply unknown tid 87 from osd21
[ 2472.666698] ceph: get_reply unknown tid 77 from osd7
[ 2472.671852] ceph: get_reply unknown tid 403 from osd7
[ 2472.677085] ceph: get_reply unknown tid 810 from osd7
[ 2472.694503] ceph: get_reply unknown tid 1033 from osd7
[ 2472.702954] ceph: get_reply unknown tid 386 from osd7
[ 2472.946497] ceph: get_reply unknown tid 210 from osd11
[ 2472.952189] ceph: get_reply unknown tid 423 from osd11
[ 2472.957507] ceph: get_reply unknown tid 617 from osd11
[ 2473.054117] ceph: get_reply unknown tid 842 from osd7
[ 2473.059867] ceph: get_reply unknown tid 264 from osd17
[ 2473.065514] ceph: get_reply unknown tid 488 from osd14
[ 2473.071475] ceph: get_reply unknown tid 420 from osd24
[ 2473.078460] ceph: get_reply unknown tid 837 from osd15
[ 2473.085107] ceph: get_reply unknown tid 973 from osd22
[ 2473.095756] ceph: get_reply unknown tid 726 from osd17
[ 2473.102913] ceph: get_reply unknown tid 545 from osd24
[ 2473.108961] ceph: get_reply unknown tid 713 from osd22
[ 2473.114821] ceph: get_reply unknown tid 870 from osd15
[ 2473.121726] ceph: get_reply unknown tid 206 from osd17
[ 2473.128683] ceph: get_reply unknown tid 1023 from osd22
[ 2473.135645] ceph: get_reply unknown tid 368 from osd17
[ 2473.184317] ceph: get_reply unknown tid 885 from osd4
[ 2473.197399] ceph: get_reply unknown tid 805 from osd4
[ 2473.208782] ceph: get_reply unknown tid 907 from osd4
[ 2878.162458] ceph: osd20 up
[ 2878.165250] ceph: osd20 weight 0x10000 (in)
[ 3139.762356] ceph: osd2 10.3.14.129:6800 socket closed
[ 3143.441306] ceph: osd2 10.3.14.129:6800 connection failed
[ 3144.075370] ceph: wrong peer, want 10.3.14.129:6800/9093, got 0.0.0.0:6800/9845
[ 3144.082858] ceph: osd2 10.3.14.129:6800 wrong peer at address
[ 3144.693520] ceph: osd3 10.3.14.130:6800 socket closed
[ 3144.699061] ceph: osd3 10.3.14.130:6800 connection failed
[ 3145.079102] ceph: osd3 10.3.14.130:6800 connection failed
[ 3145.084848] ceph: wrong peer, want 10.3.14.129:6800/9093, got 0.0.0.0:6800/9845
[ 3145.092329] ceph: osd2 10.3.14.129:6800 wrong peer at address
[ 3146.082907] ceph: osd3 10.3.14.130:6800 connection failed
[ 3147.087104] ceph: wrong peer, want 10.3.14.129:6800/9093, got 10.3.14.129:6800/9845
[ 3147.094947] ceph: osd2 10.3.14.129:6800 wrong peer at address
[ 3147.974977] ceph: osd2 down
[ 3148.090653] ceph: osd3 10.3.14.130:6800 connection failed
[ 3149.459133] ceph: osd4 10.3.14.131:6800 socket closed
[ 3149.464967] ceph: osd4 10.3.14.131:6800 connection failed
[ 3149.539980] ceph: osd2 up
[ 3149.542668] ceph: osd2 weight 0x10000 (in)
[ 3150.090476] ceph: osd4 10.3.14.131:6800 connection failed
[ 3151.090384] ceph: osd4 10.3.14.131:6800 connection failed
[ 3151.465000] ceph: mon0 10.3.14.154:6789 socket closed
[ 3151.470174] ceph: mon0 10.3.14.154:6789 session lost, hunting for new mon
[ 3151.477376] ceph: mon0 10.3.14.154:6789 connection failed
[ 3152.106561] ceph: wrong peer, want 10.3.14.130:6800/22139, got 10.3.14.130:6800/22544
[ 3152.114597] ceph: osd3 10.3.14.130:6800 wrong peer at address
[ 3153.106147] ceph: osd4 10.3.14.131:6800 connection failed
[ 3153.662235] ceph: mon0 10.3.14.154:6789 connection failed
[ 3154.591724] ceph: osd5 10.3.14.132:6800 socket closed
[ 3154.597512] ceph: osd5 10.3.14.132:6800 connection failed
[ 3155.106006] ceph: osd5 10.3.14.132:6800 connection failed
[ 3156.105964] ceph: osd5 10.3.14.132:6800 connection failed
[ 3156.132791] ceph: mds0 10.3.14.152:6800 socket closed
[ 3157.109868] ceph: mds0 10.3.14.152:6800 connection failed
[ 3157.118140] ceph: wrong peer, want 10.3.14.131:6800/22273, got 10.3.14.131:6800/22946
[ 3157.126149] ceph: osd4 10.3.14.131:6800 wrong peer at address
[ 3158.113916] ceph: mds0 10.3.14.152:6800 connection failed
[ 3158.119656] ceph: osd5 10.3.14.132:6800 connection failed
[ 3159.395778] ceph: osd4 down
[ 3159.446358] ceph: osd6 10.3.14.133:6800 socket closed
[ 3159.452172] ceph: osd6 10.3.14.133:6800 connection failed
[ 3160.117661] ceph: osd6 10.3.14.133:6800 connection failed
[ 3160.123761] ceph: wrong peer, want 10.3.14.152:6800/22246, got 10.3.14.152:6800/22373
[ 3160.131770] ceph: mds0 10.3.14.152:6800 wrong peer at address
[ 3160.149945] ceph: wrong peer, want 10.3.14.130:6800/22139, got 10.3.14.130:6800/22544
[ 3160.157956] ceph: osd3 10.3.14.130:6800 wrong peer at address
[ 3160.252671] ceph: osd4 up
[ 3160.255365] ceph: osd4 weight 0x10000 (in)
[ 3161.129633] ceph: osd6 10.3.14.133:6800 connection failed
[ 3161.741410] ceph: osd5 down
[ 3162.137790] ceph: wrong peer, want 10.3.14.132:6800/22380, got 10.3.14.132:6800/22987
[ 3162.145802] ceph: osd5 10.3.14.132:6800 wrong peer at address
[ 3163.137461] ceph: osd6 10.3.14.133:6800 connection failed
[ 3163.243539] ceph: osd5 up
[ 3163.246245] ceph: osd5 weight 0x10000 (in)
[ 3163.711783] ceph: mon0 10.3.14.154:6789 session established
[ 3164.145660] ceph: wrong peer, want 10.3.14.152:6800/22246, got 10.3.14.152:6800/22373
[ 3164.153669] ceph: mds0 10.3.14.152:6800 wrong peer at address
[ 3164.269168] ceph: osd7 10.3.14.134:6800 socket closed
[ 3164.275381] ceph: osd7 10.3.14.134:6800 connection failed
[ 3165.145329] ceph: osd7 10.3.14.134:6800 connection failed
[ 3165.760649] ceph: get_reply unknown tid 18813 from osd17
[ 3165.766128] ceph: get_reply unknown tid 18813 from osd17
[ 3165.771665] ceph: get_reply unknown tid 18806 from osd12
[ 3165.777137] ceph: get_reply unknown tid 18806 from osd12
[ 3166.149228] ceph: osd7 10.3.14.134:6800 connection failed
[ 3167.159620] ceph: wrong peer, want 10.3.14.133:6800/22342, got 10.3.14.133:6800/23019
[ 3167.167645] ceph: osd6 10.3.14.133:6800 wrong peer at address
[ 3167.204915] ceph: osd3 down
[ 3167.207795] ceph: osd6 down
[ 3168.161133] ceph: osd7 10.3.14.134:6800 connection failed
[ 3168.683339] ceph: osd6 up
[ 3168.686037] ceph: osd6 weight 0x10000 (in)
[ 3169.059210] ceph: get_reply unknown tid 18680 from osd5
[ 3169.147320] ceph: get_reply unknown tid 18936 from osd5
[ 3169.152699] ceph: get_reply unknown tid 18943 from osd5
[ 3169.311656] ceph: get_reply unknown tid 18948 from osd16
[ 3169.317334] ceph: get_reply unknown tid 18762 from osd16
[ 3169.322816] ceph: get_reply unknown tid 18771 from osd16
[ 3169.328338] ceph: get_reply unknown tid 18929 from osd13
[ 3169.334035] ceph: get_reply unknown tid 18715 from osd18
[ 3169.353842] ceph: get_reply unknown tid 18655 from osd12
[ 3169.464760] ceph: get_reply unknown tid 18704 from osd23
[ 3169.470336] ceph: get_reply unknown tid 18747 from osd23
[ 3169.475884] ceph: get_reply unknown tid 18731 from osd23
[ 3169.486778] ceph: get_reply unknown tid 18883 from osd24
[ 3169.492252] ceph: get_reply unknown tid 18950 from osd24
[ 3170.934659] ceph: osd8 10.3.14.135:6800 socket closed
[ 3170.940851] ceph: osd8 10.3.14.135:6800 connection failed
[ 3172.126115] ceph: get_reply unknown tid 18755 from osd15
[ 3172.131637] ceph: get_reply unknown tid 18755 from osd15
[ 3172.137092] ceph: get_reply unknown tid 18852 from osd17
[ 3172.142629] ceph: get_reply unknown tid 18784 from osd21
[ 3172.188909] ceph: osd8 10.3.14.135:6800 connection failed
[ 3172.194959] ceph: wrong peer, want 10.3.14.134:6800/22217, got 10.3.14.134:6800/22857
[ 3172.202998] ceph: osd7 10.3.14.134:6800 wrong peer at address
[ 3172.217173] ceph: wrong peer, want 10.3.14.152:6800/22246, got 10.3.14.152:6800/22373
[ 3172.225192] ceph: mds0 10.3.14.152:6800 wrong peer at address
[ 3172.388796] ceph: get_reply unknown tid 18854 from osd4
[ 3172.403122] ceph: get_reply unknown tid 18826 from osd20
[ 3172.408576] ceph: get_reply unknown tid 18826 from osd20
[ 3172.442808] ceph: get_reply unknown tid 18632 from osd11
[ 3172.481150] ceph: get_reply unknown tid 18836 from osd24
[ 3172.486602] ceph: get_reply unknown tid 18836 from osd24
[ 3172.594118] ceph: get_reply unknown tid 18820 from osd16
[ 3172.599570] ceph: get_reply unknown tid 18820 from osd16
[ 3173.204788] ceph: osd8 10.3.14.135:6800 connection failed
[ 3173.579857] ceph: get_reply unknown tid 18852 from osd17
[ 3173.587667] ceph: get_reply unknown tid 18784 from osd21
[ 3173.711231] ceph: get_reply unknown tid 18913 from osd4
[ 3173.814078] ceph: get_reply unknown tid 18960 from osd4
[ 3173.960048] ceph: osd7 down
[ 3174.892639] ceph: get_reply unknown tid 19009 from osd20
[ 3175.034468] ceph: osd7 up
[ 3175.037171] ceph: osd7 weight 0x10000 (in)
[ 3175.220630] ceph: osd8 10.3.14.135:6800 connection failed
[ 3176.042523] ceph: get_reply unknown tid 19061 from osd5
[ 3176.047996] ceph: get_reply unknown tid 18971 from osd19
[ 3176.870798] ceph: get_reply unknown tid 18645 from osd15
[ 3176.876274] ceph: get_reply unknown tid 18971 from osd19
[ 3177.052611] ceph: get_reply unknown tid 18481 from osd12
[ 3177.058111] ceph: get_reply unknown tid 18481 from osd12
[ 3177.063658] ceph: get_reply unknown tid 18891 from osd13
[ 3177.069208] ceph: get_reply unknown tid 18934 from osd12
[ 3177.103823] ceph: get_reply unknown tid 18508 from osd18
[ 3177.109293] ceph: get_reply unknown tid 18508 from osd18
[ 3177.121698] ceph: get_reply unknown tid 18973 from osd18
[ 3177.127164] ceph: get_reply unknown tid 18973 from osd18
[ 3177.170419] ceph: get_reply unknown tid 18505 from osd16
[ 3177.175902] ceph: get_reply unknown tid 18505 from osd16
[ 3177.181428] ceph: get_reply unknown tid 18523 from osd17
[ 3177.201721] ceph: get_reply unknown tid 18558 from osd21
[ 3177.207206] ceph: get_reply unknown tid 18558 from osd21
[ 3177.212716] ceph: get_reply unknown tid 18639 from osd18
[ 3177.218297] ceph: get_reply unknown tid 18593 from osd20
[ 3177.223753] ceph: get_reply unknown tid 18668 from osd21
[ 3177.288330] ceph: osd11 10.3.14.138:6800 socket closed
[ 3177.294879] ceph: osd11 10.3.14.138:6800 connection failed
[ 3177.374648] ceph: get_reply unknown tid 18738 from osd19
[ 3177.396960] ceph: get_reply unknown tid 18565 from osd23
[ 3177.402435] ceph: get_reply unknown tid 18565 from osd23
[ 3177.444838] ceph: get_reply unknown tid 18934 from osd12
[ 3177.810746] ceph: get_reply unknown tid 18909 from osd23
[ 3177.816276] ceph: get_reply unknown tid 18909 from osd23
[ 3177.821936] ceph: get_reply unknown tid 19071 from osd18
[ 3177.949432] ceph: get_reply unknown tid 18891 from osd13
[ 3177.955125] ceph: get_reply unknown tid 18988 from osd13
[ 3177.960637] ceph: get_reply unknown tid 19033 from osd13
[ 3178.256431] ceph: osd11 10.3.14.138:6800 connection failed
[ 3178.592003] ceph: get_reply unknown tid 18593 from osd20
[ 3178.628515] ceph: osd8 down
[ 3178.633779] ceph: get_reply unknown tid 18523 from osd17
[ 3178.668495] ceph: get_reply unknown tid 19034 from osd18
[ 3179.260371] ceph: osd11 10.3.14.138:6800 connection failed
[ 3179.268642] ceph: wrong peer, want 10.3.14.135:6800/22185, got 10.3.14.135:6800/22826
[ 3179.276651] ceph: osd8 10.3.14.135:6800 wrong peer at address
[ 3180.164656] ceph: osd8 up
[ 3180.167328] ceph: osd8 weight 0x10000 (in)
[ 3181.268554] ceph: osd11 10.3.14.138:6800 connection failed
[ 3182.527585] ceph: osd12 10.3.14.139:6800 socket closed
[ 3182.533880] ceph: osd12 10.3.14.139:6800 connection failed
[ 3183.268110] ceph: osd12 10.3.14.139:6800 connection failed
[ 3183.604580] ceph: get_reply unknown tid 18905 from osd13
[ 3183.610049] ceph: get_reply unknown tid 18905 from osd13
[ 3184.268039] ceph: osd12 10.3.14.139:6800 connection failed
[ 3185.276261] ceph: wrong peer, want 10.3.14.138:6800/22348, got 10.3.14.138:6800/22963
[ 3185.284264] ceph: osd11 10.3.14.138:6800 wrong peer at address
[ 3186.014700] ceph: get_reply unknown tid 19145 from osd16
[ 3186.020154] ceph: get_reply unknown tid 19145 from osd16
[ 3186.164913] ceph: get_reply unknown tid 19023 from osd20
[ 3186.170392] ceph: get_reply unknown tid 19023 from osd20
[ 3186.175872] ceph: get_reply unknown tid 19004 from osd17
[ 3186.233453] ceph: get_reply unknown tid 18878 from osd15
[ 3186.238924] ceph: get_reply unknown tid 18878 from osd15
[ 3186.248879] ceph: get_reply unknown tid 19035 from osd15
[ 3186.254333] ceph: get_reply unknown tid 19035 from osd15
[ 3186.266996] ceph: get_reply unknown tid 19120 from osd24
[ 3186.272472] ceph: get_reply unknown tid 19120 from osd24
[ 3186.295848] ceph: osd12 10.3.14.139:6800 connection failed
[ 3186.344478] ceph: get_reply unknown tid 18892 from osd24
[ 3186.349942] ceph: get_reply unknown tid 18892 from osd24
[ 3186.397219] ceph: osd11 down
[ 3187.818272] ceph: get_reply unknown tid 18552 from osd20
[ 3187.823743] ceph: get_reply unknown tid 18552 from osd20
[ 3187.853158] ceph: osd13 10.3.14.140:6800 socket closed
[ 3187.859582] ceph: osd13 10.3.14.140:6800 connection failed
[ 3187.907963] ceph: osd11 up
[ 3187.910733] ceph: osd11 weight 0x10000 (in)
[ 3188.235067] ceph: get_reply unknown tid 18682 from osd20
[ 3188.299712] ceph: osd13 10.3.14.140:6800 connection failed
[ 3188.680729] ceph: get_reply unknown tid 19004 from osd17
[ 3189.299687] ceph: osd13 10.3.14.140:6800 connection failed
[ 3189.633003] ceph: get_reply unknown tid 19085 from osd14
[ 3189.638462] ceph: get_reply unknown tid 19085 from osd14
[ 3189.857022] ceph: osd12 down
[ 3190.311906] ceph: wrong peer, want 10.3.14.139:6800/22593, got 10.3.14.139:6800/22982
[ 3190.319917] ceph: osd12 10.3.14.139:6800 wrong peer at address
[ 3190.650036] ceph: get_reply unknown tid 19253 from osd16
[ 3190.777185] ceph: get_reply unknown tid 19248 from osd20
[ 3191.163470] ceph: get_reply unknown tid 19242 from osd22
[ 3191.178384] ceph: get_reply unknown tid 19219 from osd15
[ 3191.183839] ceph: get_reply unknown tid 19219 from osd15
[ 3191.299295] ceph: get_reply unknown tid 19089 from osd24
[ 3191.304770] ceph: get_reply unknown tid 19089 from osd24
[ 3191.323505] ceph: osd13 10.3.14.140:6800 connection failed
[ 3191.334754] ceph: osd12 up
[ 3191.337543] ceph: osd12 weight 0x10000 (in)
[ 3192.935749] ceph: get_reply unknown tid 19358 from osd7
[ 3193.481872] ceph: get_reply unknown tid 19234 from osd6
[ 3193.487276] ceph: get_reply unknown tid 19234 from osd6
[ 3193.492692] ceph: get_reply unknown tid 19110 from osd21
[ 3193.498146] ceph: osd14 10.3.14.141:6800 socket closed
[ 3193.504446] ceph: osd14 10.3.14.141:6800 connection failed
[ 3193.718647] ceph: get_reply unknown tid 19110 from osd21
[ 3193.724231] ceph: get_reply unknown tid 19123 from osd21
[ 3193.763422] ceph: get_reply unknown tid 19123 from osd21
[ 3193.918224] ceph: get_reply unknown tid 18983 from osd6
[ 3193.923600] ceph: get_reply unknown tid 18983 from osd6
[ 3194.335305] ceph: osd14 10.3.14.141:6800 connection failed
[ 3195.335207] ceph: osd14 10.3.14.141:6800 connection failed
[ 3195.343528] ceph: wrong peer, want 10.3.14.140:6800/22385, got 10.3.14.140:6800/23252
[ 3195.351537] ceph: osd13 10.3.14.140:6800 wrong peer at address
[ 3195.379394] ceph: get_reply unknown tid 19129 from osd8
[ 3196.587484] ceph: osd13 down
[ 3197.345923] ceph: wrong peer, want 10.3.14.141:6800/22326, got 0.0.0.0:6800/22949
[ 3197.353589] ceph: osd14 10.3.14.141:6800 wrong peer at address
[ 3197.873330] ceph: get_reply unknown tid 19297 from osd20
[ 3197.878804] ceph: get_reply unknown tid 19297 from osd20
[ 3197.937427] ceph: get_reply unknown tid 19338 from osd6
[ 3197.942813] ceph: get_reply unknown tid 19338 from osd6
[ 3198.067979] ceph: osd13 up
[ 3198.070737] ceph: osd13 weight 0x10000 (in)
[ 3198.384875] ceph: osd15 10.3.14.142:6800 socket closed
[ 3198.391034] ceph: osd15 10.3.14.142:6800 connection failed
[ 3198.470045] ceph: get_reply unknown tid 19368 from osd20
[ 3198.475516] ceph: get_reply unknown tid 19368 from osd20
[ 3198.518417] ceph: get_reply unknown tid 19237 from osd7
[ 3198.523796] ceph: get_reply unknown tid 19237 from osd7
[ 3198.580715] ceph: get_reply unknown tid 19310 from osd24
[ 3198.586163] ceph: get_reply unknown tid 19310 from osd24
[ 3198.826275] ceph: get_reply unknown tid 19337 from osd6
[ 3198.831652] ceph: get_reply unknown tid 19337 from osd6
[ 3199.291090] ceph: get_reply unknown tid 19129 from osd8
[ 3199.374947] ceph: osd15 10.3.14.142:6800 connection failed
[ 3199.606810] ceph: osd14 down
[ 3200.374872] ceph: osd15 10.3.14.142:6800 connection failed
[ 3201.132985] ceph: osd14 up
[ 3201.135767] ceph: osd14 weight 0x10000 (in)
[ 3201.714212] ceph: get_reply unknown tid 19359 from osd12
[ 3202.027017] ceph: get_reply unknown tid 18793 from osd4
[ 3202.032391] ceph: get_reply unknown tid 18793 from osd4
[ 3202.496914] ceph: wrong peer, want 10.3.14.142:6800/22384, got 0.0.0.0:6800/22993
[ 3202.504581] ceph: osd15 10.3.14.142:6800 wrong peer at address
[ 3203.211860] ceph: osd16 10.3.14.143:6800 socket closed
[ 3203.217918] ceph: osd16 10.3.14.143:6800 connection failed
[ 3203.646578] ceph: get_reply unknown tid 19170 from osd4
[ 3203.651954] ceph: get_reply unknown tid 19170 from osd4
[ 3204.390681] ceph: osd16 10.3.14.143:6800 connection failed
[ 3205.386538] ceph: osd16 10.3.14.143:6800 connection failed
[ 3205.399425] ceph: get_reply unknown tid 19359 from osd12
[ 3205.692298] ceph: get_reply unknown tid 19152 from osd8
[ 3205.697666] ceph: get_reply unknown tid 19152 from osd8
[ 3205.734542] ceph: get_reply unknown tid 19274 from osd8
[ 3205.823650] ceph: get_reply unknown tid 19274 from osd8
[ 3205.829131] ceph: get_reply unknown tid 19430 from osd8
[ 3205.834514] ceph: osd15 down
[ 3206.205480] ceph: get_reply unknown tid 18954 from osd6
[ 3206.210859] ceph: get_reply unknown tid 18954 from osd6
[ 3206.301775] ceph: get_reply unknown tid 19350 from osd24
[ 3206.307248] ceph: get_reply unknown tid 19350 from osd24
[ 3206.318953] ceph: get_reply unknown tid 19100 from osd23
[ 3206.324417] ceph: get_reply unknown tid 19100 from osd23
[ 3206.410734] ceph: wrong peer, want 10.3.14.142:6800/22384, got 10.3.14.142:6800/22993
[ 3206.418755] ceph: osd15 10.3.14.142:6800 wrong peer at address
[ 3206.511693] ceph: get_reply unknown tid 19147 from osd22
[ 3206.517163] ceph: get_reply unknown tid 19147 from osd22
[ 3206.534153] ceph: get_reply unknown tid 19161 from osd22
[ 3206.539626] ceph: get_reply unknown tid 19161 from osd22
[ 3207.418339] ceph: osd16 10.3.14.143:6800 connection failed
[ 3207.460209] ceph: osd15 up
[ 3207.462987] ceph: osd15 weight 0x10000 (in)
[ 3208.327275] ceph: get_reply unknown tid 19326 from osd12
[ 3208.332745] ceph: get_reply unknown tid 19326 from osd12
[ 3208.634776] ceph: osd17 10.3.14.144:6800 socket closed
[ 3208.759689] ceph: get_reply unknown tid 19430 from osd8
[ 3209.751667] ceph: osd16 down
[ 3211.430451] ceph: wrong peer, want 10.3.14.143:6800/22467, got 10.3.14.143:6800/25642
[ 3211.438461] ceph: osd16 10.3.14.143:6800 wrong peer at address
[ 3212.112650] ceph: osd16 up
[ 3212.115438] ceph: osd16 weight 0x10000 (in)
[ 3212.263684] ceph: get_reply unknown tid 19250 from osd11
[ 3212.269138] ceph: get_reply unknown tid 19250 from osd11
[ 3212.280207] ceph: get_reply unknown tid 19007 from osd11
[ 3212.287600] ceph: get_reply unknown tid 19188 from osd11
[ 3212.295288] ceph: get_reply unknown tid 19294 from osd11
[ 3212.300745] ceph: get_reply unknown tid 19007 from osd11
[ 3212.306204] ceph: get_reply unknown tid 19188 from osd11
[ 3212.311654] ceph: get_reply unknown tid 19294 from osd11
[ 3212.484279] ceph: get_reply unknown tid 19197 from osd11
[ 3212.489742] ceph: get_reply unknown tid 19436 from osd11
[ 3212.987531] ceph: get_reply unknown tid 18556 from osd2
[ 3213.005805] ceph: get_reply unknown tid 18628 from osd2
[ 3213.036407] ceph: get_reply unknown tid 18823 from osd2
[ 3213.051125] ceph: get_reply unknown tid 18877 from osd2
[ 3213.056493] ceph: get_reply unknown tid 18877 from osd2
[ 3213.067449] ceph: get_reply unknown tid 18967 from osd2
[ 3213.073694] ceph: get_reply unknown tid 19243 from osd2
[ 3213.079883] ceph: get_reply unknown tid 18556 from osd2
[ 3213.088118] ceph: get_reply unknown tid 18628 from osd2
[ 3213.093493] ceph: get_reply unknown tid 18823 from osd2
[ 3213.098850] ceph: get_reply unknown tid 18967 from osd2
[ 3213.107295] ceph: get_reply unknown tid 19243 from osd2
[ 3213.112667] ceph: get_reply unknown tid 18701 from osd2
[ 3213.118044] ceph: get_reply unknown tid 19251 from osd2
[ 3213.124113] ceph: get_reply unknown tid 18701 from osd2
[ 3213.129489] ceph: get_reply unknown tid 19251 from osd2
[ 3213.134877] ceph: get_reply unknown tid 19241 from osd2
[ 3213.140260] ceph: get_reply unknown tid 19241 from osd2
[ 3213.474689] ceph: mds0 caps stale
[ 3213.654256] ceph: get_reply unknown tid 19436 from osd11
[ 3213.921952] ceph: get_reply unknown tid 18599 from osd2
[ 3214.186854] ceph: osd18 10.3.14.145:6800 socket closed
[ 3214.193756] ceph: get_reply unknown tid 19318 from osd11
[ 3214.473601] ceph: get_reply unknown tid 18727 from osd2
[ 3214.480941] ceph: get_reply unknown tid 18834 from osd2
[ 3214.486642] ceph: get_reply unknown tid 18910 from osd2
[ 3214.492000] ceph: get_reply unknown tid 18599 from osd2
[ 3214.544378] ceph: get_reply unknown tid 18727 from osd2
[ 3214.552980] ceph: get_reply unknown tid 18834 from osd2
[ 3214.558347] ceph: get_reply unknown tid 18910 from osd2
[ 3214.729093] ceph: get_reply unknown tid 18995 from osd2
[ 3214.734479] ceph: get_reply unknown tid 18995 from osd2
[ 3214.752705] ceph: get_reply unknown tid 19182 from osd11
[ 3214.758182] ceph: get_reply unknown tid 19182 from osd11
[ 3215.013843] ceph: get_reply unknown tid 18964 from osd7
[ 3215.139540] ceph: get_reply unknown tid 19197 from osd11
[ 3215.662903] ceph: get_reply unknown tid 19376 from osd4
[ 3215.672091] ceph: get_reply unknown tid 19376 from osd4
[ 3215.820548] ceph: get_reply unknown tid 19264 from osd4
[ 3215.825924] ceph: get_reply unknown tid 19264 from osd4
[ 3215.929423] ceph: get_reply unknown tid 19318 from osd11
[ 3216.482094] ceph: get_reply unknown tid 19288 from osd2
[ 3216.813458] ceph: get_reply unknown tid 18964 from osd7
[ 3216.820804] ceph: get_reply unknown tid 19358 from osd7
[ 3216.841540] ceph: get_reply unknown tid 19288 from osd2
[ 3217.921835] ceph: osd17 down
[ 3218.970746] ceph: osd17 up
[ 3218.973535] ceph: osd17 weight 0x10000 (in)
[ 3219.743914] ceph: osd19 10.3.14.146:6800 socket closed
[ 3223.978633] ceph: osd18 down
[ 3223.981590] ceph: osd18 up
[ 3223.984357] ceph: osd18 weight 0x10000 (in)
[ 3224.450136] ceph: osd20 10.3.14.147:6800 socket closed
[ 3225.195201] ceph: get_reply unknown tid 19180 from osd2
[ 3225.200585] ceph: get_reply unknown tid 19180 from osd2
[ 3230.212618] ceph: osd21 10.3.14.148:6800 socket closed
[ 3234.997108] ceph: osd22 10.3.14.149:6800 socket closed
[ 3240.179060] ceph: osd23 10.3.14.150:6800 socket closed
[ 3245.204815] ceph: osd24 10.3.14.151:6800 socket closed
[ 3259.240231] ceph: mds0 reconnect start
[ 3259.460371] ceph: mds0 reconnect success
[ 3297.133592] ceph: mds0 recovery completed
[ 3313.514341] ceph: mds0 caps stale
[ 3327.905442] ceph: mds0 caps went stale, renewing
[ 3398.532563] ceph: mds0 caps renewed
[ 3428.601883] ceph: wrong peer, want 10.3.14.149:6800/2179, got 10.3.14.149:6800/2993
[ 3428.609742] ceph: osd22 10.3.14.149:6800 wrong peer at address
[ 3428.818030] ceph: osd19 down
[ 3428.820990] ceph: osd19 up
[ 3428.823786] ceph: osd19 weight 0x10000 (in)
[ 3428.828085] ceph: osd20 down
[ 3428.831044] ceph: osd20 up
[ 3428.833836] ceph: osd20 weight 0x10000 (in)
[ 3428.838110] ceph: osd21 down
[ 3428.841049] ceph: osd21 up
[ 3428.843827] ceph: osd21 weight 0x10000 (in)
[ 3428.848103] ceph: osd22 down
[ 3428.851055] ceph: osd22 up
[ 3428.853838] ceph: osd22 weight 0x10000 (in)
[ 3428.858124] ceph: osd23 down
[ 3428.861069] ceph: osd23 up
[ 3428.863849] ceph: osd23 weight 0x10000 (in)
[ 3428.868131] ceph: osd24 down
[ 3428.871076] ceph: osd24 up
[ 3428.873865] ceph: osd24 weight 0x10000 (in)
[ 3428.878133] ceph: osd16 down
[ 3428.881093] ceph: osd15 down
[22462.944208] ceph: osd2 down
[22467.355585] ceph: osd2 up
[22467.358270] ceph: osd2 weight 0x10000 (in)
[22472.809066] ceph: osd3 up
[22472.811765] ceph: osd3 weight 0x10000 (in)
[22474.629180] ceph: osd3 10.3.14.130:6800 socket closed
[22474.635688] ceph: osd3 10.3.14.130:6800 connection failed
[22474.642481] ceph: osd4 down
[22475.825502] ceph: osd3 10.3.14.130:6800 connection failed
[22475.950340] ceph: osd4 up
[22475.953012] ceph: osd4 weight 0x10000 (in)
[22476.821392] ceph: osd3 10.3.14.130:6800 connection failed
[22478.825205] ceph: osd3 10.3.14.130:6800 connection failed
[22479.094782] ceph: osd5 down
[22482.828881] ceph: osd3 10.3.14.130:6800 connection failed
[22482.960255] ceph: osd5 up
[22482.962966] ceph: osd5 weight 0x10000 (in)
[22487.389774] ceph: osd6 down
[22490.844174] ceph: osd3 10.3.14.130:6800 connection failed
[22490.851103] ceph: osd6 up
[22490.853805] ceph: osd6 weight 0x10000 (in)
[22492.411907] ceph: osd7 down
[22497.409999] ceph: osd7 up
[22497.412694] ceph: osd7 weight 0x10000 (in)
[22497.416882] ceph: osd3 down
[22498.322295] ceph: osd8 down
[22502.537012] ceph: osd8 up
[22502.539709] ceph: osd8 weight 0x10000 (in)
[22512.427722] ceph: osd11 down
[22512.430681] ceph: osd11 up
[22512.433448] ceph: osd11 weight 0x10000 (in)
[22517.543154] ceph: osd12 down
[22517.546109] ceph: osd12 up
[22517.548883] ceph: osd12 weight 0x10000 (in)
[22522.781585] ceph: osd13 down
[22522.784519] ceph: osd13 up
[22522.787302] ceph: osd13 weight 0x10000 (in)
[22527.827674] ceph: osd14 down
[22527.830641] ceph: osd14 up
[22527.833441] ceph: osd14 weight 0x10000 (in)
[22532.908112] ceph: osd15 up
[22532.910899] ceph: osd15 weight 0x10000 (in)
[22532.915174] ceph: osd16 up
[22532.917952] ceph: osd16 weight 0x10000 (in)
[22537.864994] ceph: osd17 down
[22539.903889] ceph: osd17 up
[22539.906647] ceph: osd17 weight 0x10000 (in)
[22547.731344] ceph: osd18 down
[22547.734279] ceph: osd18 up
[22547.737061] ceph: osd18 weight 0x10000 (in)
[22552.488251] ceph: osd19 down
[22552.491200] ceph: osd19 up
[22552.493967] ceph: osd19 weight 0x10000 (in)
[22557.495706] ceph: osd20 down
[22562.503311] ceph: osd20 up
[22562.506094] ceph: osd20 weight 0x10000 (in)
[22567.510964] ceph: osd21 down
[22567.513926] ceph: osd21 up
[22567.516693] ceph: osd21 weight 0x10000 (in)
[22572.533545] ceph: osd22 down
[22572.536490] ceph: osd22 up
[22572.539273] ceph: osd22 weight 0x10000 (in)
[22577.590767] ceph: osd23 down
[22577.593722] ceph: osd23 up
[22577.596504] ceph: osd23 weight 0x10000 (in)
[22578.422528] ceph: mds0 10.3.14.152:6800 socket closed
[22581.828176] ceph: mds0 10.3.14.152:6800 connection failed
[22582.533526] ceph: osd24 down
[22585.000644] ceph: mon0 10.3.14.154:6789 socket closed
[22585.005794] ceph: mon0 10.3.14.154:6789 session lost, hunting for new mon
[22585.012926] ceph: mon0 10.3.14.154:6789 connection failed
[22587.602002] ceph: osd24 up
[22587.604807] ceph: osd24 weight 0x10000 (in)
[22587.609669] ceph: mon0 10.3.14.154:6789 session established
[22597.558647] ceph:  tid 19573 timed out on osd15, will reset osd
[22597.571475] ceph:  tid 19679 timed out on osd16, will reset osd
[22631.827495] ceph: mds0 caps stale
[22651.825717] ceph: mds0 caps stale
[22657.665302] ceph:  tid 19573 timed out on osd15, will reset osd
[22657.672042] ceph:  tid 19679 timed out on osd16, will reset osd
[22717.771955] ceph:  tid 19573 timed out on osd15, will reset osd
[22717.778698] ceph:  tid 19679 timed out on osd16, will reset osd
[22777.878601] ceph:  tid 19573 timed out on osd15, will reset osd
[22777.885334] ceph:  tid 19679 timed out on osd16, will reset osd
[22837.985243] ceph:  tid 19573 timed out on osd15, will reset osd
[22837.991975] ceph:  tid 19679 timed out on osd16, will reset osd
[22898.091879] ceph:  tid 19573 timed out on osd15, will reset osd
[22898.098617] ceph:  tid 19679 timed out on osd16, will reset osd
[22958.198662] ceph:  tid 19573 timed out on osd15, will reset osd
[22958.205408] ceph:  tid 19679 timed out on osd16, will reset osd
[23018.305431] ceph:  tid 19573 timed out on osd15, will reset osd
[23018.312162] ceph:  tid 19679 timed out on osd16, will reset osd
[23078.412188] ceph:  tid 19573 timed out on osd15, will reset osd
[23078.418931] ceph:  tid 19679 timed out on osd16, will reset osd
[23138.518932] ceph:  tid 19573 timed out on osd15, will reset osd
[23138.525663] ceph:  tid 19679 timed out on osd16, will reset osd
[23198.625666] ceph:  tid 19573 timed out on osd15, will reset osd
[23198.632409] ceph:  tid 19679 timed out on osd16, will reset osd
[23258.732388] ceph:  tid 19573 timed out on osd15, will reset osd
[23258.739127] ceph:  tid 19679 timed out on osd16, will reset osd
[23318.839100] ceph:  tid 19573 timed out on osd15, will reset osd
[23318.845842] ceph:  tid 19679 timed out on osd16, will reset osd
[23378.945802] ceph:  tid 19573 timed out on osd15, will reset osd
[23378.952551] ceph:  tid 19679 timed out on osd16, will reset osd
[23439.052496] ceph:  tid 19573 timed out on osd15, will reset osd
[23439.059231] ceph:  tid 19679 timed out on osd16, will reset osd
[23499.159182] ceph:  tid 19573 timed out on osd15, will reset osd
[23499.165945] ceph:  tid 19679 timed out on osd16, will reset osd
[23559.265858] ceph:  tid 19573 timed out on osd15, will reset osd
[23559.272615] ceph:  tid 19679 timed out on osd16, will reset osd
[23619.372528] ceph:  tid 19573 timed out on osd15, will reset osd
[23619.379278] ceph:  tid 19679 timed out on osd16, will reset osd
[23679.479191] ceph:  tid 19573 timed out on osd15, will reset osd
[23679.485937] ceph:  tid 19679 timed out on osd16, will reset osd
[23739.585846] ceph:  tid 19573 timed out on osd15, will reset osd
[23739.592580] ceph:  tid 19679 timed out on osd16, will reset osd
[23799.692495] ceph:  tid 19573 timed out on osd15, will reset osd
[23799.699243] ceph:  tid 19679 timed out on osd16, will reset osd
[23859.799138] ceph:  tid 19573 timed out on osd15, will reset osd
[23859.805879] ceph:  tid 19679 timed out on osd16, will reset osd
[23919.905776] ceph:  tid 19573 timed out on osd15, will reset osd
[23919.912520] ceph:  tid 19679 timed out on osd16, will reset osd
[23980.012408] ceph:  tid 19573 timed out on osd15, will reset osd
[23980.019149] ceph:  tid 19679 timed out on osd16, will reset osd
[24040.119035] ceph:  tid 19573 timed out on osd15, will reset osd
[24040.125787] ceph:  tid 19679 timed out on osd16, will reset osd
[24100.225658] ceph:  tid 19573 timed out on osd15, will reset osd
[24100.232396] ceph:  tid 19679 timed out on osd16, will reset osd
[24117.435968] ceph: osd3 up
[24117.438659] ceph: osd3 weight 0x10000 (in)
[24123.596590] ceph: osd16 10.3.14.143:6800 socket closed
[24123.602694] ceph: osd16 10.3.14.143:6800 connection failed
[24124.687577] ceph: osd16 10.3.14.143:6800 connection failed
[24125.511896] ceph: get_reply unknown tid 19573 from osd15
[24125.567312] ceph: get_reply unknown tid 20215 from osd15
[24125.572780] ceph: get_reply unknown tid 20525 from osd15
[24125.578255] ceph: get_reply unknown tid 21115 from osd15
[24125.583729] ceph: get_reply unknown tid 21202 from osd15
[24125.687410] ceph: osd16 10.3.14.143:6800 connection failed
[24126.061360] ceph: get_reply unknown tid 22277 from osd15
[24126.066825] ceph: get_reply unknown tid 25526 from osd15
[24126.103519] ceph: get_reply unknown tid 26033 from osd15
[24126.557363] ceph: get_reply unknown tid 29849 from osd15
[24126.562874] ceph: get_reply unknown tid 30430 from osd15
[24126.568895] ceph: get_reply unknown tid 19912 from osd15
[24126.574385] ceph: get_reply unknown tid 22174 from osd15
[24126.588263] ceph: get_reply unknown tid 23563 from osd15
[24126.593727] ceph: get_reply unknown tid 23869 from osd15
[24126.599190] ceph: get_reply unknown tid 24355 from osd15
[24126.604672] ceph: get_reply unknown tid 25629 from osd15
[24126.610123] ceph: get_reply unknown tid 26767 from osd15
[24126.615578] ceph: get_reply unknown tid 30189 from osd15
[24126.621042] ceph: get_reply unknown tid 30403 from osd15
[24126.880924] ceph: osd3 10.3.14.130:6800 socket closed
[24126.886867] ceph: osd3 10.3.14.130:6800 connection failed
[24127.707243] ceph: osd3 10.3.14.130:6800 connection failed
[24127.712982] ceph: osd16 10.3.14.143:6800 connection failed
[24128.707175] ceph: osd3 10.3.14.130:6800 connection failed
[24130.315479] ceph: get_reply unknown tid 22795 from osd15
[24130.710984] ceph: osd3 10.3.14.130:6800 connection failed
[24131.714910] ceph: osd16 10.3.14.143:6800 connection failed
[24134.714608] ceph: osd3 10.3.14.130:6800 connection failed
[24139.722207] ceph: osd16 10.3.14.143:6800 connection failed
[24142.729893] ceph: osd3 10.3.14.130:6800 connection failed
[24143.686577] ceph: osd16 down
[24149.557765] ceph: get_reply unknown tid 23425 from osd15
[24149.563225] ceph: get_reply unknown tid 26224 from osd15
[24149.571403] ceph: get_reply unknown tid 26698 from osd15
[24149.576903] ceph: get_reply unknown tid 27167 from osd15
[24149.582416] ceph: get_reply unknown tid 27252 from osd15
[24149.587908] ceph: get_reply unknown tid 28795 from osd15
[24149.593428] ceph: get_reply unknown tid 30270 from osd15
[24149.598936] ceph: get_reply unknown tid 30460 from osd15
[24149.604435] ceph: get_reply unknown tid 30765 from osd15
[24149.610041] ceph: osd3 down
[24210.447788] ceph:  tid 19910 timed out on osd15, will reset osd
[24270.538407] ceph:  tid 19910 timed out on osd15, will reset osd
[24330.629022] ceph:  tid 19910 timed out on osd15, will reset osd
[24390.719637] ceph:  tid 19910 timed out on osd15, will reset osd
[24450.810249] ceph:  tid 19910 timed out on osd15, will reset osd
[24510.900860] ceph:  tid 19910 timed out on osd15, will reset osd
[24570.991468] ceph:  tid 19910 timed out on osd15, will reset osd
[24631.082076] ceph:  tid 19910 timed out on osd15, will reset osd
[24691.172691] ceph:  tid 19910 timed out on osd15, will reset osd
[24751.263286] ceph:  tid 19910 timed out on osd15, will reset osd
[24811.353888] ceph:  tid 19910 timed out on osd15, will reset osd
[24871.444488] ceph:  tid 19910 timed out on osd15, will reset osd
[24931.535087] ceph:  tid 19910 timed out on osd15, will reset osd
[24991.625685] ceph:  tid 19910 timed out on osd15, will reset osd
[25051.716281] ceph:  tid 19910 timed out on osd15, will reset osd
[25111.806876] ceph:  tid 19910 timed out on osd15, will reset osd
[25171.897469] ceph:  tid 19910 timed out on osd15, will reset osd
[25231.988061] ceph:  tid 19910 timed out on osd15, will reset osd
[25292.078653] ceph:  tid 19910 timed out on osd15, will reset osd
[25352.169242] ceph:  tid 19910 timed out on osd15, will reset osd
[25412.259831] ceph:  tid 19910 timed out on osd15, will reset osd
[25472.350418] ceph:  tid 19910 timed out on osd15, will reset osd
[25532.441004] ceph:  tid 19910 timed out on osd15, will reset osd
[25592.531590] ceph:  tid 19910 timed out on osd15, will reset osd
[25652.622174] ceph:  tid 19910 timed out on osd15, will reset osd
[25712.712756] ceph:  tid 19910 timed out on osd15, will reset osd
[25772.803386] ceph:  tid 19910 timed out on osd15, will reset osd
[25832.894017] ceph:  tid 19910 timed out on osd15, will reset osd
[25892.984646] ceph:  tid 19910 timed out on osd15, will reset osd
[25953.075273] ceph:  tid 19910 timed out on osd15, will reset osd
[26013.165897] ceph:  tid 19910 timed out on osd15, will reset osd
[26073.256520] ceph:  tid 19910 timed out on osd15, will reset osd
[26133.347140] ceph:  tid 19910 timed out on osd15, will reset osd
[26193.437759] ceph:  tid 19910 timed out on osd15, will reset osd
[26253.528398] ceph:  tid 19910 timed out on osd15, will reset osd
[26313.619076] ceph:  tid 19910 timed out on osd15, will reset osd
[26373.709753] ceph:  tid 19910 timed out on osd15, will reset osd
[26433.800419] ceph:  tid 19910 timed out on osd15, will reset osd
[26493.891086] ceph:  tid 19910 timed out on osd15, will reset osd
[26553.981750] ceph:  tid 19910 timed out on osd15, will reset osd
[26614.072410] ceph:  tid 19910 timed out on osd15, will reset osd
[26674.163068] ceph:  tid 19910 timed out on osd15, will reset osd
[26734.253724] ceph:  tid 19910 timed out on osd15, will reset osd
[26794.344404] ceph:  tid 19910 timed out on osd15, will reset osd
[26854.435088] ceph:  tid 19910 timed out on osd15, will reset osd
[26914.525769] ceph:  tid 19910 timed out on osd15, will reset osd
[26974.616446] ceph:  tid 19910 timed out on osd15, will reset osd
[27034.707120] ceph:  tid 19910 timed out on osd15, will reset osd
[27094.797791] ceph:  tid 19910 timed out on osd15, will reset osd
[27154.888458] ceph:  tid 19910 timed out on osd15, will reset osd
[27214.979122] ceph:  tid 19910 timed out on osd15, will reset osd
[27275.069782] ceph:  tid 19910 timed out on osd15, will reset osd
[27335.160440] ceph:  tid 19910 timed out on osd15, will reset osd
[27395.251095] ceph:  tid 19910 timed out on osd15, will reset osd
[27455.341747] ceph:  tid 19910 timed out on osd15, will reset osd
[27515.432396] ceph:  tid 19910 timed out on osd15, will reset osd
[27575.523043] ceph:  tid 19910 timed out on osd15, will reset osd
[27635.613687] ceph:  tid 19910 timed out on osd15, will reset osd
[27695.704328] ceph:  tid 19910 timed out on osd15, will reset osd
[27755.794967] ceph:  tid 19910 timed out on osd15, will reset osd
[27815.885604] ceph:  tid 19910 timed out on osd15, will reset osd
[27875.976238] ceph:  tid 19910 timed out on osd15, will reset osd
[27936.066870] ceph:  tid 19910 timed out on osd15, will reset osd
[27945.170631] ceph: osd3 up
[27945.173304] ceph: osd3 weight 0x10000 (in)
[27968.364590] ceph: osd6 down
[27968.367442] ceph: osd19 down
[28006.172605] ceph:  tid 19679 timed out on osd3, will reset osd
[28026.070488] ceph: osd6 up
[28026.073187] ceph: osd6 weight 0x10000 (in)
[28031.342586] ceph: osd16 up
[28031.345350] ceph: osd16 weight 0x10000 (in)
[28034.788727] ceph: osd19 up
[28034.791512] ceph: osd19 weight 0x10000 (in)
[28090.668819] ceph: get_reply unknown tid 20776 from osd3
[28090.674214] ceph: get_reply unknown tid 22563 from osd3
[28090.679598] ceph: get_reply unknown tid 26876 from osd3
[28090.727881] ceph: get_reply unknown tid 28772 from osd3
[28090.760869] ceph: get_reply unknown tid 19679 from osd3
[28090.766250] ceph: get_reply unknown tid 20075 from osd3
[28090.777733] ceph: get_reply unknown tid 20362 from osd3
[28090.813099] ceph: get_reply unknown tid 20533 from osd3
[28091.321042] ceph: get_reply unknown tid 27556 from osd3
[28091.624473] ceph: get_reply unknown tid 28624 from osd3
[28091.629855] ceph: get_reply unknown tid 29669 from osd3
[28091.665150] ceph: get_reply unknown tid 25146 from osd3
[28096.332542] ceph:  tid 22188 timed out on osd3, will reset osd
[28180.715706] ceph: mds0 reconnect start
[28181.071238] ceph: mds0 reconnect success
[28231.392356] ceph: mds0 caps stale
[28251.390565] ceph: mds0 caps stale
[28280.062503] ceph: mds0 recovery completed
[28396.157325] ceph: mds0 caps went stale, renewing
[28452.975129] ceph: mds0 caps renewed
[34599.211043] ceph: client4107 fsid 1d235fcf-9728-52a3-ba9e-1becb44d989c
[34599.218033] ceph: mon0 10.3.14.10:6789 session established
[34709.689871] ceph: client4110 fsid 1d235fcf-9728-52a3-ba9e-1becb44d989c
[34709.696828] ceph: mon0 10.3.14.10:6789 session established
[35197.307321] ceph: client4113 fsid 1d235fcf-9728-52a3-ba9e-1becb44d989c
[35197.314305] ceph: mon0 10.3.14.10:6789 session established
  syslogd: /var/log/news/news.crit: No such file or directory
  syslogd: /var/log/news/news.err: No such file or directory
  syslogd: /var/log/news/news.notice: No such file or directory
[94880.387538] ceph: osd15 10.3.14.142:6800 socket closed
[94880.392791] INFO: trying to register non-static key.
[94880.396785] the code is fine but needs lockdep annotation.
[94880.396785] turning off the locking correctness validator.
[94880.396785] Pid: 10, comm: kworker/0:1 Not tainted 2.6.36-rc7+ #61
[94880.396785] Call Trace:
[94880.396785]  [<ffffffff8105dd50>] ? static_obj+0x43/0x53
[94880.396785]  [<ffffffff8106234c>] __lock_acquire+0x852/0x87a
[94880.396785]  [<ffffffff810623fc>] lock_acquire+0x88/0xa5
[94880.396785]  [<ffffffffa002bf1b>] ? osd_reset+0x40/0x8d [ceph]
[94880.396785]  [<ffffffff814b5382>] down_read+0x47/0x8d
[94880.396785]  [<ffffffffa002bf1b>] ? osd_reset+0x40/0x8d [ceph]
[94880.396785]  [<ffffffffa002bf1b>] osd_reset+0x40/0x8d [ceph]
[94880.396785]  [<ffffffffa001fef5>] con_work+0x37b/0x6bb [ceph]
[94880.396785]  [<ffffffff8104c89f>] process_one_work+0x1fd/0x38f
[94880.396785]  [<ffffffff8104c83d>] ? process_one_work+0x19b/0x38f
[94880.396785]  [<ffffffffa001fb7a>] ? con_work+0x0/0x6bb [ceph]
[94880.396785]  [<ffffffff8104e269>] worker_thread+0x147/0x22b
[94880.396785]  [<ffffffff8104e122>] ? worker_thread+0x0/0x22b
[94880.396785]  [<ffffffff81051a6d>] kthread+0x8d/0x95
[94880.396785]  [<ffffffff81003794>] kernel_thread_helper+0x4/0x10
[94880.396785]  [<ffffffff81030fe9>] ? finish_task_switch+0x0/0xa8
[94880.396785]  [<ffffffff81031052>] ? finish_task_switch+0x69/0xa8
[94880.396785]  [<ffffffff814b6f00>] ? restore_args+0x0/0x30
[94880.396785]  [<ffffffff810519e0>] ? kthread+0x0/0x95
[94880.396785]  [<ffffffff81003790>] ? kernel_thread_helper+0x0/0x10
[94880.535663] BUG: unable to handle kernel NULL pointer dereference at (null)
[94880.539585] IP: [<ffffffff8126ffbd>] __list_add+0x42/0x89
[94880.539585] PGD 11d119067 PUD 11cbc0067 PMD 0 
[94880.539585] Oops: 0000 [#1] PREEMPT SMP 
[94880.539585] last sysfs file: /sys/kernel/uevent_seqnum
[94880.539585] CPU 0 
[94880.539585] Modules linked in: ceph
[94880.539585] 
[94880.539585] Pid: 10, comm: kworker/0:1 Not tainted 2.6.36-rc7+ #61 PDSMi+/PDSMi
[94880.539585] RIP: 0010:[<ffffffff8126ffbd>]  [<ffffffff8126ffbd>] __list_add+0x42/0x89
[94880.539585] RSP: 0018:ffff88011faddc40  EFLAGS: 00010046
[94880.539585] RAX: 0000000000000000 RBX: ffff88011e1be958 RCX: 0000000000000000
[94880.539585] RDX: ffff88011e1be958 RSI: 0000000000000000 RDI: ffff88011faddc90
[94880.539585] RBP: ffff88011faddc60 R08: 0000000000000000 R09: ffff88011faddc90
[94880.539585] R10: ffffffff81056048 R11: ffffffff81055a87 R12: 0000000000000000
[94880.539585] R13: ffff88011faddc90 R14: ffff88011fada280 R15: ffffffffa002be61
[94880.539585] FS:  0000000000000000(0000) GS:ffff880002600000(0000) knlGS:0000000000000000
[94880.539585] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[94880.539585] CR2: 0000000000000000 CR3: 000000011dcd4000 CR4: 00000000000006f0
[94880.539585] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[94880.539585] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[94880.539585] Process kworker/0:1 (pid: 10, threadinfo ffff88011fadc000, task ffff88011fada280)
[94880.539585] Stack:
[94880.539585]  ffff88011e1be910 00000000ffffffff ffff88011e1be910 0000000000000202
[94880.539585] <0> ffff88011faddce0 ffffffff814b479f ffffffffa002be61 0000000000000000
[94880.539585] <0> ffff88011faddce0 0000000000000246 ffff88011faddc90 ffff88011faddc90
[94880.539585] Call Trace:
[94880.539585]  [<ffffffff814b479f>] mutex_lock_nested+0x130/0x31e
[94880.539585]  [<ffffffffa002be61>] ? kick_requests+0x24/0x9e [ceph]
[94880.539585]  [<ffffffffa002be61>] kick_requests+0x24/0x9e [ceph]
[94880.539585]  [<ffffffffa002bf1b>] ? osd_reset+0x40/0x8d [ceph]
[94880.539585]  [<ffffffffa002bf26>] osd_reset+0x4b/0x8d [ceph]
[94880.539585]  [<ffffffffa001fef5>] con_work+0x37b/0x6bb [ceph]
[94880.539585]  [<ffffffff8104c89f>] process_one_work+0x1fd/0x38f
[94880.539585]  [<ffffffff8104c83d>] ? process_one_work+0x19b/0x38f
[94880.539585]  [<ffffffffa001fb7a>] ? con_work+0x0/0x6bb [ceph]
[94880.539585]  [<ffffffff8104e269>] worker_thread+0x147/0x22b
[94880.539585]  [<ffffffff8104e122>] ? worker_thread+0x0/0x22b
[94880.539585]  [<ffffffff81051a6d>] kthread+0x8d/0x95
[94880.539585]  [<ffffffff81003794>] kernel_thread_helper+0x4/0x10
[94880.539585]  [<ffffffff81030fe9>] ? finish_task_switch+0x0/0xa8
[94880.539585]  [<ffffffff81031052>] ? finish_task_switch+0x69/0xa8
[94880.539585]  [<ffffffff814b6f00>] ? restore_args+0x0/0x30
[94880.539585]  [<ffffffff810519e0>] ? kthread+0x0/0x95
[94880.539585]  [<ffffffff81003790>] ? kernel_thread_helper+0x0/0x10
[94880.539585] Code: 8b 42 08 48 39 f0 74 23 49 89 d1 49 89 c0 48 89 f1 48 c7 c2 06 d6 64 81 be 1a 00 00 00 48 c7 c7 51 d5 64 81 31 c0 e8 fc a5 dc ff <49> 8b 04 24 48 39 d8 74 23 49 89 c0 4d 89 e1 48 89 d9 48 c7 c2 
[94880.539585] RIP  [<ffffffff8126ffbd>] __list_add+0x42/0x89
[94880.539585]  RSP <ffff88011faddc40>
[94880.539585] CR2: 0000000000000000
[94880.539585] ---[ end trace 555371ce86832624 ]---
[94880.539585] note: kworker/0:1[10] exited with preempt_count 1
[94880.853864] BUG: unable to handle kernel paging request at fffffffffffffff8
[94880.857795] IP: [<ffffffff810516f4>] kthread_data+0xb/0x11
[94880.857795] PGD 174d067 PUD 174e067 PMD 0 
[94880.857795] Oops: 0000 [#2] PREEMPT SMP 
[94880.857795] last sysfs file: /sys/kernel/uevent_seqnum
[94880.857795] CPU 0 
[94880.857795] Modules linked in: ceph
[94880.857795] 
[94880.857795] Pid: 10, comm: kworker/0:1 Tainted: G      D     2.6.36-rc7+ #61 PDSMi+/PDSMi
[94880.857795] RIP: 0010:[<ffffffff810516f4>]  [<ffffffff810516f4>] kthread_data+0xb/0x11
[94880.857795] RSP: 0018:ffff88011fadd828  EFLAGS: 00010092
[94880.857795] RAX: 0000000000000000 RBX: ffff88011fada760 RCX: ffff88011fada280
[94880.857795] RDX: 0000000000000040 RSI: 0000000000000000 RDI: ffff88011fada280
[94880.857795] RBP: ffff88011fadd828 R08: 0000000000000002 R09: ffffffff814b374f
[94880.857795] R10: ffff88011fada280 R11: ffff88011fadd888 R12: ffff88011fada280
[94880.857795] R13: 0000000000000000 R14: ffff88011fada270 R15: 0000000000000000
[94880.857795] FS:  0000000000000000(0000) GS:ffff880002600000(0000) knlGS:0000000000000000
[94880.857795] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[94880.857795] CR2: fffffffffffffff8 CR3: 000000011dcd4000 CR4: 00000000000006f0
[94880.857795] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[94880.857795] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[94880.857795] Process kworker/0:1 (pid: 10, threadinfo ffff88011fadc000, task ffff88011fada280)
[94880.857795] Stack:
[94880.857795]  ffff88011fadd858 ffffffff8104deca ffffffff814b374f ffff88011fada760
[94880.857795] <0> ffff88011fada280 ffff8800027d2800 ffff88011fadd928 ffffffff814b37d6
[94880.857795] <0> ffff88011faddfd8 0000000000004000 00000000001d2800 00000000001d2800
[94880.857795] Call Trace:
[94880.857795]  [<ffffffff8104deca>] wq_worker_sleeping+0x15/0x82
[94880.857795]  [<ffffffff814b374f>] ? schedule+0x108/0x7d3
[94880.857795]  [<ffffffff814b37d6>] schedule+0x18f/0x7d3
[94880.857795]  [<ffffffff8103e14b>] do_exit+0x676/0x67a
[94880.857795]  [<ffffffff810068d5>] oops_end+0xb2/0xba
[94880.857795]  [<ffffffff81022e24>] no_context+0x1f5/0x204
[94880.857795]  [<ffffffff81023075>] __bad_area_nosemaphore+0x186/0x1a9
[94880.857795]  [<ffffffff8102310e>] bad_area_nosemaphore+0xe/0x10
[94880.857795]  [<ffffffff810234c6>] do_page_fault+0x177/0x35d
[94880.857795]  [<ffffffff814b6ad8>] ? _raw_spin_unlock_irq+0x36/0x53
[94880.857795]  [<ffffffff81031052>] ? finish_task_switch+0x69/0xa8
[94880.857795]  [<ffffffff81030fe9>] ? finish_task_switch+0x0/0xa8
[94880.857795]  [<ffffffff814b3df2>] ? schedule+0x7ab/0x7d3
[94880.857795]  [<ffffffff814b5dfe>] ? trace_hardirqs_off_thunk+0x3a/0x3c
[94880.857795]  [<ffffffff81055a87>] ? up+0xf/0x39
[94880.857795]  [<ffffffff81056048>] ? __atomic_notifier_call_chain+0x0/0xb4
[94880.857795]  [<ffffffffa002be61>] ? kick_requests+0x24/0x9e [ceph]
[94880.857795]  [<ffffffff814b711f>] page_fault+0x1f/0x30
[94880.857795]  [<ffffffffa002be61>] ? kick_requests+0x24/0x9e [ceph]
[94880.857795]  [<ffffffff81055a87>] ? up+0xf/0x39
[94880.857795]  [<ffffffff81056048>] ? __atomic_notifier_call_chain+0x0/0xb4
[94880.857795]  [<ffffffff8126ffbd>] ? __list_add+0x42/0x89
[94880.857795]  [<ffffffff814b479f>] mutex_lock_nested+0x130/0x31e
[94880.857795]  [<ffffffffa002be61>] ? kick_requests+0x24/0x9e [ceph]
[94880.857795]  [<ffffffffa002be61>] kick_requests+0x24/0x9e [ceph]
[94880.857795]  [<ffffffffa002bf1b>] ? osd_reset+0x40/0x8d [ceph]
[94880.857795]  [<ffffffffa002bf26>] osd_reset+0x4b/0x8d [ceph]
[94880.857795]  [<ffffffffa001fef5>] con_work+0x37b/0x6bb [ceph]
[94880.857795]  [<ffffffff8104c89f>] process_one_work+0x1fd/0x38f
[94880.857795]  [<ffffffff8104c83d>] ? process_one_work+0x19b/0x38f
[94880.857795]  [<ffffffffa001fb7a>] ? con_work+0x0/0x6bb [ceph]
[94880.857795]  [<ffffffff8104e269>] worker_thread+0x147/0x22b
[94880.857795]  [<ffffffff8104e122>] ? worker_thread+0x0/0x22b
[94880.857795]  [<ffffffff81051a6d>] kthread+0x8d/0x95
[94880.857795]  [<ffffffff81003794>] kernel_thread_helper+0x4/0x10
[94880.857795]  [<ffffffff81030fe9>] ? finish_task_switch+0x0/0xa8
[94880.857795]  [<ffffffff81031052>] ? finish_task_switch+0x69/0xa8
[94880.857795]  [<ffffffff814b6f00>] ? restore_args+0x0/0x30
[94880.857795]  [<ffffffff810519e0>] ? kthread+0x0/0x95
[94880.857795]  [<ffffffff81003790>] ? kernel_thread_helper+0x0/0x10
[94880.857795] Code: 41 5f c9 c3 90 90 90 55 65 48 8b 04 25 40 b5 00 00 48 8b 80 f0 01 00 00 48 89 e5 8b 40 f0 c9 c3 48 8b 87 f0 01 00 00 55 48 89 e5 <48> 8b 40 f8 c9 c3 55 48 83 c7 78 48 89 e5 e8 af 1f fe ff c9 c3 
[94880.857795] RIP  [<ffffffff810516f4>] kthread_data+0xb/0x11
[94880.857795]  RSP <ffff88011fadd828>
[94880.857795] CR2: fffffffffffffff8
[94880.857795] ---[ end trace 555371ce86832625 ]---
[94880.857795] Fixing recursive fault but reboot is needed!
[94880.857795] BUG: spinlock lockup on CPU#0, kworker/0:1/10, ffff8800027d2800
[94880.857795] Pid: 10, comm: kworker/0:1 Tainted: G      D     2.6.36-rc7+ #61
[94880.857795] Call Trace:
[94880.857795]  [<ffffffff8126fba7>] do_raw_spin_lock+0x109/0x135
[94880.857795]  [<ffffffff814b6135>] _raw_spin_lock_irq+0x62/0x77
[94880.857795]  [<ffffffff814b374f>] ? schedule+0x108/0x7d3
[94880.857795]  [<ffffffff814b374f>] schedule+0x108/0x7d3
[94880.857795]  [<ffffffff8103dba1>] do_exit+0xcc/0x67a
[94880.857795]  [<ffffffff8103be4b>] ? kmsg_dump+0x137/0x160
[94880.857795]  [<ffffffff810068d5>] oops_end+0xb2/0xba
[94880.857795]  [<ffffffff81022e24>] no_context+0x1f5/0x204
[94880.857795]  [<ffffffff8105e952>] ? print_lock_contention_bug+0x1b/0xe0
[94880.857795]  [<ffffffff81023075>] __bad_area_nosemaphore+0x186/0x1a9
[94880.857795]  [<ffffffff8102310e>] bad_area_nosemaphore+0xe/0x10
[94880.857795]  [<ffffffff810234c6>] do_page_fault+0x177/0x35d
[94880.857795]  [<ffffffff810e2b45>] ? fsnotify_clear_marks_by_inode+0x2d/0xd5
[94880.857795]  [<ffffffff814b5dfe>] ? trace_hardirqs_off_thunk+0x3a/0x3c
[94880.857795]  [<ffffffff814b374f>] ? schedule+0x108/0x7d3
[94880.857795]  [<ffffffff814b711f>] page_fault+0x1f/0x30
[94880.857795]  [<ffffffff814b374f>] ? schedule+0x108/0x7d3
[94880.857795]  [<ffffffff810516f4>] ? kthread_data+0xb/0x11
[94880.857795]  [<ffffffff8104deca>] wq_worker_sleeping+0x15/0x82
[94880.857795]  [<ffffffff814b374f>] ? schedule+0x108/0x7d3
[94880.857795]  [<ffffffff814b37d6>] schedule+0x18f/0x7d3
[94880.857795]  [<ffffffff8103e14b>] do_exit+0x676/0x67a
[94880.857795]  [<ffffffff810068d5>] oops_end+0xb2/0xba
[94880.857795]  [<ffffffff81022e24>] no_context+0x1f5/0x204
[94880.857795]  [<ffffffff81023075>] __bad_area_nosemaphore+0x186/0x1a9
[94880.857795]  [<ffffffff8102310e>] bad_area_nosemaphore+0xe/0x10
[94880.857795]  [<ffffffff810234c6>] do_page_fault+0x177/0x35d
[94880.857795]  [<ffffffff814b6ad8>] ? _raw_spin_unlock_irq+0x36/0x53
[94880.857795]  [<ffffffff81031052>] ? finish_task_switch+0x69/0xa8
[94880.857795]  [<ffffffff81030fe9>] ? finish_task_switch+0x0/0xa8
[94880.857795]  [<ffffffff814b3df2>] ? schedule+0x7ab/0x7d3
[94880.857795]  [<ffffffff814b5dfe>] ? trace_hardirqs_off_thunk+0x3a/0x3c
[94880.857795]  [<ffffffff81055a87>] ? up+0xf/0x39
[94880.857795]  [<ffffffff81056048>] ? __atomic_notifier_call_chain+0x0/0xb4
[94880.857795]  [<ffffffffa002be61>] ? kick_requests+0x24/0x9e [ceph]
[94880.857795]  [<ffffffff814b711f>] page_fault+0x1f/0x30
[94880.857795]  [<ffffffffa002be61>] ? kick_requests+0x24/0x9e [ceph]
[94880.857795]  [<ffffffff81055a87>] ? up+0xf/0x39
[94880.857795]  [<ffffffff81056048>] ? __atomic_notifier_call_chain+0x0/0xb4
[94880.857795]  [<ffffffff8126ffbd>] ? __list_add+0x42/0x89
[94880.857795]  [<ffffffff814b479f>] mutex_lock_nested+0x130/0x31e
[94880.857795]  [<ffffffffa002be61>] ? kick_requests+0x24/0x9e [ceph]
[94880.857795]  [<ffffffffa002be61>] kick_requests+0x24/0x9e [ceph]
[94880.857795]  [<ffffffffa002bf1b>] ? osd_reset+0x40/0x8d [ceph]
[94880.857795]  [<ffffffffa002bf26>] osd_reset+0x4b/0x8d [ceph]
[94880.857795]  [<ffffffffa001fef5>] con_work+0x37b/0x6bb [ceph]
[94880.857795]  [<ffffffff8104c89f>] process_one_work+0x1fd/0x38f
[94880.857795]  [<ffffffff8104c83d>] ? process_one_work+0x19b/0x38f
[94880.857795]  [<ffffffffa001fb7a>] ? con_work+0x0/0x6bb [ceph]
[94880.857795]  [<ffffffff8104e269>] worker_thread+0x147/0x22b
[94880.857795]  [<ffffffff8104e122>] ? worker_thread+0x0/0x22b
[94880.857795]  [<ffffffff81051a6d>] kthread+0x8d/0x95
[94880.857795]  [<ffffffff81003794>] kernel_thread_helper+0x4/0x10
[94880.857795]  [<ffffffff81030fe9>] ? finish_task_switch+0x0/0xa8
[94880.857795]  [<ffffffff81031052>] ? finish_task_switch+0x69/0xa8
[94880.857795]  [<ffffffff814b6f00>] ? restore_args+0x0/0x30
[94880.857795]  [<ffffffff810519e0>] ? kthread+0x0/0x95
[94880.857795]  [<ffffffff81003790>] ? kernel_thread_helper+0x0/0x10
[94880.857795] sending NMI to all CPUs:
[94889.441308] NMI backtrace for cpu 1
[94889.441308] CPU 1 
[94889.441308] Modules linked in: ceph
[94889.441308] 
[94889.441308] Pid: 0, comm: kworker/0:0 Tainted: G      D     2.6.36-rc7+ #61 PDSMi+/PDSMi
[94889.441308] RIP: 0010:[<ffffffff8100aa6d>]  [<ffffffff8100aa6d>] mwait_idle+0x89/0x99
[94889.441308] RSP: 0018:ffff88011fae1ed8  EFLAGS: 00000246
[94889.441308] RAX: 0000000000000000 RBX: ffff8800029d2700 RCX: 0000000000000000
[94889.441308] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff8100aa64
[94889.441308] RBP: ffff88011fae1ef8 R08: ffffffff8175ea90 R09: 0000000000000001
[94889.441308] R10: ffffffff81056048 R11: ffffffff8102da71 R12: ffff88011fae1fd8
[94889.441308] R13: ffff88011fae0010 R14: 0000000000000000 R15: 0000000000000000
[94889.441308] FS:  0000000000000000(0000) GS:ffff880002800000(0000) knlGS:0000000000000000
[94889.441308] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[94889.441308] CR2: 00007f2a37b36380 CR3: 000000000174b000 CR4: 00000000000006e0
[94889.441308] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[94889.441308] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[94889.441308] Process kworker/0:0 (pid: 0, threadinfo ffff88011fae0000, task ffff88011fade2c0)
[94889.441308] Stack:
[94889.441308]  ffff88011fae1ee8 ffff88011fae1fd8 ffffffff817c90c0 0000000000000000
[94889.441308] <0> ffff88011fae1f18 ffffffff81001ac8 ffff88000280d408 0000000000000000
[94889.441308] <0> ffff88011fae1f48 ffffffff814aea64 0000000000000000 0000000000000000
[94889.441308] Call Trace:
[94889.441308]  [<ffffffff81001ac8>] cpu_idle+0x4d/0x7d
[94889.441308]  [<ffffffff814aea64>] start_secondary+0x22b/0x271
[94889.441308] Code: e0 ff ff 31 c9 4c 89 e8 48 89 ca 0f 01 c8 0f ae f0 49 8b 84 24 38 e0 ff ff a8 08 75 10 e8 cf 4a 05 00 31 c9 48 89 c8 fb 0f 01 c9 <eb> 06 e8 bf 4a 05 00 fb 5e 5b 41 5c 41 5d c9 c3 55 48 89 e5 e8 
[94889.441308] Call Trace:
[94889.441308]  [<ffffffff81001ac8>] cpu_idle+0x4d/0x7d
[94889.441308]  [<ffffffff814aea64>] start_secondary+0x22b/0x271
[94889.441308] Pid: 0, comm: kworker/0:0 Tainted: G      D     2.6.36-rc7+ #61
[94889.441308] Call Trace:
[94889.441308]  <NMI>  [<ffffffff8100a603>] ? show_regs+0x26/0x2a
[94889.441308]  [<ffffffff8101aaa0>] nmi_watchdog_tick+0xad/0x19e
[94889.441308]  [<ffffffff81004004>] do_nmi+0xea/0x298
[94889.441308]  [<ffffffff814b740a>] nmi+0x1a/0x2c
[94889.441308]  [<ffffffff8102da71>] ? __wake_up_sync_key+0x27/0x5a
[94889.441308]  [<ffffffff81056048>] ? __atomic_notifier_call_chain+0x0/0xb4
[94889.441308]  [<ffffffff8100aa64>] ? mwait_idle+0x80/0x99
[94889.441308]  [<ffffffff8100aa6d>] ? mwait_idle+0x89/0x99
[94889.441308]  <<EOE>>  [<ffffffff81001ac8>] cpu_idle+0x4d/0x7d
[94889.441308]  [<ffffffff814aea64>] start_secondary+0x22b/0x271
[94889.441266] NMI backtrace for cpu 3
[94889.441266] CPU 3 
[94889.441266] Modules linked in: ceph
[94889.441266] 
[94889.441266] Pid: 0, comm: kworker/0:1 Tainted: G      D     2.6.36-rc7+ #61 PDSMi+/PDSMi
[94889.441266] RIP: 0010:[<ffffffff8100aa6d>]  [<ffffffff8100aa6d>] mwait_idle+0x89/0x99
[94889.441266] RSP: 0018:ffff88011fb2ded8  EFLAGS: 00000246
[94889.441266] RAX: 0000000000000000 RBX: ffff880002dd2700 RCX: 0000000000000000
[94889.441266] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff8100aa64
[94889.441266] RBP: ffff88011fb2def8 R08: ffffffff8175ea90 R09: 0000000000000001
[94889.441266] R10: ffffffff81056048 R11: ffffffff810521b7 R12: ffff88011fb2dfd8
[94889.441266] R13: ffff88011fb2c010 R14: 0000000000000000 R15: 0000000000000000
[94889.441266] FS:  0000000000000000(0000) GS:ffff880002c00000(0000) knlGS:0000000000000000
[94889.441266] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[94889.441266] CR2: 0000000000612188 CR3: 000000011defa000 CR4: 00000000000006e0
[94889.441266] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[94889.441266] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[94889.441266] Process kworker/0:1 (pid: 0, threadinfo ffff88011fb2c000, task ffff88011fb2a4c0)
[94889.441266] Stack:
[94889.441266]  ffff88011fb2dee8 ffff88011fb2dfd8 ffffffff817c90c0 0000000000000000
[94889.441266] <0> ffff88011fb2df18 ffffffff81001ac8 ffff880002c0d408 0000000000000000
[94889.441266] <0> ffff88011fb2df48 ffffffff814aea64 0000000000000000 0000000000000000
[94889.441266] Call Trace:
[94889.441266]  [<ffffffff81001ac8>] cpu_idle+0x4d/0x7d
[94889.441266]  [<ffffffff814aea64>] start_secondary+0x22b/0x271
[94889.441266] Code: e0 ff ff 31 c9 4c 89 e8 48 89 ca 0f 01 c8 0f ae f0 49 8b 84 24 38 e0 ff ff a8 08 75 10 e8 cf 4a 05 00 31 c9 48 89 c8 fb 0f 01 c9 <eb> 06 e8 bf 4a 05 00 fb 5e 5b 41 5c 41 5d c9 c3 55 48 89 e5 e8 
[94889.441266] Call Trace:
[94889.441266]  [<ffffffff81001ac8>] cpu_idle+0x4d/0x7d
[94889.441266]  [<ffffffff814aea64>] start_secondary+0x22b/0x271
[94889.441266] Pid: 0, comm: kworker/0:1 Tainted: G      D     2.6.36-rc7+ #61
[94889.441266] Call Trace:
[94889.441266]  <NMI>  [<ffffffff8100a603>] ? show_regs+0x26/0x2a
[94889.441266]  [<ffffffff8101aaa0>] nmi_watchdog_tick+0xad/0x19e
[94889.441266]  [<ffffffff81004004>] do_nmi+0xea/0x298
[94889.441266]  [<ffffffff814b740a>] nmi+0x1a/0x2c
[94889.441266]  [<ffffffff810521b7>] ? add_wait_queue+0x1b/0x45
[94889.441266]  [<ffffffff81056048>] ? __atomic_notifier_call_chain+0x0/0xb4
[94889.441266]  [<ffffffff8100aa64>] ? mwait_idle+0x80/0x99
[94889.441266]  [<ffffffff8100aa6d>] ? mwait_idle+0x89/0x99
[94889.441266]  <<EOE>>  [<ffffffff81001ac8>] cpu_idle+0x4d/0x7d
[94889.441266]  [<ffffffff814aea64>] start_secondary+0x22b/0x271
[94889.441339] NMI backtrace for cpu 2
[94889.441339] CPU 2 
[94889.441339] Modules linked in: ceph
[94889.441339] 
[94889.441339] Pid: 0, comm: kworker/0:1 Tainted: G      D     2.6.36-rc7+ #61 PDSMi+/PDSMi
[94889.441339] RIP: 0010:[<ffffffff8100aa6d>]  [<ffffffff8100aa6d>] mwait_idle+0x89/0x99
[94889.441339] RSP: 0018:ffff88011fb17ed8  EFLAGS: 00000246
[94889.441339] RAX: 0000000000000000 RBX: ffff880002bd2700 RCX: 0000000000000000
[94889.441339] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff8100aa64
[94889.441339] RBP: ffff88011fb17ef8 R08: ffffffff8175ea90 R09: 0000000000000001
[94889.441339] R10: ffffffff81056048 R11: ffffffff8102da71 R12: ffff88011fb17fd8
[94889.441339] R13: ffff88011fb16010 R14: 0000000000000000 R15: 0000000000000000
[94889.441339] FS:  0000000000000000(0000) GS:ffff880002a00000(0000) knlGS:0000000000000000
[94889.441339] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[94889.441339] CR2: 00007f2a37b36380 CR3: 000000000174b000 CR4: 00000000000006e0
[94889.441339] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[94889.441339] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[94889.441339] Process kworker/0:1 (pid: 0, threadinfo ffff88011fb16000, task ffff88011fb143c0)
[94889.441339] Stack:
[94889.441339]  ffff88011fb17ee8 ffff88011fb17fd8 ffffffff817c90c0 0000000000000000
[94889.441339] <0> ffff88011fb17f18 ffffffff81001ac8 ffff880002a0d408 0000000000000000
[94889.441339] <0> ffff88011fb17f48 ffffffff814aea64 0000000000000000 0000000000000000
[94889.441339] Call Trace:
[94889.441339]  [<ffffffff81001ac8>] cpu_idle+0x4d/0x7d
[94889.441339]  [<ffffffff814aea64>] start_secondary+0x22b/0x271
[94889.441339] Code: e0 ff ff 31 c9 4c 89 e8 48 89 ca 0f 01 c8 0f ae f0 49 8b 84 24 38 e0 ff ff a8 08 75 10 e8 cf 4a 05 00 31 c9 48 89 c8 fb 0f 01 c9 <eb> 06 e8 bf 4a 05 00 fb 5e 5b 41 5c 41 5d c9 c3 55 48 89 e5 e8 
[94889.441339] Call Trace:
[94889.441339]  [<ffffffff81001ac8>] cpu_idle+0x4d/0x7d
[94889.441339]  [<ffffffff814aea64>] start_secondary+0x22b/0x271
[94889.441339] Pid: 0, comm: kworker/0:1 Tainted: G 

Actions #2

Updated by Sage Weil over 13 years ago

  • Priority changed from High to Normal
Actions #3

Updated by Sage Weil over 13 years ago

  • Status changed from New to Can't reproduce
Actions

Also available in: Atom PDF