Actions
Bug #471
closedNULL pointer dereference __list_add+0x42/0x89 kick_requests+0x24/0x9e
Added by Sage Weil over 13 years ago. Updated over 13 years ago.
Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):
Description
On commit:0d328c1
[94880.387538] ceph: osd15 10.3.14.142:6800 socket closed [94880.392791] INFO: trying to register non-static key. [94880.396785] the code is fine but needs lockdep annotation. [94880.396785] turning off the locking correctness validator. [94880.396785] Pid: 10, comm: kworker/0:1 Not tainted 2.6.36-rc7+ #61 [94880.396785] Call Trace: [94880.396785] [<ffffffff8105dd50>] ? static_obj+0x43/0x53 [94880.396785] [<ffffffff8106234c>] __lock_acquire+0x852/0x87a [94880.396785] [<ffffffff810623fc>] lock_acquire+0x88/0xa5 [94880.396785] [<ffffffffa002bf1b>] ? osd_reset+0x40/0x8d [ceph] [94880.396785] [<ffffffff814b5382>] down_read+0x47/0x8d [94880.396785] [<ffffffffa002bf1b>] ? osd_reset+0x40/0x8d [ceph] [94880.396785] [<ffffffffa002bf1b>] osd_reset+0x40/0x8d [ceph] [94880.396785] [<ffffffffa001fef5>] con_work+0x37b/0x6bb [ceph] [94880.396785] [<ffffffff8104c89f>] process_one_work+0x1fd/0x38f [94880.396785] [<ffffffff8104c83d>] ? process_one_work+0x19b/0x38f [94880.396785] [<ffffffffa001fb7a>] ? con_work+0x0/0x6bb [ceph] [94880.396785] [<ffffffff8104e269>] worker_thread+0x147/0x22b [94880.396785] [<ffffffff8104e122>] ? worker_thread+0x0/0x22b [94880.396785] [<ffffffff81051a6d>] kthread+0x8d/0x95 [94880.396785] [<ffffffff81003794>] kernel_thread_helper+0x4/0x10 [94880.396785] [<ffffffff81030fe9>] ? finish_task_switch+0x0/0xa8 [94880.396785] [<ffffffff81031052>] ? finish_task_switch+0x69/0xa8 [94880.396785] [<ffffffff814b6f00>] ? restore_args+0x0/0x30 [94880.396785] [<ffffffff810519e0>] ? kthread+0x0/0x95 [94880.396785] [<ffffffff81003790>] ? kernel_thread_helper+0x0/0x10 [94880.535663] BUG: unable to handle kernel NULL pointer dereference at (null) [94880.539585] IP: [<ffffffff8126ffbd>] __list_add+0x42/0x89 [94880.539585] PGD 11d119067 PUD 11cbc0067 PMD 0 [94880.539585] Oops: 0000 [#1] PREEMPT SMP [94880.539585] last sysfs file: /sys/kernel/uevent_seqnum [94880.539585] CPU 0 [94880.539585] Modules linked in: ceph [94880.539585] [94880.539585] Pid: 10, comm: kworker/0:1 Not tainted 2.6.36-rc7+ #61 PDSMi+/PDSMi [94880.539585] RIP: 0010:[<ffffffff8126ffbd>] [<ffffffff8126ffbd>] __list_add+0x42/0x89 [94880.539585] RSP: 0018:ffff88011faddc40 EFLAGS: 00010046 [94880.539585] RAX: 0000000000000000 RBX: ffff88011e1be958 RCX: 0000000000000000 [94880.539585] RDX: ffff88011e1be958 RSI: 0000000000000000 RDI: ffff88011faddc90 [94880.539585] RBP: ffff88011faddc60 R08: 0000000000000000 R09: ffff88011faddc90 [94880.539585] R10: ffffffff81056048 R11: ffffffff81055a87 R12: 0000000000000000 [94880.539585] R13: ffff88011faddc90 R14: ffff88011fada280 R15: ffffffffa002be61 [94880.539585] FS: 0000000000000000(0000) GS:ffff880002600000(0000) knlGS:0000000000000000 [94880.539585] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [94880.539585] CR2: 0000000000000000 CR3: 000000011dcd4000 CR4: 00000000000006f0 [94880.539585] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [94880.539585] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [94880.539585] Process kworker/0:1 (pid: 10, threadinfo ffff88011fadc000, task ffff88011fada280) [94880.539585] Stack: [94880.539585] ffff88011e1be910 00000000ffffffff ffff88011e1be910 0000000000000202 [94880.539585] <0> ffff88011faddce0 ffffffff814b479f ffffffffa002be61 0000000000000000 [94880.539585] <0> ffff88011faddce0 0000000000000246 ffff88011faddc90 ffff88011faddc90 [94880.539585] Call Trace: [94880.539585] [<ffffffff814b479f>] mutex_lock_nested+0x130/0x31e [94880.539585] [<ffffffffa002be61>] ? kick_requests+0x24/0x9e [ceph] [94880.539585] [<ffffffffa002be61>] kick_requests+0x24/0x9e [ceph] [94880.539585] [<ffffffffa002bf1b>] ? osd_reset+0x40/0x8d [ceph] [94880.539585] [<ffffffffa002bf26>] osd_reset+0x4b/0x8d [ceph] [94880.539585] [<ffffffffa001fef5>] con_work+0x37b/0x6bb [ceph] [94880.539585] [<ffffffff8104c89f>] process_one_work+0x1fd/0x38f [94880.539585] [<ffffffff8104c83d>] ? process_one_work+0x19b/0x38f [94880.539585] [<ffffffffa001fb7a>] ? con_work+0x0/0x6bb [ceph] [94880.539585] [<ffffffff8104e269>] worker_thread+0x147/0x22b [94880.539585] [<ffffffff8104e122>] ? worker_thread+0x0/0x22b [94880.539585] [<ffffffff81051a6d>] kthread+0x8d/0x95 [94880.539585] [<ffffffff81003794>] kernel_thread_helper+0x4/0x10 [94880.539585] [<ffffffff81030fe9>] ? finish_task_switch+0x0/0xa8 [94880.539585] [<ffffffff81031052>] ? finish_task_switch+0x69/0xa8 [94880.539585] [<ffffffff814b6f00>] ? restore_args+0x0/0x30 [94880.539585] [<ffffffff810519e0>] ? kthread+0x0/0x95 [94880.539585] [<ffffffff81003790>] ? kernel_thread_helper+0x0/0x10 [94880.539585] Code: 8b 42 08 48 39 f0 74 23 49 89 d1 49 89 c0 48 89 f1 48 c7 c2 06 d6 64 81 be 1a 00 00 00 48 c7 c7 51 d5 64 81 31 c0 e8 fc a5 dc ff <49> 8b 04 24 48 39 d8 74 23 49 89 c0 4d 89 e1 48 89 d9 48 c7 c2 [94880.539585] RIP [<ffffffff8126ffbd>] __list_add+0x42/0x89 [94880.539585] RSP <ffff88011faddc40> [94880.539585] CR2: 0000000000000000 [94880.539585] ---[ end trace 555371ce86832624 ]---
This was on ceph1.
Updated by Sage Weil over 13 years ago
Here's teh full dmesg, fwiw:
ceph1 login: [ 383.024127] ceph: version magic '2.6.36-rc3+ SMP preempt mod_unload ' should be '2.6.36-rc7+ SMP preempt mod_unload ' [ 425.208545] ceph: version magic '2.6.36-rc3+ SMP preempt mod_unload ' should be '2.6.36-rc7+ SMP preempt mod_unload ' [ 2280.788691] ceph: loaded (mon/mds/osd proto 15/32/24, osdmap 5/5 5/5) [ 2284.412794] ceph: mon0 10.3.14.136:6789 connection failed [ 2294.439930] ceph: mon0 10.3.14.136:6789 connection failed [ 2304.454969] ceph: mon0 10.3.14.136:6789 connection failed [ 2314.470179] ceph: mon0 10.3.14.136:6789 connection failed [ 2324.485309] ceph: mon0 10.3.14.136:6789 connection failed [ 2334.500533] ceph: mon0 10.3.14.136:6789 connection failed [ 2361.694189] ceph: mon0 10.3.14.136:6789 connection failed [ 2371.713385] ceph: mon0 10.3.14.136:6789 connection failed [ 2381.728586] ceph: mon0 10.3.14.136:6789 connection failed [ 2391.743728] ceph: mon0 10.3.14.136:6789 connection failed [ 2401.758922] ceph: mon0 10.3.14.136:6789 connection failed [ 2402.321279] ceph: client4225 fsid e67f3214-bb55-4590-5bd2-92b0ea895f7e [ 2402.328651] ceph: mon0 10.3.14.154:6789 session established [ 2411.778001] ceph: mon0 10.3.14.136:6789 connection failed [ 2448.375277] ceph: osd20 10.3.14.147:6800 socket closed [ 2448.380917] ceph: osd20 10.3.14.147:6800 connection failed [ 2449.070804] ceph: osd20 10.3.14.147:6800 connection failed [ 2450.070688] ceph: osd20 10.3.14.147:6800 connection failed [ 2452.074453] ceph: osd20 10.3.14.147:6800 connection failed [ 2456.078101] ceph: osd20 10.3.14.147:6800 connection failed [ 2464.093460] ceph: osd20 10.3.14.147:6800 connection failed [ 2469.908987] ceph: osd20 down [ 2472.203746] ceph: get_reply unknown tid 328 from osd5 [ 2472.209032] ceph: get_reply unknown tid 497 from osd5 [ 2472.220570] ceph: get_reply unknown tid 1118 from osd15 [ 2472.226126] ceph: get_reply unknown tid 742 from osd15 [ 2472.231558] ceph: get_reply unknown tid 502 from osd16 [ 2472.237009] ceph: get_reply unknown tid 246 from osd3 [ 2472.363685] ceph: get_reply unknown tid 554 from osd13 [ 2472.369091] ceph: get_reply unknown tid 966 from osd13 [ 2472.374525] ceph: get_reply unknown tid 572 from osd13 [ 2472.380024] ceph: get_reply unknown tid 692 from osd18 [ 2472.385746] ceph: get_reply unknown tid 112 from osd8 [ 2472.391192] ceph: get_reply unknown tid 122 from osd8 [ 2472.396548] ceph: get_reply unknown tid 258 from osd8 [ 2472.402153] ceph: get_reply unknown tid 981 from osd2 [ 2472.407479] ceph: get_reply unknown tid 93 from osd2 [ 2472.412676] ceph: get_reply unknown tid 845 from osd2 [ 2472.418146] ceph: get_reply unknown tid 172 from osd17 [ 2472.423475] ceph: get_reply unknown tid 754 from osd13 [ 2472.428845] ceph: get_reply unknown tid 576 from osd22 [ 2472.434282] ceph: get_reply unknown tid 954 from osd6 [ 2472.440099] ceph: get_reply unknown tid 207 from osd24 [ 2472.445565] ceph: get_reply unknown tid 414 from osd24 [ 2472.450937] ceph: get_reply unknown tid 338 from osd24 [ 2472.456534] ceph: get_reply unknown tid 174 from osd24 [ 2472.461801] ceph: get_reply unknown tid 230 from osd24 [ 2472.467114] ceph: get_reply unknown tid 557 from osd24 [ 2472.472553] ceph: get_reply unknown tid 135 from osd14 [ 2472.477921] ceph: get_reply unknown tid 472 from osd14 [ 2472.483250] ceph: get_reply unknown tid 614 from osd24 [ 2472.492541] ceph: get_reply unknown tid 237 from osd24 [ 2472.539801] ceph: get_reply unknown tid 1095 from osd21 [ 2472.545333] ceph: get_reply unknown tid 400 from osd21 [ 2472.555431] ceph: get_reply unknown tid 684 from osd21 [ 2472.560810] ceph: get_reply unknown tid 87 from osd21 [ 2472.666698] ceph: get_reply unknown tid 77 from osd7 [ 2472.671852] ceph: get_reply unknown tid 403 from osd7 [ 2472.677085] ceph: get_reply unknown tid 810 from osd7 [ 2472.694503] ceph: get_reply unknown tid 1033 from osd7 [ 2472.702954] ceph: get_reply unknown tid 386 from osd7 [ 2472.946497] ceph: get_reply unknown tid 210 from osd11 [ 2472.952189] ceph: get_reply unknown tid 423 from osd11 [ 2472.957507] ceph: get_reply unknown tid 617 from osd11 [ 2473.054117] ceph: get_reply unknown tid 842 from osd7 [ 2473.059867] ceph: get_reply unknown tid 264 from osd17 [ 2473.065514] ceph: get_reply unknown tid 488 from osd14 [ 2473.071475] ceph: get_reply unknown tid 420 from osd24 [ 2473.078460] ceph: get_reply unknown tid 837 from osd15 [ 2473.085107] ceph: get_reply unknown tid 973 from osd22 [ 2473.095756] ceph: get_reply unknown tid 726 from osd17 [ 2473.102913] ceph: get_reply unknown tid 545 from osd24 [ 2473.108961] ceph: get_reply unknown tid 713 from osd22 [ 2473.114821] ceph: get_reply unknown tid 870 from osd15 [ 2473.121726] ceph: get_reply unknown tid 206 from osd17 [ 2473.128683] ceph: get_reply unknown tid 1023 from osd22 [ 2473.135645] ceph: get_reply unknown tid 368 from osd17 [ 2473.184317] ceph: get_reply unknown tid 885 from osd4 [ 2473.197399] ceph: get_reply unknown tid 805 from osd4 [ 2473.208782] ceph: get_reply unknown tid 907 from osd4 [ 2878.162458] ceph: osd20 up [ 2878.165250] ceph: osd20 weight 0x10000 (in) [ 3139.762356] ceph: osd2 10.3.14.129:6800 socket closed [ 3143.441306] ceph: osd2 10.3.14.129:6800 connection failed [ 3144.075370] ceph: wrong peer, want 10.3.14.129:6800/9093, got 0.0.0.0:6800/9845 [ 3144.082858] ceph: osd2 10.3.14.129:6800 wrong peer at address [ 3144.693520] ceph: osd3 10.3.14.130:6800 socket closed [ 3144.699061] ceph: osd3 10.3.14.130:6800 connection failed [ 3145.079102] ceph: osd3 10.3.14.130:6800 connection failed [ 3145.084848] ceph: wrong peer, want 10.3.14.129:6800/9093, got 0.0.0.0:6800/9845 [ 3145.092329] ceph: osd2 10.3.14.129:6800 wrong peer at address [ 3146.082907] ceph: osd3 10.3.14.130:6800 connection failed [ 3147.087104] ceph: wrong peer, want 10.3.14.129:6800/9093, got 10.3.14.129:6800/9845 [ 3147.094947] ceph: osd2 10.3.14.129:6800 wrong peer at address [ 3147.974977] ceph: osd2 down [ 3148.090653] ceph: osd3 10.3.14.130:6800 connection failed [ 3149.459133] ceph: osd4 10.3.14.131:6800 socket closed [ 3149.464967] ceph: osd4 10.3.14.131:6800 connection failed [ 3149.539980] ceph: osd2 up [ 3149.542668] ceph: osd2 weight 0x10000 (in) [ 3150.090476] ceph: osd4 10.3.14.131:6800 connection failed [ 3151.090384] ceph: osd4 10.3.14.131:6800 connection failed [ 3151.465000] ceph: mon0 10.3.14.154:6789 socket closed [ 3151.470174] ceph: mon0 10.3.14.154:6789 session lost, hunting for new mon [ 3151.477376] ceph: mon0 10.3.14.154:6789 connection failed [ 3152.106561] ceph: wrong peer, want 10.3.14.130:6800/22139, got 10.3.14.130:6800/22544 [ 3152.114597] ceph: osd3 10.3.14.130:6800 wrong peer at address [ 3153.106147] ceph: osd4 10.3.14.131:6800 connection failed [ 3153.662235] ceph: mon0 10.3.14.154:6789 connection failed [ 3154.591724] ceph: osd5 10.3.14.132:6800 socket closed [ 3154.597512] ceph: osd5 10.3.14.132:6800 connection failed [ 3155.106006] ceph: osd5 10.3.14.132:6800 connection failed [ 3156.105964] ceph: osd5 10.3.14.132:6800 connection failed [ 3156.132791] ceph: mds0 10.3.14.152:6800 socket closed [ 3157.109868] ceph: mds0 10.3.14.152:6800 connection failed [ 3157.118140] ceph: wrong peer, want 10.3.14.131:6800/22273, got 10.3.14.131:6800/22946 [ 3157.126149] ceph: osd4 10.3.14.131:6800 wrong peer at address [ 3158.113916] ceph: mds0 10.3.14.152:6800 connection failed [ 3158.119656] ceph: osd5 10.3.14.132:6800 connection failed [ 3159.395778] ceph: osd4 down [ 3159.446358] ceph: osd6 10.3.14.133:6800 socket closed [ 3159.452172] ceph: osd6 10.3.14.133:6800 connection failed [ 3160.117661] ceph: osd6 10.3.14.133:6800 connection failed [ 3160.123761] ceph: wrong peer, want 10.3.14.152:6800/22246, got 10.3.14.152:6800/22373 [ 3160.131770] ceph: mds0 10.3.14.152:6800 wrong peer at address [ 3160.149945] ceph: wrong peer, want 10.3.14.130:6800/22139, got 10.3.14.130:6800/22544 [ 3160.157956] ceph: osd3 10.3.14.130:6800 wrong peer at address [ 3160.252671] ceph: osd4 up [ 3160.255365] ceph: osd4 weight 0x10000 (in) [ 3161.129633] ceph: osd6 10.3.14.133:6800 connection failed [ 3161.741410] ceph: osd5 down [ 3162.137790] ceph: wrong peer, want 10.3.14.132:6800/22380, got 10.3.14.132:6800/22987 [ 3162.145802] ceph: osd5 10.3.14.132:6800 wrong peer at address [ 3163.137461] ceph: osd6 10.3.14.133:6800 connection failed [ 3163.243539] ceph: osd5 up [ 3163.246245] ceph: osd5 weight 0x10000 (in) [ 3163.711783] ceph: mon0 10.3.14.154:6789 session established [ 3164.145660] ceph: wrong peer, want 10.3.14.152:6800/22246, got 10.3.14.152:6800/22373 [ 3164.153669] ceph: mds0 10.3.14.152:6800 wrong peer at address [ 3164.269168] ceph: osd7 10.3.14.134:6800 socket closed [ 3164.275381] ceph: osd7 10.3.14.134:6800 connection failed [ 3165.145329] ceph: osd7 10.3.14.134:6800 connection failed [ 3165.760649] ceph: get_reply unknown tid 18813 from osd17 [ 3165.766128] ceph: get_reply unknown tid 18813 from osd17 [ 3165.771665] ceph: get_reply unknown tid 18806 from osd12 [ 3165.777137] ceph: get_reply unknown tid 18806 from osd12 [ 3166.149228] ceph: osd7 10.3.14.134:6800 connection failed [ 3167.159620] ceph: wrong peer, want 10.3.14.133:6800/22342, got 10.3.14.133:6800/23019 [ 3167.167645] ceph: osd6 10.3.14.133:6800 wrong peer at address [ 3167.204915] ceph: osd3 down [ 3167.207795] ceph: osd6 down [ 3168.161133] ceph: osd7 10.3.14.134:6800 connection failed [ 3168.683339] ceph: osd6 up [ 3168.686037] ceph: osd6 weight 0x10000 (in) [ 3169.059210] ceph: get_reply unknown tid 18680 from osd5 [ 3169.147320] ceph: get_reply unknown tid 18936 from osd5 [ 3169.152699] ceph: get_reply unknown tid 18943 from osd5 [ 3169.311656] ceph: get_reply unknown tid 18948 from osd16 [ 3169.317334] ceph: get_reply unknown tid 18762 from osd16 [ 3169.322816] ceph: get_reply unknown tid 18771 from osd16 [ 3169.328338] ceph: get_reply unknown tid 18929 from osd13 [ 3169.334035] ceph: get_reply unknown tid 18715 from osd18 [ 3169.353842] ceph: get_reply unknown tid 18655 from osd12 [ 3169.464760] ceph: get_reply unknown tid 18704 from osd23 [ 3169.470336] ceph: get_reply unknown tid 18747 from osd23 [ 3169.475884] ceph: get_reply unknown tid 18731 from osd23 [ 3169.486778] ceph: get_reply unknown tid 18883 from osd24 [ 3169.492252] ceph: get_reply unknown tid 18950 from osd24 [ 3170.934659] ceph: osd8 10.3.14.135:6800 socket closed [ 3170.940851] ceph: osd8 10.3.14.135:6800 connection failed [ 3172.126115] ceph: get_reply unknown tid 18755 from osd15 [ 3172.131637] ceph: get_reply unknown tid 18755 from osd15 [ 3172.137092] ceph: get_reply unknown tid 18852 from osd17 [ 3172.142629] ceph: get_reply unknown tid 18784 from osd21 [ 3172.188909] ceph: osd8 10.3.14.135:6800 connection failed [ 3172.194959] ceph: wrong peer, want 10.3.14.134:6800/22217, got 10.3.14.134:6800/22857 [ 3172.202998] ceph: osd7 10.3.14.134:6800 wrong peer at address [ 3172.217173] ceph: wrong peer, want 10.3.14.152:6800/22246, got 10.3.14.152:6800/22373 [ 3172.225192] ceph: mds0 10.3.14.152:6800 wrong peer at address [ 3172.388796] ceph: get_reply unknown tid 18854 from osd4 [ 3172.403122] ceph: get_reply unknown tid 18826 from osd20 [ 3172.408576] ceph: get_reply unknown tid 18826 from osd20 [ 3172.442808] ceph: get_reply unknown tid 18632 from osd11 [ 3172.481150] ceph: get_reply unknown tid 18836 from osd24 [ 3172.486602] ceph: get_reply unknown tid 18836 from osd24 [ 3172.594118] ceph: get_reply unknown tid 18820 from osd16 [ 3172.599570] ceph: get_reply unknown tid 18820 from osd16 [ 3173.204788] ceph: osd8 10.3.14.135:6800 connection failed [ 3173.579857] ceph: get_reply unknown tid 18852 from osd17 [ 3173.587667] ceph: get_reply unknown tid 18784 from osd21 [ 3173.711231] ceph: get_reply unknown tid 18913 from osd4 [ 3173.814078] ceph: get_reply unknown tid 18960 from osd4 [ 3173.960048] ceph: osd7 down [ 3174.892639] ceph: get_reply unknown tid 19009 from osd20 [ 3175.034468] ceph: osd7 up [ 3175.037171] ceph: osd7 weight 0x10000 (in) [ 3175.220630] ceph: osd8 10.3.14.135:6800 connection failed [ 3176.042523] ceph: get_reply unknown tid 19061 from osd5 [ 3176.047996] ceph: get_reply unknown tid 18971 from osd19 [ 3176.870798] ceph: get_reply unknown tid 18645 from osd15 [ 3176.876274] ceph: get_reply unknown tid 18971 from osd19 [ 3177.052611] ceph: get_reply unknown tid 18481 from osd12 [ 3177.058111] ceph: get_reply unknown tid 18481 from osd12 [ 3177.063658] ceph: get_reply unknown tid 18891 from osd13 [ 3177.069208] ceph: get_reply unknown tid 18934 from osd12 [ 3177.103823] ceph: get_reply unknown tid 18508 from osd18 [ 3177.109293] ceph: get_reply unknown tid 18508 from osd18 [ 3177.121698] ceph: get_reply unknown tid 18973 from osd18 [ 3177.127164] ceph: get_reply unknown tid 18973 from osd18 [ 3177.170419] ceph: get_reply unknown tid 18505 from osd16 [ 3177.175902] ceph: get_reply unknown tid 18505 from osd16 [ 3177.181428] ceph: get_reply unknown tid 18523 from osd17 [ 3177.201721] ceph: get_reply unknown tid 18558 from osd21 [ 3177.207206] ceph: get_reply unknown tid 18558 from osd21 [ 3177.212716] ceph: get_reply unknown tid 18639 from osd18 [ 3177.218297] ceph: get_reply unknown tid 18593 from osd20 [ 3177.223753] ceph: get_reply unknown tid 18668 from osd21 [ 3177.288330] ceph: osd11 10.3.14.138:6800 socket closed [ 3177.294879] ceph: osd11 10.3.14.138:6800 connection failed [ 3177.374648] ceph: get_reply unknown tid 18738 from osd19 [ 3177.396960] ceph: get_reply unknown tid 18565 from osd23 [ 3177.402435] ceph: get_reply unknown tid 18565 from osd23 [ 3177.444838] ceph: get_reply unknown tid 18934 from osd12 [ 3177.810746] ceph: get_reply unknown tid 18909 from osd23 [ 3177.816276] ceph: get_reply unknown tid 18909 from osd23 [ 3177.821936] ceph: get_reply unknown tid 19071 from osd18 [ 3177.949432] ceph: get_reply unknown tid 18891 from osd13 [ 3177.955125] ceph: get_reply unknown tid 18988 from osd13 [ 3177.960637] ceph: get_reply unknown tid 19033 from osd13 [ 3178.256431] ceph: osd11 10.3.14.138:6800 connection failed [ 3178.592003] ceph: get_reply unknown tid 18593 from osd20 [ 3178.628515] ceph: osd8 down [ 3178.633779] ceph: get_reply unknown tid 18523 from osd17 [ 3178.668495] ceph: get_reply unknown tid 19034 from osd18 [ 3179.260371] ceph: osd11 10.3.14.138:6800 connection failed [ 3179.268642] ceph: wrong peer, want 10.3.14.135:6800/22185, got 10.3.14.135:6800/22826 [ 3179.276651] ceph: osd8 10.3.14.135:6800 wrong peer at address [ 3180.164656] ceph: osd8 up [ 3180.167328] ceph: osd8 weight 0x10000 (in) [ 3181.268554] ceph: osd11 10.3.14.138:6800 connection failed [ 3182.527585] ceph: osd12 10.3.14.139:6800 socket closed [ 3182.533880] ceph: osd12 10.3.14.139:6800 connection failed [ 3183.268110] ceph: osd12 10.3.14.139:6800 connection failed [ 3183.604580] ceph: get_reply unknown tid 18905 from osd13 [ 3183.610049] ceph: get_reply unknown tid 18905 from osd13 [ 3184.268039] ceph: osd12 10.3.14.139:6800 connection failed [ 3185.276261] ceph: wrong peer, want 10.3.14.138:6800/22348, got 10.3.14.138:6800/22963 [ 3185.284264] ceph: osd11 10.3.14.138:6800 wrong peer at address [ 3186.014700] ceph: get_reply unknown tid 19145 from osd16 [ 3186.020154] ceph: get_reply unknown tid 19145 from osd16 [ 3186.164913] ceph: get_reply unknown tid 19023 from osd20 [ 3186.170392] ceph: get_reply unknown tid 19023 from osd20 [ 3186.175872] ceph: get_reply unknown tid 19004 from osd17 [ 3186.233453] ceph: get_reply unknown tid 18878 from osd15 [ 3186.238924] ceph: get_reply unknown tid 18878 from osd15 [ 3186.248879] ceph: get_reply unknown tid 19035 from osd15 [ 3186.254333] ceph: get_reply unknown tid 19035 from osd15 [ 3186.266996] ceph: get_reply unknown tid 19120 from osd24 [ 3186.272472] ceph: get_reply unknown tid 19120 from osd24 [ 3186.295848] ceph: osd12 10.3.14.139:6800 connection failed [ 3186.344478] ceph: get_reply unknown tid 18892 from osd24 [ 3186.349942] ceph: get_reply unknown tid 18892 from osd24 [ 3186.397219] ceph: osd11 down [ 3187.818272] ceph: get_reply unknown tid 18552 from osd20 [ 3187.823743] ceph: get_reply unknown tid 18552 from osd20 [ 3187.853158] ceph: osd13 10.3.14.140:6800 socket closed [ 3187.859582] ceph: osd13 10.3.14.140:6800 connection failed [ 3187.907963] ceph: osd11 up [ 3187.910733] ceph: osd11 weight 0x10000 (in) [ 3188.235067] ceph: get_reply unknown tid 18682 from osd20 [ 3188.299712] ceph: osd13 10.3.14.140:6800 connection failed [ 3188.680729] ceph: get_reply unknown tid 19004 from osd17 [ 3189.299687] ceph: osd13 10.3.14.140:6800 connection failed [ 3189.633003] ceph: get_reply unknown tid 19085 from osd14 [ 3189.638462] ceph: get_reply unknown tid 19085 from osd14 [ 3189.857022] ceph: osd12 down [ 3190.311906] ceph: wrong peer, want 10.3.14.139:6800/22593, got 10.3.14.139:6800/22982 [ 3190.319917] ceph: osd12 10.3.14.139:6800 wrong peer at address [ 3190.650036] ceph: get_reply unknown tid 19253 from osd16 [ 3190.777185] ceph: get_reply unknown tid 19248 from osd20 [ 3191.163470] ceph: get_reply unknown tid 19242 from osd22 [ 3191.178384] ceph: get_reply unknown tid 19219 from osd15 [ 3191.183839] ceph: get_reply unknown tid 19219 from osd15 [ 3191.299295] ceph: get_reply unknown tid 19089 from osd24 [ 3191.304770] ceph: get_reply unknown tid 19089 from osd24 [ 3191.323505] ceph: osd13 10.3.14.140:6800 connection failed [ 3191.334754] ceph: osd12 up [ 3191.337543] ceph: osd12 weight 0x10000 (in) [ 3192.935749] ceph: get_reply unknown tid 19358 from osd7 [ 3193.481872] ceph: get_reply unknown tid 19234 from osd6 [ 3193.487276] ceph: get_reply unknown tid 19234 from osd6 [ 3193.492692] ceph: get_reply unknown tid 19110 from osd21 [ 3193.498146] ceph: osd14 10.3.14.141:6800 socket closed [ 3193.504446] ceph: osd14 10.3.14.141:6800 connection failed [ 3193.718647] ceph: get_reply unknown tid 19110 from osd21 [ 3193.724231] ceph: get_reply unknown tid 19123 from osd21 [ 3193.763422] ceph: get_reply unknown tid 19123 from osd21 [ 3193.918224] ceph: get_reply unknown tid 18983 from osd6 [ 3193.923600] ceph: get_reply unknown tid 18983 from osd6 [ 3194.335305] ceph: osd14 10.3.14.141:6800 connection failed [ 3195.335207] ceph: osd14 10.3.14.141:6800 connection failed [ 3195.343528] ceph: wrong peer, want 10.3.14.140:6800/22385, got 10.3.14.140:6800/23252 [ 3195.351537] ceph: osd13 10.3.14.140:6800 wrong peer at address [ 3195.379394] ceph: get_reply unknown tid 19129 from osd8 [ 3196.587484] ceph: osd13 down [ 3197.345923] ceph: wrong peer, want 10.3.14.141:6800/22326, got 0.0.0.0:6800/22949 [ 3197.353589] ceph: osd14 10.3.14.141:6800 wrong peer at address [ 3197.873330] ceph: get_reply unknown tid 19297 from osd20 [ 3197.878804] ceph: get_reply unknown tid 19297 from osd20 [ 3197.937427] ceph: get_reply unknown tid 19338 from osd6 [ 3197.942813] ceph: get_reply unknown tid 19338 from osd6 [ 3198.067979] ceph: osd13 up [ 3198.070737] ceph: osd13 weight 0x10000 (in) [ 3198.384875] ceph: osd15 10.3.14.142:6800 socket closed [ 3198.391034] ceph: osd15 10.3.14.142:6800 connection failed [ 3198.470045] ceph: get_reply unknown tid 19368 from osd20 [ 3198.475516] ceph: get_reply unknown tid 19368 from osd20 [ 3198.518417] ceph: get_reply unknown tid 19237 from osd7 [ 3198.523796] ceph: get_reply unknown tid 19237 from osd7 [ 3198.580715] ceph: get_reply unknown tid 19310 from osd24 [ 3198.586163] ceph: get_reply unknown tid 19310 from osd24 [ 3198.826275] ceph: get_reply unknown tid 19337 from osd6 [ 3198.831652] ceph: get_reply unknown tid 19337 from osd6 [ 3199.291090] ceph: get_reply unknown tid 19129 from osd8 [ 3199.374947] ceph: osd15 10.3.14.142:6800 connection failed [ 3199.606810] ceph: osd14 down [ 3200.374872] ceph: osd15 10.3.14.142:6800 connection failed [ 3201.132985] ceph: osd14 up [ 3201.135767] ceph: osd14 weight 0x10000 (in) [ 3201.714212] ceph: get_reply unknown tid 19359 from osd12 [ 3202.027017] ceph: get_reply unknown tid 18793 from osd4 [ 3202.032391] ceph: get_reply unknown tid 18793 from osd4 [ 3202.496914] ceph: wrong peer, want 10.3.14.142:6800/22384, got 0.0.0.0:6800/22993 [ 3202.504581] ceph: osd15 10.3.14.142:6800 wrong peer at address [ 3203.211860] ceph: osd16 10.3.14.143:6800 socket closed [ 3203.217918] ceph: osd16 10.3.14.143:6800 connection failed [ 3203.646578] ceph: get_reply unknown tid 19170 from osd4 [ 3203.651954] ceph: get_reply unknown tid 19170 from osd4 [ 3204.390681] ceph: osd16 10.3.14.143:6800 connection failed [ 3205.386538] ceph: osd16 10.3.14.143:6800 connection failed [ 3205.399425] ceph: get_reply unknown tid 19359 from osd12 [ 3205.692298] ceph: get_reply unknown tid 19152 from osd8 [ 3205.697666] ceph: get_reply unknown tid 19152 from osd8 [ 3205.734542] ceph: get_reply unknown tid 19274 from osd8 [ 3205.823650] ceph: get_reply unknown tid 19274 from osd8 [ 3205.829131] ceph: get_reply unknown tid 19430 from osd8 [ 3205.834514] ceph: osd15 down [ 3206.205480] ceph: get_reply unknown tid 18954 from osd6 [ 3206.210859] ceph: get_reply unknown tid 18954 from osd6 [ 3206.301775] ceph: get_reply unknown tid 19350 from osd24 [ 3206.307248] ceph: get_reply unknown tid 19350 from osd24 [ 3206.318953] ceph: get_reply unknown tid 19100 from osd23 [ 3206.324417] ceph: get_reply unknown tid 19100 from osd23 [ 3206.410734] ceph: wrong peer, want 10.3.14.142:6800/22384, got 10.3.14.142:6800/22993 [ 3206.418755] ceph: osd15 10.3.14.142:6800 wrong peer at address [ 3206.511693] ceph: get_reply unknown tid 19147 from osd22 [ 3206.517163] ceph: get_reply unknown tid 19147 from osd22 [ 3206.534153] ceph: get_reply unknown tid 19161 from osd22 [ 3206.539626] ceph: get_reply unknown tid 19161 from osd22 [ 3207.418339] ceph: osd16 10.3.14.143:6800 connection failed [ 3207.460209] ceph: osd15 up [ 3207.462987] ceph: osd15 weight 0x10000 (in) [ 3208.327275] ceph: get_reply unknown tid 19326 from osd12 [ 3208.332745] ceph: get_reply unknown tid 19326 from osd12 [ 3208.634776] ceph: osd17 10.3.14.144:6800 socket closed [ 3208.759689] ceph: get_reply unknown tid 19430 from osd8 [ 3209.751667] ceph: osd16 down [ 3211.430451] ceph: wrong peer, want 10.3.14.143:6800/22467, got 10.3.14.143:6800/25642 [ 3211.438461] ceph: osd16 10.3.14.143:6800 wrong peer at address [ 3212.112650] ceph: osd16 up [ 3212.115438] ceph: osd16 weight 0x10000 (in) [ 3212.263684] ceph: get_reply unknown tid 19250 from osd11 [ 3212.269138] ceph: get_reply unknown tid 19250 from osd11 [ 3212.280207] ceph: get_reply unknown tid 19007 from osd11 [ 3212.287600] ceph: get_reply unknown tid 19188 from osd11 [ 3212.295288] ceph: get_reply unknown tid 19294 from osd11 [ 3212.300745] ceph: get_reply unknown tid 19007 from osd11 [ 3212.306204] ceph: get_reply unknown tid 19188 from osd11 [ 3212.311654] ceph: get_reply unknown tid 19294 from osd11 [ 3212.484279] ceph: get_reply unknown tid 19197 from osd11 [ 3212.489742] ceph: get_reply unknown tid 19436 from osd11 [ 3212.987531] ceph: get_reply unknown tid 18556 from osd2 [ 3213.005805] ceph: get_reply unknown tid 18628 from osd2 [ 3213.036407] ceph: get_reply unknown tid 18823 from osd2 [ 3213.051125] ceph: get_reply unknown tid 18877 from osd2 [ 3213.056493] ceph: get_reply unknown tid 18877 from osd2 [ 3213.067449] ceph: get_reply unknown tid 18967 from osd2 [ 3213.073694] ceph: get_reply unknown tid 19243 from osd2 [ 3213.079883] ceph: get_reply unknown tid 18556 from osd2 [ 3213.088118] ceph: get_reply unknown tid 18628 from osd2 [ 3213.093493] ceph: get_reply unknown tid 18823 from osd2 [ 3213.098850] ceph: get_reply unknown tid 18967 from osd2 [ 3213.107295] ceph: get_reply unknown tid 19243 from osd2 [ 3213.112667] ceph: get_reply unknown tid 18701 from osd2 [ 3213.118044] ceph: get_reply unknown tid 19251 from osd2 [ 3213.124113] ceph: get_reply unknown tid 18701 from osd2 [ 3213.129489] ceph: get_reply unknown tid 19251 from osd2 [ 3213.134877] ceph: get_reply unknown tid 19241 from osd2 [ 3213.140260] ceph: get_reply unknown tid 19241 from osd2 [ 3213.474689] ceph: mds0 caps stale [ 3213.654256] ceph: get_reply unknown tid 19436 from osd11 [ 3213.921952] ceph: get_reply unknown tid 18599 from osd2 [ 3214.186854] ceph: osd18 10.3.14.145:6800 socket closed [ 3214.193756] ceph: get_reply unknown tid 19318 from osd11 [ 3214.473601] ceph: get_reply unknown tid 18727 from osd2 [ 3214.480941] ceph: get_reply unknown tid 18834 from osd2 [ 3214.486642] ceph: get_reply unknown tid 18910 from osd2 [ 3214.492000] ceph: get_reply unknown tid 18599 from osd2 [ 3214.544378] ceph: get_reply unknown tid 18727 from osd2 [ 3214.552980] ceph: get_reply unknown tid 18834 from osd2 [ 3214.558347] ceph: get_reply unknown tid 18910 from osd2 [ 3214.729093] ceph: get_reply unknown tid 18995 from osd2 [ 3214.734479] ceph: get_reply unknown tid 18995 from osd2 [ 3214.752705] ceph: get_reply unknown tid 19182 from osd11 [ 3214.758182] ceph: get_reply unknown tid 19182 from osd11 [ 3215.013843] ceph: get_reply unknown tid 18964 from osd7 [ 3215.139540] ceph: get_reply unknown tid 19197 from osd11 [ 3215.662903] ceph: get_reply unknown tid 19376 from osd4 [ 3215.672091] ceph: get_reply unknown tid 19376 from osd4 [ 3215.820548] ceph: get_reply unknown tid 19264 from osd4 [ 3215.825924] ceph: get_reply unknown tid 19264 from osd4 [ 3215.929423] ceph: get_reply unknown tid 19318 from osd11 [ 3216.482094] ceph: get_reply unknown tid 19288 from osd2 [ 3216.813458] ceph: get_reply unknown tid 18964 from osd7 [ 3216.820804] ceph: get_reply unknown tid 19358 from osd7 [ 3216.841540] ceph: get_reply unknown tid 19288 from osd2 [ 3217.921835] ceph: osd17 down [ 3218.970746] ceph: osd17 up [ 3218.973535] ceph: osd17 weight 0x10000 (in) [ 3219.743914] ceph: osd19 10.3.14.146:6800 socket closed [ 3223.978633] ceph: osd18 down [ 3223.981590] ceph: osd18 up [ 3223.984357] ceph: osd18 weight 0x10000 (in) [ 3224.450136] ceph: osd20 10.3.14.147:6800 socket closed [ 3225.195201] ceph: get_reply unknown tid 19180 from osd2 [ 3225.200585] ceph: get_reply unknown tid 19180 from osd2 [ 3230.212618] ceph: osd21 10.3.14.148:6800 socket closed [ 3234.997108] ceph: osd22 10.3.14.149:6800 socket closed [ 3240.179060] ceph: osd23 10.3.14.150:6800 socket closed [ 3245.204815] ceph: osd24 10.3.14.151:6800 socket closed [ 3259.240231] ceph: mds0 reconnect start [ 3259.460371] ceph: mds0 reconnect success [ 3297.133592] ceph: mds0 recovery completed [ 3313.514341] ceph: mds0 caps stale [ 3327.905442] ceph: mds0 caps went stale, renewing [ 3398.532563] ceph: mds0 caps renewed [ 3428.601883] ceph: wrong peer, want 10.3.14.149:6800/2179, got 10.3.14.149:6800/2993 [ 3428.609742] ceph: osd22 10.3.14.149:6800 wrong peer at address [ 3428.818030] ceph: osd19 down [ 3428.820990] ceph: osd19 up [ 3428.823786] ceph: osd19 weight 0x10000 (in) [ 3428.828085] ceph: osd20 down [ 3428.831044] ceph: osd20 up [ 3428.833836] ceph: osd20 weight 0x10000 (in) [ 3428.838110] ceph: osd21 down [ 3428.841049] ceph: osd21 up [ 3428.843827] ceph: osd21 weight 0x10000 (in) [ 3428.848103] ceph: osd22 down [ 3428.851055] ceph: osd22 up [ 3428.853838] ceph: osd22 weight 0x10000 (in) [ 3428.858124] ceph: osd23 down [ 3428.861069] ceph: osd23 up [ 3428.863849] ceph: osd23 weight 0x10000 (in) [ 3428.868131] ceph: osd24 down [ 3428.871076] ceph: osd24 up [ 3428.873865] ceph: osd24 weight 0x10000 (in) [ 3428.878133] ceph: osd16 down [ 3428.881093] ceph: osd15 down [22462.944208] ceph: osd2 down [22467.355585] ceph: osd2 up [22467.358270] ceph: osd2 weight 0x10000 (in) [22472.809066] ceph: osd3 up [22472.811765] ceph: osd3 weight 0x10000 (in) [22474.629180] ceph: osd3 10.3.14.130:6800 socket closed [22474.635688] ceph: osd3 10.3.14.130:6800 connection failed [22474.642481] ceph: osd4 down [22475.825502] ceph: osd3 10.3.14.130:6800 connection failed [22475.950340] ceph: osd4 up [22475.953012] ceph: osd4 weight 0x10000 (in) [22476.821392] ceph: osd3 10.3.14.130:6800 connection failed [22478.825205] ceph: osd3 10.3.14.130:6800 connection failed [22479.094782] ceph: osd5 down [22482.828881] ceph: osd3 10.3.14.130:6800 connection failed [22482.960255] ceph: osd5 up [22482.962966] ceph: osd5 weight 0x10000 (in) [22487.389774] ceph: osd6 down [22490.844174] ceph: osd3 10.3.14.130:6800 connection failed [22490.851103] ceph: osd6 up [22490.853805] ceph: osd6 weight 0x10000 (in) [22492.411907] ceph: osd7 down [22497.409999] ceph: osd7 up [22497.412694] ceph: osd7 weight 0x10000 (in) [22497.416882] ceph: osd3 down [22498.322295] ceph: osd8 down [22502.537012] ceph: osd8 up [22502.539709] ceph: osd8 weight 0x10000 (in) [22512.427722] ceph: osd11 down [22512.430681] ceph: osd11 up [22512.433448] ceph: osd11 weight 0x10000 (in) [22517.543154] ceph: osd12 down [22517.546109] ceph: osd12 up [22517.548883] ceph: osd12 weight 0x10000 (in) [22522.781585] ceph: osd13 down [22522.784519] ceph: osd13 up [22522.787302] ceph: osd13 weight 0x10000 (in) [22527.827674] ceph: osd14 down [22527.830641] ceph: osd14 up [22527.833441] ceph: osd14 weight 0x10000 (in) [22532.908112] ceph: osd15 up [22532.910899] ceph: osd15 weight 0x10000 (in) [22532.915174] ceph: osd16 up [22532.917952] ceph: osd16 weight 0x10000 (in) [22537.864994] ceph: osd17 down [22539.903889] ceph: osd17 up [22539.906647] ceph: osd17 weight 0x10000 (in) [22547.731344] ceph: osd18 down [22547.734279] ceph: osd18 up [22547.737061] ceph: osd18 weight 0x10000 (in) [22552.488251] ceph: osd19 down [22552.491200] ceph: osd19 up [22552.493967] ceph: osd19 weight 0x10000 (in) [22557.495706] ceph: osd20 down [22562.503311] ceph: osd20 up [22562.506094] ceph: osd20 weight 0x10000 (in) [22567.510964] ceph: osd21 down [22567.513926] ceph: osd21 up [22567.516693] ceph: osd21 weight 0x10000 (in) [22572.533545] ceph: osd22 down [22572.536490] ceph: osd22 up [22572.539273] ceph: osd22 weight 0x10000 (in) [22577.590767] ceph: osd23 down [22577.593722] ceph: osd23 up [22577.596504] ceph: osd23 weight 0x10000 (in) [22578.422528] ceph: mds0 10.3.14.152:6800 socket closed [22581.828176] ceph: mds0 10.3.14.152:6800 connection failed [22582.533526] ceph: osd24 down [22585.000644] ceph: mon0 10.3.14.154:6789 socket closed [22585.005794] ceph: mon0 10.3.14.154:6789 session lost, hunting for new mon [22585.012926] ceph: mon0 10.3.14.154:6789 connection failed [22587.602002] ceph: osd24 up [22587.604807] ceph: osd24 weight 0x10000 (in) [22587.609669] ceph: mon0 10.3.14.154:6789 session established [22597.558647] ceph: tid 19573 timed out on osd15, will reset osd [22597.571475] ceph: tid 19679 timed out on osd16, will reset osd [22631.827495] ceph: mds0 caps stale [22651.825717] ceph: mds0 caps stale [22657.665302] ceph: tid 19573 timed out on osd15, will reset osd [22657.672042] ceph: tid 19679 timed out on osd16, will reset osd [22717.771955] ceph: tid 19573 timed out on osd15, will reset osd [22717.778698] ceph: tid 19679 timed out on osd16, will reset osd [22777.878601] ceph: tid 19573 timed out on osd15, will reset osd [22777.885334] ceph: tid 19679 timed out on osd16, will reset osd [22837.985243] ceph: tid 19573 timed out on osd15, will reset osd [22837.991975] ceph: tid 19679 timed out on osd16, will reset osd [22898.091879] ceph: tid 19573 timed out on osd15, will reset osd [22898.098617] ceph: tid 19679 timed out on osd16, will reset osd [22958.198662] ceph: tid 19573 timed out on osd15, will reset osd [22958.205408] ceph: tid 19679 timed out on osd16, will reset osd [23018.305431] ceph: tid 19573 timed out on osd15, will reset osd [23018.312162] ceph: tid 19679 timed out on osd16, will reset osd [23078.412188] ceph: tid 19573 timed out on osd15, will reset osd [23078.418931] ceph: tid 19679 timed out on osd16, will reset osd [23138.518932] ceph: tid 19573 timed out on osd15, will reset osd [23138.525663] ceph: tid 19679 timed out on osd16, will reset osd [23198.625666] ceph: tid 19573 timed out on osd15, will reset osd [23198.632409] ceph: tid 19679 timed out on osd16, will reset osd [23258.732388] ceph: tid 19573 timed out on osd15, will reset osd [23258.739127] ceph: tid 19679 timed out on osd16, will reset osd [23318.839100] ceph: tid 19573 timed out on osd15, will reset osd [23318.845842] ceph: tid 19679 timed out on osd16, will reset osd [23378.945802] ceph: tid 19573 timed out on osd15, will reset osd [23378.952551] ceph: tid 19679 timed out on osd16, will reset osd [23439.052496] ceph: tid 19573 timed out on osd15, will reset osd [23439.059231] ceph: tid 19679 timed out on osd16, will reset osd [23499.159182] ceph: tid 19573 timed out on osd15, will reset osd [23499.165945] ceph: tid 19679 timed out on osd16, will reset osd [23559.265858] ceph: tid 19573 timed out on osd15, will reset osd [23559.272615] ceph: tid 19679 timed out on osd16, will reset osd [23619.372528] ceph: tid 19573 timed out on osd15, will reset osd [23619.379278] ceph: tid 19679 timed out on osd16, will reset osd [23679.479191] ceph: tid 19573 timed out on osd15, will reset osd [23679.485937] ceph: tid 19679 timed out on osd16, will reset osd [23739.585846] ceph: tid 19573 timed out on osd15, will reset osd [23739.592580] ceph: tid 19679 timed out on osd16, will reset osd [23799.692495] ceph: tid 19573 timed out on osd15, will reset osd [23799.699243] ceph: tid 19679 timed out on osd16, will reset osd [23859.799138] ceph: tid 19573 timed out on osd15, will reset osd [23859.805879] ceph: tid 19679 timed out on osd16, will reset osd [23919.905776] ceph: tid 19573 timed out on osd15, will reset osd [23919.912520] ceph: tid 19679 timed out on osd16, will reset osd [23980.012408] ceph: tid 19573 timed out on osd15, will reset osd [23980.019149] ceph: tid 19679 timed out on osd16, will reset osd [24040.119035] ceph: tid 19573 timed out on osd15, will reset osd [24040.125787] ceph: tid 19679 timed out on osd16, will reset osd [24100.225658] ceph: tid 19573 timed out on osd15, will reset osd [24100.232396] ceph: tid 19679 timed out on osd16, will reset osd [24117.435968] ceph: osd3 up [24117.438659] ceph: osd3 weight 0x10000 (in) [24123.596590] ceph: osd16 10.3.14.143:6800 socket closed [24123.602694] ceph: osd16 10.3.14.143:6800 connection failed [24124.687577] ceph: osd16 10.3.14.143:6800 connection failed [24125.511896] ceph: get_reply unknown tid 19573 from osd15 [24125.567312] ceph: get_reply unknown tid 20215 from osd15 [24125.572780] ceph: get_reply unknown tid 20525 from osd15 [24125.578255] ceph: get_reply unknown tid 21115 from osd15 [24125.583729] ceph: get_reply unknown tid 21202 from osd15 [24125.687410] ceph: osd16 10.3.14.143:6800 connection failed [24126.061360] ceph: get_reply unknown tid 22277 from osd15 [24126.066825] ceph: get_reply unknown tid 25526 from osd15 [24126.103519] ceph: get_reply unknown tid 26033 from osd15 [24126.557363] ceph: get_reply unknown tid 29849 from osd15 [24126.562874] ceph: get_reply unknown tid 30430 from osd15 [24126.568895] ceph: get_reply unknown tid 19912 from osd15 [24126.574385] ceph: get_reply unknown tid 22174 from osd15 [24126.588263] ceph: get_reply unknown tid 23563 from osd15 [24126.593727] ceph: get_reply unknown tid 23869 from osd15 [24126.599190] ceph: get_reply unknown tid 24355 from osd15 [24126.604672] ceph: get_reply unknown tid 25629 from osd15 [24126.610123] ceph: get_reply unknown tid 26767 from osd15 [24126.615578] ceph: get_reply unknown tid 30189 from osd15 [24126.621042] ceph: get_reply unknown tid 30403 from osd15 [24126.880924] ceph: osd3 10.3.14.130:6800 socket closed [24126.886867] ceph: osd3 10.3.14.130:6800 connection failed [24127.707243] ceph: osd3 10.3.14.130:6800 connection failed [24127.712982] ceph: osd16 10.3.14.143:6800 connection failed [24128.707175] ceph: osd3 10.3.14.130:6800 connection failed [24130.315479] ceph: get_reply unknown tid 22795 from osd15 [24130.710984] ceph: osd3 10.3.14.130:6800 connection failed [24131.714910] ceph: osd16 10.3.14.143:6800 connection failed [24134.714608] ceph: osd3 10.3.14.130:6800 connection failed [24139.722207] ceph: osd16 10.3.14.143:6800 connection failed [24142.729893] ceph: osd3 10.3.14.130:6800 connection failed [24143.686577] ceph: osd16 down [24149.557765] ceph: get_reply unknown tid 23425 from osd15 [24149.563225] ceph: get_reply unknown tid 26224 from osd15 [24149.571403] ceph: get_reply unknown tid 26698 from osd15 [24149.576903] ceph: get_reply unknown tid 27167 from osd15 [24149.582416] ceph: get_reply unknown tid 27252 from osd15 [24149.587908] ceph: get_reply unknown tid 28795 from osd15 [24149.593428] ceph: get_reply unknown tid 30270 from osd15 [24149.598936] ceph: get_reply unknown tid 30460 from osd15 [24149.604435] ceph: get_reply unknown tid 30765 from osd15 [24149.610041] ceph: osd3 down [24210.447788] ceph: tid 19910 timed out on osd15, will reset osd [24270.538407] ceph: tid 19910 timed out on osd15, will reset osd [24330.629022] ceph: tid 19910 timed out on osd15, will reset osd [24390.719637] ceph: tid 19910 timed out on osd15, will reset osd [24450.810249] ceph: tid 19910 timed out on osd15, will reset osd [24510.900860] ceph: tid 19910 timed out on osd15, will reset osd [24570.991468] ceph: tid 19910 timed out on osd15, will reset osd [24631.082076] ceph: tid 19910 timed out on osd15, will reset osd [24691.172691] ceph: tid 19910 timed out on osd15, will reset osd [24751.263286] ceph: tid 19910 timed out on osd15, will reset osd [24811.353888] ceph: tid 19910 timed out on osd15, will reset osd [24871.444488] ceph: tid 19910 timed out on osd15, will reset osd [24931.535087] ceph: tid 19910 timed out on osd15, will reset osd [24991.625685] ceph: tid 19910 timed out on osd15, will reset osd [25051.716281] ceph: tid 19910 timed out on osd15, will reset osd [25111.806876] ceph: tid 19910 timed out on osd15, will reset osd [25171.897469] ceph: tid 19910 timed out on osd15, will reset osd [25231.988061] ceph: tid 19910 timed out on osd15, will reset osd [25292.078653] ceph: tid 19910 timed out on osd15, will reset osd [25352.169242] ceph: tid 19910 timed out on osd15, will reset osd [25412.259831] ceph: tid 19910 timed out on osd15, will reset osd [25472.350418] ceph: tid 19910 timed out on osd15, will reset osd [25532.441004] ceph: tid 19910 timed out on osd15, will reset osd [25592.531590] ceph: tid 19910 timed out on osd15, will reset osd [25652.622174] ceph: tid 19910 timed out on osd15, will reset osd [25712.712756] ceph: tid 19910 timed out on osd15, will reset osd [25772.803386] ceph: tid 19910 timed out on osd15, will reset osd [25832.894017] ceph: tid 19910 timed out on osd15, will reset osd [25892.984646] ceph: tid 19910 timed out on osd15, will reset osd [25953.075273] ceph: tid 19910 timed out on osd15, will reset osd [26013.165897] ceph: tid 19910 timed out on osd15, will reset osd [26073.256520] ceph: tid 19910 timed out on osd15, will reset osd [26133.347140] ceph: tid 19910 timed out on osd15, will reset osd [26193.437759] ceph: tid 19910 timed out on osd15, will reset osd [26253.528398] ceph: tid 19910 timed out on osd15, will reset osd [26313.619076] ceph: tid 19910 timed out on osd15, will reset osd [26373.709753] ceph: tid 19910 timed out on osd15, will reset osd [26433.800419] ceph: tid 19910 timed out on osd15, will reset osd [26493.891086] ceph: tid 19910 timed out on osd15, will reset osd [26553.981750] ceph: tid 19910 timed out on osd15, will reset osd [26614.072410] ceph: tid 19910 timed out on osd15, will reset osd [26674.163068] ceph: tid 19910 timed out on osd15, will reset osd [26734.253724] ceph: tid 19910 timed out on osd15, will reset osd [26794.344404] ceph: tid 19910 timed out on osd15, will reset osd [26854.435088] ceph: tid 19910 timed out on osd15, will reset osd [26914.525769] ceph: tid 19910 timed out on osd15, will reset osd [26974.616446] ceph: tid 19910 timed out on osd15, will reset osd [27034.707120] ceph: tid 19910 timed out on osd15, will reset osd [27094.797791] ceph: tid 19910 timed out on osd15, will reset osd [27154.888458] ceph: tid 19910 timed out on osd15, will reset osd [27214.979122] ceph: tid 19910 timed out on osd15, will reset osd [27275.069782] ceph: tid 19910 timed out on osd15, will reset osd [27335.160440] ceph: tid 19910 timed out on osd15, will reset osd [27395.251095] ceph: tid 19910 timed out on osd15, will reset osd [27455.341747] ceph: tid 19910 timed out on osd15, will reset osd [27515.432396] ceph: tid 19910 timed out on osd15, will reset osd [27575.523043] ceph: tid 19910 timed out on osd15, will reset osd [27635.613687] ceph: tid 19910 timed out on osd15, will reset osd [27695.704328] ceph: tid 19910 timed out on osd15, will reset osd [27755.794967] ceph: tid 19910 timed out on osd15, will reset osd [27815.885604] ceph: tid 19910 timed out on osd15, will reset osd [27875.976238] ceph: tid 19910 timed out on osd15, will reset osd [27936.066870] ceph: tid 19910 timed out on osd15, will reset osd [27945.170631] ceph: osd3 up [27945.173304] ceph: osd3 weight 0x10000 (in) [27968.364590] ceph: osd6 down [27968.367442] ceph: osd19 down [28006.172605] ceph: tid 19679 timed out on osd3, will reset osd [28026.070488] ceph: osd6 up [28026.073187] ceph: osd6 weight 0x10000 (in) [28031.342586] ceph: osd16 up [28031.345350] ceph: osd16 weight 0x10000 (in) [28034.788727] ceph: osd19 up [28034.791512] ceph: osd19 weight 0x10000 (in) [28090.668819] ceph: get_reply unknown tid 20776 from osd3 [28090.674214] ceph: get_reply unknown tid 22563 from osd3 [28090.679598] ceph: get_reply unknown tid 26876 from osd3 [28090.727881] ceph: get_reply unknown tid 28772 from osd3 [28090.760869] ceph: get_reply unknown tid 19679 from osd3 [28090.766250] ceph: get_reply unknown tid 20075 from osd3 [28090.777733] ceph: get_reply unknown tid 20362 from osd3 [28090.813099] ceph: get_reply unknown tid 20533 from osd3 [28091.321042] ceph: get_reply unknown tid 27556 from osd3 [28091.624473] ceph: get_reply unknown tid 28624 from osd3 [28091.629855] ceph: get_reply unknown tid 29669 from osd3 [28091.665150] ceph: get_reply unknown tid 25146 from osd3 [28096.332542] ceph: tid 22188 timed out on osd3, will reset osd [28180.715706] ceph: mds0 reconnect start [28181.071238] ceph: mds0 reconnect success [28231.392356] ceph: mds0 caps stale [28251.390565] ceph: mds0 caps stale [28280.062503] ceph: mds0 recovery completed [28396.157325] ceph: mds0 caps went stale, renewing [28452.975129] ceph: mds0 caps renewed [34599.211043] ceph: client4107 fsid 1d235fcf-9728-52a3-ba9e-1becb44d989c [34599.218033] ceph: mon0 10.3.14.10:6789 session established [34709.689871] ceph: client4110 fsid 1d235fcf-9728-52a3-ba9e-1becb44d989c [34709.696828] ceph: mon0 10.3.14.10:6789 session established [35197.307321] ceph: client4113 fsid 1d235fcf-9728-52a3-ba9e-1becb44d989c [35197.314305] ceph: mon0 10.3.14.10:6789 session established syslogd: /var/log/news/news.crit: No such file or directory syslogd: /var/log/news/news.err: No such file or directory syslogd: /var/log/news/news.notice: No such file or directory [94880.387538] ceph: osd15 10.3.14.142:6800 socket closed [94880.392791] INFO: trying to register non-static key. [94880.396785] the code is fine but needs lockdep annotation. [94880.396785] turning off the locking correctness validator. [94880.396785] Pid: 10, comm: kworker/0:1 Not tainted 2.6.36-rc7+ #61 [94880.396785] Call Trace: [94880.396785] [<ffffffff8105dd50>] ? static_obj+0x43/0x53 [94880.396785] [<ffffffff8106234c>] __lock_acquire+0x852/0x87a [94880.396785] [<ffffffff810623fc>] lock_acquire+0x88/0xa5 [94880.396785] [<ffffffffa002bf1b>] ? osd_reset+0x40/0x8d [ceph] [94880.396785] [<ffffffff814b5382>] down_read+0x47/0x8d [94880.396785] [<ffffffffa002bf1b>] ? osd_reset+0x40/0x8d [ceph] [94880.396785] [<ffffffffa002bf1b>] osd_reset+0x40/0x8d [ceph] [94880.396785] [<ffffffffa001fef5>] con_work+0x37b/0x6bb [ceph] [94880.396785] [<ffffffff8104c89f>] process_one_work+0x1fd/0x38f [94880.396785] [<ffffffff8104c83d>] ? process_one_work+0x19b/0x38f [94880.396785] [<ffffffffa001fb7a>] ? con_work+0x0/0x6bb [ceph] [94880.396785] [<ffffffff8104e269>] worker_thread+0x147/0x22b [94880.396785] [<ffffffff8104e122>] ? worker_thread+0x0/0x22b [94880.396785] [<ffffffff81051a6d>] kthread+0x8d/0x95 [94880.396785] [<ffffffff81003794>] kernel_thread_helper+0x4/0x10 [94880.396785] [<ffffffff81030fe9>] ? finish_task_switch+0x0/0xa8 [94880.396785] [<ffffffff81031052>] ? finish_task_switch+0x69/0xa8 [94880.396785] [<ffffffff814b6f00>] ? restore_args+0x0/0x30 [94880.396785] [<ffffffff810519e0>] ? kthread+0x0/0x95 [94880.396785] [<ffffffff81003790>] ? kernel_thread_helper+0x0/0x10 [94880.535663] BUG: unable to handle kernel NULL pointer dereference at (null) [94880.539585] IP: [<ffffffff8126ffbd>] __list_add+0x42/0x89 [94880.539585] PGD 11d119067 PUD 11cbc0067 PMD 0 [94880.539585] Oops: 0000 [#1] PREEMPT SMP [94880.539585] last sysfs file: /sys/kernel/uevent_seqnum [94880.539585] CPU 0 [94880.539585] Modules linked in: ceph [94880.539585] [94880.539585] Pid: 10, comm: kworker/0:1 Not tainted 2.6.36-rc7+ #61 PDSMi+/PDSMi [94880.539585] RIP: 0010:[<ffffffff8126ffbd>] [<ffffffff8126ffbd>] __list_add+0x42/0x89 [94880.539585] RSP: 0018:ffff88011faddc40 EFLAGS: 00010046 [94880.539585] RAX: 0000000000000000 RBX: ffff88011e1be958 RCX: 0000000000000000 [94880.539585] RDX: ffff88011e1be958 RSI: 0000000000000000 RDI: ffff88011faddc90 [94880.539585] RBP: ffff88011faddc60 R08: 0000000000000000 R09: ffff88011faddc90 [94880.539585] R10: ffffffff81056048 R11: ffffffff81055a87 R12: 0000000000000000 [94880.539585] R13: ffff88011faddc90 R14: ffff88011fada280 R15: ffffffffa002be61 [94880.539585] FS: 0000000000000000(0000) GS:ffff880002600000(0000) knlGS:0000000000000000 [94880.539585] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [94880.539585] CR2: 0000000000000000 CR3: 000000011dcd4000 CR4: 00000000000006f0 [94880.539585] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [94880.539585] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [94880.539585] Process kworker/0:1 (pid: 10, threadinfo ffff88011fadc000, task ffff88011fada280) [94880.539585] Stack: [94880.539585] ffff88011e1be910 00000000ffffffff ffff88011e1be910 0000000000000202 [94880.539585] <0> ffff88011faddce0 ffffffff814b479f ffffffffa002be61 0000000000000000 [94880.539585] <0> ffff88011faddce0 0000000000000246 ffff88011faddc90 ffff88011faddc90 [94880.539585] Call Trace: [94880.539585] [<ffffffff814b479f>] mutex_lock_nested+0x130/0x31e [94880.539585] [<ffffffffa002be61>] ? kick_requests+0x24/0x9e [ceph] [94880.539585] [<ffffffffa002be61>] kick_requests+0x24/0x9e [ceph] [94880.539585] [<ffffffffa002bf1b>] ? osd_reset+0x40/0x8d [ceph] [94880.539585] [<ffffffffa002bf26>] osd_reset+0x4b/0x8d [ceph] [94880.539585] [<ffffffffa001fef5>] con_work+0x37b/0x6bb [ceph] [94880.539585] [<ffffffff8104c89f>] process_one_work+0x1fd/0x38f [94880.539585] [<ffffffff8104c83d>] ? process_one_work+0x19b/0x38f [94880.539585] [<ffffffffa001fb7a>] ? con_work+0x0/0x6bb [ceph] [94880.539585] [<ffffffff8104e269>] worker_thread+0x147/0x22b [94880.539585] [<ffffffff8104e122>] ? worker_thread+0x0/0x22b [94880.539585] [<ffffffff81051a6d>] kthread+0x8d/0x95 [94880.539585] [<ffffffff81003794>] kernel_thread_helper+0x4/0x10 [94880.539585] [<ffffffff81030fe9>] ? finish_task_switch+0x0/0xa8 [94880.539585] [<ffffffff81031052>] ? finish_task_switch+0x69/0xa8 [94880.539585] [<ffffffff814b6f00>] ? restore_args+0x0/0x30 [94880.539585] [<ffffffff810519e0>] ? kthread+0x0/0x95 [94880.539585] [<ffffffff81003790>] ? kernel_thread_helper+0x0/0x10 [94880.539585] Code: 8b 42 08 48 39 f0 74 23 49 89 d1 49 89 c0 48 89 f1 48 c7 c2 06 d6 64 81 be 1a 00 00 00 48 c7 c7 51 d5 64 81 31 c0 e8 fc a5 dc ff <49> 8b 04 24 48 39 d8 74 23 49 89 c0 4d 89 e1 48 89 d9 48 c7 c2 [94880.539585] RIP [<ffffffff8126ffbd>] __list_add+0x42/0x89 [94880.539585] RSP <ffff88011faddc40> [94880.539585] CR2: 0000000000000000 [94880.539585] ---[ end trace 555371ce86832624 ]--- [94880.539585] note: kworker/0:1[10] exited with preempt_count 1 [94880.853864] BUG: unable to handle kernel paging request at fffffffffffffff8 [94880.857795] IP: [<ffffffff810516f4>] kthread_data+0xb/0x11 [94880.857795] PGD 174d067 PUD 174e067 PMD 0 [94880.857795] Oops: 0000 [#2] PREEMPT SMP [94880.857795] last sysfs file: /sys/kernel/uevent_seqnum [94880.857795] CPU 0 [94880.857795] Modules linked in: ceph [94880.857795] [94880.857795] Pid: 10, comm: kworker/0:1 Tainted: G D 2.6.36-rc7+ #61 PDSMi+/PDSMi [94880.857795] RIP: 0010:[<ffffffff810516f4>] [<ffffffff810516f4>] kthread_data+0xb/0x11 [94880.857795] RSP: 0018:ffff88011fadd828 EFLAGS: 00010092 [94880.857795] RAX: 0000000000000000 RBX: ffff88011fada760 RCX: ffff88011fada280 [94880.857795] RDX: 0000000000000040 RSI: 0000000000000000 RDI: ffff88011fada280 [94880.857795] RBP: ffff88011fadd828 R08: 0000000000000002 R09: ffffffff814b374f [94880.857795] R10: ffff88011fada280 R11: ffff88011fadd888 R12: ffff88011fada280 [94880.857795] R13: 0000000000000000 R14: ffff88011fada270 R15: 0000000000000000 [94880.857795] FS: 0000000000000000(0000) GS:ffff880002600000(0000) knlGS:0000000000000000 [94880.857795] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [94880.857795] CR2: fffffffffffffff8 CR3: 000000011dcd4000 CR4: 00000000000006f0 [94880.857795] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [94880.857795] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [94880.857795] Process kworker/0:1 (pid: 10, threadinfo ffff88011fadc000, task ffff88011fada280) [94880.857795] Stack: [94880.857795] ffff88011fadd858 ffffffff8104deca ffffffff814b374f ffff88011fada760 [94880.857795] <0> ffff88011fada280 ffff8800027d2800 ffff88011fadd928 ffffffff814b37d6 [94880.857795] <0> ffff88011faddfd8 0000000000004000 00000000001d2800 00000000001d2800 [94880.857795] Call Trace: [94880.857795] [<ffffffff8104deca>] wq_worker_sleeping+0x15/0x82 [94880.857795] [<ffffffff814b374f>] ? schedule+0x108/0x7d3 [94880.857795] [<ffffffff814b37d6>] schedule+0x18f/0x7d3 [94880.857795] [<ffffffff8103e14b>] do_exit+0x676/0x67a [94880.857795] [<ffffffff810068d5>] oops_end+0xb2/0xba [94880.857795] [<ffffffff81022e24>] no_context+0x1f5/0x204 [94880.857795] [<ffffffff81023075>] __bad_area_nosemaphore+0x186/0x1a9 [94880.857795] [<ffffffff8102310e>] bad_area_nosemaphore+0xe/0x10 [94880.857795] [<ffffffff810234c6>] do_page_fault+0x177/0x35d [94880.857795] [<ffffffff814b6ad8>] ? _raw_spin_unlock_irq+0x36/0x53 [94880.857795] [<ffffffff81031052>] ? finish_task_switch+0x69/0xa8 [94880.857795] [<ffffffff81030fe9>] ? finish_task_switch+0x0/0xa8 [94880.857795] [<ffffffff814b3df2>] ? schedule+0x7ab/0x7d3 [94880.857795] [<ffffffff814b5dfe>] ? trace_hardirqs_off_thunk+0x3a/0x3c [94880.857795] [<ffffffff81055a87>] ? up+0xf/0x39 [94880.857795] [<ffffffff81056048>] ? __atomic_notifier_call_chain+0x0/0xb4 [94880.857795] [<ffffffffa002be61>] ? kick_requests+0x24/0x9e [ceph] [94880.857795] [<ffffffff814b711f>] page_fault+0x1f/0x30 [94880.857795] [<ffffffffa002be61>] ? kick_requests+0x24/0x9e [ceph] [94880.857795] [<ffffffff81055a87>] ? up+0xf/0x39 [94880.857795] [<ffffffff81056048>] ? __atomic_notifier_call_chain+0x0/0xb4 [94880.857795] [<ffffffff8126ffbd>] ? __list_add+0x42/0x89 [94880.857795] [<ffffffff814b479f>] mutex_lock_nested+0x130/0x31e [94880.857795] [<ffffffffa002be61>] ? kick_requests+0x24/0x9e [ceph] [94880.857795] [<ffffffffa002be61>] kick_requests+0x24/0x9e [ceph] [94880.857795] [<ffffffffa002bf1b>] ? osd_reset+0x40/0x8d [ceph] [94880.857795] [<ffffffffa002bf26>] osd_reset+0x4b/0x8d [ceph] [94880.857795] [<ffffffffa001fef5>] con_work+0x37b/0x6bb [ceph] [94880.857795] [<ffffffff8104c89f>] process_one_work+0x1fd/0x38f [94880.857795] [<ffffffff8104c83d>] ? process_one_work+0x19b/0x38f [94880.857795] [<ffffffffa001fb7a>] ? con_work+0x0/0x6bb [ceph] [94880.857795] [<ffffffff8104e269>] worker_thread+0x147/0x22b [94880.857795] [<ffffffff8104e122>] ? worker_thread+0x0/0x22b [94880.857795] [<ffffffff81051a6d>] kthread+0x8d/0x95 [94880.857795] [<ffffffff81003794>] kernel_thread_helper+0x4/0x10 [94880.857795] [<ffffffff81030fe9>] ? finish_task_switch+0x0/0xa8 [94880.857795] [<ffffffff81031052>] ? finish_task_switch+0x69/0xa8 [94880.857795] [<ffffffff814b6f00>] ? restore_args+0x0/0x30 [94880.857795] [<ffffffff810519e0>] ? kthread+0x0/0x95 [94880.857795] [<ffffffff81003790>] ? kernel_thread_helper+0x0/0x10 [94880.857795] Code: 41 5f c9 c3 90 90 90 55 65 48 8b 04 25 40 b5 00 00 48 8b 80 f0 01 00 00 48 89 e5 8b 40 f0 c9 c3 48 8b 87 f0 01 00 00 55 48 89 e5 <48> 8b 40 f8 c9 c3 55 48 83 c7 78 48 89 e5 e8 af 1f fe ff c9 c3 [94880.857795] RIP [<ffffffff810516f4>] kthread_data+0xb/0x11 [94880.857795] RSP <ffff88011fadd828> [94880.857795] CR2: fffffffffffffff8 [94880.857795] ---[ end trace 555371ce86832625 ]--- [94880.857795] Fixing recursive fault but reboot is needed! [94880.857795] BUG: spinlock lockup on CPU#0, kworker/0:1/10, ffff8800027d2800 [94880.857795] Pid: 10, comm: kworker/0:1 Tainted: G D 2.6.36-rc7+ #61 [94880.857795] Call Trace: [94880.857795] [<ffffffff8126fba7>] do_raw_spin_lock+0x109/0x135 [94880.857795] [<ffffffff814b6135>] _raw_spin_lock_irq+0x62/0x77 [94880.857795] [<ffffffff814b374f>] ? schedule+0x108/0x7d3 [94880.857795] [<ffffffff814b374f>] schedule+0x108/0x7d3 [94880.857795] [<ffffffff8103dba1>] do_exit+0xcc/0x67a [94880.857795] [<ffffffff8103be4b>] ? kmsg_dump+0x137/0x160 [94880.857795] [<ffffffff810068d5>] oops_end+0xb2/0xba [94880.857795] [<ffffffff81022e24>] no_context+0x1f5/0x204 [94880.857795] [<ffffffff8105e952>] ? print_lock_contention_bug+0x1b/0xe0 [94880.857795] [<ffffffff81023075>] __bad_area_nosemaphore+0x186/0x1a9 [94880.857795] [<ffffffff8102310e>] bad_area_nosemaphore+0xe/0x10 [94880.857795] [<ffffffff810234c6>] do_page_fault+0x177/0x35d [94880.857795] [<ffffffff810e2b45>] ? fsnotify_clear_marks_by_inode+0x2d/0xd5 [94880.857795] [<ffffffff814b5dfe>] ? trace_hardirqs_off_thunk+0x3a/0x3c [94880.857795] [<ffffffff814b374f>] ? schedule+0x108/0x7d3 [94880.857795] [<ffffffff814b711f>] page_fault+0x1f/0x30 [94880.857795] [<ffffffff814b374f>] ? schedule+0x108/0x7d3 [94880.857795] [<ffffffff810516f4>] ? kthread_data+0xb/0x11 [94880.857795] [<ffffffff8104deca>] wq_worker_sleeping+0x15/0x82 [94880.857795] [<ffffffff814b374f>] ? schedule+0x108/0x7d3 [94880.857795] [<ffffffff814b37d6>] schedule+0x18f/0x7d3 [94880.857795] [<ffffffff8103e14b>] do_exit+0x676/0x67a [94880.857795] [<ffffffff810068d5>] oops_end+0xb2/0xba [94880.857795] [<ffffffff81022e24>] no_context+0x1f5/0x204 [94880.857795] [<ffffffff81023075>] __bad_area_nosemaphore+0x186/0x1a9 [94880.857795] [<ffffffff8102310e>] bad_area_nosemaphore+0xe/0x10 [94880.857795] [<ffffffff810234c6>] do_page_fault+0x177/0x35d [94880.857795] [<ffffffff814b6ad8>] ? _raw_spin_unlock_irq+0x36/0x53 [94880.857795] [<ffffffff81031052>] ? finish_task_switch+0x69/0xa8 [94880.857795] [<ffffffff81030fe9>] ? finish_task_switch+0x0/0xa8 [94880.857795] [<ffffffff814b3df2>] ? schedule+0x7ab/0x7d3 [94880.857795] [<ffffffff814b5dfe>] ? trace_hardirqs_off_thunk+0x3a/0x3c [94880.857795] [<ffffffff81055a87>] ? up+0xf/0x39 [94880.857795] [<ffffffff81056048>] ? __atomic_notifier_call_chain+0x0/0xb4 [94880.857795] [<ffffffffa002be61>] ? kick_requests+0x24/0x9e [ceph] [94880.857795] [<ffffffff814b711f>] page_fault+0x1f/0x30 [94880.857795] [<ffffffffa002be61>] ? kick_requests+0x24/0x9e [ceph] [94880.857795] [<ffffffff81055a87>] ? up+0xf/0x39 [94880.857795] [<ffffffff81056048>] ? __atomic_notifier_call_chain+0x0/0xb4 [94880.857795] [<ffffffff8126ffbd>] ? __list_add+0x42/0x89 [94880.857795] [<ffffffff814b479f>] mutex_lock_nested+0x130/0x31e [94880.857795] [<ffffffffa002be61>] ? kick_requests+0x24/0x9e [ceph] [94880.857795] [<ffffffffa002be61>] kick_requests+0x24/0x9e [ceph] [94880.857795] [<ffffffffa002bf1b>] ? osd_reset+0x40/0x8d [ceph] [94880.857795] [<ffffffffa002bf26>] osd_reset+0x4b/0x8d [ceph] [94880.857795] [<ffffffffa001fef5>] con_work+0x37b/0x6bb [ceph] [94880.857795] [<ffffffff8104c89f>] process_one_work+0x1fd/0x38f [94880.857795] [<ffffffff8104c83d>] ? process_one_work+0x19b/0x38f [94880.857795] [<ffffffffa001fb7a>] ? con_work+0x0/0x6bb [ceph] [94880.857795] [<ffffffff8104e269>] worker_thread+0x147/0x22b [94880.857795] [<ffffffff8104e122>] ? worker_thread+0x0/0x22b [94880.857795] [<ffffffff81051a6d>] kthread+0x8d/0x95 [94880.857795] [<ffffffff81003794>] kernel_thread_helper+0x4/0x10 [94880.857795] [<ffffffff81030fe9>] ? finish_task_switch+0x0/0xa8 [94880.857795] [<ffffffff81031052>] ? finish_task_switch+0x69/0xa8 [94880.857795] [<ffffffff814b6f00>] ? restore_args+0x0/0x30 [94880.857795] [<ffffffff810519e0>] ? kthread+0x0/0x95 [94880.857795] [<ffffffff81003790>] ? kernel_thread_helper+0x0/0x10 [94880.857795] sending NMI to all CPUs: [94889.441308] NMI backtrace for cpu 1 [94889.441308] CPU 1 [94889.441308] Modules linked in: ceph [94889.441308] [94889.441308] Pid: 0, comm: kworker/0:0 Tainted: G D 2.6.36-rc7+ #61 PDSMi+/PDSMi [94889.441308] RIP: 0010:[<ffffffff8100aa6d>] [<ffffffff8100aa6d>] mwait_idle+0x89/0x99 [94889.441308] RSP: 0018:ffff88011fae1ed8 EFLAGS: 00000246 [94889.441308] RAX: 0000000000000000 RBX: ffff8800029d2700 RCX: 0000000000000000 [94889.441308] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff8100aa64 [94889.441308] RBP: ffff88011fae1ef8 R08: ffffffff8175ea90 R09: 0000000000000001 [94889.441308] R10: ffffffff81056048 R11: ffffffff8102da71 R12: ffff88011fae1fd8 [94889.441308] R13: ffff88011fae0010 R14: 0000000000000000 R15: 0000000000000000 [94889.441308] FS: 0000000000000000(0000) GS:ffff880002800000(0000) knlGS:0000000000000000 [94889.441308] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [94889.441308] CR2: 00007f2a37b36380 CR3: 000000000174b000 CR4: 00000000000006e0 [94889.441308] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [94889.441308] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [94889.441308] Process kworker/0:0 (pid: 0, threadinfo ffff88011fae0000, task ffff88011fade2c0) [94889.441308] Stack: [94889.441308] ffff88011fae1ee8 ffff88011fae1fd8 ffffffff817c90c0 0000000000000000 [94889.441308] <0> ffff88011fae1f18 ffffffff81001ac8 ffff88000280d408 0000000000000000 [94889.441308] <0> ffff88011fae1f48 ffffffff814aea64 0000000000000000 0000000000000000 [94889.441308] Call Trace: [94889.441308] [<ffffffff81001ac8>] cpu_idle+0x4d/0x7d [94889.441308] [<ffffffff814aea64>] start_secondary+0x22b/0x271 [94889.441308] Code: e0 ff ff 31 c9 4c 89 e8 48 89 ca 0f 01 c8 0f ae f0 49 8b 84 24 38 e0 ff ff a8 08 75 10 e8 cf 4a 05 00 31 c9 48 89 c8 fb 0f 01 c9 <eb> 06 e8 bf 4a 05 00 fb 5e 5b 41 5c 41 5d c9 c3 55 48 89 e5 e8 [94889.441308] Call Trace: [94889.441308] [<ffffffff81001ac8>] cpu_idle+0x4d/0x7d [94889.441308] [<ffffffff814aea64>] start_secondary+0x22b/0x271 [94889.441308] Pid: 0, comm: kworker/0:0 Tainted: G D 2.6.36-rc7+ #61 [94889.441308] Call Trace: [94889.441308] <NMI> [<ffffffff8100a603>] ? show_regs+0x26/0x2a [94889.441308] [<ffffffff8101aaa0>] nmi_watchdog_tick+0xad/0x19e [94889.441308] [<ffffffff81004004>] do_nmi+0xea/0x298 [94889.441308] [<ffffffff814b740a>] nmi+0x1a/0x2c [94889.441308] [<ffffffff8102da71>] ? __wake_up_sync_key+0x27/0x5a [94889.441308] [<ffffffff81056048>] ? __atomic_notifier_call_chain+0x0/0xb4 [94889.441308] [<ffffffff8100aa64>] ? mwait_idle+0x80/0x99 [94889.441308] [<ffffffff8100aa6d>] ? mwait_idle+0x89/0x99 [94889.441308] <<EOE>> [<ffffffff81001ac8>] cpu_idle+0x4d/0x7d [94889.441308] [<ffffffff814aea64>] start_secondary+0x22b/0x271 [94889.441266] NMI backtrace for cpu 3 [94889.441266] CPU 3 [94889.441266] Modules linked in: ceph [94889.441266] [94889.441266] Pid: 0, comm: kworker/0:1 Tainted: G D 2.6.36-rc7+ #61 PDSMi+/PDSMi [94889.441266] RIP: 0010:[<ffffffff8100aa6d>] [<ffffffff8100aa6d>] mwait_idle+0x89/0x99 [94889.441266] RSP: 0018:ffff88011fb2ded8 EFLAGS: 00000246 [94889.441266] RAX: 0000000000000000 RBX: ffff880002dd2700 RCX: 0000000000000000 [94889.441266] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff8100aa64 [94889.441266] RBP: ffff88011fb2def8 R08: ffffffff8175ea90 R09: 0000000000000001 [94889.441266] R10: ffffffff81056048 R11: ffffffff810521b7 R12: ffff88011fb2dfd8 [94889.441266] R13: ffff88011fb2c010 R14: 0000000000000000 R15: 0000000000000000 [94889.441266] FS: 0000000000000000(0000) GS:ffff880002c00000(0000) knlGS:0000000000000000 [94889.441266] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [94889.441266] CR2: 0000000000612188 CR3: 000000011defa000 CR4: 00000000000006e0 [94889.441266] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [94889.441266] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [94889.441266] Process kworker/0:1 (pid: 0, threadinfo ffff88011fb2c000, task ffff88011fb2a4c0) [94889.441266] Stack: [94889.441266] ffff88011fb2dee8 ffff88011fb2dfd8 ffffffff817c90c0 0000000000000000 [94889.441266] <0> ffff88011fb2df18 ffffffff81001ac8 ffff880002c0d408 0000000000000000 [94889.441266] <0> ffff88011fb2df48 ffffffff814aea64 0000000000000000 0000000000000000 [94889.441266] Call Trace: [94889.441266] [<ffffffff81001ac8>] cpu_idle+0x4d/0x7d [94889.441266] [<ffffffff814aea64>] start_secondary+0x22b/0x271 [94889.441266] Code: e0 ff ff 31 c9 4c 89 e8 48 89 ca 0f 01 c8 0f ae f0 49 8b 84 24 38 e0 ff ff a8 08 75 10 e8 cf 4a 05 00 31 c9 48 89 c8 fb 0f 01 c9 <eb> 06 e8 bf 4a 05 00 fb 5e 5b 41 5c 41 5d c9 c3 55 48 89 e5 e8 [94889.441266] Call Trace: [94889.441266] [<ffffffff81001ac8>] cpu_idle+0x4d/0x7d [94889.441266] [<ffffffff814aea64>] start_secondary+0x22b/0x271 [94889.441266] Pid: 0, comm: kworker/0:1 Tainted: G D 2.6.36-rc7+ #61 [94889.441266] Call Trace: [94889.441266] <NMI> [<ffffffff8100a603>] ? show_regs+0x26/0x2a [94889.441266] [<ffffffff8101aaa0>] nmi_watchdog_tick+0xad/0x19e [94889.441266] [<ffffffff81004004>] do_nmi+0xea/0x298 [94889.441266] [<ffffffff814b740a>] nmi+0x1a/0x2c [94889.441266] [<ffffffff810521b7>] ? add_wait_queue+0x1b/0x45 [94889.441266] [<ffffffff81056048>] ? __atomic_notifier_call_chain+0x0/0xb4 [94889.441266] [<ffffffff8100aa64>] ? mwait_idle+0x80/0x99 [94889.441266] [<ffffffff8100aa6d>] ? mwait_idle+0x89/0x99 [94889.441266] <<EOE>> [<ffffffff81001ac8>] cpu_idle+0x4d/0x7d [94889.441266] [<ffffffff814aea64>] start_secondary+0x22b/0x271 [94889.441339] NMI backtrace for cpu 2 [94889.441339] CPU 2 [94889.441339] Modules linked in: ceph [94889.441339] [94889.441339] Pid: 0, comm: kworker/0:1 Tainted: G D 2.6.36-rc7+ #61 PDSMi+/PDSMi [94889.441339] RIP: 0010:[<ffffffff8100aa6d>] [<ffffffff8100aa6d>] mwait_idle+0x89/0x99 [94889.441339] RSP: 0018:ffff88011fb17ed8 EFLAGS: 00000246 [94889.441339] RAX: 0000000000000000 RBX: ffff880002bd2700 RCX: 0000000000000000 [94889.441339] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff8100aa64 [94889.441339] RBP: ffff88011fb17ef8 R08: ffffffff8175ea90 R09: 0000000000000001 [94889.441339] R10: ffffffff81056048 R11: ffffffff8102da71 R12: ffff88011fb17fd8 [94889.441339] R13: ffff88011fb16010 R14: 0000000000000000 R15: 0000000000000000 [94889.441339] FS: 0000000000000000(0000) GS:ffff880002a00000(0000) knlGS:0000000000000000 [94889.441339] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [94889.441339] CR2: 00007f2a37b36380 CR3: 000000000174b000 CR4: 00000000000006e0 [94889.441339] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [94889.441339] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [94889.441339] Process kworker/0:1 (pid: 0, threadinfo ffff88011fb16000, task ffff88011fb143c0) [94889.441339] Stack: [94889.441339] ffff88011fb17ee8 ffff88011fb17fd8 ffffffff817c90c0 0000000000000000 [94889.441339] <0> ffff88011fb17f18 ffffffff81001ac8 ffff880002a0d408 0000000000000000 [94889.441339] <0> ffff88011fb17f48 ffffffff814aea64 0000000000000000 0000000000000000 [94889.441339] Call Trace: [94889.441339] [<ffffffff81001ac8>] cpu_idle+0x4d/0x7d [94889.441339] [<ffffffff814aea64>] start_secondary+0x22b/0x271 [94889.441339] Code: e0 ff ff 31 c9 4c 89 e8 48 89 ca 0f 01 c8 0f ae f0 49 8b 84 24 38 e0 ff ff a8 08 75 10 e8 cf 4a 05 00 31 c9 48 89 c8 fb 0f 01 c9 <eb> 06 e8 bf 4a 05 00 fb 5e 5b 41 5c 41 5d c9 c3 55 48 89 e5 e8 [94889.441339] Call Trace: [94889.441339] [<ffffffff81001ac8>] cpu_idle+0x4d/0x7d [94889.441339] [<ffffffff814aea64>] start_secondary+0x22b/0x271 [94889.441339] Pid: 0, comm: kworker/0:1 Tainted: G
Updated by Sage Weil over 13 years ago
- Status changed from New to Can't reproduce
Actions