Actions
Bug #11122
closedkceph: crash in ceph_update_writeable_page after failure to reconnect to mds
Status:
Duplicate
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):
Description
Mar 16 09:43:10 teuthology kernel: [3307568.154466] libceph: osd52 down Mar 16 09:43:18 teuthology kernel: [3307576.262229] libceph: osd52 up Mar 16 09:43:50 teuthology kernel: [3307608.808578] ceph: mds0 reconnect denied Mar 16 09:43:50 teuthology kernel: [3307608.825439] ceph: dropping dirty+flushing Fw state for ffff880003cc4bb0 1099676077574 Mar 16 09:43:50 teuthology kernel: [3307608.833684] ceph: dropping dirty data for ffff880003cc4bb0 1099676077574 Mar 16 09:43:50 teuthology kernel: [3307608.847329] ceph: dropping dirty+flushing Fw state for ffff8800394ccbb0 1099676117445 Mar 16 09:43:50 teuthology kernel: [3307608.855594] ceph: dropping dirty+flushing Fw state for ffff8800394ce620 1099676117446 Mar 16 09:43:50 teuthology kernel: [3307608.863877] ceph: dropping dirty+flushing Fw state for ffff8800394ceef0 1099676117447 Mar 16 09:43:50 teuthology kernel: [3307608.872090] ceph: dropping dirty+flushing Fw state for ffff8800394cc2e0 1099676117448 Mar 16 09:43:50 teuthology kernel: [3307608.880511] ceph: dropping dirty+flushing Fw state for ffff8800395642e0 1099676117449 Mar 16 09:43:50 teuthology kernel: [3307608.888723] ceph: dropping dirty+flushing Fw state for ffff880039560530 1099676117450 Mar 16 09:43:50 teuthology kernel: [3307608.896940] ceph: dropping dirty+flushing Fw state for ffff8800395616d0 1099676117451 Mar 16 09:43:50 teuthology kernel: [3307608.905128] ceph: dropping dirty+flushing Fw state for ffff880039566ef0 1099676117452 Mar 16 09:43:50 teuthology kernel: [3307608.913388] ceph: dropping dirty+flushing Fw state for ffff880039560e00 1099676117453 ... Mar 16 09:43:59 teuthology kernel: [3307617.106704] ceph: dropping dirty+flushing Fw state for ffff8801fc33f7c0 1099676111598 Mar 16 09:43:59 teuthology kernel: [3307617.114993] ceph: dropping dirty Fw state for ffff88018baa5480 1099676092139 Mar 16 09:43:59 teuthology kernel: [3307617.122511] ceph: dropping dirty+flushing Fw state for ffff88018baa5480 1099676092139 Mar 16 09:43:59 teuthology kernel: [3307617.130864] ceph: dropping dirty Fw state for ffff880123006ef0 1099675965969 Mar 16 09:43:59 teuthology kernel: [3307617.138296] ceph: dropping dirty+flushing Fw state for ffff880123006ef0 1099675965969 Mar 16 09:43:59 teuthology kernel: [3307617.146587] ceph: dropping dirty Fw state for ffff880210b50530 1099676033790 Mar 16 09:43:59 teuthology kernel: [3307617.154144] ceph: dropping dirty+flushing Fw state for ffff880210b50530 1099676033790 Mar 16 09:43:59 teuthology kernel: [3307617.162366] ceph: dropping dirty Fw state for ffff880175025d50 1099676087571 Mar 16 09:43:59 teuthology kernel: [3307617.169779] ceph: dropping dirty+flushing Fw state for ffff880175025d50 1099676087571 Mar 16 09:43:59 teuthology kernel: [3307617.178035] ceph: dropping dirty+flushing Fw state for ffff88023dfb8530 1099676051224 Mar 16 09:43:59 teuthology kernel: [3307617.187023] ceph: dropping dirty+flushing Fw state for ffff8801bbdc8e00 1099676095004 Mar 16 09:43:59 teuthology kernel: [3307617.195250] ceph: dropping dirty+flushing Fw state for ffff880296cc8530 1099676074407 Mar 16 09:43:59 teuthology kernel: [3307617.203515] ceph: dropping dirty+flushing Fw state for ffff88014ce796d0 1099676112000 Mar 16 09:43:59 teuthology kernel: [3307617.211863] ceph: dropping dirty+flushing Fw state for ffff8802979fc2e0 1099676115764 Mar 16 09:43:59 teuthology kernel: [3307617.220234] ceph: dropping dirty+flushing Fw state for ffff880297b62870 1099676094931 Mar 16 09:43:59 teuthology kernel: [3307617.228569] ceph: dropping dirty+flushing Fw state for ffff8801ce6add50 1099676087465 Mar 16 09:43:59 teuthology kernel: [3307617.236811] ceph: dropping dirty+flushing Fw state for ffff8800341bd480 1099676070642 Mar 16 09:43:59 teuthology kernel: [3307617.245116] ceph: dropping dirty+flushing Fw state for ffff880020c842e0 1099676080523 Mar 16 09:43:59 teuthology kernel: [3307617.253313] ceph: dropping dirty+flushing Fw state for ffff880020d70e00 1099676086396 Mar 16 09:43:59 teuthology kernel: [3307617.261511] ceph: dropping dirty+flushing Fw state for ffff88025d5ca870 1099676114877 Mar 16 09:43:59 teuthology kernel: [3307617.269902] ceph: dropping dirty+flushing Fw state for ffff88013e2a8530 1099676114901 Mar 16 09:43:59 teuthology kernel: [3307617.278239] ceph: dropping dirty+flushing Fw state for ffff88016c8f2870 1099676113436 Mar 16 09:43:59 teuthology kernel: [3307617.286501] ceph: dropping dirty+flushing Fw state for ffff8801eb25e620 1099676075091 Mar 16 09:43:59 teuthology kernel: [3307617.294703] ceph: dropping dirty+flushing Fw state for ffff8802726a4bb0 1099676112540 Mar 16 09:43:59 teuthology kernel: [3307617.302889] ceph: dropping dirty+flushing Fw state for ffff8801d4a39fa0 1099676084375 Mar 16 09:43:59 teuthology kernel: [3307617.311198] ceph: dropping dirty+flushing Fw state for ffff8801d275a870 1099675847146 Mar 16 09:43:59 teuthology kernel: [3307617.319540] ceph: dropping dirty+flushing Fw state for ffff880280465480 1099676095542 Mar 16 09:43:59 teuthology kernel: [3307617.327793] ceph: dropping dirty+flushing Fw state for ffff88000537dd50 1099676111500 Mar 16 09:43:59 teuthology kernel: [3307617.336054] ceph: dropping dirty+flushing Fw state for ffff880263423a10 1099676112633 Mar 16 09:43:59 teuthology kernel: [3307617.344335] ceph: dropping dirty+flushing Fw state for ffff8800179c96d0 1099676084597 Mar 16 09:43:59 teuthology kernel: [3307617.352672] ceph: dropping dirty+flushing Fw state for ffff8801efa65480 1099676045528 Mar 16 09:43:59 teuthology kernel: [3307617.360881] ceph: dropping dirty+flushing Fw state for ffff88017ea1a870 1099676111315 Mar 16 09:43:59 teuthology kernel: [3307617.369135] ceph: dropping dirty+flushing Fw state for ffff88001ba54bb0 1099676049451 Mar 16 09:43:59 teuthology kernel: [3307617.377467] [sched_delayed] sched: RT throttling activated Mar 16 09:44:00 teuthology kernel: [3307617.728291] ------------[ cut here ]------------ Mar 16 09:44:00 teuthology kernel: [3307617.733107] kernel BUG at /srv/autobuild-ceph/gitbuilder.git/build/fs/ceph/addr.c:1024! Mar 16 09:44:00 teuthology kernel: [3307617.741397] invalid opcode: 0000 [#1] SMP Mar 16 09:44:00 teuthology kernel: [3307617.745751] Modules linked in: ipmi_devintf(E) ipmi_si(E) ipmi_msghandler(E) ip6table_filter(E) ip6_tables(E) ebtable_nat(E) ebtables(E) ipt_MASQUERADE(E) iptable_nat(E) nf_nat_ipv4(E) nf_nat(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) xt_state(E) nf_conntrack(E) ipt_REJECT(E) xt_CHECKSUM(E) iptable_mangle(E) xt_tcpudp(E) iptable_filter(E) ip_tables(E) x_tables(E) bridge(E) stp(E) llc(E) ceph(E) libceph(E) fscache(E) gpio_ich(E) psmouse(E) serio_raw(E) i7core_edac(E) edac_core(E) joydev(E) tpm_infineon(E) tpm_tis(E) lpc_ich(E) xfs(E) lp(E) parport(E) hid_generic(E) usbhid(E) hid(E) btrfs(E) e1000e(E) ahci(E) ptp(E) raid6_pq(E) libahci(E) pps_core(E) arcmsr(E) xor(E) libcrc32c(E) Mar 16 09:44:00 teuthology kernel: [3307617.807403] CPU: 3 PID: 14523 Comm: python Tainted: G E 3.16.3-ceph-00306-g76c7fd1 #1 Mar 16 09:44:00 teuthology kernel: [3307617.816499] Hardware name: Supermicro X8SIL/X8SIL, BIOS 1.1 05/27/2010 Mar 16 09:44:00 teuthology kernel: [3307617.823237] task: ffff8803353da1f0 ti: ffff88036c174000 task.ti: ffff88036c174000 Mar 16 09:44:00 teuthology kernel: [3307617.830955] RIP: 0010:[<ffffffffa0397e0a>] [<ffffffffa0397e0a>] ceph_update_writeable_page+0x40a/0x480 [ceph] Mar 16 09:44:00 teuthology kernel: [3307617.841190] RSP: 0018:ffff88036c177b18 EFLAGS: 00010246 Mar 16 09:44:00 teuthology kernel: [3307617.846745] RAX: 02ffff0000000005 RBX: ffffea000d764840 RCX: ffffea000d764840 Mar 16 09:44:00 teuthology kernel: [3307617.854194] RDX: 0000000000000b23 RSI: 0000000000220b23 RDI: ffff8801269e0540 Mar 16 09:44:00 teuthology kernel: [3307617.861581] RBP: ffff88036c177ba8 R08: 0000000000000001 R09: 0000000000000000 Mar 16 09:44:00 teuthology kernel: [3307617.868928] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801c6488e00 Mar 16 09:44:00 teuthology kernel: [3307617.876273] R13: ffff8801269e0540 R14: ffff880425859950 R15: 000000000000006c Mar 16 09:44:00 teuthology kernel: [3307617.883654] FS: 00007fafc644f700(0000) GS:ffff88043fcc0000(0000) knlGS:0000000000000000 Mar 16 09:44:00 teuthology kernel: [3307617.891997] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Mar 16 09:44:00 teuthology kernel: [3307617.897930] CR2: 00007f1256380000 CR3: 0000000169569000 CR4: 00000000000007e0 Mar 16 09:44:00 teuthology kernel: [3307617.905307] Stack: Mar 16 09:44:00 teuthology kernel: [3307617.907511] ffff8801c6488fe0 0000006c00000000 0000000000220b23 00000ab700000b23 Mar 16 09:44:00 teuthology kernel: [3307617.915199] 0000000000220000 ffff8801269e0540 0000000000001000 0000000000000220 Mar 16 09:44:00 teuthology kernel: [3307617.922891] ffff88036c177b78 0000000000220ab7 ffff88036c177b88 ffffffff81166536 Mar 16 09:44:00 teuthology kernel: [3307617.930601] Call Trace: Mar 16 09:44:00 teuthology kernel: [3307617.933372] [<ffffffff81166536>] ? wait_for_stable_page+0x16/0x50 Mar 16 09:44:00 teuthology kernel: [3307617.939747] [<ffffffffa0397eec>] ceph_write_begin+0x6c/0xc0 [ceph] Mar 16 09:44:00 teuthology kernel: [3307617.946321] [<ffffffff8115a9e6>] generic_perform_write+0xc6/0x1e0 Mar 16 09:44:00 teuthology kernel: [3307617.952726] [<ffffffffa03936b0>] ceph_write_iter+0x890/0xa30 [ceph] Mar 16 09:44:00 teuthology kernel: [3307617.959268] [<ffffffff817295fb>] ? _raw_spin_unlock+0x2b/0x40 Mar 16 09:44:00 teuthology kernel: [3307617.965348] [<ffffffff81187454>] ? do_wp_page+0x374/0x740 Mar 16 09:44:00 teuthology kernel: [3307617.971021] [<ffffffff81189efc>] ? handle_mm_fault+0x86c/0x1060 Mar 16 09:44:00 teuthology kernel: [3307617.977335] [<ffffffff81046a19>] ? __do_page_fault+0x159/0x580 Mar 16 09:44:00 teuthology kernel: [3307617.983512] [<ffffffff811cdc13>] ? vfs_write+0x1d3/0x1f0 Mar 16 09:44:00 teuthology kernel: [3307617.989111] [<ffffffff811cdc13>] ? vfs_write+0x1d3/0x1f0 Mar 16 09:44:00 teuthology kernel: [3307617.994711] [<ffffffff811ccdcb>] new_sync_write+0x7b/0xb0 Mar 16 09:44:00 teuthology kernel: [3307618.000386] [<ffffffff811cdb07>] vfs_write+0xc7/0x1f0 Mar 16 09:44:00 teuthology kernel: [3307618.005842] [<ffffffff811ce012>] SyS_write+0x52/0xb0 Mar 16 09:44:00 teuthology kernel: [3307618.011081] [<ffffffff81729cd6>] system_call_fastpath+0x1a/0x1f Mar 16 09:44:00 teuthology kernel: [3307618.018451] Code: fe ff ff 48 89 da 48 c7 c6 90 5b 3b a0 48 c7 c7 08 38 3c a0 31 c0 e8 46 c7 02 e1 31 c0 e9 ca fc ff ff 0f 1f 80 00 00 00 00 0f 0b <0f> 0b 4c 89 e9 48 89 da 48 c7 c6 58 5b 3b a0 48 c7 c7 e0 37 3c Mar 16 09:44:00 teuthology kernel: [3307618.038808] RIP [<ffffffffa0397e0a>] ceph_update_writeable_page+0x40a/0x480 [ceph] Mar 16 09:44:00 teuthology kernel: [3307618.046715] RSP <ffff88036c177b18> Mar 16 09:44:00 teuthology kernel: [3307618.050701] ---[ end trace 8bcfe0f2cc71a190 ]--- Mar 16 09:44:01 teuthology kernel: [3307618.959925] ------------[ cut here ]------------ Mar 16 09:44:01 teuthology kernel: [3307618.964777] kernel BUG at /srv/autobuild-ceph/gitbuilder.git/build/fs/ceph/addr.c:1024! Mar 16 09:44:01 teuthology kernel: [3307618.973050] invalid opcode: 0000 [#2] SMP Mar 16 09:44:01 teuthology kernel: [3307618.977473] Modules linked in: ipmi_devintf(E) ipmi_si(E) ipmi_msghandler(E) ip6table_filter(E) ip6_tables(E) ebtable_nat(E) ebtables(E) ipt_MASQUERADE(E) iptable_nat(E) nf_nat_ipv4(E) nf_nat(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) xt_state(E) nf_conntrack(E) ipt_REJECT(E) xt_CHECKSUM(E) iptable_mangle(E) xt_tcpudp(E) iptable_filter(E) ip_tables(E) x_tables(E) bridge(E) stp(E) llc(E) ceph(E) libceph(E) fscache(E) gpio_ich(E) psmouse(E) serio_raw(E) i7core_edac(E) edac_core(E) joydev(E) tpm_infineon(E) tpm_tis(E) lpc_ich(E) xfs(E) lp(E) parport(E) hid_generic(E) usbhid(E) hid(E) btrfs(E) e1000e(E) ahci(E) ptp(E) raid6_pq(E) libahci(E) pps_core(E) arcmsr(E) xor(E) libcrc32c(E) Mar 16 09:44:01 teuthology kernel: [3307619.040699] CPU: 1 PID: 14498 Comm: python Tainted: G D E 3.16.3-ceph-00306-g76c7fd1 #1 Mar 16 09:44:01 teuthology kernel: [3307619.049847] Hardware name: Supermicro X8SIL/X8SIL, BIOS 1.1 05/27/2010 Mar 16 09:44:01 teuthology kernel: [3307619.056623] task: ffff8802d9b1c3e0 ti: ffff880278134000 task.ti: ffff880278134000 Mar 16 09:44:01 teuthology kernel: [3307619.064453] RIP: 0010:[<ffffffffa0397e0a>] [<ffffffffa0397e0a>] ceph_update_writeable_page+0x40a/0x480 [ceph] Mar 16 09:44:01 teuthology kernel: [3307619.074745] RSP: 0018:ffff880278137b18 EFLAGS: 00010246 Mar 16 09:44:01 teuthology kernel: [3307619.080335] RAX: 02ffff0000000005 RBX: ffffea0004fa7440 RCX: ffffea0004fa7440 Mar 16 09:44:01 teuthology kernel: [3307619.087707] RDX: 0000000000000a9d RSI: 000000000019aa9d RDI: ffff8801f7b4a140 Mar 16 09:44:01 teuthology kernel: [3307619.095123] RBP: ffff880278137ba8 R08: 0000000000000001 R09: 0000000000000000 Mar 16 09:44:01 teuthology kernel: [3307619.102492] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800572116d0 Mar 16 09:44:01 teuthology kernel: [3307619.110005] R13: ffff8801f7b4a140 R14: ffff880425859950 R15: 0000000000000078 Mar 16 09:44:01 teuthology kernel: [3307619.117473] FS: 00007f1f4b5b5700(0000) GS:ffff88043fc40000(0000) knlGS:0000000000000000 Mar 16 09:44:01 teuthology kernel: [3307619.125918] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Mar 16 09:44:01 teuthology kernel: [3307619.131884] CR2: 00007f1256380000 CR3: 0000000126f53000 CR4: 00000000000007e0 Mar 16 09:44:01 teuthology kernel: [3307619.139255] Stack: Mar 16 09:44:01 teuthology kernel: [3307619.141613] ffff8800572118b0 0000007800000000 000000000019aa9d 00000a2500000a9d Mar 16 09:44:01 teuthology kernel: [3307619.149441] 000000000019a000 ffff8801f7b4a140 0000000000001000 000000000000019a Mar 16 09:44:01 teuthology kernel: [3307619.157313] ffff880278137b78 000000000019aa25 ffff880278137b88 ffffffff81166536 Mar 16 09:44:01 teuthology kernel: [3307619.165144] Call Trace: Mar 16 09:44:01 teuthology kernel: [3307619.167822] [<ffffffff81166536>] ? wait_for_stable_page+0x16/0x50 Mar 16 09:44:01 teuthology kernel: [3307619.174303] [<ffffffffa0397eec>] ceph_write_begin+0x6c/0xc0 [ceph] Mar 16 09:44:01 teuthology kernel: [3307619.180794] [<ffffffff8115a9e6>] generic_perform_write+0xc6/0x1e0 Mar 16 09:44:01 teuthology kernel: [3307619.187198] [<ffffffffa03936b0>] ceph_write_iter+0x890/0xa30 [ceph] Mar 16 09:44:01 teuthology kernel: [3307619.193840] [<ffffffff817295fb>] ? _raw_spin_unlock+0x2b/0x40 Mar 16 09:44:01 teuthology kernel: [3307619.199895] [<ffffffff81187454>] ? do_wp_page+0x374/0x740 Mar 16 09:44:01 teuthology kernel: [3307619.205677] [<ffffffff81189efc>] ? handle_mm_fault+0x86c/0x1060 Mar 16 09:44:01 teuthology kernel: [3307619.211908] [<ffffffff81046a19>] ? __do_page_fault+0x159/0x580 Mar 16 09:44:01 teuthology kernel: [3307619.218049] [<ffffffff811cdc13>] ? vfs_write+0x1d3/0x1f0 Mar 16 09:44:01 teuthology kernel: [3307619.223706] [<ffffffff811cdc13>] ? vfs_write+0x1d3/0x1f0 Mar 16 09:44:01 teuthology kernel: [3307619.229328] [<ffffffff811ccdcb>] new_sync_write+0x7b/0xb0 Mar 16 09:44:01 teuthology kernel: [3307619.235033] [<ffffffff811cdb07>] vfs_write+0xc7/0x1f0 Mar 16 09:44:01 teuthology kernel: [3307619.240420] [<ffffffff811ce012>] SyS_write+0x52/0xb0 Mar 16 09:44:01 teuthology kernel: [3307619.245698] [<ffffffff81729cd6>] system_call_fastpath+0x1a/0x1f Mar 16 09:44:01 teuthology kernel: [3307619.251951] Code: fe ff ff 48 89 da 48 c7 c6 90 5b 3b a0 48 c7 c7 08 38 3c a0 31 c0 e8 46 c7 02 e1 31 c0 e9 ca fc ff ff 0f 1f 80 00 00 00 00 0f 0b <0f> 0b 4c 89 e9 48 89 da 48 c7 c6 58 5b 3b a0 48 c7 c7 e0 37 3c Mar 16 09:44:01 teuthology kernel: [3307619.274600] RIP [<ffffffffa0397e0a>] ceph_update_writeable_page+0x40a/0x480 [ceph] Mar 16 09:44:01 teuthology kernel: [3307619.282588] RSP <ffff880278137b18> Mar 16 09:44:01 teuthology kernel: [3307619.286356] ---[ end trace 8bcfe0f2cc71a191 ]--- Mar 16 09:44:45 teuthology kernel: [3307663.995782] libceph: mds0 10.214.134.10:6801 socket closed (con state OPEN) Mar 16 09:44:46 teuthology kernel: [3307664.493813] libceph: mds0 10.214.134.10:6801 connection reset Mar 16 09:44:46 teuthology kernel: [3307664.499881] libceph: reset on mds0 Mar 16 09:44:46 teuthology kernel: [3307664.503568] ceph: mds0 closed our session Mar 16 09:44:46 teuthology kernel: [3307664.507899] ceph: mds0 reconnect start Mar 16 09:44:46 teuthology kernel: [3307664.670763] ceph: mds0 reconnect denied Mar 16 09:44:46 teuthology kernel: [3307664.676400] libceph: mds0 10.214.134.10:6801 socket closed (con state OPEN) Mar 16 09:44:47 teuthology kernel: [3307665.476927] libceph: mds0 10.214.134.10:6801 connection reset Mar 16 09:44:47 teuthology kernel: [3307665.482993] libceph: reset on mds0 Mar 16 09:44:47 teuthology kernel: [3307665.486692] ceph: mds0 closed our session Mar 16 09:44:47 teuthology kernel: [3307665.490961] ceph: mds0 reconnect start Mar 16 09:44:47 teuthology kernel: [3307665.496364] ceph: mds0 reconnect denied Mar 16 09:44:47 teuthology kernel: [3307665.501573] libceph: mds0 10.214.134.10:6801 socket closed (con state OPEN) Mar 16 09:44:48 teuthology kernel: [3307666.476213] libceph: mds0 10.214.134.10:6801 connection reset Mar 16 09:44:48 teuthology kernel: [3307666.482235] libceph: reset on mds0 Mar 16 09:44:48 teuthology kernel: [3307666.485895] ceph: mds0 closed our session Mar 16 09:44:48 teuthology kernel: [3307666.490218] ceph: mds0 reconnect start Mar 16 09:44:48 teuthology kernel: [3307666.495752] ceph: mds0 reconnect denied
then syslog ends and machine restarts...
Updated by Sage Weil about 9 years ago
- Subject changed from kceph: crash after failure to reconnect to mds to kceph: crash in ceph_update_writeable_page after failure to reconnect to mds
- Description updated (diff)
Updated by Zheng Yan about 9 years ago
should be dup of #9928. fixed by commit 97c85a828f36bbfffe9d77b977b65a5872b6cad4 (ceph: introduce global empty snap context)
Actions