Actions
Bug #5480
closedlibceph: unexpected old state in con_sock_state_change
Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Target version:
-
% Done:
0%
Source:
Development
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
This happened after running this job in a loop for a while:
machine_type: mira interactive-on-error: true overrides: admin_socket: branch: cuttlefish ceph: conf: global: ms inject socket failures: 500 osd: osd op thread timeout: 60 fs: btrfs log-whitelist: - slow request branch: cuttlefish install: ceph: branch: cuttlefish workunit: branch: cuttlefish roles: - - mon.a - mon.c - osd.0 - osd.1 - osd.2 - - mon.b - mds.a - osd.3 - osd.4 - osd.5 - - client.0 tasks: - chef: null - clock.check: null - install: null - ceph: null - exec: client.0: - modprobe rbd - echo 'module libceph +p' > /sys/kernel/debug/dynamic_debug/control - echo 'module rbd +p' > /sys/kernel/debug/dynamic_debug/control - workunit: clients: client.0: - rbd/map-unmap.sh
The crash:
<4>[131857.709460] WARNING: at /srv/autobuild-ceph/gitbuilder.git/build/net/ceph/messenger.c:360 ceph_sock_state_change+0x128/0x220 [libceph]() <4>[131857.758949] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 3.10.0-rc7-ceph-00046-g2eb04fe #1 <4>[131857.768041] Hardware name: Supermicro X8SIL/X8SIL, BIOS 1.1 05/27/2010 <4>[131857.774684] ffffffffa0707448 ffff88043fc038d8 ffffffff81630d8f ffff88043fc03918 <4>[131857.782298] ffffffff8103fae0 ffff88043fc038f8 000000003148e000 ffff88042c743418 <4>[131857.789931] ffff8800234665a2 0000000000000000 ffff88002346658e ffff88043fc03928 <4>[131857.797570] Call Trace: <4>[131857.800126] <IRQ> [<ffffffff81630d8f>] dump_stack+0x19/0x1b <4>[131857.806028] [<ffffffff8103fae0>] warn_slowpath_common+0x70/0xa0 <4>[131857.812151] [<ffffffff8103fb2a>] warn_slowpath_null+0x1a/0x20 <4>[131857.818111] [<ffffffffa06ecd58>] ceph_sock_state_change+0x128/0x220 [libceph] <4>[131857.825502] [<ffffffff815760d9>] tcp_fin+0x149/0x1c0 <4>[131857.830668] [<ffffffff81576ef0>] tcp_data_queue+0x750/0xcb0 <4>[131857.836444] [<ffffffff81579b78>] tcp_rcv_established+0x258/0x880 <4>[131857.842648] [<ffffffff815852cc>] tcp_v4_do_rcv+0x2cc/0x500 <4>[131857.848337] [<ffffffff81586915>] tcp_v4_rcv+0x895/0xc00 <4>[131857.853763] [<ffffffff8155e2f8>] ? ip_local_deliver_finish+0x48/0x370 <4>[131857.860415] [<ffffffff815490cb>] ? sch_direct_xmit+0x6b/0x280 <4>[131857.866411] [<ffffffff8155e3bd>] ip_local_deliver_finish+0x10d/0x370 <4>[131857.872967] [<ffffffff8155e2f8>] ? ip_local_deliver_finish+0x48/0x370 <4>[131857.879619] [<ffffffff81533147>] ? __skb_dst_set_noref+0x27/0xb0 <4>[131857.885879] [<ffffffff8155edd8>] ip_local_deliver+0x88/0x90 <4>[131857.891649] [<ffffffff8155e7a7>] ip_rcv_finish+0x187/0x5f0 <4>[131857.897340] [<ffffffff8155f031>] ip_rcv+0x251/0x320 <4>[131857.903773] [<ffffffff8152a853>] __netif_receive_skb_core+0x563/0x750 <4>[131857.910591] [<ffffffff8152a357>] ? __netif_receive_skb_core+0x67/0x750 <4>[131857.917448] [<ffffffff8152aa66>] __netif_receive_skb+0x26/0x70 <4>[131857.923846] [<ffffffff8152ac7d>] netif_receive_skb+0x2d/0xf0 <4>[131857.929911] [<ffffffff8152b608>] napi_gro_receive+0xe8/0x140 <4>[131857.936020] [<ffffffffa0083ada>] e1000_receive_skb+0x7a/0xf0 [e1000e] <4>[131857.943028] [<ffffffffa0084cae>] e1000_clean_rx_irq+0x24e/0x430 [e1000e] <4>[131857.950045] [<ffffffffa008c96f>] e1000e_poll+0xaf/0x310 [e1000e] <4>[131857.956262] [<ffffffff810d9603>] ? handle_irq_event+0x53/0x70 <4>[131857.962212] [<ffffffff8152b404>] net_rx_action+0xb4/0x1d0 <4>[131857.967814] [<ffffffff81048195>] __do_softirq+0xe5/0x2a0 <4>[131857.973324] [<ffffffff810484ce>] irq_exit+0x9e/0xc0 <4>[131857.978424] [<ffffffff81641d83>] do_IRQ+0x63/0xe0 <4>[131857.983328] [<ffffffff81637caf>] common_interrupt+0x6f/0x6f <4>[131857.989174] <EOI> [<ffffffff814ed5f3>] ? cpuidle_enter_state+0x63/0xe0 <4>[131857.996033] [<ffffffff814ed5ef>] ? cpuidle_enter_state+0x5f/0xe0 <4>[131858.002240] [<ffffffff814ed73e>] cpuidle_idle_call+0xce/0x240 <4>[131858.008241] [<ffffffff8100ac1e>] arch_cpu_idle+0xe/0x30 <4>[131858.013663] [<ffffffff81091dee>] cpu_startup_entry+0xce/0x2b0 <4>[131858.019615] [<ffffffff8161ba0c>] rest_init+0xbc/0xd0 <4>[131858.024784] [<ffffffff8161b955>] ? rest_init+0x5/0xd0 <4>[131858.030035] [<ffffffff81d00e79>] start_kernel+0x3f1/0x3fe <4>[131858.035635] [<ffffffff81d00888>] ? repair_env_string+0x5a/0x5a <4>[131858.041669] [<ffffffff81d005a6>] x86_64_start_reservations+0x2a/0x2c <4>[131858.048225] [<ffffffff81d0068d>] x86_64_start_kernel+0xe5/0xec
From dmesg:
<7>[131856.801962] rbd: rbd_client_release: rbdc ffff88042d17da60 <7>[131856.801966] libceph: destroy_client ffff88041eedb000 <7>[131856.801972] libceph: try_read tag 1 in_base_pos 0 <7>[131856.801979] libceph: try_read done on ffff8800592fb030 ret 0 <7>[131856.801984] libceph: try_write start ffff8800592fb030 state 5 <7>[131856.801988] libceph: try_write out_kvec_bytes 0 <7>[131856.802100] libceph: prepare_write_ack ffff8800592fb030 16 -> 17 <7>[131856.802107] libceph: try_write out_kvec_bytes 9 <7>[131856.802108] libceph: write_partial_kvec ffff8800592fb030 9 left <7>[131856.802112] libceph: write_partial_kvec ffff8800592fb030 0 left in 0 kvecs ret = 1 <7>[131856.802114] libceph: try_write nothing else to write. <7>[131856.802115] libceph: try_write done on ffff8800592fb030 ret 0 <7>[131856.802116] libceph: put_osd ffff8800592fb000 2 -> 1 <7>[131856.802128] libceph: osdmap_destroy ffff8802bf78c500 <7>[131856.802140] libceph: remove_all_osds ffff88041eedb950 <7>[131856.802143] libceph: __remove_osd ffff8800592fe800 <7>[131856.802148] libceph: con_close ffff8800592fe830 peer 10.214.134.130:6800 <7>[131856.802152] libceph: reset_connection ffff8800592fe830 <7>[131856.802155] libceph: con_close_socket on ffff8800592fe830 sock ffff880411f5bc00 <7>[131856.802170] libceph: ceph_sock_state_change ffff8800592fe830 state = 1 sk_state = 4 <7>[131856.802181] libceph: con_sock_state_closed con ffff8800592fe830 sock 3 -> 1 <7>[131856.802184] libceph: put_osd ffff8800592fe800 4 -> 3 <7>[131856.802187] libceph: __remove_osd ffff8800592fb000 <7>[131856.802191] libceph: con_close ffff8800592fb030 peer 10.214.134.134:6800 <7>[131856.802194] libceph: reset_connection ffff8800592fb030 <7>[131856.802197] libceph: con_close_socket on ffff8800592fb030 sock ffff88042cef7c00 <7>[131856.802205] libceph: ceph_sock_state_change ffff8800592fb030 state = 1 sk_state = 4 <7>[131856.802213] libceph: con_sock_state_closed con ffff8800592fb030 sock 3 -> 1 <7>[131856.802222] libceph: put_osd ffff8800592fb000 1 -> 0 <7>[131856.802226] libceph: buffer_release ffff88020e9c2200 <7>[131856.802233] libceph: msgpool osd_op destroy <7>[131856.802236] libceph: msgpool_release osd_op ffff880412e53c18 <7>[131856.802239] libceph: ceph_msg_put last one on ffff880412e53c18 <7>[131856.802242] libceph: msg_kfree ffff880412e53c18 <7>[131856.802245] libceph: msgpool_release osd_op ffff880412e52cb0 <7>[131856.802248] libceph: ceph_msg_put last one on ffff880412e52cb0 <7>[131856.802251] libceph: msg_kfree ffff880412e52cb0 <7>[131856.802254] libceph: msgpool_release osd_op ffff880412e52e80 <7>[131856.802257] libceph: ceph_msg_put last one on ffff880412e52e80 <7>[131856.802263] libceph: msg_kfree ffff880412e52e80 <7>[131856.802264] libceph: msgpool_release osd_op ffff880412e520e8 <7>[131856.802266] libceph: ceph_msg_put last one on ffff880412e520e8 <7>[131856.802267] libceph: msg_kfree ffff880412e520e8 <7>[131856.802268] libceph: msgpool_release osd_op ffff880412e52d98 <7>[131856.802269] libceph: ceph_msg_put last one on ffff880412e52d98 <7>[131856.802270] libceph: msg_kfree ffff880412e52d98 <7>[131856.802272] libceph: msgpool_release osd_op ffff880412e52000 <7>[131856.802273] libceph: ceph_msg_put last one on ffff880412e52000 <7>[131856.802274] libceph: msg_kfree ffff880412e52000 <7>[131856.802275] libceph: msgpool_release osd_op ffff880412e53138 <7>[131856.802277] libceph: ceph_msg_put last one on ffff880412e53138 <7>[131856.802278] libceph: msg_kfree ffff880412e53138 <7>[131856.802279] libceph: msgpool_release osd_op ffff880412e52740 <7>[131856.802280] libceph: ceph_msg_put last one on ffff880412e52740 <7>[131856.802281] libceph: msg_kfree ffff880412e52740 <7>[131856.802283] libceph: msgpool_release osd_op ffff880412e52488 <7>[131856.802285] libceph: ceph_msg_put last one on ffff880412e52488 <7>[131856.802286] libceph: msg_kfree ffff880412e52488 <7>[131856.802287] libceph: msgpool_release osd_op ffff880412e534d8 <7>[131856.802288] libceph: ceph_msg_put last one on ffff880412e534d8 <7>[131856.802289] libceph: msg_kfree ffff880412e534d8 <7>[131856.802291] libceph: msgpool osd_op_reply destroy <7>[131856.802292] libceph: msgpool_release osd_op_reply ffff880412e53de8 <7>[131856.802294] libceph: ceph_msg_put last one on ffff880412e53de8 <7>[131856.802295] libceph: msg_kfree ffff880412e53de8 <7>[131856.802296] libceph: msgpool_release osd_op_reply ffff880412e52828 <7>[131856.802297] libceph: ceph_msg_put last one on ffff880412e52828 <7>[131856.802299] libceph: msg_kfree ffff880412e52828 <7>[131856.802300] libceph: msgpool_release osd_op_reply ffff880412e53ed0 <7>[131856.802301] libceph: ceph_msg_put last one on ffff880412e53ed0 <7>[131856.802302] libceph: msg_kfree ffff880412e53ed0 <7>[131856.802304] libceph: msgpool_release osd_op_reply ffff880412e53960 <7>[131856.802305] libceph: ceph_msg_put last one on ffff880412e53960 <7>[131856.802306] libceph: msg_kfree ffff880412e53960 <7>[131856.802307] libceph: msgpool_release osd_op_reply ffff880412e52ae0 <7>[131856.802308] libceph: ceph_msg_put last one on ffff880412e52ae0 <7>[131856.802310] libceph: msg_kfree ffff880412e52ae0 <7>[131856.802311] libceph: msgpool_release osd_op_reply ffff880412e535c0 <7>[131856.802312] libceph: ceph_msg_put last one on ffff880412e535c0 <7>[131856.802313] libceph: msg_kfree ffff880412e535c0 <7>[131856.802315] libceph: msgpool_release osd_op_reply ffff880412e52f68 <7>[131856.802316] libceph: ceph_msg_put last one on ffff880412e52f68 <7>[131856.802317] libceph: msg_kfree ffff880412e52f68 <7>[131856.802318] libceph: msgpool_release osd_op_reply ffff880412e529f8 <7>[131856.802319] libceph: ceph_msg_put last one on ffff880412e529f8 <7>[131856.802320] libceph: msg_kfree ffff880412e529f8 <7>[131856.802322] libceph: msgpool_release osd_op_reply ffff880412e53050 <7>[131856.802323] libceph: ceph_msg_put last one on ffff880412e53050 <7>[131856.802324] libceph: msg_kfree ffff880412e53050 <7>[131856.802325] libceph: msgpool_release osd_op_reply ffff880412e53790 <7>[131856.802326] libceph: ceph_msg_put last one on ffff880412e53790 <7>[131856.802328] libceph: msg_kfree ffff880412e53790 <7>[131856.802329] libceph: stop <7>[131856.802331] libceph: __close_session closing mon1 <7>[131856.802333] libceph: ceph_msg_revoke_incoming msg ffff880412e53d00 null con <7>[131856.802334] libceph: ceph_msg_revoke_incoming msg ffff880412e53a48 null con <7>[131856.802336] libceph: con_close ffff88041eedb418 peer 10.214.134.134:6789 <7>[131856.802337] libceph: reset_connection ffff88041eedb418 <7>[131856.802339] libceph: con_close_socket on ffff88041eedb418 sock ffff88042cef6c00 <7>[131856.802343] libceph: ceph_sock_state_change ffff88041eedb418 state = 1 sk_state = 4 <7>[131856.802347] libceph: con_sock_state_closed con ffff88041eedb418 sock 3 -> 1 <7>[131856.802349] libceph: auth_reset ffff880412790700 <7>[131856.802350] libceph: reset <7>[131856.802354] libceph: auth_destroy ffff880412790700 <7>[131856.802356] libceph: ceph_x_destroy ffff880412790700 <7>[131856.802357] libceph: remove_ticket_handler ffff880429dd79c0 1 <7>[131856.802359] libceph: buffer_release ffff8802bf687400 <7>[131856.802360] libceph: remove_ticket_handler ffff880429dd7300 2 <7>[131856.802361] libceph: buffer_release ffff8802bf687080 <7>[131856.802363] libceph: remove_ticket_handler ffff880429dd71e0 4 <7>[131856.802364] libceph: buffer_release ffff8802bf687700 <7>[131856.802365] libceph: remove_ticket_handler ffff880429dd7060 32 <7>[131856.802366] libceph: buffer_release ffff8802bf687f40 <7>[131856.802368] libceph: buffer_release ffff8802bf687000 <7>[131856.802369] libceph: ceph_msg_put last one on ffff880412e52910 <7>[131856.802370] libceph: msg_kfree ffff880412e52910 <7>[131856.802372] libceph: ceph_msg_put last one on ffff880412e53d00 <7>[131856.802373] libceph: msg_kfree ffff880412e53d00 <7>[131856.802374] libceph: ceph_msg_put last one on ffff880412e533f0 <7>[131856.802375] libceph: msg_kfree ffff880412e533f0 <7>[131856.802376] libceph: ceph_msg_put last one on ffff880412e53a48 <7>[131856.802378] libceph: msg_kfree ffff880412e53a48 <7>[131856.802379] libceph: ceph_debugfs_client_cleanup ffff88041eedb000 <7>[131856.802397] libceph: destroy_options ffff880412791900 <7>[131856.802399] libceph: destroy_client ffff88041eedb000 done <7>[131857.704702] libceph: ceph_sock_state_change ffff88042c743418 state = 4096 sk_state = 8 <7>[131857.704716] libceph: ceph_sock_state_change TCP_CLOSE_WAIT <4>[131858.058988] con_sock_state_closing: unexpected old state 826859520 <7>[131858.065326] libceph: con_sock_state_closing con ffff88042c743418 sock 826859520 -> 4
mira087 is in kdb.
Updated by Sage Weil almost 10 years ago
- Status changed from New to Can't reproduce
Actions