Project

General

Profile

Actions

Bug #37779

closed

msg/async: connection race + winner fault can leave connection in standby

Added by Sage Weil over 5 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
AsyncMessenger
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
luminous,mimic
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

on one end,

2019-01-02 16:28:53.247 7f13ea225700 10 -- 172.21.15.125:6802/33738 >> 172.21.15.6:6809/33741 conn(0x557db5ee3800 legacy :6802 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_message_2 accept existing 0x557db5f00000.gseq 0 <= 1, looks ok
2019-01-02 16:28:53.247 7f13ea225700  1 -- 172.21.15.125:6802/33738 >> 172.21.15.6:6809/33741 conn(0x557db5ee3800 legacy :6802 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_message_2 accept connect_seq 0 vs existing csq=0 existing_state=STATE_CONNECTING_RE
2019-01-02 16:28:53.247 7f13ea225700 10 -- 172.21.15.125:6802/33738 >> 172.21.15.6:6809/33741 conn(0x557db5ee3800 legacy :6802 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_message_2 accept connection race, existing 0x557db5f00000.cseq 0 == 0, sending WAIT

but then,
2019-01-02 16:28:53.247 7f13eba28700  0 -- 172.21.15.125:6802/33738 >> 172.21.15.6:6809/33741 conn(0x557db5f00000 legacy :-1 s=STATE_CONNECTION_ESTABLISHED l=0)._try_send injecting socket failure
2019-01-02 16:28:53.247 7f13eba28700  1 -- 172.21.15.125:6802/33738 >> 172.21.15.6:6809/33741 conn(0x557db5f00000 legacy :-1 s=STATE_CONNECTION_ESTABLISHED l=0)._try_send send error: (32) Broken pipe
2019-01-02 16:28:53.247 7f13eba28700 20 -- 172.21.15.125:6802/33738 >> 172.21.15.6:6809/33741 conn(0x557db5f00000 legacy :-1 s=CONNECTING pgs=0 cs=0 l=0).handle_client_banner_write r=-32
2019-01-02 16:28:53.247 7f13eba28700  1 -- 172.21.15.125:6802/33738 >> 172.21.15.6:6809/33741 conn(0x557db5f00000 legacy :-1 s=CONNECTING pgs=0 cs=0 l=0).handle_client_banner_write write client banner failed
2019-01-02 16:28:53.247 7f13eba28700 20 -- 172.21.15.125:6802/33738 >> 172.21.15.6:6809/33741 conn(0x557db5f00000 legacy :-1 s=CONNECTING pgs=0 cs=0 l=0).fault
2019-01-02 16:28:53.247 7f13eba28700 10 -- 172.21.15.125:6802/33738 >> 172.21.15.6:6809/33741 conn(0x557db5f00000 legacy :-1 s=CONNECTING pgs=0 cs=0 l=0).fault with nothing to send, going to standby

and then nothing.

/a/sage-2019-01-02_14:51:32-rados-master-distro-basic-smithi/3414810


Related issues 3 (0 open3 closed)

Related to Messengers - Bug #37799: msg/async: RESETSESSION due to connection reset during initial connectionCan't reproduce01/06/2019

Actions
Copied to Messengers - Backport #38241: mimic: msg/async: connection race + winner fault can leave connection in standbyRejectedActions
Copied to Messengers - Backport #38242: luminous: msg/async: connection race + winner fault can leave connection in standbyResolvedxie xingguoActions
Actions

Also available in: Atom PDF