Project

General

Profile

Actions

Bug #38118

closed

msgr2: race connection results in hang

Added by Sage Weil about 5 years ago. Updated about 5 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

osd.5 connects to osd.1, sending the client ident


2019-01-30 22:55:29.751 7f30af1dd700 20 -- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 msgr2=0xe498480 :-1 s=STATE_CONNECTING l=0).process
2019-01-30 22:55:29.751 7f30af1dd700 10 -- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 msgr2=0xe498480 :-1 s=STATE_CONNECTING_RE l=0).process nonblock connect inprogress
2019-01-30 22:55:29.751 7f30af1dd700 20 -- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 msgr2=0xe498480 :-1 s=STATE_CONNECTING_RE l=0).process
2019-01-30 22:55:29.751 7f30af1dd700 10 -- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 msgr2=0xe498480 :-1 s=STATE_CONNECTING_RE l=0).process connect successfully, ready to send banner
2019-01-30 22:55:29.751 7f30af1dd700 20 --2- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 0xe498480 :-1 s=START_CONNECT pgs=0 cs=0 l=0).read_event
2019-01-30 22:55:29.751 7f30af1dd700 20 --2- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 0xe498480 :-1 s=START_CONNECT pgs=0 cs=0 l=0).start_client_banner_exchange
2019-01-30 22:55:29.751 7f30af1dd700 20 --2- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 0xe498480 :-1 s=CONNECTING pgs=0 cs=0 l=0)._banner_exchange
2019-01-30 22:55:29.751 7f30af1dd700 10 -- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 msgr2=0xe498480 :-1 s=STATE_CONNECTION_ESTABLISHED l=0)._try_send sent bytes 27 remaining bytes 0
2019-01-30 22:55:29.751 7f30af1dd700 20 -- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 msgr2=0xe498480 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read start len=27
2019-01-30 22:55:29.755 7f30af1dd700 20 -- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 msgr2=0xe498480 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).process
2019-01-30 22:55:29.755 7f30af1dd700 20 -- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 msgr2=0xe498480 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read continue len=27
2019-01-30 22:55:29.755 7f30af1dd700 20 --2- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 0xe498480 :-1 s=CONNECTING pgs=0 cs=0 l=0)._banner_exchange_handle_peer_banner r=0
2019-01-30 22:55:29.755 7f30af1dd700  1 --2- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 0xe498480 :-1 s=CONNECTING pgs=0 cs=0 l=0)._banner_exchange_handle_peer_banner peer_type=4 supported=0 required=0
2019-01-30 22:55:29.755 7f30af1dd700 20 --2- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 0xe498480 :-1 s=CONNECTING pgs=0 cs=0 l=0).post_client_banner_exchange
2019-01-30 22:55:29.755 7f30af1dd700 20 --2- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 0xe498480 :-1 s=CONNECTING pgs=0 cs=0 l=0).send_auth_request peer_type 4 auth_client 0x7fff9dbfa438
2019-01-30 22:55:29.755 7f30af1dd700 10 -- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 msgr2=0xe498480 :-1 s=STATE_CONNECTION_ESTABLISHED l=0)._try_send sent bytes 198 remaining bytes 0
2019-01-30 22:55:29.755 7f30af1dd700 20 --2- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 0xe498480 :-1 s=CONNECTING pgs=0 cs=0 l=0).read_frame
2019-01-30 22:55:29.755 7f30af1dd700 20 -- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 msgr2=0xe498480 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read start len=8
2019-01-30 22:55:29.755 7f30af1dd700 20 -- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 msgr2=0xe498480 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).process
2019-01-30 22:55:29.755 7f30af1dd700 20 -- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 msgr2=0xe498480 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read continue len=8
2019-01-30 22:55:29.755 7f30af1dd700 20 --2- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 0xe498480 :-1 s=CONNECTING pgs=0 cs=0 l=0).handle_read_frame_length_and_tag r=0
2019-01-30 22:55:29.755 7f30af1dd700 10 --2- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 0xe498480 :-1 s=CONNECTING pgs=0 cs=0 l=0).handle_read_frame_length_and_tag next payload_len=32 tag=3
2019-01-30 22:55:29.755 7f30af1dd700 20 -- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 msgr2=0xe498480 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read start len=32
2019-01-30 22:55:29.755 7f30af1dd700 20 --2- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 0xe498480 :-1 s=CONNECTING pgs=0 cs=0 l=0).handle_frame_payload r=0
2019-01-30 22:55:29.755 7f30af1dd700 20 --2- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 0xe498480 :-1 s=CONNECTING pgs=0 cs=0 l=0).handle_auth_reply_more payload_len=32
2019-01-30 22:55:29.755 7f30af1dd700  5 --2- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 0xe498480 :-1 s=CONNECTING pgs=0 cs=0 l=0).handle_auth_reply_more auth reply more len=32
2019-01-30 22:55:29.755 7f30af1dd700 10 -- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 msgr2=0xe498480 :-1 s=STATE_CONNECTION_ESTABLISHED l=0)._try_send sent bytes 182 remaining bytes 0
2019-01-30 22:55:29.755 7f30af1dd700 20 --2- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 0xe498480 :-1 s=CONNECTING pgs=0 cs=0 l=0).read_frame
2019-01-30 22:55:29.755 7f30af1dd700 20 -- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 msgr2=0xe498480 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read start len=8
2019-01-30 22:55:29.755 7f30af1dd700 20 -- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 msgr2=0xe498480 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).process
2019-01-30 22:55:29.755 7f30af1dd700 20 -- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 msgr2=0xe498480 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read continue len=8
2019-01-30 22:55:29.755 7f30af1dd700 20 --2- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 0xe498480 :-1 s=CONNECTING pgs=0 cs=0 l=0).handle_read_frame_length_and_tag r=0
2019-01-30 22:55:29.755 7f30af1dd700 10 --2- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 0xe498480 :-1 s=CONNECTING pgs=0 cs=0 l=0).handle_read_frame_length_and_tag next payload_len=48 tag=5
2019-01-30 22:55:29.755 7f30af1dd700 20 -- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 msgr2=0xe498480 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read start len=48
2019-01-30 22:55:29.755 7f30af1dd700 20 --2- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 0xe498480 :-1 s=CONNECTING pgs=0 cs=0 l=0).handle_frame_payload r=0
2019-01-30 22:55:29.755 7f30af1dd700 20 --2- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 0xe498480 :-1 s=CONNECTING pgs=0 cs=0 l=0).handle_auth_done payload_len=48
2019-01-30 22:55:29.755 7f30af1dd700 20 --2- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 0xe498480 :-1 s=CONNECTING pgs=0 cs=0 l=0).send_client_ident
2019-01-30 22:55:29.755 7f30af1dd700  5 --2- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 0xe498480 :-1 s=CONNECTING pgs=0 cs=0 l=0).send_client_ident sending identification: addrs=v2:172.21.15.96:6811/13292 gid=5 global_seq=6 features_supported=3ffddff8ffacffff features_required=2e01020002046201 flags=0
2019-01-30 22:55:29.755 7f30af1dd700 10 -- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 msgr2=0xe498480 :-1 s=STATE_CONNECTION_ESTABLISHED l=0)._try_send sent bytes 88 remaining bytes 0
2019-01-30 22:55:29.755 7f30af1dd700 20 --2- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 0xe498480 :-1 s=CONNECTING pgs=0 cs=0 l=0).read_frame

osd.0 gets it, responds with server_ident

2019-01-30 22:55:29.878 7f5a42cc7700  5 --2- v2:172.21.15.121:6806/13202 >>  conn(0xe624d00 0xe625180 :-1 s=ACCEPTING pgs=0 cs=0 l=0).handle_client_ident received client identification: addrs=v2:172.21.15.96:6811/13292 gid=5 global_seq=6 features_supported=3ffddff8ffacffff features_required=2e01020002046201 flags=0
2019-01-30 22:55:29.878 7f5a42cc7700 20 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=ACCEPTING_SESSION pgs=6 cs=0 l=0).send_server_ident
2019-01-30 22:55:29.878 7f5a42cc7700 10 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=ACCEPTING_SESSION pgs=6 cs=0 l=0).discard_requeued_up_to 0
2019-01-30 22:55:29.878 7f5a42cc7700  5 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=ACCEPTING_SESSION pgs=6 cs=0 l=0).send_server_ident sending identification: addrs=v2:172.21.15.121:6806/13202 peer_addr=v2:172.21.15.96:6811/13292 gid=1 global_seq=7 features_supported=3ffddff8ffacffff features_required=2e01020002046201 flags=0 cookie=5914555333950814737
2019-01-30 22:55:29.878 7f5a42cc7700 10 -- v2:172.21.15.121:6806/13202 accept_conn 0xe624d00 v2:172.21.15.96:6811/13292
2019-01-30 22:55:29.878 7f5a42cc7700 10 -- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 msgr2=0xe625180 :-1 s=STATE_CONNECTION_ESTABLISHED l=0)._try_send sent bytes 131 remaining bytes 0
2019-01-30 22:55:29.878 7f5a42cc7700 20 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=ACCEPTING_SESSION pgs=6 cs=0 l=0).server_ready
2019-01-30 22:55:29.878 7f5a42cc7700  1 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=READY pgs=6 cs=0 l=0).ready entity=osd.5 cookie=5214b8bd5487ba11 in_seq=0 out_seq=0
2019-01-30 22:55:29.878 7f5a42cc7700 20 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=READY pgs=6 cs=0 l=0).read_frame

osd.5 injects an error during its read. it never gets the server_ident.

2019-01-30 22:55:29.755 7f30af1dd700 20 -- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 msgr2=0xe498480 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read start len=8
2019-01-30 22:55:29.755 7f30af1dd700  0 -- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 msgr2=0xe498480 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read_until injecting socket failure
2019-01-30 22:55:29.755 7f30af1dd700  1 -- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 msgr2=0xe498480 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read_bulk peer close file descriptor 76
2019-01-30 22:55:29.755 7f30af1dd700  1 -- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 msgr2=0xe498480 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read_until read failed
2019-01-30 22:55:29.755 7f30af1dd700 20 --2- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 0xe498480 :-1 s=CONNECTING pgs=0 cs=0 l=0).handle_read_frame_length_and_tag r=-1
2019-01-30 22:55:29.755 7f30af1dd700  1 --2- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 0xe498480 :-1 s=CONNECTING pgs=0 cs=0 l=0).handle_read_frame_length_and_tag read frame length and tag failed r=-1 ((1) Operation not permitted)
2019-01-30 22:55:29.755 7f30af1dd700 10 --2- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 0xe498480 :-1 s=CONNECTING pgs=0 cs=0 l=0)._fault
2019-01-30 22:55:29.755 7f30af1dd700  1 --2- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe498000 0xe498480 :-1 s=CONNECTING pgs=0 cs=0 l=0)._fault with nothing to send, going to standby

osd.0 sees the error too, faults. reconnects.

2019-01-30 22:55:29.878 7f5a42cc7700 20 -- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 msgr2=0xe625180 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read start len=8
2019-01-30 22:55:29.878 7f5a42cc7700  1 -- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 msgr2=0xe625180 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read_bulk peer close file descriptor 95
2019-01-30 22:55:29.878 7f5a42cc7700  1 -- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 msgr2=0xe625180 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read_until read failed
2019-01-30 22:55:29.878 7f5a42cc7700 20 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=READY pgs=6 cs=0 l=0).handle_read_frame_length_and_tag r=-1
2019-01-30 22:55:29.878 7f5a42cc7700  1 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=READY pgs=6 cs=0 l=0).handle_read_frame_length_and_tag read frame length and tag failed r=-1 ((1) Operation not permitted)
2019-01-30 22:55:29.878 7f5a42cc7700 10 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=READY pgs=6 cs=0 l=0)._fault
2019-01-30 22:55:29.878 7f5a42cc7700  1 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=READY pgs=6 cs=0 l=0)._fault with nothing to send, going to standby
2019-01-30 22:55:29.902 7f5a434c8700 10 -- v2:172.21.15.121:6806/13202 connect_to v2:172.21.15.96:6811/13292 existing 0xe624d00
2019-01-30 22:55:29.906 7f5a42cc7700 10 -- v2:172.21.15.121:6806/13202 connect_to v2:172.21.15.96:6811/13292 existing 0xe624d00
2019-01-30 22:55:31.401 7f5a1f813700 10 -- v2:172.21.15.121:6806/13202 connect_to v2:172.21.15.96:6811/13292 existing 0xe624d00
2019-01-30 22:55:31.401 7f5a1f813700 10 osd.1 14 send_incremental_map 13 -> 14 to 0xe624d00 v2:172.21.15.96:6811/13292
2019-01-30 22:55:31.401 7f5a1f813700  1 -- v2:172.21.15.121:6806/13202 --> v2:172.21.15.96:6811/13292 -- osd_map(14..14 src has 1..14) v4 -- 0xe6ccd00 con 0xe624d00
2019-01-30 22:55:31.401 7f5a1f813700  5 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=STANDBY pgs=6 cs=0 l=0).send_message enqueueing message m=0xe6ccd00 type=41 osd_map(14..14 src has 1..14) v4
2019-01-30 22:55:31.401 7f5a1f813700 15 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=STANDBY pgs=6 cs=0 l=0).send_message inline write is denied, reschedule m=0xe6ccd00
2019-01-30 22:55:31.401 7f5a1f813700  1 -- v2:172.21.15.121:6806/13202 --> v2:172.21.15.96:6811/13292 -- pg_query(1.0 epoch 14) v4 -- 0xafcc200 con 0xe624d00
2019-01-30 22:55:31.401 7f5a1f813700 20 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=STANDBY pgs=6 cs=0 l=0).prepare_send_message m=pg_query(1.0 epoch 14) v4
2019-01-30 22:55:31.401 7f5a1f813700 20 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=STANDBY pgs=6 cs=0 l=0).prepare_send_message encoding features 4611087854031667199 0xafcc200 pg_query(1.0 epoch 14) v4
2019-01-30 22:55:31.401 7f5a1f813700 10 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=STANDBY pgs=6 cs=0 l=0).send_message clear encoded buffer previous 4611087854031667199 != 4611087854031667199
2019-01-30 22:55:31.401 7f5a1f813700  5 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=STANDBY pgs=6 cs=0 l=0).send_message enqueueing message m=0xafcc200 type=81 pg_query(1.0 epoch 14) v4
2019-01-30 22:55:31.401 7f5a1f813700 15 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=STANDBY pgs=6 cs=0 l=0).send_message inline write is denied, reschedule m=0xafcc200
2019-01-30 22:55:31.401 7f5a42cc7700 10 -- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 msgr2=0xe625180 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).handle_write
2019-01-30 22:55:31.401 7f5a42cc7700 10 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=STANDBY pgs=6 cs=0 l=0).write_event
2019-01-30 22:55:31.401 7f5a42cc7700 10 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=STANDBY pgs=6 cs=0 l=0).write_event policy.server is false
2019-01-30 22:55:31.401 7f5a42cc7700 10 -- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 msgr2=0xe625180 :-1 s=STATE_CONNECTION_ESTABLISHED l=0)._connect
2019-01-30 22:55:31.401 7f5a42cc7700  1 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=STANDBY pgs=6 cs=1 l=0).connect
2019-01-30 22:55:31.401 7f5a42cc7700 20 -- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 msgr2=0xe625180 :-1 s=STATE_CONNECTING l=0).process
2019-01-30 22:55:31.401 7f5a42cc7700 10 -- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 msgr2=0xe625180 :-1 s=STATE_CONNECTING_RE l=0).process nonblock connect inprogress
2019-01-30 22:55:31.401 7f5a42cc7700 20 -- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 msgr2=0xe625180 :-1 s=STATE_CONNECTING_RE l=0).process
2019-01-30 22:55:31.401 7f5a42cc7700 10 -- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 msgr2=0xe625180 :-1 s=STATE_CONNECTING_RE l=0).process connect successfully, ready to send banner
2019-01-30 22:55:31.401 7f5a42cc7700 20 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=START_CONNECT pgs=6 cs=1 l=0).read_event
2019-01-30 22:55:31.401 7f5a42cc7700 20 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=START_CONNECT pgs=6 cs=1 l=0).start_client_banner_exchange
2019-01-30 22:55:31.401 7f5a42cc7700 20 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=CONNECTING pgs=6 cs=1 l=0)._banner_exchange
2019-01-30 22:55:31.401 7f5a42cc7700 10 -- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 msgr2=0xe625180 :-1 s=STATE_CONNECTION_ESTABLISHED l=0)._try_send sent bytes 27 remaining bytes 0
2019-01-30 22:55:31.401 7f5a42cc7700 20 -- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 msgr2=0xe625180 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read start len=27
2019-01-30 22:55:31.401 7f5a42cc7700 20 -- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 msgr2=0xe625180 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).process
2019-01-30 22:55:31.401 7f5a42cc7700 20 -- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 msgr2=0xe625180 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read continue len=27
2019-01-30 22:55:31.401 7f5a42cc7700 20 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=CONNECTING pgs=6 cs=1 l=0)._banner_exchange_handle_peer_banner r=0
2019-01-30 22:55:31.401 7f5a42cc7700  1 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=CONNECTING pgs=6 cs=1 l=0)._banner_exchange_handle_peer_banner peer_type=4 supported=0 required=0
2019-01-30 22:55:31.401 7f5a42cc7700 20 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=CONNECTING pgs=6 cs=1 l=0).post_client_banner_exchange
2019-01-30 22:55:31.401 7f5a42cc7700 20 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=CONNECTING pgs=6 cs=1 l=0).send_auth_request peer_type 4 auth_client 0x7ffd8e557708
2019-01-30 22:55:31.401 7f5a42cc7700 10 -- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 msgr2=0xe625180 :-1 s=STATE_CONNECTION_ESTABLISHED l=0)._try_send sent bytes 198 remaining bytes 0
2019-01-30 22:55:31.401 7f5a42cc7700 20 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=CONNECTING pgs=6 cs=1 l=0).read_frame
2019-01-30 22:55:31.401 7f5a42cc7700 20 -- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 msgr2=0xe625180 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read start len=8
2019-01-30 22:55:31.405 7f5a42cc7700 20 -- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 msgr2=0xe625180 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).process
2019-01-30 22:55:31.405 7f5a42cc7700 20 -- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 msgr2=0xe625180 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read continue len=8
2019-01-30 22:55:31.405 7f5a42cc7700 20 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=CONNECTING pgs=6 cs=1 l=0).handle_read_frame_length_and_tag r=0
2019-01-30 22:55:31.405 7f5a42cc7700 10 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=CONNECTING pgs=6 cs=1 l=0).handle_read_frame_length_and_tag next payload_len=32 tag=3
2019-01-30 22:55:31.405 7f5a42cc7700 20 -- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 msgr2=0xe625180 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read start len=32
2019-01-30 22:55:31.405 7f5a42cc7700 20 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=CONNECTING pgs=6 cs=1 l=0).handle_frame_payload r=0
2019-01-30 22:55:31.405 7f5a42cc7700 20 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=CONNECTING pgs=6 cs=1 l=0).handle_auth_reply_more payload_len=32
2019-01-30 22:55:31.405 7f5a42cc7700  5 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=CONNECTING pgs=6 cs=1 l=0).handle_auth_reply_more auth reply more len=32
2019-01-30 22:55:31.405 7f5a42cc7700 10 -- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 msgr2=0xe625180 :-1 s=STATE_CONNECTION_ESTABLISHED l=0)._try_send sent bytes 182 remaining bytes 0
2019-01-30 22:55:31.405 7f5a42cc7700 20 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=CONNECTING pgs=6 cs=1 l=0).read_frame
2019-01-30 22:55:31.405 7f5a42cc7700 20 -- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 msgr2=0xe625180 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read start len=8
2019-01-30 22:55:31.405 7f5a42cc7700 20 -- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 msgr2=0xe625180 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).process
2019-01-30 22:55:31.405 7f5a42cc7700 20 -- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 msgr2=0xe625180 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read continue len=8
2019-01-30 22:55:31.405 7f5a42cc7700 20 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=CONNECTING pgs=6 cs=1 l=0).handle_read_frame_length_and_tag r=0
2019-01-30 22:55:31.405 7f5a42cc7700 10 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=CONNECTING pgs=6 cs=1 l=0).handle_read_frame_length_and_tag next payload_len=48 tag=5
2019-01-30 22:55:31.405 7f5a42cc7700 20 -- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 msgr2=0xe625180 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read start len=48
2019-01-30 22:55:31.405 7f5a42cc7700 20 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=CONNECTING pgs=6 cs=1 l=0).handle_frame_payload r=0
2019-01-30 22:55:31.405 7f5a42cc7700 20 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=CONNECTING pgs=6 cs=1 l=0).handle_auth_done payload_len=48
2019-01-30 22:55:31.405 7f5a42cc7700 20 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=CONNECTING pgs=6 cs=1 l=0).send_reconnect
2019-01-30 22:55:31.405 7f5a42cc7700  5 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=CONNECTING pgs=6 cs=1 l=0).send_reconnect reconnect to session: cookie=5914555333950814737 gs=9 cs=1 ms=0
2019-01-30 22:55:31.405 7f5a42cc7700 10 -- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 msgr2=0xe625180 :-1 s=STATE_CONNECTION_ESTABLISHED l=0)._try_send sent bytes 80 remaining bytes 0

osd.5 accepts, sees the reconnect with a bad cookie, resets

2019-01-30 22:55:31.283 7f30ae9dc700  5 --2- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe4b6400 0xe4b6880 :-1 s=ACCEPTING pgs=0 cs=0 l=0).handle_reconnect no cookie set, reseting client
2019-01-30 22:55:31.283 7f30ae9dc700 10 -- v2:172.21.15.96:6811/13292 >> v2:172.21.15.121:6806/13202 conn(0xe4b6400 msgr2=0xe4b6880 :-1 s=STATE_CONNECTION_ESTABLISHED l=0)._try_send sent bytes 8 remaining bytes 0

osd.0 gets the reset, discards outgoing queued message

2019-01-30 22:55:31.405 7f5a42cc7700 20 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=CONNECTING pgs=6 cs=1 l=0).read_frame
2019-01-30 22:55:31.405 7f5a42cc7700 20 -- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 msgr2=0xe625180 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read start len=8
2019-01-30 22:55:31.405 7f5a42cc7700 20 -- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 msgr2=0xe625180 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).process
2019-01-30 22:55:31.405 7f5a42cc7700 20 -- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 msgr2=0xe625180 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read continue len=8
2019-01-30 22:55:31.405 7f5a42cc7700 20 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=CONNECTING pgs=6 cs=1 l=0).handle_read_frame_length_and_tag r=0
2019-01-30 22:55:31.405 7f5a42cc7700 10 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=CONNECTING pgs=6 cs=1 l=0).handle_read_frame_length_and_tag next payload_len=0 tag=9
2019-01-30 22:55:31.405 7f5a42cc7700 20 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=CONNECTING pgs=6 cs=1 l=0).handle_session_reset
2019-01-30 22:55:31.405 7f5a42cc7700  1 --2- v2:172.21.15.121:6806/13202 >> v2:172.21.15.96:6811/13292 conn(0xe624d00 0xe625180 :-1 s=CONNECTING pgs=6 cs=1 l=0).handle_session_reset received session reset

/a/sage-2019-01-30_22:09:10-rados-wip-sage-testing-2019-01-30-1251-distro-basic-smithi/3527478


Related issues 1 (0 open1 closed)

Related to CephFS - Bug #38128: msgr: unexpected "handle_cephx_auth got bad authorizer, auth_reply_len=0"ResolvedSage Weil

Actions
Actions #1

Updated by Sage Weil about 5 years ago

  • Related to Bug #38128: msgr: unexpected "handle_cephx_auth got bad authorizer, auth_reply_len=0" added
Actions #2

Updated by Sage Weil about 5 years ago

  • Status changed from 12 to Resolved
Actions #3

Updated by Greg Farnum about 5 years ago

  • Project changed from RADOS to Messengers
Actions

Also available in: Atom PDF