Project

General

Profile

Bug #40613

kclient: .handle_message_footer got old message 1 <= 648 0x558ceadeaac0 client_session(request_renewcaps seq 12), discarding

Added by Patrick Donnelly over 4 years ago. Updated over 3 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
kceph
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Got this assertion with the testing kernel. We haven't seen this type of failure in a while. Last time was #18690.

2019-06-22T20:41:05.217+0000 7f44ce537700  0 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcc8900 0x558cebb6a800 :6835 s=READ_FOOTER_AND_DISPATCH pgs=259 cs=3 l=0).handle_message_footer got old message 1 <= 648 0x558ceadeaac0 client_session(request_renewcaps seq 12), discarding
2019-06-22T20:41:05.221+0000 7f44ce537700 -1 /build/ceph-15.0.0-2080-g3e9d828/src/msg/async/ProtocolV1.cc: In function 'Ct<ProtocolV1>* ProtocolV1::handle_message_footer(char*, int)' thread 7f44ce537700 time 2019-06-22T20:41:05.219375+0000
/build/ceph-15.0.0-2080-g3e9d828/src/msg/async/ProtocolV1.cc: 965: FAILED ceph_assert(0 == "old msgs despite reconnect_seq feature")

 ceph version 15.0.0-2080-g3e9d828 (3e9d8282615ffe44f24e38a7b3893bc71ecd561f) octopus (dev)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14f) [0x7f44d3c2e037]
 2: (()+0x2d3221) [0x7f44d3c2e221]
 3: (ProtocolV1::handle_message_footer(char*, int)+0xaaf) [0x7f44d3f0c0ff]
 4: (()+0x59e5da) [0x7f44d3ef95da]
 5: (AsyncConnection::process()+0x5d0) [0x7f44d3ee8070]
 6: (EventCenter::process_events(unsigned int, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*)+0x585) [0x7f44d3f4a075]
 7: (()+0x5f4bc2) [0x7f44d3f4fbc2]
 8: (()+0x95aef0) [0x7f44d42b5ef0]
 9: (()+0x76db) [0x7f44d34d36db]
 10: (clone()+0x3f) [0x7f44d2c5a88f]

2019-06-22T20:41:05.225+0000 7f44ce537700 -1 *** Caught signal (Aborted) **
 in thread 7f44ce537700 thread_name:msgr-worker-1

 ceph version 15.0.0-2080-g3e9d828 (3e9d8282615ffe44f24e38a7b3893bc71ecd561f) octopus (dev)
 1: (()+0x12890) [0x7f44d34de890]
 2: (gsignal()+0xc7) [0x7f44d2b77e97]
 3: (abort()+0x141) [0x7f44d2b79801]
 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1aa) [0x7f44d3c2e092]
 5: (()+0x2d3221) [0x7f44d3c2e221]
 6: (ProtocolV1::handle_message_footer(char*, int)+0xaaf) [0x7f44d3f0c0ff]
 7: (()+0x59e5da) [0x7f44d3ef95da]
 8: (AsyncConnection::process()+0x5d0) [0x7f44d3ee8070]
 9: (EventCenter::process_events(unsigned int, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*)+0x585) [0x7f44d3f4a075]
 10: (()+0x5f4bc2) [0x7f44d3f4fbc2]
 11: (()+0x95aef0) [0x7f44d42b5ef0]
 12: (()+0x76db) [0x7f44d34d36db]
 13: (clone()+0x3f) [0x7f44d2c5a88f]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
...
  -207> 2019-06-22T20:40:52.965+0000 7f44cb531700 20 mds.0.13 get_session have 0x558cebcc9680 client.4761 v1:172.21.15.137:0/910809705 state open
  -205> 2019-06-22T20:40:52.965+0000 7f44cb531700  1 -- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] <== client.4761 v1:172.21.15.137:0/910809705 648 ==== client_caps(flush ino 0x10000001e4d 7760 seq 0 tid 7824 caps=pAsLsXsFsc dirty=Fxw wanted=Fc follows 1 size 941/0 mtime 2009-10-09T17:55:48.000000+0000 tws 2 xattrs(v=4033554479 l=0)) v10 ==== 236+0+0 (unknown 4148257304 0 0) 0x558cec5dd800 con 0x558cebcc8900
  -203> 2019-06-22T20:40:52.965+0000 7f44cb531700 20 mds.0.13 get_session have 0x558cebcc9680 client.4761 v1:172.21.15.137:0/910809705 state open
  -184> 2019-06-22T20:40:54.017+0000 7f44cdd36700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >>  conn(0x558cebcca880 0x558cebb6b000 :6835 s=ACCEPTING pgs=0 cs=0 l=0).handle_client_banner accept peer addr is v1:172.21.15.137:0/910809705
  -183> 2019-06-22T20:40:54.017+0000 7f44cdd36700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcca880 0x558cebb6b000 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_message_2 accept of host_type 0, policy.lossy=1 policy.server=0 policy.standby=0 policy.resetcheck=0 features 0x1000003b8
  -182> 2019-06-22T20:40:54.017+0000 7f44cdd36700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcca880 0x558cebb6b000 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_message_2 accept my proto 32, their proto 65536
  -181> 2019-06-22T20:40:54.017+0000 7f44cdd36700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcca880 0x558cebb6b000 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).send_connect_message_reply reply features 0x13b8 = (policy sup 0x3f01cfb8ffadffff & connect 0x1000003b8) | policy req 0x1000
  -180> 2019-06-22T20:40:54.017+0000 7f44cdd36700 10 -- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcca880 legacy=0x558cebb6b000 unknown :6835 s=STATE_CONNECTION_ESTABLISHED l=1)._try_send sent bytes 26 remaining bytes 0
  -179> 2019-06-22T20:40:54.017+0000 7f44cdd36700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcca880 0x558cebb6b000 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_message_2 accept of host_type 0, policy.lossy=1 policy.server=0 policy.standby=0 policy.resetcheck=0 features 0x10000015500
  -178> 2019-06-22T20:40:54.017+0000 7f44cdd36700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcca880 0x558cebb6b000 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_message_2 accept my proto 32, their proto 2048
  -177> 2019-06-22T20:40:54.017+0000 7f44cdd36700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcca880 0x558cebb6b000 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).send_connect_message_reply reply features 0x10000015500 = (policy sup 0x3f01cfb8ffadffff & connect 0x10000015500) | policy req 0x1000
  -176> 2019-06-22T20:40:54.017+0000 7f44cdd36700 10 -- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcca880 legacy=0x558cebb6b000 unknown :6835 s=STATE_CONNECTION_ESTABLISHED l=1)._try_send sent bytes 26 remaining bytes 0
  -175> 2019-06-22T20:40:54.017+0000 7f44cdd36700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcca880 0x558cebb6b000 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_message_2 accept of host_type 0, policy.lossy=1 policy.server=0 policy.standby=0 policy.resetcheck=0 features 0x0
  -174> 2019-06-22T20:40:54.017+0000 7f44cdd36700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcca880 0x558cebb6b000 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_message_2 accept my proto 32, their proto 0
  -173> 2019-06-22T20:40:54.017+0000 7f44cdd36700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcca880 0x558cebb6b000 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).send_connect_message_reply reply features 0x1000 = (policy sup 0x3f01cfb8ffadffff & connect 0x0) | policy req 0x1000
  -172> 2019-06-22T20:40:54.017+0000 7f44cdd36700 10 -- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcca880 legacy=0x558cebb6b000 unknown :6835 s=STATE_CONNECTION_ESTABLISHED l=1)._try_send sent bytes 26 remaining bytes 0
  -171> 2019-06-22T20:40:54.017+0000 7f44cdd36700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcca880 0x558cebb6b000 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_message_2 accept of host_type 256, policy.lossy=1 policy.server=0 policy.standby=0 policy.resetcheck=0 features 0x1550000000000
  -170> 2019-06-22T20:40:54.017+0000 7f44cdd36700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcca880 0x558cebb6b000 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_message_2 accept my proto 32, their proto 0
  -169> 2019-06-22T20:40:54.017+0000 7f44cdd36700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcca880 0x558cebb6b000 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).send_connect_message_reply reply features 0x1450000001000 = (policy sup 0x3f01cfb8ffadffff & connect 0x1550000000000) | policy req 0x1000
  -168> 2019-06-22T20:40:54.017+0000 7f44cdd36700 10 -- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcca880 legacy=0x558cebb6b000 unknown :6835 s=STATE_CONNECTION_ESTABLISHED l=1)._try_send sent bytes 26 remaining bytes 0
  -167> 2019-06-22T20:40:54.025+0000 7f44cdd36700  1 -- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcca880 legacy=0x558cebb6b000 unknown :6835 s=STATE_CONNECTION_ESTABLISHED l=1).read_bulk peer close file descriptor 33
  -166> 2019-06-22T20:40:54.025+0000 7f44cdd36700  1 -- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcca880 legacy=0x558cebb6b000 unknown :6835 s=STATE_CONNECTION_ESTABLISHED l=1).read_until read failed
  -165> 2019-06-22T20:40:54.025+0000 7f44cdd36700  1 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcca880 0x558cebb6b000 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_message_auth read connect authorizer failed
  -164> 2019-06-22T20:40:54.025+0000 7f44cdd36700  1 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcca880 0x558cebb6b000 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).fault on lossy channel, failing
  -163> 2019-06-22T20:40:54.025+0000 7f44cdd36700  2 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcca880 0x558cebb6b000 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).stop
  -162> 2019-06-22T20:40:54.025+0000 7f44cdd36700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcca880 0x558cebb6b000 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).discard_out_queue started
  -154> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >>  conn(0x558cebccad00 0x558cebb6b800 :6835 s=ACCEPTING pgs=0 cs=0 l=0).handle_client_banner accept peer addr is v1:172.21.15.137:0/910809705
  -153> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebccad00 0x558cebb6b800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_message_2 accept of host_type 8, policy.lossy=0 policy.server=1 policy.standby=1 policy.resetcheck=1 features 0x2f018fb86aa42ada
  -152> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebccad00 0x558cebb6b800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_message_2 accept my proto 32, their proto 32
  -151> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebccad00 0x558cebb6b800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_message_2 authorizor_protocol 2 len 174
  -149> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebccad00 0x558cebb6b800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_message_2: challenging authorizer
  -148> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebccad00 0x558cebb6b800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).send_connect_message_reply reply features 0x2f018fb86aa42ada = (policy sup 0x3f01cfb8ffadffff & connect 0x2f018fb86aa42ada) | policy req 0x0
  -147> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 -- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebccad00 legacy=0x558cebb6b800 unknown :6835 s=STATE_CONNECTION_ESTABLISHED l=0)._try_send sent bytes 58 remaining bytes 0
  -146> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebccad00 0x558cebb6b800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_message_2 accept of host_type 8, policy.lossy=0 policy.server=1 policy.standby=1 policy.resetcheck=1 features 0x2f018fb86aa42ada
  -145> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebccad00 0x558cebb6b800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_message_2 accept my proto 32, their proto 32
  -144> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebccad00 0x558cebb6b800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_message_2 authorizor_protocol 2 len 174
  -142> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebccad00 0x558cebb6b800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_message_2 accept setting up session_security.
  -141> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebccad00 0x558cebb6b800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_message_2 existing=0x558cebcc8900 exproto=0x558cebb6a800
  -140> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebccad00 0x558cebb6b800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_message_2 accept existing 0x558cebcc8900.gseq 254 <= 256, looks ok
  -139> 2019-06-22T20:40:54.209+0000 7f44ce537700  1 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebccad00 0x558cebb6b800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_message_2 accept connect_seq 1 vs existing csq=1 existing_state=STATE_CONNECTION_ESTABLISHED
  -138> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebccad00 0x558cebb6b800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_message_2 accept connection race, existing 0x558cebcc8900.cseq 1 == 1, OPEN|STANDBY, RETRY_SESSION
  -137> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebccad00 0x558cebb6b800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).send_connect_message_reply reply features 0x2f018fb86aa42ada = (policy sup 0x3f01cfb8ffadffff & connect 0x2f018fb86aa42ada) | policy req 0x0
  -136> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 -- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebccad00 legacy=0x558cebb6b800 unknown :6835 s=STATE_CONNECTION_ESTABLISHED l=0)._try_send sent bytes 62 remaining bytes 0
  -135> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebccad00 0x558cebb6b800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_message_2 accept of host_type 8, policy.lossy=0 policy.server=1 policy.standby=1 policy.resetcheck=1 features 0x2f018fb86aa42ada
  -134> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebccad00 0x558cebb6b800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_message_2 accept my proto 32, their proto 32
  -133> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebccad00 0x558cebb6b800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_message_2 authorizor_protocol 2 len 174
  -131> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebccad00 0x558cebb6b800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_message_2 accept setting up session_security.
  -130> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebccad00 0x558cebb6b800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_message_2 existing=0x558cebcc8900 exproto=0x558cebb6a800
  -129> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebccad00 0x558cebb6b800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_message_2 accept existing 0x558cebcc8900.gseq 254 <= 257, looks ok
  -128> 2019-06-22T20:40:54.209+0000 7f44ce537700  1 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebccad00 0x558cebb6b800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_message_2 accept connect_seq 2 vs existing csq=1 existing_state=STATE_CONNECTION_ESTABLISHED
  -127> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebccad00 0x558cebb6b800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_message_2 accept peer sent cseq 2 > 1
  -126> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebccad00 0x558cebb6b800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).replace accept replacing 0x558cebcc8900
  -125> 2019-06-22T20:40:54.209+0000 7f44ce537700  2 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebccad00 0x558cebb6b800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).stop
  -124> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebccad00 0x558cebb6b800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).discard_out_queue started
  -123> 2019-06-22T20:40:54.209+0000 7f44ce537700  1 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebccad00 0x558cebb6b800 :6835 s=CLOSED pgs=0 cs=0 l=0).replace stop myself to swap existing
  -122> 2019-06-22T20:40:54.209+0000 7f44cb531700  5 mds.j ms_handle_reset on v1:172.21.15.137:0/910809705
  -121> 2019-06-22T20:40:54.209+0000 7f44cb531700  1 -- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebccad00 legacy=0x558cebb6b800 unknown :6835 s=STATE_CLOSED l=0).mark_down
  -120> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcc8900 0x558cebb6a800 :6835 s=ACCEPTING pgs=254 cs=1 l=0).send_connect_message_reply reply features 0x2f018fb86aa42ada = (policy sup 0x3f01cfb8ffadffff & connect 0x2f018fb86aa42ada) | policy req 0x0
  -119> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 -- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcc8900 legacy=0x558cebb6a800 unknown :6835 s=STATE_CONNECTION_ESTABLISHED l=0)._try_send sent bytes 62 remaining bytes 0
  -118> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcc8900 0x558cebb6a800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=254 cs=1 l=0).handle_connect_message_2 accept of host_type 8, policy.lossy=0 policy.server=1 policy.standby=1 policy.resetcheck=1 features 0x2f018fb86aa42ada
  -117> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcc8900 0x558cebb6a800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=254 cs=1 l=0).handle_connect_message_2 accept my proto 32, their proto 32
  -116> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcc8900 0x558cebb6a800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=254 cs=1 l=0).handle_connect_message_2 authorizor_protocol 2 len 174
  -114> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcc8900 0x558cebb6a800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=254 cs=1 l=0).handle_connect_message_2: challenging authorizer
  -113> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcc8900 0x558cebb6a800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=254 cs=1 l=0).send_connect_message_reply reply features 0x2f018fb86aa42ada = (policy sup 0x3f01cfb8ffadffff & connect 0x2f018fb86aa42ada) | policy req 0x0
  -112> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 -- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcc8900 legacy=0x558cebb6a800 unknown :6835 s=STATE_CONNECTION_ESTABLISHED l=0)._try_send sent bytes 58 remaining bytes 0
  -111> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcc8900 0x558cebb6a800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=254 cs=1 l=0).handle_connect_message_2 accept of host_type 8, policy.lossy=0 policy.server=1 policy.standby=1 policy.resetcheck=1 features 0x2f018fb86aa42ada
  -110> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcc8900 0x558cebb6a800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=254 cs=1 l=0).handle_connect_message_2 accept my proto 32, their proto 32
  -109> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcc8900 0x558cebb6a800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=254 cs=1 l=0).handle_connect_message_2 authorizor_protocol 2 len 174
  -107> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcc8900 0x558cebb6a800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=254 cs=1 l=0).handle_connect_message_2 accept setting up session_security.
  -106> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcc8900 0x558cebb6a800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=254 cs=1 l=0).handle_connect_message_2 accept new session
  -105> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcc8900 0x558cebb6a800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=259 cs=3 l=0).open accept success, connect_seq = 3 in_seq=648, sending READY
  -104> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcc8900 0x558cebb6a800 :6835 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=259 cs=3 l=0).open accept features 3387146416985221850 authorizer_protocol 2
  -103> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 -- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] accept_conn 0x558cebcc8900 v1:172.21.15.137:0/910809705
  -102> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 -- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcc8900 legacy=0x558cebb6a800 unknown :6835 s=STATE_CONNECTION_ESTABLISHED l=0)._try_send sent bytes 70 remaining bytes 0
  -101> 2019-06-22T20:40:54.209+0000 7f44cb531700 10 mds.j  existing session 0x558cebcc9680 for client.4761 v1:172.21.15.137:0/910809705 existing con 0x558cebcc8900, new/authorizing con 0x558cebcc8900
   -99> 2019-06-22T20:40:54.209+0000 7f44cb531700 10 mds.j ms_handle_accept v1:172.21.15.137:0/910809705 con 0x558cebcc8900 session 0x558cebcc9680
   -98> 2019-06-22T20:40:54.209+0000 7f44ce537700  2 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcc8900 0x558cebb6a800 :6835 s=ACCEPTING_HANDLED_CONNECT_MSG pgs=259 cs=3 l=0).handle_seq accept get newly_acked_seq 0
   -97> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcc8900 0x558cebb6a800 :6835 s=ACCEPTING_HANDLED_CONNECT_MSG pgs=259 cs=3 l=0).discard_requeued_up_to 0
   -96> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcc8900 0x558cebb6a800 :6835 s=OPENED pgs=259 cs=3 l=0).append_keepalive_or_ack
   -95> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 -- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcc8900 legacy=0x558cebb6a800 unknown :6835 s=STATE_CONNECTION_ESTABLISHED l=0).handle_write
   -94> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcc8900 0x558cebb6a800 :6835 s=OPENED pgs=259 cs=3 l=0).write_event
   -93> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcc8900 0x558cebb6a800 :6835 s=OPENED pgs=259 cs=3 l=0).write_event try send msg ack, acked 648 messages
   -92> 2019-06-22T20:40:54.209+0000 7f44ce537700 10 -- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcc8900 legacy=0x558cebb6a800 unknown :6835 s=STATE_CONNECTION_ESTABLISHED l=0)._try_send sent bytes 18 remaining bytes 0
   -41> 2019-06-22T20:41:00.093+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcc8900 0x558cebb6a800 :6835 s=OPENED pgs=259 cs=3 l=0).append_keepalive_or_ack
   -40> 2019-06-22T20:41:00.097+0000 7f44ce537700 10 -- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcc8900 legacy=0x558cebb6a800 unknown :6835 s=STATE_CONNECTION_ESTABLISHED l=0).handle_write
   -39> 2019-06-22T20:41:00.097+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcc8900 0x558cebb6a800 :6835 s=OPENED pgs=259 cs=3 l=0).write_event
   -38> 2019-06-22T20:41:00.097+0000 7f44ce537700 10 -- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcc8900 legacy=0x558cebb6a800 unknown :6835 s=STATE_CONNECTION_ESTABLISHED l=0)._try_send sent bytes 9 remaining bytes 0
    -3> 2019-06-22T20:41:05.217+0000 7f44ce537700 10 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcc8900 0x558cebb6a800 :6835 s=READ_FOOTER_AND_DISPATCH pgs=259 cs=3 l=0).handle_message_footer aborted = 0
    -2> 2019-06-22T20:41:05.217+0000 7f44ce537700  0 --1- [v2:172.21.15.97:6834/3705416733,v1:172.21.15.97:6835/3705416733] >> v1:172.21.15.137:0/910809705 conn(0x558cebcc8900 0x558cebb6a800 :6835 s=READ_FOOTER_AND_DISPATCH pgs=259 cs=3 l=0).handle_message_footer got old message 1 <= 648 0x558ceadeaac0 client_session(request_renewcaps seq 12), discarding

From: /ceph/teuthology-archive/pdonnell-2019-06-21_01:51:23-multimds-wip-pdonnell-testing-20190620.220400-distro-basic-smithi/4053722/remote/smithi097/log/ceph-mds.j.log.gz


Related issues

Related to Linux kernel client - Bug #18690: kclient: FAILED assert(0 == "old msgs despite reconnect_seq feature") Resolved 01/26/2017
Related to CephFS - Bug #50696: nautilus: qa: multimds/thrash tasks/cfuse_workunit_suites_fsstress failure Won't Fix

History

#1 Updated by Patrick Donnelly over 4 years ago

  • Description updated (diff)

#2 Updated by Patrick Donnelly over 4 years ago

  • Description updated (diff)

#3 Updated by Patrick Donnelly over 4 years ago

  • Related to Bug #18690: kclient: FAILED assert(0 == "old msgs despite reconnect_seq feature") added

#4 Updated by Patrick Donnelly over 4 years ago

  • Status changed from New to Need More Info

Waiting to see if this happens again.

#5 Updated by Patrick Donnelly about 4 years ago

  • Target version deleted (v15.0.0)

#6 Updated by Zheng Yan almost 4 years ago

  • Status changed from Need More Info to Can't reproduce

seems like a temp issue of testing branch

#7 Updated by Patrick Donnelly over 3 years ago

  • Status changed from Can't reproduce to New

This one is back:

2020-09-24T22:58:20.631 INFO:tasks.ceph.mds.c.smithi150.stderr:/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.0.0-5895-ge3a9554f/rpm/el8/BUILD/ceph-16.0.0-5895-ge3a9554f/src/msg/async/ProtocolV1.cc: In function 'Ct<ProtocolV1>* ProtocolV1::handle_message_footer(char*, int)' thread 7fb146f38700 time 2020-09-24T22:58:20.631671+0000
2020-09-24T22:58:20.632 INFO:tasks.ceph.mds.c.smithi150.stderr:/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.0.0-5895-ge3a9554f/rpm/el8/BUILD/ceph-16.0.0-5895-ge3a9554f/src/msg/async/ProtocolV1.cc: 966: FAILED ceph_assert(0 == "old msgs despite reconnect_seq feature")
2020-09-24T22:58:20.632 INFO:tasks.ceph.mds.c.smithi150.stderr: ceph version 16.0.0-5895-ge3a9554f (e3a9554f85f926ca9d6cf59545ac390783fe8fc5) pacific (dev)
2020-09-24T22:58:20.633 INFO:tasks.ceph.mds.c.smithi150.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x7fb14d2bd5a8]
2020-09-24T22:58:20.633 INFO:tasks.ceph.mds.c.smithi150.stderr: 2: ()
2020-09-24T22:58:20.633 INFO:tasks.ceph.mds.c.smithi150.stderr: 3: (ProtocolV1::handle_message_footer(char*, int)+0xebe) [0x7fb14d5a20ee]
2020-09-24T22:58:20.633 INFO:tasks.ceph.mds.c.smithi150.stderr: 4: ()
2020-09-24T22:58:20.633 INFO:tasks.ceph.mds.c.smithi150.stderr: 5: (AsyncConnection::process()+0x8a9) [0x7fb14d5881f9]
2020-09-24T22:58:20.633 INFO:tasks.ceph.mds.c.smithi150.stderr: 6: (EventCenter::process_events(unsigned int, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*)+0xcb7) [0x7fb14d5e1147]
2020-09-24T22:58:20.634 INFO:tasks.ceph.mds.c.smithi150.stderr: 7: ()
2020-09-24T22:58:20.634 INFO:tasks.ceph.mds.c.smithi150.stderr: 8: ()
2020-09-24T22:58:20.634 INFO:tasks.ceph.mds.c.smithi150.stderr: 9: ()
2020-09-24T22:58:20.634 INFO:tasks.ceph.mds.c.smithi150.stderr: 10: clone()
2020-09-24T22:58:20.634 INFO:tasks.ceph.mds.c.smithi150.stderr:*** Caught signal (Aborted) **

From: /ceph/teuthology-archive/pdonnell-2020-09-24_08:08:06-multimds-wip-pdonnell-testing-20200924.050041-distro-basic-smithi/5465655/teuthology.log

#8 Updated by Zheng Yan over 3 years ago

  • Assignee deleted (Zheng Yan)

#9 Updated by Ramana Raja almost 3 years ago

  • Related to Bug #50696: nautilus: qa: multimds/thrash tasks/cfuse_workunit_suites_fsstress failure added

Also available in: Atom PDF