Project

General

Profile

Actions

Bug #37802

closed

msg/async/ProtocolV2.cc: 956: FAILED ceph_assert(0 == "old msgs despite reconnect_seq feature")

Added by Sage Weil over 5 years ago. Updated about 5 years ago.

Status:
Rejected
Priority:
High
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

 -2529> 2019-01-06 16:55:05.379 7f44cb846f00  1 -- [v2:172.21.15.112:3301/0,v1:172.21.15.112:6790/0] --> [v2:172.21.15.112:3302/0,v1:172.21.15.112:6791/0] -- mon_probe(probe abbe56b3-25c7-4790-8a34-5842dabf304a name d) v6 -- 0x55ecda2ff8c0 con 0x55ecda546d00
 -2304> 2019-01-06 16:55:05.383 7f44b3054700  1 -- [v2:172.21.15.112:3301/0,v1:172.21.15.112:6790/0] <== mon.6 v2:172.21.15.112:3302/0 1 ==== mon_probe(probe abbe56b3-25c7-4790-8a34-5842dabf304a name g) v6 ==== 58+0+0 (541348649 0 0) 0x55ecda301180 con 0x55ecda546d00
 -2291> 2019-01-06 16:55:05.383 7f44b3054700  1 -- [v2:172.21.15.112:3301/0,v1:172.21.15.112:6790/0] --> [v2:172.21.15.112:3302/0,v1:172.21.15.112:6791/0] -- mon_probe(reply abbe56b3-25c7-4790-8a34-5842dabf304a name d paxos( fc 1 lc 201 )) v6 -- 0x55ecda5ba580 con 0x55ecda546d00
 -2183> 2019-01-06 16:55:05.383 7f44b3054700  1 -- [v2:172.21.15.112:3301/0,v1:172.21.15.112:6790/0] <== mon.6 v2:172.21.15.112:3302/0 2 ==== mon_probe(probe abbe56b3-25c7-4790-8a34-5842dabf304a name g) v6 ==== 58+0+0 (541348649 0 0) 0x55ecda5263c0 con 0x55ecda546d00
 -2173> 2019-01-06 16:55:05.383 7f44b3054700  1 -- [v2:172.21.15.112:3301/0,v1:172.21.15.112:6790/0] --> [v2:172.21.15.112:3302/0,v1:172.21.15.112:6791/0] -- mon_probe(reply abbe56b3-25c7-4790-8a34-5842dabf304a name d paxos( fc 1 lc 201 )) v6 -- 0x55ecda5bcc00 con 0x55ecda546d00
 -2028> 2019-01-06 16:55:05.383 7f44b3054700  1 -- [v2:172.21.15.112:3301/0,v1:172.21.15.112:6790/0] <== mon.6 v2:172.21.15.112:3302/0 3 ==== election(abbe56b3-25c7-4790-8a34-5842dabf304a propose 243) v7 ==== 2214+0+0 (2518065110 0 0) 0x55ecda2f8900 con 0x55ecda546d00
 -1906> 2019-01-06 16:55:05.383 7f44b3054700  1 -- [v2:172.21.15.112:3301/0,v1:172.21.15.112:6790/0] <== mon.6 v2:172.21.15.112:3302/0 4 ==== mon_probe(reply abbe56b3-25c7-4790-8a34-5842dabf304a name g quorum 1,2,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20 paxos( fc 1 lc 243 )) v6 ==== 2270+0+0 (2636899176 0 0) 0
x55ecda526ec0 con 0x55ecda546d00
 -1655> 2019-01-06 16:55:05.383 7f44b7473700  1 -- [v2:172.21.15.112:3301/0,v1:172.21.15.112:6790/0] >> [v2:172.21.15.112:3302/0,v1:172.21.15.112:6791/0] conn(0x55ecda546d00 msgr2 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read_bulk peer close file descriptor 30
 -1650> 2019-01-06 16:55:05.383 7f44b7473700  1 -- [v2:172.21.15.112:3301/0,v1:172.21.15.112:6790/0] >> [v2:172.21.15.112:3302/0,v1:172.21.15.112:6791/0] conn(0x55ecda546d00 msgr2 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read_until read failed
 -1646> 2019-01-06 16:55:05.383 7f44b7473700  1 --2- [v2:172.21.15.112:3301/0,v1:172.21.15.112:6790/0] >> [v2:172.21.15.112:3302/0,v1:172.21.15.112:6791/0] conn(0x55ecda546d00 msgr2 :-1 s=OPENED pgs=53 cs=1 l=0).handle_message read tag failed
  -269> 2019-01-06 16:56:05.386 7f44b5859700  1 -- [v2:172.21.15.112:3301/0,v1:172.21.15.112:6790/0] --> [v2:172.21.15.112:3302/0,v1:172.21.15.112:6791/0] -- mon_probe(probe abbe56b3-25c7-4790-8a34-5842dabf304a name d) v6 -- 0x55ecda2fedc0 con 0x55ecda546d00
  -193> 2019-01-06 16:56:05.386 7f44b7c74700  0 --2- [v2:172.21.15.112:3301/0,v1:172.21.15.112:6790/0] >> [v2:172.21.15.112:3302/0,v1:172.21.15.112:6791/0] conn(0x55ecda546d00 msgr2 :-1 s=READ_FOOTER_AND_DISPATCH pgs=56 cs=3 l=0).handle_message_footer got old message 2 <= 4 0x55ecda301180 mon_probe(reply abbe56b3-25c7-4790-8a34-5842dabf304a name g quorum 1,2,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20 paxos( fc 1 lc 291 )) v6, discarding
    -1> 2019-01-06 16:56:05.386 7f44b7c74700 -1 /build/ceph-14.0.1-2309-g5f4e8b4/src/msg/async/ProtocolV2.cc: In function 'Ct<ProtocolV2>* ProtocolV2::handle_message_footer(char*, int)' thread 7f44b7c74700 time 2019-01-06 16:56:05.388832
/build/ceph-14.0.1-2309-g5f4e8b4/src/msg/async/ProtocolV2.cc: 956: FAILED ceph_assert(0 == "old msgs despite reconnect_seq feature")

 ceph version 14.0.1-2309-g5f4e8b4 (5f4e8b42ac9b87ac64ea4171336396209061e043) nautilus (dev)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152) [0x7f44c2aa9479]
 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x7f44c2aa9654]
 3: (ProtocolV2::handle_message_footer(char*, int)+0xcbd) [0x7f44c2dda20d]
 4: (()+0x57488d) [0x7f44c2dd188d]
 5: (AsyncConnection::process()+0x186) [0x7f44c2da82a6]
 6: (EventCenter::process_events(unsigned int, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*)+0x1595) [0x7f44c2df2d15]
 7: (()+0x5a16fa) [0x7f44c2dfe6fa]
 8: (()+0xbe733) [0x7f44c1209733]
 9: (()+0x76db) [0x7f44c16e46db]

/a/sage-2019-01-06_15:12:50-rados:multimon-wip-sage3-testing-2019-01-04-1436-distro-basic-smithi/3431278
/a/sage-2019-01-06_15:12:50-rados:multimon-wip-sage3-testing-2019-01-04-1436-distro-basic-smithi/3431274

Actions #1

Updated by Sage Weil over 5 years ago

  • Description updated (diff)
Actions #2

Updated by Sage Weil over 5 years ago

  • Description updated (diff)
Actions #3

Updated by Sage Weil over 5 years ago

  • Status changed from 12 to Rejected

I was hitting this as a side-effect of one or more of #36497 or #37778 or #37779.

Actions #4

Updated by Greg Farnum about 5 years ago

  • Project changed from RADOS to Messengers
Actions

Also available in: Atom PDF