Project

General

Profile

Bug #22846

Updated by Kefu Chai about 6 years ago

/a/kchai-2018-01-31_01:48:16-rados-wip-kefu-testing-2018-01-31-0034-distro-basic-mira/2130028 


 i am not sure it's caused by the fastclose.yaml setting. mon.b failed to respond to mon.a 's paxos(begin) message in a timely manner. and also was unable to rejoin the quorum in 15 seconds. it kept trying to start the election, and didn't respond to mon.a 's probe message. mon.a was the leader before the election was started. 

 <pre> 

 2018-01-31 10:11:24.574 7f894510d700 10 mon.a@0(leader).paxos(paxos updating c 1..164)    sending begin to mon.1 
 2018-01-31 10:11:24.574 7f894510d700 10 mon.a@0(leader).paxos(paxos updating c 1..164) 10:11:22.577 7f03c0e17700    sending begin to mon.2 
 ... 
 2018-01-31 10:11:24.578 7f8942908700    1 -- 172.21.7.104:6789/0 <== mon.0 172.21.6.138:6789/0 <== mon.2 172.21.6.138:6790/0 489 603 ==== paxos(accept paxos(begin lc 164 163 fc 0 pn 300 opn 0) v4 ==== 84+0+0 (4169294484 2738+0+0 (515352010 0 0) 0x556113b51c00 0x5598e2230100 con 0x556113faf500 0x5598e1f5ee00 
 ... 
 2018-01-31 10:11:33.826 7f8942908700 10:11:22.577 7f03c0e17700    1 -- 172.21.7.104:6789/0 --> 172.21.6.138:6789/0 <== mon.1 172.21.7.104:6789/0 648 ==== -- paxos(accept lc 164 163 fc 0 pn 300 opn 0) v4 ==== 84+0+0 (3515085151 -- 0x5598e2a25e00 con 0 0) 0x55611429f900 con 0x556113faee00 
 </pre>

Back