client is waiting for session close message from mds.8.
2014-05-19 15:33:27.831973 7f38df361700 0 -- 10.214.131.11:6815/12204 >> 10.214.131.21:0/2397708351 pipe(0x2e81a00 sd=25 :6815 s=2 pgs=21 cs=23 l=0 c=0x2fcb6e0).fault with nothing to send, going to standby
2014-05-19 15:33:31.126319 7f38e0c6c700 10 mds.8.1 beacon_send up:active seq 2720 (currently up:active)
2014-05-19 15:33:31.126354 7f38e0c6c700 1 -- 10.214.131.11:6815/12204 --> 10.214.131.10:6789/0 -- mdsbeacon(4098/e up:active seq 2720 v8) v2 -- ?+0 0x322cb00 con 0x2e5f9a0
2014-05-19 15:33:31.127380 7f38e3673700 1 -- 10.214.131.11:6815/12204 <== mon.0 10.214.131.10:6789/0 2818 ==== mdsbeacon(4098/e up:active seq 2720 v8) v2 ==== 103+0+0 (425398902 0 0) 0x32348c0 con 0x2e5f9a0
2014-05-19 15:33:31.127415 7f38e3673700 10 mds.8.1 handle_mds_beacon up:active seq 2720 rtt 0.001081
2014-05-19 15:33:32.051766 7f38e3673700 1 -- 10.214.131.11:6815/12204 <== osd.2 10.214.131.10:6805/10269 1 ==== osd_op_reply(41 208.00000001 [write 3628~198] v5'1367 uv1367 ondisk = 0) v6 ==== 179+0+0 (1233982911 0 0) 0x2fadc80 con 0x2ffd9a0
2014-05-19 15:33:32.051824 7f38e3673700 10 mds.8.server _session_logged client.4182 10.214.131.21:0/2397708351 state_seq 3 close 17
2014-05-19 15:33:32.051837 7f38e3673700 1 -- 10.214.131.11:6815/12204 mark_disposable 0x2fcb6e0 -- 0x2e81a00
2014-05-19 15:33:32.051843 7f38e3673700 10 mds.8.1 send_message_client client.4182 10.214.131.21:0/2397708351 client_session(close) v1
2014-05-19 15:33:32.051851 7f38e3673700 1 -- 10.214.131.11:6815/12204 --> 10.214.131.21:0/2397708351 -- client_session(close) v1 -- ?+0 0x31ffc40 con 0x2fcb6e0
2014-05-19 15:33:32.051894 7f38e3673700 1 -- 10.214.131.11:6815/12204 <== osd.2 10.214.131.10:6805/10269 2 ==== osd_op_reply(43 208.00000001 [write 3826~198] v5'1369 uv1369 ondisk = 0) v6 ==== 179+0+0 (109633772 0 0) 0x2f55280 con 0x2ffd9a0
the session close message was dropped because MDS first marked connection disposable, then sent session close message (the connection was already in standby state)
Server::_session_logged() has following comments
if (session->is_closing()) {
// mark con disposable. if there is a fault, we will get a
// reset and clean it up. if the client hasn't received the
// CLOSE message yet, they will reconnect and get an
// ms_handle_remote_reset() and realize they had in fact closed.
// do this *before* sending the message to avoid a possible
// race.
mds->messenger->mark_disposable(session->connection.get());
// reset session
mds->send_message_client(new MClientSession(CEPH_SESSION_CLOSE), session);
mds->sessionmap.set_state(session, Session::STATE_CLOSED);
session->clear();
}
No idea what is the race.