Bug #4564
closedclient: Close session doesn't wait for outstanding requests
0%
Description
Ran into another failure related to testing #4451 on the client where the following occurs:
client sends create/unlink requests
gets back unsafe replies
kill mds on openc
mds restarts
client does unmount request
client sends replay requests for unsafe requests
mds queues replay requests for later
client sends reconnect
mds enters clientreplay state
mds starts replaying queued requests
mds waits for root inode (mydir) -- again queuing replayed requests
once mydir is populated, starts first replay request
session enters closing state
replay requests are dropped because session is closing
This uncovers two problems
1. The client now assert fails because there are outstanding requests when the session is closed:
../../src/include/xlist.h: In function 'xlist<T>::~xlist() [with T = MetaRequest*]' thread 7f84ea7fc700 time 2013-03-26 17:34:04.329235
../../src/include/xlist.h: 69: FAILED assert(size == 0)
ceph version 0.59-524-g0dcb897 (0dcb897e680de5119b77a216ca5133a2265bc446)
1: (ceph::_ceph_assert_fail(char const*, char const*, int, char const*)+0x9b) [0x7f84f338a791]
2: (xlist<MetaRequest*>::~xlist()+0x3c) [0x7f84f31a7ff2]
3: (MetaSession::~MetaSession()+0x65) [0x7f84f31f29f5]
4: (Client::_closed_mds_session(MetaSession*)+0xbf) [0x7f84f314dd5b]
5: (Client::handle_client_session(MClientSession*)+0x456) [0x7f84f314e1c0]
6: (Client::ms_dispatch(Message*)+0x2d9) [0x7f84f3150785]
7: (Messenger::ms_deliver_dispatch(Message*)+0xa1) [0x7f84f3261a75]
8: (DispatchQueue::entry()+0x54f) [0x7f84f326103f]
9: (DispatchQueue::DispatchThread::entry()+0x22) [0x7f84f336f96a]
2. The mds drops replay requests that the client has already received unsafe replies to.
Proposed fix at the client: wait for all outstanding requests on a session to complete. The mds shouldn't return session_close until all outstanding requests have completed, but for mds restart scenarios like this one, we might want to wait instead of asserting.
Proposed fix at the mds: delay a session close request until clientreplay is complete
Logs attached.
Files