Bug #10915
closedclient: hangs on umount if it had an MDS session evicted
0%
Description
Seen like this with fuse client: * Start 2 active MDSs * Do some activity such that sessions are open with both MDSs * ceph daemon mds.b session evict <client id> * Umount client
Client does this:
2015-02-19 10:20:38.580641 7f0e2b7c87c0 2 client.4121 _close_mds_session mds.0 seq 4 2015-02-19 10:20:38.580657 7f0e2b7c87c0 2 client.4121 _close_mds_session mds.1 seq 3 2015-02-19 10:20:38.580664 7f0e2b7c87c0 2 client.4121 waiting for 2 mds sessions to close 2015-02-19 10:20:38.591696 7f0e227fc700 10 client.4121 handle_client_session client_session(close) v1 from mds.0 2015-02-19 10:20:38.591771 7f0e227fc700 10 client.4121 remove_session_caps mds.0 2015-02-19 10:20:38.591783 7f0e227fc700 10 client.4121 kick_requests_closed for mds.0 2015-02-19 10:20:38.591792 7f0e227fc700 10 client.4121 unmounting: trim pass, size was 0+0 2015-02-19 10:20:38.591794 7f0e227fc700 20 client.4121 trim_cache size 0 max 0 2015-02-19 10:20:38.591796 7f0e227fc700 10 client.4121 unmounting: trim pass, size still 0+0 2015-02-19 10:20:38.591801 7f0e2b7c87c0 2 client.4121 waiting for 1 mds sessions to close
But the MDS where we evicted the session ignored it:
2015-02-19 10:20:36.922299 7f2266057700 1 -- 172.16.79.251:6813/35691 <== client.4120 172.16.79.251:0/35776 2093931643 ==== client_session(request_close seq 2) v1 ==== 28+0+0 (817140596 0 0) 0x4680000 con 0x4a27e80 2015-02-19 10:20:36.922319 7f2266057700 20 mds.1.server get_session have 0x4a7a700 client.4120 172.16.79.251:0/35776 state closed 2015-02-19 10:20:36.922323 7f2266057700 3 mds.1.server handle_client_session client_session(request_close seq 2) v1 from client.4120 2015-02-19 10:20:36.922326 7f2266057700 10 mds.1.server already closed|closing|killing, dropping this req
I suppose we should be always acknowledging client request_close messages, so that the client can terminate itself.
Updated by Greg Farnum about 9 years ago
Mmmm, that should be a pretty easy change MDS-side; I'm trying to figure out if it could get us in trouble though. And do we really want the client to be clean if we evicted it? There's probably going to be dirty data...Actually I think if there is dirty data it will block on that rather than not getting a close.
On the other hand, we might also want the client to be able to shut down happily if a server or the network goes away but it has no dirty data. I don't think there's much harm cluster-side to the client disappearing in that case, so maybe it should time out the close session request and just exit?
Updated by Greg Farnum almost 8 years ago
- Category set to Administration/Usability
Updated by Patrick Donnelly about 6 years ago
- Subject changed from client hangs on umount if it had an MDS session evicted to client: hangs on umount if it had an MDS session evicted
- Target version set to v13.0.0
- Tags set to intern
- Component(FS) Client, MDS added
Updated by Rishabh Dave about 6 years ago
The issue is also reproducible with the kernel client.
Patrick, can you assign this issue to me?
Updated by Patrick Donnelly about 6 years ago
- Status changed from New to In Progress
Updated by Patrick Donnelly about 6 years ago
- Status changed from In Progress to Fix Under Review
Updated by Patrick Donnelly almost 6 years ago
- Status changed from Fix Under Review to Pending Backport
- Tags deleted (
intern) - Backport set to luminous,jewel
- Component(FS) deleted (
MDS) - Labels (FS) task(intern) added
Updated by Patrick Donnelly almost 6 years ago
- Related to Bug #23975: qa: TestVolumeClient.test_lifecycle needs updated for new eviction behavior added
Updated by Nathan Cutler almost 6 years ago
- Copied to Backport #23990: jewel: client: hangs on umount if it had an MDS session evicted added
Updated by Nathan Cutler almost 6 years ago
- Copied to Backport #23991: luminous: client: hangs on umount if it had an MDS session evicted added
Updated by Patrick Donnelly almost 6 years ago
- Related to Bug #24053: qa: kernel_mount.py umount must handle timeout arg added
Updated by Patrick Donnelly almost 6 years ago
- Precedes Bug #24054: kceph: umount on evicted client blocks forever added
Updated by Patrick Donnelly over 5 years ago
- Status changed from Pending Backport to Resolved