Project

General

Profile

Bug #14357

Delay in clientreplay on quiet clusters

Added by John Spray almost 7 years ago. Updated almost 7 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Because we are checking for clientreplay_done at the end of _dispatch, if a request is completing via a commit context like C_MDS_inode_update_finish, we don't recognise that clientreplay is done until the next time some other messages comes in.

In practice, that means that on a quiet cluster of an MDS and a client, we don't make it out of clientreplay until the client sends its next cap renewal (up to 30s later).

In usage this is mildly annoying, in test it is especially annoying because it causes an overly wide variance in expected timing for failover.

Associated revisions

Revision 24de350d (diff)
Added by John Spray almost 7 years ago

mds: advance clientreplay when replying

...not just at the end of _dispatch. Often we reply
to clients (i.e. complete a request) outside of
_dispatch, and currently in these cases we fail
to check for clientreplay completion (only hitting
that next time someone talks to _dispatch)

Fixes: #14357
Signed-off-by: John Spray <>

History

#1 Updated by John Spray almost 7 years ago

  • Status changed from In Progress to Fix Under Review

#2 Updated by Zheng Yan almost 7 years ago

  • Status changed from Fix Under Review to Resolved

Also available in: Atom PDF