Bug #14357: Delay in clientreplay on quiet clusters - CephFS - Ceph

Actions

Copy link

Bug #14357

closed

Delay in clientreplay on quiet clusters

Added by John Spray over 8 years ago. Updated over 8 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

John Spray

Category:

Target version:

% Done:

Source:

other

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(FS):

Labels (FS):

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

Because we are checking for clientreplay_done at the end of _dispatch, if a request is completing via a commit context like C_MDS_inode_update_finish, we don't recognise that clientreplay is done until the next time some other messages comes in.

In practice, that means that on a quiet cluster of an MDS and a client, we don't make it out of clientreplay until the client sends its next cap renewal (up to 30s later).

In usage this is mildly annoying, in test it is especially annoying because it causes an overly wide variance in expected timing for failover.