Project

General

Profile

Feature #22446

mds: ask idle client to trim more caps

Added by Zheng Yan about 1 year ago. Updated about 1 month ago.

Status:
In Progress
Priority:
Urgent
Category:
-
Target version:
Start date:
12/15/2017
Due date:
% Done:

0%

Source:
Development
Tags:
Backport:
mimic,luminous
Reviewed:
Affected Versions:
Component(FS):
Client, MDS, kceph
Labels (FS):
Pull request ID:

Description

we can add decay counter to client session, tracking the rate we add new cap to the client


Related issues

Related to fs - Feature #21156: mds: speed up recovery with many open inodes Resolved 08/28/2017

History

#1 Updated by Patrick Donnelly about 1 year ago

  • Status changed from New to Feedback
  • Priority changed from Normal to Low
  • Target version set to v13.0.0
  • Component(FS) Client added

What's the goal? Prevent the situation where the client has ~1M caps for an indefinite period like what we saw on the mailing list?

#2 Updated by Patrick Donnelly about 1 year ago

Ah, the problem is recovery of the MDS takes too long. (from follow-up posts to "[ceph-users] cephfs mds millions of caps")

I'd like to see measurements to see how long it takes the MDS to load metadata during recovery. Is there a way we could make that faster?

tracking the rate we add new cap to the client

Do we really care about the rate we add caps? If the client is bumping into its cache limit then it will begin trimming. We should only care if the caps are old and unused?

#3 Updated by Zheng Yan about 1 year ago

no about recovery time, clients already trim their cache aggressively when mds recovers.

Idle client holds so many caps is wasteful, because it increase the chance that mds trim other relatively hot objects from the cache.

#4 Updated by Patrick Donnelly 12 months ago

Zheng Yan wrote:

Idle client holds so many caps is wasteful, because it increase the chance that mds trim other relatively hot objects from the cache.

Okay, that makes sense. I think the decay counter idea is good; let's do it.

#5 Updated by Patrick Donnelly 12 months ago

  • Related to Feature #21156: mds: speed up recovery with many open inodes added

#6 Updated by Patrick Donnelly 7 months ago

  • Status changed from Feedback to New
  • Assignee set to Rishabh Dave
  • Priority changed from Low to High
  • Target version changed from v13.0.0 to v14.0.0
  • Backport set to mimic,luminous
  • Component(FS) MDS, kceph added

#7 Updated by Webert Lima 7 months ago

Glad to see this :)

- Backport set to mimic,luminous

Thanks.

#8 Updated by Rishabh Dave 6 months ago

Can I get few implementation specific details to get started working on this issue?

And for clarity on my side, we would trim caps when the count reaches zero (and the counter reaches zero when there are no ops on the session for a certain while i.e. it is idle), right?

#9 Updated by Patrick Donnelly 5 months ago

Rishabh Dave wrote:

Can I get few implementation specific details to get started working on this issue?

Copying from your email:

  • The issue description talks about "decay counter". I haven't found
    to a source to verify what I understand by the term, but in my
    understanding we'll increase the count with the every cap that the
    client acquires and then if there are no ops for a certain period of
    time, the count must naturally decrease until either there are no more
    caps or there an op is conducted on the filesystem. Please correct me
    if I am wrong here.

DecayCounter is a data structure used in the MDS in various places to track popularity of metadata (for the purposes of subtree migration).

  • The issue description says "tracking the rate we add new cap to
    client". What is the purpose behind knowing the rate? In my
    understanding, idle clients must trim caps irrespectively. Is it to
    vary the timeout for each session accordingly?

I'm not sure having a DecayCounter for each session is quite right. We'd probably want a DecayCounter per-cap and trim caps that haven't been used in a long time (like 1 week?).

  • Should the code to implement this feature be implemented with MDS
    (idle sessions are request to trim caps) or with client (sessions
    trims the caps after the certain period of time)?

In the client. Only the client knows how often it uses its capabilities (i.e. how often some file I/O relies on a capability).

  • Currently, on the MDS side we already trim caps for idle sessions
    after `session_timeout`[2]. So, how is this feature be different from
    that?

Idle sessions are not the same as idle clients in this context. An idle client would be a client that hasn't done any significant file I/O recently (recently to-be-defined). Whereas an idle session is a concept in the MDS where the client has not spoken to the MDS in session_timeout seconds (e.g. hasn't renewed caps as required).

#10 Updated by Patrick Donnelly 4 months ago

  • Assignee deleted (Rishabh Dave)

#11 Updated by Patrick Donnelly about 1 month ago

  • Status changed from New to In Progress
  • Assignee set to Patrick Donnelly
  • Priority changed from High to Urgent
  • Source set to Development

Also available in: Atom PDF