Project

General

Profile

Feature #21156

mds: speed up recovery with many open inodes

Added by Zheng Yan over 2 years ago. Updated about 2 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Performance/Resource Usage
Target version:
% Done:

0%

Source:
Development
Tags:
perf
Backport:
Reviewed:
Affected Versions:
Component(FS):
MDS
Labels (FS):
Pull request ID:

Description

opening inode during rejoin stage is slow when clients have large number of caps.

Currently mds journal open inodes (client wants their caps) on each log segment. this does not work well there are large number of open inodes


Related issues

Related to fs - Feature #22446: mds: ask idle client to trim more caps Resolved

History

#1 Updated by Zheng Yan over 2 years ago

  • Subject changed from speed mds recovery to speed up mds recovery

#2 Updated by dongdong tao over 2 years ago

hi zheng,

i'm not sure if i understand this correctlly, do you mean the mds can not recover the openning inode just from journal, need to fetch from the corresponding CDir in metadata pool ?

#3 Updated by Zheng Yan over 2 years ago

mds need to open all inodes with client caps during recovery. some of these inode may be not in the journal

#4 Updated by Zheng Yan over 2 years ago

besides, when there are lots of open inodes, it's not efficient to journal all of them in each log segment.

#5 Updated by dongdong tao over 2 years ago

thanks, that can explain the senerio we have met,
sometimes my standby-replay mds spend too much time in rejoin state. (almost 70%)
sometimes rejoin is fast.

#6 Updated by Patrick Donnelly over 2 years ago

  • Related to Feature #22446: mds: ask idle client to trim more caps added

#7 Updated by Patrick Donnelly over 2 years ago

  • Subject changed from speed up mds recovery to mds: speed up recovery with many open inodes
  • Component(FS) MDS added

#8 Updated by Zheng Yan over 2 years ago

  • Status changed from New to In Progress

#9 Updated by Zheng Yan over 2 years ago

  • Status changed from In Progress to Fix Under Review

#10 Updated by Patrick Donnelly about 2 years ago

  • Category set to Performance/Resource Usage
  • Status changed from Fix Under Review to Resolved
  • Assignee set to Zheng Yan
  • Target version set to v13.0.0
  • Source set to Development
  • Tags set to perf

#11 Updated by Webert Lima about 2 years ago

Hi, thank you very much for this.

I see this

Target version: Ceph - v13.0.0

So I'm not even asking for a backport do Jewel, but how likely would this be backported to Luminous? My 3 current cephfs clusters run jewel but the next one could run Luminous.

Thanks!

#12 Updated by Patrick Donnelly about 2 years ago

Webert Lima wrote:

Hi, thank you very much for this.

I see this

Target version: Ceph - v13.0.0

So I'm not even asking for a backport do Jewel, but how likely would this be backported to Luminous? My 3 current cephfs clusters run jewel but the next one could run Luminous.

Thanks!

Very unlikely because of the new structure in the metadata pool adds unacceptable risk for a backport.

#13 Updated by Webert Lima about 2 years ago

Patrick Donnelly wrote:

Very unlikely because of the new structure in the metadata pool adds unacceptable risk for a backport.

Oh, I see. It has changed that much in v13, huh?
Thank you and all your team for taking the effort in this improvement anyway. I see that this was due to an email I have sent to the list.

I'm glad to rely on ceph =]

Also available in: Atom PDF