Project

General

Profile

Feature #61863

mds: issue a health warning with estimated time to complete replay

Added by Patrick Donnelly 8 months ago. Updated 10 days ago.

Status:
Fix Under Review
Priority:
Urgent
Assignee:
Category:
Administration/Usability
Target version:
% Done:

0%

Source:
Development
Tags:
Backport:
reef,quincy
Reviewed:
Affected Versions:
Component(FS):
MDS
Labels (FS):
Pull request ID:

Description

When the MDS is in up:replay, it does not give any indication to the operator when it will complete. We do have this information though. The MDS knows the end of the journal, the write position, and how quickly it's consuming (replaying) events. Issue a periodic health warning with the current completion percentage (e.g. 40% of journal read), time spent in replay, and expected time remaining.

History

#1 Updated by Greg Farnum 8 months ago

  • Assignee set to Manish Yathnalli

#2 Updated by Manish Yathnalli 8 months ago

  • Status changed from New to In Progress

#3 Updated by Venky Shankar 8 months ago

Patrick, the "MDS behind trimming" warning during up:replay is kind of expected in cases where there are lot many journal events/segments to replay. I think it makes sense to have this warning show up when the MDS is up:active and during up:replay the MDS could warn for longer replay times depending on the expected replay completion time/percentage.

#4 Updated by Patrick Donnelly 7 months ago

Venky Shankar wrote:

Patrick, the "MDS behind trimming" warning during up:replay is kind of expected in cases where there are lot many journal events/segments to replay.

Sorry I'm not understanding where "MDS behind trimming" warning fits into this particular issue (beyond causing longer replay).

I think it makes sense to have this warning show up when the MDS is up:active and during up:replay the MDS could warn for longer replay times depending on the expected replay completion time/percentage.

If "this warning" meaning "MDS behind on trimming", I think we're on the same page.

#5 Updated by Greg Farnum 6 months ago

  • Pull request ID set to 52527

#6 Updated by Manish Yathnalli 6 months ago

  • Status changed from In Progress to Fix Under Review

#7 Updated by Venky Shankar 6 months ago

Manish Yathnalli wrote:

https://github.com/ceph/ceph/pull/52527

Manish, the PR id is linked in the "Pull request ID" field.

#8 Updated by Venky Shankar 10 days ago

  • Assignee changed from Manish Yathnalli to Venky Shankar
  • Backport changed from reef,quincy,pacific to reef,quincy
  • Pull request ID changed from 52527 to 55616

Manish, I'm taking the ownership of this one. I retained your contribution tags in the new pull request (of course!).

Also available in: Atom PDF