Project

General

Profile

Bug #46648

mds: cannot handle hundreds+ of subtrees

Added by Patrick Donnelly 15 days ago. Updated 15 days ago.

Status:
New
Priority:
High
Assignee:
Category:
-
Target version:
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
task(hard)
Pull request ID:
Crash signature:

Description

The MDS has a lot of trouble scaling to hundreds or thousands of subtrees. From discussions with Zheng, one of the reasons for that is the MDS needs to write the subtree map anytime it starts a new journal segment. That can cause long delays if the subtree map is large. It'd be more efficient to write out incremental changes to the subtree map as the MDS goes.

Additionally, there's various places in the MDS where we iterate over the subtrees and spam debug messages. Generally, information is useful but we should try to find ways to compact this down into fewer messages. Writing out all the subtrees to the debug log just does not scale along with the workload.

This ticket is part of a refactor Zheng has planned to take up.


Related issues

Related to fs - Fix #46696: mds: pre-fragment distributed ephemeral pin directories to distribute the subtree bounds New

History

#1 Updated by Patrick Donnelly 15 days ago

I should add, it's trivial to set up a test for this: just create a distributed ephemeral pinned directory with large fan-out (~1000 sub-dirs). Or use manual pinning; it does not matter. Then try to do any kind of workload in any of the directories.

#2 Updated by Patrick Donnelly 12 days ago

  • Related to Fix #46696: mds: pre-fragment distributed ephemeral pin directories to distribute the subtree bounds added

Also available in: Atom PDF