Project

General

Profile

Actions

Bug #17731

closed

MDS stuck in stopping with other rank's strays

Added by John Spray over 7 years ago. Updated about 5 years ago.

Status:
Can't reproduce
Priority:
High
Assignee:
Category:
-
Target version:
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
multimds
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Kraken v11.0.2

Seen on a max_mds=2 MDS cluster with a fuse client doing an rsync -av --delete on a dir that included hard links.

Running a backup job overnight with two active MDS daemons, then I set max_mds=1 and deactivated rank 1 (client still mounted).

Log and cache dump attached from mds.gravel1 which held rank 1. It got most of the way through stopping and then got stuck with 6 items in cache, all things in ~mds0.

The log indicates that we're somehow not making it as far as trim_dentry on those items, but I can't see why.

Tried flushing journals, killing and evicting client, but no progress. Interestingly, when I tried setting mds_cache_size to 100 on rank 0, it also wouldn't trim past 500-something entries, so there was something going wrong with the trimming there too.


Files

Actions #1

Updated by John Spray over 7 years ago

  • Priority changed from Normal to High
  • Target version set to v12.0.0
Actions #2

Updated by John Spray over 7 years ago

  • Assignee set to John Spray
Actions #3

Updated by John Spray almost 7 years ago

  • Status changed from New to Can't reproduce

This code has all changed a lot since.

Actions #4

Updated by Patrick Donnelly about 5 years ago

  • Category deleted (90)
  • Labels (FS) multimds added
Actions

Also available in: Atom PDF