Project

General

Profile

Actions

Bug #1850

closed

mds sometimes crashes removing trees with plenty of hardlinks

Added by Alexandre Oliva over 12 years ago. Updated over 12 years ago.

Status:
Duplicate
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

rsync -aH /usr/share/zoneinfo/ /mnt/ceph-fuse/subdir/ (the H and a hardlink-plentiful /usr/share/zoneinfo are essential to trigger the bug), let the mds stabilize, and then (perhaps after restarting mdses, re-mounting, not sure) rm -rf /mnt/ceph-fuse/subdir. Odds are high that the mds will fail assert(anchor_map.count(ino)) within AnchorServer::dec(inodeno_t) and die. Upon restart, it might be lucky to recover (not sure about that), but odds are it will rejoin and become active and shortly thereafter process deferred messages that will fail assert(anchor_map.count(curino) == 1) within AnchorServer::handle_query(MMDSTableRequest*).

Removing only one link at a time seems to avoid this failure mode, presumably (wild guess) because then the mds has enough time to move back into the containing dir the inode whose second hard link was removed, and if we hit the removal before this operation is complete, some internal consistency assumption doesn't hold.

Actions #1

Updated by Greg Farnum over 12 years ago

  • Status changed from New to Duplicate

I'm pretty sure you're looking at #1047 here. :)

Actions

Also available in: Atom PDF