Project

General

Profile

Actions

Feature #44455

open

cephfs: add recursive unlink RPC

Added by Patrick Donnelly about 4 years ago. Updated over 1 year ago.

Status:
In Progress
Priority:
High
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Reviewed:
Affected Versions:
Component(FS):
MDS
Labels (FS):
task(intern), task(medium)
Pull request ID:

Description

This is a fairly common operation [1] and there's no particular reason we can't support it. The PurgeQueue (I think) is already well architected enough to support this with some modification. In particular, it needs to permit an unlinked directory to have children.

I see two immediate use-cases:

  • The pybind/mgr/volumes plugin currently has an asynchronous unlink module that cleans up deleted volumes. It'd be much simpler to just tell the MDS to unlink the directory tree.
  • cephfs-shell can provide a command which may be used out-of-band to unlink some subtree (thinking HPC)

Keep in mind we can relax some POSIX consistency requirements: the link counts on all the descendants may not change. Logically, this is just renaming the directory to an internal Trash directory that's slowly purged by the MDS. One moderate challenge that needs addressed is revoking any capabilities to asynchronously create files by clients in the subtree. Likewise, the MDS shouldn't create any files in the unlinked subtree via a create RPC.

[1] For example, HDFS has long had a recursive unlink command.

Actions #1

Updated by Patrick Donnelly about 4 years ago

  • Description updated (diff)
  • Labels (FS) task(intern), task(medium) added
Actions #2

Updated by Venky Shankar about 4 years ago

  • Assignee set to Venky Shankar

(self assigning this) will start looking to add this support soon.

Actions #3

Updated by Greg Farnum about 4 years ago

Hmm one problem the issue description skips over is that this will need to deal with hard-linked files underneath the directory we're trying to recursively delete. That probably means we have to check the full child tree when queueing it up for deletion, and put everything onto the purge queue individually?

Actions #4

Updated by Venky Shankar about 4 years ago

Greg Farnum wrote:

Hmm one problem the issue description skips over is that this will need to deal with hard-linked files underneath the directory we're trying to recursively delete. That probably means we have to check the full child tree when queueing it up for deletion, and put everything onto the purge queue individually?

We should try to avoid the scan if possible. The purge queue would need to deal with unlinking non-empty directories. For the hardlink case you mention, if for a given directory we can tell that there are no hardlinked files in the subtree, that would be a win (not sure how atm!).

Actions #5

Updated by Greg Farnum about 4 years ago

Venky Shankar wrote:

Greg Farnum wrote:

Hmm one problem the issue description skips over is that this will need to deal with hard-linked files underneath the directory we're trying to recursively delete. That probably means we have to check the full child tree when queueing it up for deletion, and put everything onto the purge queue individually?

We should try to avoid the scan if possible. The purge queue would need to deal with unlinking non-empty directories.

Well, yes, not having to do a scan would be better. But having the purge queue deal with it probably breaks a whole lot of invariants there. (But maybe not? I haven't looked at that code in much depth!)

For the hardlink case you mention, if for a given directory we can tell that there are no hardlinked files in the subtree, that would be a win (not sure how atm!).

Yeah we can't tell this right now; it would require propagating some kind of "hardlink children" count up in the rstats but even those are asynchronously-updating and non-blocking so we can't actually rely on them. There's really not a good solution here and I don't think it's something you'll be able to count on.

Actions #6

Updated by Patrick Donnelly over 3 years ago

  • Target version changed from v16.0.0 to v17.0.0
Actions #7

Updated by Patrick Donnelly almost 2 years ago

  • Target version deleted (v17.0.0)
Actions #8

Updated by Patrick Donnelly over 1 year ago

  • Status changed from New to In Progress
  • Assignee changed from Venky Shankar to Patrick Donnelly
Actions

Also available in: Atom PDF