Project

General

Profile

Actions

Bug #63814

open

File deletion is 20x slower on kernel mount compared to libcephfs

Added by Niklas Hambuechen 5 months ago. Updated 2 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
fs/ceph
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

I am trying to delete a directory that has 200k subdirs with 1000 files in each. So approximately 200M files.

Doing this on the kernel mount using `rm -r` is very slow, around 200 `unlink()`s per second according to strace.

However doing it with `libcephfs` is surprisingly 20x faster.

This ticket is for finding out why, and ideally making the kernel mount equally fast.

Details

  • Ceph 16.2.7, Linux kernel mount 6.1.51, x86_64
  • 3-node cluster, 10 spinning-disk bluestore OSDs each with DB on NVMe SSDs
  • CephFS metadata pool on NVMe SSD
  • Single-threaded `libcephfs` `unlink()` does around 2000 per second, 2-threaded (using Python's `ThreadPoolExecutor` does 4000 unlinks per second; more threads don't help
  • Trying to parallelise the kernel mount `rm -r` by deleting multiple dirs in parallel using `xargs` or GNU `parallel` does not provide any speedup
  • No other Ceph clients are interacted the dir on which the deletion is happening (they have the FS mounted but do not do operations on that dir).

What could be reasons that the kernel mount is so much slower?

Actions #1

Updated by Niklas Hambuechen 5 months ago

Potentially related: https://tracker.ceph.com/issues/57898

Though that ticket is supposedly about a kernel regression in deletion speed, so may not be the same.

In this ticket here, there's a very clear benchmark criterion for any given version (kernel `rm -r` should be as fast as `libcephfs` operations).

Actions #2

Updated by Xiubo Li 2 months ago

What do you mean the libcephfs ? Do you mean use the ceph-fuse or directly using the third-part Apps to call the libcephfs APIs ? And have you tried the ceph-fuse ?

Thanks
- Xiubo

Actions #3

Updated by Niklas Hambuechen 2 months ago

I mean the libcephfs Python bindings with docs at https://docs.ceph.com/en/latest/cephfs/api/libcephfs-py/

In the issue description, when I referred to `unlink()`, I meant `unlink()` function from Python, which for unknown reason is not documented on that page. It's this one:

https://github.com/ceph/ceph/blob/716316e0c5f01cce3131320d3066aabd8ddb666a/src/pybind/cephfs/cephfs.pyx#L2245

I have not tried `ceph-fuse` yet, as the kernel mount works much better for us. I agree that this may give useful additional info, but my goal with the bug would really be to have the kernel mount delete stuff just as fast as libcephfs-py does.

Actions #4

Updated by Xiubo Li 2 months ago

Niklas Hambuechen wrote:

I mean the libcephfs Python bindings with docs at https://docs.ceph.com/en/latest/cephfs/api/libcephfs-py/

In the issue description, when I referred to `unlink()`, I meant `unlink()` function from Python, which for unknown reason is not documented on that page. It's this one:

https://github.com/ceph/ceph/blob/716316e0c5f01cce3131320d3066aabd8ddb666a/src/pybind/cephfs/cephfs.pyx#L2245

I have not tried `ceph-fuse` yet, as the kernel mount works much better for us. I agree that this may give useful additional info, but my goal with the bug would really be to have the kernel mount delete stuff just as fast as libcephfs-py does.

Yeah, if we can compare this with the ceph-fuse, it will be much better to understand where the problem is.

For the kclient it will work under the logic of VFS, I incline to think it's the VFS logic causes this. The logics of unlink() code in both libcephfs and kclient are mostly similar. Let me have a check later about this again. If we can get the test result from ceph-fuse, then it will be much clear and could narrow it down.

Thanks

Actions

Also available in: Atom PDF