Project

General

Profile

Actions

Bug #63461

open

Long delays when two threads modify the same directory

Added by Xavi Hernandez 6 months ago. Updated 10 days ago.

Status:
Triaged
Priority:
Normal
Assignee:
Category:
Performance/Resource Usage
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I've identified an issue in a CephFS' kernel mount while accessing it from Samba.

The workload is just to create and delete a file in the root directory of the mount from two or more clients (each client uses a different file name).

What I've observed is that some of the operations (create or delete) take near 5 seconds, and all clients tend to complete the pending operation at the same time, so almost all operations complete in batches every 5 seconds, independently of when they were started.

After analyzing what samba does, I've been able to create a reproducer that doesn't depend on Samba. This is a subset of operations that causes the issue:

    dirfd = openat(AT_FDCWD, mount_path, O_RDONLY | O_PATH);
    fstatat(dirfd, "", &st, AT_EMPTY_PATH);
    fd = openat(dirfd, file_name, O_CREAT | O_TRUNC | O_RDWR, 0644);
    fstatat(dirfd, "", &st, AT_EMPTY_PATH);
    unlinkat(dirfd, file_name, 0);
    close(fd);
    close(dirfd);

If this code is run in a loop from two threads accessing the same CephFS mount point, then several "fstatat" calls take near 5 seconds to complete.

Actions

Also available in: Atom PDF