Bug #63461: Long delays when two threads modify the same directory - CephFS - Ceph

Actions

Copy link

Bug #63461

open

Long delays when two threads modify the same directory

Added by Xavi Hernandez 6 months ago. Updated 4 months ago.

Status:

Triaged

Priority:

Normal

Assignee:

Venky Shankar

Category:

Performance/Resource Usage

Target version:

Ceph - v19.0.0

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

2 - major

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(FS):

Labels (FS):

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

I've identified an issue in a CephFS' kernel mount while accessing it from Samba.

The workload is just to create and delete a file in the root directory of the mount from two or more clients (each client uses a different file name).

What I've observed is that some of the operations (create or delete) take near 5 seconds, and all clients tend to complete the pending operation at the same time, so almost all operations complete in batches every 5 seconds, independently of when they were started.

After analyzing what samba does, I've been able to create a reproducer that doesn't depend on Samba. This is a subset of operations that causes the issue:

    dirfd = openat(AT_FDCWD, mount_path, O_RDONLY | O_PATH);
    fstatat(dirfd, "", &st, AT_EMPTY_PATH);
    fd = openat(dirfd, file_name, O_CREAT | O_TRUNC | O_RDWR, 0644);
    fstatat(dirfd, "", &st, AT_EMPTY_PATH);
    unlinkat(dirfd, file_name, 0);
    close(fd);
    close(dirfd);

If this code is run in a loop from two threads accessing the same CephFS mount point, then several "fstatat" calls take near 5 seconds to complete.

Actions

Copy link

Updated by Xavi Hernandez 6 months ago

I've just seen that the delay corresponds roughly to the value of mds_tick_interval option. Changing this value also changes the delays seen during the test to a very near value.

Actions

Copy link

Updated by Venky Shankar 6 months ago

Xavi Hernandez wrote:

I've just seen that the delay corresponds roughly to the value of mds_tick_interval option. Changing this value also changes the delays seen during the test to a very near value.

Mostly, any operation that gets kicked at ~5s (tick interval) is related to flushing of mdlog. The MDS however can flush the mdlog if it finds it necessary to satisfy a client request. There have been a couple of fixes in the past related to this.

Which ceph version are you using and what's max_mds set to?

Actions

Copy link

Updated by Xavi Hernandez 6 months ago

Venky Shankar wrote:

Which ceph version are you using and what's max_mds set to?

I'm using a recent build from main branch (commit 8858839c) on CentOS 9 Stream.

max_mds is 1.

The test I described is the only thing accessing the CephFS volume.

Actions

Copy link

Updated by Venky Shankar 6 months ago

Xavi Hernandez wrote:

Venky Shankar wrote:

Which ceph version are you using and what's max_mds set to?

I'm using a recent build from main branch (commit 8858839c) on CentOS 9 Stream.

max_mds is 1.

The test I described is the only thing accessing the CephFS volume.

Thanks, Xavi. I'll recreate in my test cluster and see what's going on.

Actions

Copy link

Updated by Venky Shankar 4 months ago

Status changed from New to Triaged
Assignee set to Venky Shankar
Target version set to v19.0.0

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » CephFS

Custom queries

Bug #63461

Long delays when two threads modify the same directory

Updated by Xavi Hernandez 6 months ago

Updated by Venky Shankar 6 months ago

Updated by Xavi Hernandez 6 months ago

Updated by Venky Shankar 6 months ago

Updated by Venky Shankar 4 months ago