Bug #45153: fsync locking up in certain conditions - Linux kernel client - Ceph

Actions

Copy link

Bug #45153

closed

fsync locking up in certain conditions

Added by Youness Alaoui about 4 years ago. Updated over 2 years ago.

Status:

Duplicate

Priority:

Normal

Assignee:

Jeff Layton

Category:

fs/ceph

Target version:

Ceph - v15.2.1

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

2 - major

Reviewed:

Affected Versions:

ceph-qa-suite:

Crash signature (v1):

Crash signature (v2):

Description

I've been setting up/testing cephfs on a server and during one of the tests (with an application using nedb), I noticed a huge delay which I tracked down to cephfs being slow/locking up on an fsync call.

I was able to reproduce the problem with a simple command line. The situation is if a new file is created, then synced, then renamed and synced again (nedb's method of overwriting the database file in a crash safe manner), then the final fsync will take 5 seconds to return as it locks up.

Doing :

# touch to-rename && sync to-rename && mv to-rename final && time sync final

real    0m4.506s
user    0m0.000s
sys     0m0.001s

will take about 5 seconds to resolve (according to the 'time' command on the final sync) while doing the same without the second sync of the new file, takes 0.003s

# touch to-rename && mv to-rename final && time sync final

real    0m0.003s
user    0m0.001s
sys     0m0.000s

Note that in this context, the file does not exist, its size is 0 bytes, but it could be any size, I only use 0 bytes to show that it's not taking time to actually sync any large amounts of data. I have also tested it as a user and as root, with no difference.

See attached screenshot.

I have also run the command in a while loop and I get a consistent ~5 second time which leads me to think it's a timeout happening and the fsync is locked waiting for something.

# while true; do touch to-rename && sync to-rename && mv to-rename final && time sync final; done

real    0m3.172s
user    0m0.001s
sys     0m0.000s

real    0m4.990s
user    0m0.002s
sys     0m0.000s

real    0m4.985s
user    0m0.001s
sys     0m0.000s

real    0m4.993s
user    0m0.001s
sys     0m0.000s

real    0m4.993s
user    0m0.001s
sys     0m0.000s

real    0m4.994s
user    0m0.001s
sys     0m0.000s

real    0m4.993s
user    0m0.001s
sys     0m0.000s

real    0m4.993s
user    0m0.001s
sys     0m0.000s

real    0m4.995s
user    0m0.001s
sys     0m0.000s

real    0m4.991s
user    0m0.000s
sys     0m0.001s

real    0m4.995s
user    0m0.000s
sys     0m0.001s

real    0m4.993s
user    0m0.001s
sys     0m0.000s

I have also tested that if I 'ls' the file in question, it makes no difference, but if I 'ls' the parent directory, it will immediately unlock the fsync and it returns right away before the 5 second timeout. So doing a readdir seems to unlock the frozen task.

I am filing this under the kernel client because testing with ceph-fuse, I cannot reproduce the same behavior. I have also realized that if I use ceph-fuse to mount the same cluster in a different directory on the same machine, then the kernel mount stops behaving in this manner as well. Once I unmount the ceph-fuse directory, the 5 second delay starts happening again. Having the ceph-fuse mounted on its own does not exhibit the problem.

My setup is using ceph 15.2.1, installed with cephadm, I have 3 mon, 2 mgr, 2 mds and 1 osd (for now, as I'm still testing things), spread over 3 hosts within the same network. The local machine I was testing this on had a mon, mgr and the osd on it.
The system is Debian 10.3 with kernel 4.19.0-8-cloud-amd64. The machine also had at least a few GB of free/available RAM to use and its CPU usage didn't seem to go beyond 3% to 5%.

I hope this helps find/resolve the issue, and that it's not a duplicate.
I've marked it as major since it's a major performance issue (my apps started freezing for 10 to 20 seconds), but I've switched to ceph-fuse for now, so it's not a critical issue for me at the moment.

Thank you.

Files

ceph-screenshot1.png (30.4 KB) ceph-screenshot1.png

Youness Alaoui, 04/20/2020 11:17 PM

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Updated by Jeff Layton about 4 years ago

An ~5s delay is usually a telltale sign that it's getting stuck waiting for an MDS tick and journal flush.

I suspect this is probably a duplicate of #44744. If so that will likely be fixed in v5.8, though it may be backportable as well. If you're able to test recent "testing" series kernels here:

https://shaman.ceph.com/builds/kernel/

...then that would be helpful.

Actions

Copy link

Updated by Youness Alaoui about 4 years ago

Jeff Layton wrote:

An ~5s delay is usually a telltale sign that it's getting stuck waiting for an MDS tick and journal flush.

I suspect this is probably a duplicate of #44744. If so that will likely be fixed in v5.8, though it may be backportable as well. If you're able to test recent "testing" series kernels here:

https://shaman.ceph.com/builds/kernel/

...then that would be helpful.

Thank you,
I hope it gets backported. I unfortunately cannot test with v5.8. Do you have links to the actual commits that would have fixed it? I could try to backport them myself to my kernel and see if that works.

Note: the shaman.ceph.com builds you linked to, all give 404 because of a double '//' in the URL, but even fixing that, it just gives empty content.

Thanks.

Actions

Copy link