Project

General

Profile

Actions

Bug #20938

open

CephFS: concurrent access to file from multiple nodes blocks for seconds

Added by Andras Pataki over 6 years ago. Updated over 4 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
ceph-fuse
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

When accessing the same file opened for read/write on multiple nodes via ceph-fuse, performance drops by about 3 orders of magnitude to 1-2 operations/second from thousands of operations/second. Tested both on Jewel 10.2.9 and the latest Luminous RC 12.1.2.

The core of the problem boils down to the following operation being run on the same file on multiple nodes (in a loop in the test program):

    int fd = open(filename, mode);
    read(fd, buffer, 100);
    close(fd);

Here are some results on our cluster:
One node, mode=read-only: 7000 opens/second
One node, mode=read-write: 7000 opens/second
Two nodes, mode=read-only: 7000 opens/second/node
Two nodes, mode=read-write: around 0.5 opens/second/node (!!!)
Two nodes, one read-only, one read-write: around 0.5 opens/second/node (!!!)
Two nodes, mode=read-write, but remove the 'read(fd, buffer,100)' line from the code: 500 opens/second/node

There seems to be some problems with opening the same file read/write and reading from the file on multiple nodes. That operation seems to be 3 orders of magnitude slower than other parallel access patterns to the same file. The 1 second time to open files almost seems like some timeout is happening somewhere. I have some suspicion that this has to do with capability management between the fuse client and the MDS, but I don't know enough about that protocol to make an educated assessment.

The attached small C program reproduces the issue. Run it on two different nodes with "timed_openrw_read <filename> rw" where <filename> is a file in cephfs with some data in it.


Files

timed_openrw_read.c (1.28 KB) timed_openrw_read.c C test program to reproduce issue Andras Pataki, 08/07/2017 03:26 PM
ceph-fuse.patch (3.14 KB) ceph-fuse.patch Zheng Yan, 08/09/2017 10:39 AM
Actions

Also available in: Atom PDF