Bug #1206: NFS reexport file creation lags 1-3 seconds - CephFS - Ceph

Actions

Copy link

Bug #1206

closed

NFS reexport file creation lags 1-3 seconds

Added by Brian Chrisman almost 13 years ago. Updated over 7 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Category:

NFS (Linux Kernel)

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(FS):

Labels (FS):

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

I'm attaching the kernel logs and mds logs for the creation of a file called scale-product/testfoo3 via a touch running on a remote NFS client.
I've looked through the logs, but they're a little difficult to track.

There are a few other file accesses in these logs for other services we're using.
However, I think it unlikely that those are causing problems, as I can create files locally on the ceph kernel mount without experiencing this lag.

Files

Download all files

CREATE_LAG.mds.log (153 KB) CREATE_LAG.mds.log		Brian Chrisman, 06/20/2011 05:02 PM
CREATE_LAG.messages.log (108 KB) CREATE_LAG.messages.log		Brian Chrisman, 06/20/2011 05:02 PM
mds.scale-192-168-98-109.log (228 KB) mds.scale-192-168-98-109.log		Brian Chrisman, 07/28/2011 02:51 PM
messages (165 KB) messages		Brian Chrisman, 07/28/2011 02:51 PM

Actions

Copy link

Updated by Brian Chrisman almost 13 years ago

If I reexport the ceph filesystem via fuse/nfs, there is no lag in creating files.

Actions

Copy link

Updated by Sage Weil almost 13 years ago

Target version set to v0.31

Actions

Copy link

Updated by Sage Weil almost 13 years ago

It looks like after the mknod nfsd is calling write_inode via commit_metadata() in fs/nfsd/vfs.c. This triggers cap writeback, which does not flush teh mds cache immediately because normally nobody (except maybe sync(2)) waits on it.

If you export with option async and it should skip this step and be fast.

We can also defined a ->commit_metadata hook in the export_operations that does something different/smarter. What this is, I'm not sure. In this specific case, mknod is setting the mtime/atime/ctime, which is pretty useless (the mds did that too). In other cases safety might matter more.

Wanna try the async option just to verify this is in fact what's going on?

Actions

Copy link

Updated by Sage Weil almost 13 years ago

Project changed from Ceph to Linux kernel client
Target version deleted (~~v0.31~~)

Actions

Copy link

Updated by Sage Weil almost 13 years ago

Category set to NFS
Target version set to v3.0

Actions

Copy link

Updated by Brian Chrisman almost 13 years ago

That's faster... 0.5s for a touch create.
As a reference local filesystem and another cfs exported over nfs runs sub-0.01s for touch.

Actions

Copy link

Updated by Brian Chrisman almost 13 years ago

Ahh okay... file create isn't really any slower than on native ceph kernel mount (rather than re-export)... so this isn't an issue.
I'm going to close it.

Actions

Copy link

Updated by Brian Chrisman almost 13 years ago

Ahh I don't seem to have access to close it...

Actions

Copy link

Updated by Sage Weil almost 13 years ago

ok, but are you saying that a native mount is also .5s for touch? that's no good either!

Actions

Copy link

#10

Updated by Brian Chrisman almost 13 years ago

0.3 seconds for kernel mount... 0.01s for cfuse.. I'm going to work on tracking this down..

Actions

Copy link

#11

Updated by Sage Weil almost 13 years ago

Target version changed from v3.0 to v3.1

Actions

Copy link Download all files

#12

Updated by Brian Chrisman over 12 years ago

File mds.scale-192-168-98-109.log mds.scale-192-168-98-109.log added
File messages messages added

I've reproduced this half-second-NFS-create bug (f3de1a506d6c2debb399a1ae71f8f50714a31c8a), and it only appears through NFS (and occurs with async export and mount options set).
With local ceph mount, I get < 0.02s file creates, so there's something mucked up with the nfs side of this.
I'm going to attach mds and messages logs... (and move on to testing for stale fhs).

Here I'm creating a sequence of files, foo42, foo43 ... there's are being created manually (not being run as soon as last 'touch' returns) via nfs.

(Note, I previously exported non-ceph filesystems from the same server/node without any significant file create lag)

Example:
[root@buildrhel6 ~]# time touch /root/testmnt/foo42

real 0m0.603s
user 0m0.000s
sys 0m0.005s
[root@buildrhel6 ~]# time touch /root/testmnt/foo43

real 0m0.463s
user 0m0.000s
sys 0m0.005s
[root@buildrhel6 ~]# time touch /root/testmnt/foo44

real 0m0.465s
user 0m0.000s
sys 0m0.004s
[root@buildrhel6 ~]#

Actions

Copy link

#13

Updated by Brian Chrisman over 12 years ago

Looks like that commit-id may be an internal reference (we host a local repo)
This is a recent master.. 0.31 is the latest tag.
I'm working on archiving exactly which commit we build rpms from.

Actions

Copy link

#14