Project

General

Profile

Bug #3625

client: EEXIST error on multiple clients to create

Added by Sam Lang over 11 years ago. Updated over 7 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Client
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Discovered with IOR shared file test on ceph-fuse, if multiple clients attempt to create a file at the same time (do an open with O_CREAT), one will wrongly get back EEXIST.

Associated revisions

Revision 67bc849c (diff)
Added by Sam Lang about 11 years ago

mds: Return created inode in mds reply to create

If multiple clients race to create a file, multiple clients will send a
create request and get back a valid dentry+inode, but only one client
will actually win the race to create the file. All other clients should
treat the reply as an open of an existing file and check permissions.
This patch adds the created inode number to the mds create reply if that
request actually created the inode/file (and the feature is supported),
so the client can properly check permissions if the inode number isn't
returned. Fixes #3625.

Signed-off-by: Sam Lang <>

History

#1 Updated by Sam Lang over 11 years ago

  • Description updated (diff)

I made some commits to wip-3625, which resolve the EEXIST, but now the test returns an EIO...

#2 Updated by Sam Lang over 11 years ago

David and I have posted comments on github about the fix to allow multiple
clients opening the same file to get a valid fd (instead of EEXIST). The
summary is that the changes made to wip-3625 successfully resolve the EEXIST
problem, but introduce another race with permissions checking, because we are
only checking permissions on open if the file already exists. With two
clients that both send create requests, one will perform the create at the
mds, and the other will perform an open, but both clients will think that
they've created the file, and skip the permissions checking. There are a few
ways to resolve this:

1. Send the create request with O_EXCL, and if it returns EEXIST, send a
lookup request.

2. Do the permissions checking at the mds. The problem with this currently
is that we don't send the secondary group ids to the mds on the create
request, so its permissions checks can currently only check against uid and
gid. We could add the secondary group ids to the create request to resolve
this.

3. Send back a created flag with the reply. We don't have request specific
reply structures, so this seems non-trivial.

4. Ignore this for now, assuming that permissions checking is handled
properly for existing files, so if we end up in the create path, we are in
the multi-client create race, and can assume they will all open the file with
the same permissions. This isn't really posix compliant, and clients could
perform an open/create with:

open(foo, O_CREAT|O_RDONLY, 0400);

in which case, posix requires that the client that wins the create gets a
valid fd, while the other clients get EACCES. If we ignore it, it means that
someone could gain access to a file they don't have permission to access by
timing an open of the file at the same time that its being created.


From Sage:

Hrm. I like #3 the best. Although we probably also need to do #2.
Probably. Here's the thing:

We will never enforce all security at the MDS; the whole premise is that
we can delegate cache consistency to the client and do things there. But:
we also want to get the semantics right (here it might help). And more
importantly, we want ot support a mode where the client is only allowed to
be a specific user (e.g., a libcephfs user with a cephx capabitilty of
uid=1001). I suspect that is a big project, though, which a whole host of
issues that we don't want to get into right now.

Which leads me back to 2 being the cleanest approach for this particular
problem (where we just want to get the semantics right).

1 might work if we can coax the mds into including the dentry+inode in
the EEXIST reply... in that case the subsequent lookup/open will be 100%
client-side and performance won't be any worse.

#3 Updated by Sam Lang over 11 years ago

  • Status changed from In Progress to 7

Pushed fixes to wip-3625 (ceph and ceph-client repos) that implement #3 (mds sends back the created flag in reply to a create request). Testing now.

#4 Updated by Greg Farnum about 11 years ago

I know you guys did a couple rounds on this one, what's the status?

#5 Updated by Sam Lang about 11 years ago

The kernel side has been reviewed and tested, but needs to be merged. The fuse side has been tested, but I think it still needs to be reviewed (and merged).

#6 Updated by Sam Lang about 11 years ago

  • Status changed from 7 to Fix Under Review

#7 Updated by Greg Farnum about 11 years ago

  • Assignee changed from Sam Lang to Sage Weil

Maybe you already handled this?

#8 Updated by Sage Weil about 11 years ago

  • Status changed from Fix Under Review to Resolved

commit:b4d3bd06d4083d780755f6ef506df1643932fa2f

#9 Updated by Greg Farnum over 7 years ago

  • Component(FS) Client added

Also available in: Atom PDF