Project

General

Profile

Actions

Bug #2753

closed

Writes to mounted Ceph FS fail silently if client has no write capability on data pool

Added by Florian Haas almost 12 years ago. Updated almost 8 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Client
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Originally reported in http://marc.info/?l=ceph-devel&m=134151023912148&w=2:

How to reproduce (this is on a 3.2.0 kernel):

1. Create a client, mine is named "test", with the following capabilities:

client.test
    key: <key>
    caps: [mds] allow
    caps: [mon] allow r
    caps: [osd] allow rw pool=testpool

Note the client only has access to a single pool, "testpool".

2. Export the client's secret and mount a Ceph FS.

mount -t ceph -o name=test,secretfile=/etc/ceph/test.secret
daisy,eric,frank:/ /mnt

This succeeds, despite us not even having read access to the "data" pool.

3. Write something to a file.

root@alice:/mnt# echo "hello world" > hello.txt
root@alice:/mnt# cat hello.txt

This too succeeds.

4. Sync and clear caches.

root@alice:/mnt# sync
root@alice:/mnt# echo 3 > /proc/sys/vm/drop_caches

5. Check file size and contents.

root@alice:/mnt# ls -la
total 5
drwxr-xr-x  1 root root    0 Jul  5 17:15 .
drwxr-xr-x 21 root root 4096 Jun 11 09:03 ..
-rw-r--r--  1 root root   12 Jul  5 17:15 hello.txt
root@alice:/mnt# cat hello.txt
root@alice:/mnt#

Note the reported file size in unchanged, but the file is empty.

Checking the "data" pool with client.admin credentials obviously shows
that that pool is empty, so objects are never written. Interestingly,
"cephfs hello.txt show_location" does list an object_name, identifying
an object which doesn't exist.

Is there any way to make the client fail with -EIO, -EPERM,
-EOPNOTSUPP or whatever else is appropriate, rather than pretending to
write when it can't?

Actions #1

Updated by Sage Weil over 11 years ago

  • Project changed from Ceph to CephFS
  • Category deleted (26)
Actions #2

Updated by Greg Farnum over 11 years ago

  • Tracker changed from Bug to Feature
  • Category set to 46

This is a great suggestion but falls into feature rather than bug-fix category. My initial thought is keeping a list of which pools we've successfully written to, and on the first read or write to a file stored in a pool we haven't looked we block that write until we've managed to do a "touch" or equivalent through the OSDs.
Which means first-file accesses to new pools will be slow, but...meh.

More complicated possibilities include actually looking at caps, but I don't remember if clients can even do that.

Unless I turn out to be confused about implementation, this will need a duplicate request in the kernel client. But I'll hold off on that until we come up with a more detailed solution.

Actions #3

Updated by Florian Haas over 11 years ago

No, please. A write pretending to succeed while actually not writing data is a bug. The filesystem not lying to its users doesn't qualify as a feature.

Actions #4

Updated by Greg Farnum over 11 years ago

I agree it's a bug, but given the procedures we have now (ack! changing procedures coming alert!) I don't think we want to file it that way. It will take development as opposed to debugging and then fixing a line somewhere. ;) Just bureaucratic mangling.

Actions #5

Updated by Florian Haas over 11 years ago

Fair enough, but if I can just make a suggestion, perhaps you might want to explain these procedures somewhere in the official docs? Reading stuff like this ("this is broken -- why is this a feature?!") can leave a terrible impression to novices or casual observers and earn the project a lot of undeserved ridicule, specifically when the public message is "we're focusing on stabilizing the filesystem." Just my $.02.

Actions #6

Updated by Ian Colle over 11 years ago

  • Tracker changed from Feature to Bug

This is clearly a bug, bureaucracy or not. It should not be a feature. We can do new development to fix a bug. If you feel strongly about it, open a new feature ticket to track that development and tie to this bug, but this should remain a bug ticket.

Actions #7

Updated by Sage Weil over 11 years ago

  • Priority changed from Normal to High

we should return an error code on fsync().. that is the quick fix.

a more polite feature will be opened to return an error early (e.g., during open) to complain about misconfiguration.

Actions #8

Updated by Greg Farnum over 11 years ago

  • Assignee set to Greg Farnum
Actions #9

Updated by Greg Farnum about 11 years ago

Looked at this briefly; I see that the way we do fsyncs is attached to a "FIXME: this could starve" comment, and I believe the proper fix will deal with that too. ;)
Anyway, the ObjectCacher already takes a callback and gives that callback the return code from the OSD (well, it's a C_Gather so it provides at least one error return code if there are any, which works for our purposes).
Sadly the Client's internal interfaces don't match up real well with looking at that callback's result (currently on fsync it just waits until the file has no dirty buffers remaining!), but those are all that will need to be changed. Probably I'll add an optional callback parameter to _flush (or see if I can convert everybody to use it) and get rid of the silly "C_ClientPutInode" (which just grabs and puts a ref) in exchange for something that reports errors, which we can check in _fsync and other callers which care.

Although I don't think that makes reporting issues on close() much easier; maybe a more comprehensive reporting overhaul is in order but that's a lot more work...

Actions #10

Updated by Greg Farnum about 11 years ago

wip-2753-fsync-errors has a patch which makes fsync return an error if the client gets back an error from the Objecter. Is that sufficient or do we want to make it do something on close() as well?

However, while testing (and I'm running into FUSE problems with that which are slowing me down), I realized that actually if you try and fsync a file which can't write it will hang forever; you aren't getting a false success indication back. So I went back to the original reported case, and fixing that will indeed require #3826 since all it's doing is writes and a sync (which doesn't block for data to actually go out).

Actions #11

Updated by Sam Lang about 11 years ago

I don't see wip-2753-fsync-errors in the repo. Also, note that this problem was reported on the cephfs kernel client, your comments suggest you're looking at the fuse/client code.

Actions #12

Updated by Greg Farnum about 11 years ago

...yes, yes it is. I've been working in FUSE so far. sigh Well, it needed the fix too.

Actions #13

Updated by Greg Farnum about 11 years ago

  • Status changed from New to Fix Under Review

Okay, I've checked that the kernel client deals correctly with an fsync (it'll return EPERM). The client branch wip-2753-fsync-errors still need to get reviewed and merged, though.

Actions #14

Updated by Greg Farnum about 11 years ago

  • Status changed from Fix Under Review to Resolved

wip-2753-fsync errors merged and pushed in commit:b3ffc718c93b7daa75841778b5d50ea3bc5fcc53 and fsync works properly on the kernel client.

Actions #15

Updated by Greg Farnum almost 8 years ago

  • Component(FS) Client added
Actions

Also available in: Atom PDF