Project

General

Profile

Documentation #24641

Document behaviour of fsync-after-close

Added by Niklas Hambuechen about 1 year ago. Updated 4 months ago.

Status:
Pending Backport
Priority:
Normal
Assignee:
Category:
-
Target version:
Start date:
Due date:
% Done:

0%

Tags:
Backport:
nautilus,mimic
Reviewed:
Affected Versions:
Labels (FS):
Pull request ID:

Description

The following should be documented:

Does close()/re-open()/fsync() provide the same durability and visibility-to-other-clients guarantees as fsync()/close() does?

POSIX leaves that open; for many local file systems the answer is "yes" (but modulo bugs; even the very recent Linux v4.17 had some issues with that).

The coreutils "sync" utility now also accepts single files as arguments, and does open()+fsync(); the answer to the above question would determine wither this will work reliably on CephFS.

For more details see: https://stackoverflow.com/questions/37288453/calling-fsync2-after-close2

It would be great to know the answer for CephFS, and have it documented.


Related issues

Copied to fs - Backport #40130: mimic: Document behaviour of fsync-after-close In Progress
Copied to fs - Backport #40131: nautilus: Document behaviour of fsync-after-close In Progress

History

#1 Updated by Patrick Donnelly about 1 year ago

  • Assignee set to Patrick Donnelly
  • Target version set to v14.0.0

#2 Updated by Zheng Yan about 1 year ago

they are the same. In cephfs, there is no dirty data/metadata associated with file handle.

#3 Updated by Niklas Hambuechen about 1 year ago

Great, that certainly makes things easier from the application perspective.

It would be great if we could write it down in the docs, perhaps http://docs.ceph.com/docs/mimic/cephfs/posix/ is a good place?

#4 Updated by Patrick Donnelly 7 months ago

  • Target version changed from v14.0.0 to v15.0.0

#5 Updated by Jeff Layton 4 months ago

  • Assignee changed from Patrick Donnelly to Jeff Layton

#6 Updated by Jeff Layton 4 months ago

Niklas Hambuechen wrote:

The following should be documented:

Does close()/re-open()/fsync() provide the same durability and visibility-to-other-clients guarantees as fsync()/close() does?

No, but that's true across filesystems in Linux. Once no one has the file opened, the inode is subject to eviction from the kernel's caches. We can never guarantee that you'll see errors at fsync() time that occurred before you opened the file.

kcephfs isn't any different from most other Linux filesystems here. The client's fsync() routine uses the standard file_write_and_wait_range call in the kernel, which uses the errseq_t infrastructure in the kernel to track writeback errors. So, you'll get "standard" kernel behavior for this (which unfortunately varies a bit between versions, but should be stable in anything v4.17 or later).

I'm not opposed to documenting this, but it's a bit redundant. We should probably just refer to the excellent explanation in the postgresql wiki:

https://wiki.postgresql.org/wiki/Fsync_Errors

#7 Updated by Jeff Layton 4 months ago

Niklas replied via email:

I think it makes sense to document it the way you say it, e.g. "kcephfs's guarantees in respect to this are the same as for kernel file system XYZ, whose guarantees you can read about at <link>".

Already the fact that it behaves just like other FSs is very valuable information.

The more can be written about it, the better though, since as I wrote on the issue "POSIX leaves that open" for many questions, and even "like other file systems" can easily people uncertain because what exactly those file systems do isn't very well documented. In fact, until postgres's "fsyncgate", close to no information could be found about it, so I asked on the `linux-fsdevel` mailing list, and learned about fsyncgate only there (then wrote it up on https://stackoverflow.com/questions/37288453/calling-fsync2-after-close2/50158433#50158433).

So it seems that in general, anybody with deeper knowledge writing stuff like that up would be of huge value in general.

Second your answer sounds kcephfs specific -- do the same guarantees still hold for ceph-fuse?

#8 Updated by Jeff Layton 4 months ago

Second your answer sounds kcephfs specific -- do the same guarantees still hold for ceph-fuse?

FUSE just farms out the fsync call to the fuse daemon, and it reports errors on a per-Fh basis. So, ceph-fuse is actually not subject to the differences that we're talking about between kernel versions, and should basically always "just work" like you'd expect.

#9 Updated by Jeff Layton 4 months ago

Proposed documentation update here:

https://github.com/ceph/ceph/pull/28300

Niklas, please take a look and let me know if this is what you had in mind.

#10 Updated by Patrick Donnelly 4 months ago

  • Status changed from New to Need Review
  • Start date deleted (06/24/2018)
  • Backport set to nautilus,mimic,luminous
  • Pull request ID set to 28300

#11 Updated by Patrick Donnelly 4 months ago

  • Status changed from Need Review to Pending Backport
  • Backport changed from nautilus,mimic,luminous to nautilus,mimic

#12 Updated by Nathan Cutler 4 months ago

  • Copied to Backport #40130: mimic: Document behaviour of fsync-after-close added

#13 Updated by Nathan Cutler 4 months ago

  • Copied to Backport #40131: nautilus: Document behaviour of fsync-after-close added

Also available in: Atom PDF