Project

General

Profile

Actions

Feature #7333

open

client: evaluate multiple O_APPEND writers

Added by Sage Weil about 10 years ago. Updated about 4 years ago.

Status:
In Progress
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Reviewed:
Affected Versions:
Component(FS):
Client, kceph
Labels (FS):
task(intern)
Pull request ID:

Description

This needs done for kclient and libcephfs. Extending the size of the file is potentially racy with updates to inode's i_size.

Do clients do atomic extend writes (writes which fail if it would overwrite)? Is there a RADOS op for this? That might simplify handling the race condition: i_size may be inconsistent (eventually consistent) across clients but multiple appenders would not race.

Actions #1

Updated by Greg Farnum about 10 years ago

  • Target version deleted (v0.77)
Actions #2

Updated by Greg Farnum almost 8 years ago

  • Category set to Correctness/Safety

Do we have some known issues with how we handle EOF and synchronous IO?

Actions #3

Updated by Patrick Donnelly over 4 years ago

  • Subject changed from client: compare sync read/write vs eof logic with kclient to client: evaluate multiple O_APPEND writers
  • Description updated (diff)
  • Assignee set to Xiubo Li
  • Source changed from other to Development
  • Component(FS) Client, kceph added
  • Labels (FS) task(intern) added
Actions #4

Updated by Xiubo Li over 4 years ago

  • Status changed from New to In Progress
Actions #5

Updated by Xiubo Li about 4 years ago

This is one fix about this O_APPEND & O_DIRECT:

In O_APPEND & O_DIRECT mode, the data from different writers will
be possiblly overlapping each other with shared lock.

For example, both Writer1 and Writer2 are in O_APPEND and O_DIRECT
mode:

Writer1                         Writer2
shared_lock()                   shared_lock()
getattr(CAP_SIZE) getattr(CAP_SIZE)
iocb->ki_pos = EOF iocb->ki_pos = EOF
write(data1)
write(data2)
shared_unlock() shared_unlock()

The data2 will overlap the data1 from the same file offset, the
old EOF.

Switch to exclusive lock instead when O_APPEND is specified.

https://www.spinics.net/lists/ceph-devel/msg47445.html

Actions

Also available in: Atom PDF