Project

General

Profile

Actions

Tasks #64819

closed

Tasks #63293: Implement fscrypt in libcephfs and cephfs-fuse

Tasks #64160: RMW race detection

data corruption during rmw after lseek

Added by Christopher Hoffman about 2 months ago. Updated 22 days ago.

Status:
Resolved
Priority:
Normal
Category:
-
Target version:
-
% Done:

0%

Tags:
Reviewed:
Affected Versions:
Component(FS):
Labels (FS):
Pull request ID:

Description

There's data corruption during a rmw after a seek.

reproducer

import os

def write_fill(fd, fill, size, offset):
  s = ''
  for i in range(0,size):
    s += fill

  os.lseek(fd, offset - int(size / 2), 0)
  os.write(fd, str.encode(s))

file = 'file.log'
fd = os.open(file, os.O_RDWR|os.O_CREAT)

s = write_fill(fd, 's', 32, 16)
s = write_fill(fd, 't', 8, 16)

os.close(fd)

Run the above on fscrypt enabled dir and non-fscrypt dir.

fscrypt enabled dir

hexdump /mnt/mycephfs/fscrypt_test_unlocked_ffsb/file.log 
0000000 7373 7373 7373 7373 7373 7373 7474 7474
0000010 7474 7474 0000 0000 0000 0000 0000 0000
0000020

non-fscrypt dir

hexdump file.log 
0000000 7373 7373 7373 7373 7373 7373 7474 7474
0000010 7474 7474 7373 7373 7373 7373 7373 7373
0000020

fscrypt case, file.log does not have trailing ssss... as it should

Actions #1

Updated by Christopher Hoffman 24 days ago · Edited

The RC for this issue is fixed by:

diff --git a/src/client/Client.cc b/src/client/Client.cc
index 9ccb39501a4..80c84a920d8 100644
--- a/src/client/Client.cc
+++ b/src/client/Client.cc
@@ -11824,12 +11824,11 @@ int Client::WriteEncMgr::read_modify_write(Context *_iofinish)
   ldout(cct, 10) << "start_block=" << start_block << " start_block_ofs=" << start_block_ofs
                 << " ofs_in_start_block=" << ofs_in_start_block << " end_block=" << end_block
                 << " end_block_ofs=" << end_block_ofs << " ofs_in_end_block=" << ofs_in_end_block
-                << dendl;
+                << " offset=" << offset
+                << " effective_size=" <<  in->effective_size() << " endoff=" << endoff << dendl;
   need_read_start = ofs_in_start_block > 0;
   need_read_end = (endoff < in->effective_size() && ofs_in_end_block < FSCRYPT_BLOCK_SIZE - 1 && start_block != end_block);
-  read_start_size = (need_read_start && need_read_end && start_block == end_block ?
-                     FSCRYPT_BLOCK_SIZE : ofs_in_start_block);
-  
+  read_start_size = FSCRYPT_BLOCK_SIZE; 
   bool need_read = need_read_start | need_read_end;
   ldout(cct, 10) << "need_read_start=" << need_read_start << "need_read_end=" << need_read_end
                  << " read_start_size=" << read_start_size << " need_read=" << need_read << dendl;

read_start_size was always ofs_in_start_block as it depends on need_read_end and have conflicting bool statements about start/end block. This caused data after the write portion to not be read. read_async would read up to the offset of where the write was occuring, write the pending data and leave the rest of fscrypt block empty. In fscrypt, the whole block needs to be read, overwrite starting at offset to len then recommit the whole fscrypt block.

Actions #2

Updated by Christopher Hoffman 24 days ago · Edited

There's also another case that is failing rmw:

reproducer:

import os

file = 'datafile31'
fd = os.open(file, os.O_RDONLY)
os.close(fd)

fd = os.open(file, os.O_RDWR|os.O_CREAT)

#reproducing sys calls after ffsb bench has started
fill = '\0'
size = 3192
offset = 60653568

s = ''
for i in range(0,size):
  s += fill

os.lseek(fd, offset, 0)
os.write(fd, str.encode(s))

os.close(fd)

The error we're getting

Inode.cc: In function 'int Inode::put_cap_ref(int)' thread 7f8aabfff6c0 time 2024-04-10T19:30:45.442916+0000
Inode.cc: 212: FAILED ceph_assert(cap_refs[c] > 0)

There's probably a missing buffered read that isn't happening that should.

Actions #3

Updated by Christopher Hoffman 22 days ago

As seen in note 2, the case of modification happening at start of block and contained within a single block is now addressed by:

diff --git a/src/client/Client.cc b/src/client/Client.cc
index 9ccb39501a4..3b45ed8453f 100644
--- a/src/client/Client.cc
+++ b/src/client/Client.cc
@@ -11825,11 +11825,10 @@ int Client::WriteEncMgr::read_modify_write(Context *_iofinish)
                 << " ofs_in_start_block=" << ofs_in_start_block << " end_block=" << end_block
                 << " end_block_ofs=" << end_block_ofs << " ofs_in_end_block=" << ofs_in_end_block
                 << dendl;
-  need_read_start = ofs_in_start_block > 0;
+  need_read_start = (ofs_in_start_block >= 0 && (endoff - offset) <= FSCRYPT_BLOCK_SIZE);

Actions #4

Updated by Christopher Hoffman 22 days ago

  • Status changed from In Progress to Resolved

The reproducers above were simplifications of failures/errors from running the ffsb test suite on a fscrypt enabled directory. Now with the above changes, ffsb test suite is now completing and passing.

Actions

Also available in: Atom PDF