Project

General

Profile

Actions

Bug #63896

open

client: contiguous read fails for non-contiguous write (in async I/O api)

Added by Dhairya Parmar 5 months ago. Updated 10 days ago.

Status:
In Progress
Priority:
Normal
Category:
Correctness/Safety
Target version:
% Done:

0%

Source:
Tags:
Backport:
reef
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Client
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

when waiting for onfinish, with this test case(trivial before and after parts removed):

  for(int i = 0; i < NUM_BUF; ++i) {
    writefinish.reset(new C_SaferCond("test-nonblocking-writefinish-non-contiguous"));
    rc = client->ll_preadv_pwritev(fh, current_iov++, 1, i * NUM_BUF * 10,
                                   true, writefinish.get(), nullptr);
    ASSERT_EQ(rc, 0);
    total_bytes_written += writefinish->wait();
  }
  ASSERT_EQ(total_bytes_written, bytes_to_write);

  readfinish.reset(new C_SaferCond("test-nonblocking-readfinish-contiguous"));
  rc = client->ll_preadv_pwritev(fh, iov_in_contiguous, NUM_BUF, 0, false,
                                 readfinish.get(), &bl);
  ASSERT_EQ(rc, 0);
  total_bytes_read = readfinish->wait();
  // should be less since the data written is non-contiguous but the read
  // was contiguous
  ASSERT_LE(total_bytes_read, bytes_to_write);

the executions stalls, check out 1_execution_stalled_contiguous_read linked below

BUT if the code to wait for context to complete is removed, i.e.:

  for(int i = 0; i < NUM_BUF; ++i) {
    writefinish.reset(new C_SaferCond("test-nonblocking-writefinish-non-contiguous"));
    rc = client->ll_preadv_pwritev(fh, current_iov++, 1, i * NUM_BUF * 10,
                                   true, writefinish.get(), nullptr);
    ASSERT_EQ(rc, 0);
    total_bytes_written += writefinish->wait();
  }
  ASSERT_EQ(total_bytes_written, bytes_to_write);

  readfinish.reset(new C_SaferCond("test-nonblocking-readfinish-contiguous"));
  rc = client->ll_preadv_pwritev(fh, iov_in_contiguous, NUM_BUF, 0, false,
                                 readfinish.get(), &bl);
  ASSERT_EQ(rc, 0);

execution aborts with a core dump with:

2023-12-28T19:53:08.780+0530 7f0f47a4a9c0 20 client.4423 awaiting reply|forward|kick on 0x7fff7814eaa0
terminate called after throwing an instance of 'std::system_error'
  what():  Invalid argument
*** Caught signal (Aborted) **

checkout 2_core_dumped(attached below) for more info

Ideally these should've just returned bytes less than total bytes written since the data between offsets must be 0 but it failed to handle this case


Files

1_execution_stalled_contiguous_read (102 KB) 1_execution_stalled_contiguous_read log for case where the contiguous read execution stalls (for non-contiguous write) waiting on context completion Dhairya Parmar, 12/28/2023 02:17 PM
2_core_dumped (268 KB) 2_core_dumped log for case where we don't wait for the context completion to take place when performing contiguous read for non-contiguous write Dhairya Parmar, 12/28/2023 02:25 PM
Actions

Also available in: Atom PDF