Bug #16385: rados bench seq and rand tests do not work if op_size != object_size - RADOS - Ceph

Actions

Copy link

Bug #16385

open

rados bench seq and rand tests do not work if op_size != object_size

Added by Anonymous almost 8 years ago. Updated 18 days ago.

Status:

Fix Under Review

Priority:

Normal

Assignee:

Category:

Correctness/Safety

Target version:

% Done:

Source:

other

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(RADOS):

rados tool

Pull request ID:

12203

Crash signature (v1):

Crash signature (v2):

Description

rados bench write correctly creates objects whose object_size is a multiple of op_size, but radios bench seq and rand cannot read them, failing with:

benchmark_data_alpha1-p200_7856_object0 is not correct!

There are a number of issues concerning this bug, discovered one at a time (I will create a pull request shortly).

Routine aio_read() in rados.cc is hard-coded to read an object at offset zero, no matter what offset it is passed.

Routine seq_read_bench() in obj_bencher.cc stops reading after "num_objects" I/Os, when it should issue "num_objects * writes_per_object" reads (but see below), so the test run times are unexpectedly short.

In routine rand_read_bench(), the first "concurrentios" I/Os are actually sequential reads of the first few objects, not random. Also, the routine only generates random offsets modulo "num_objects", not "num_objects * writes_per_object", so the I/O distribution ignores most of the higher-numbered objects (but see next).

If a rados bench write test finishes based on time, not number of objects, then the last object written is usually shorter than the rest. Fixing the above issues causes the seq and rand tests to issue I/Os beyond the end of that final shorter object.

The solution is to record the number of writes issued by rados bench write as an additional parameter in the "bench_metadata" file (num_writes). The seq test then issues "num_writes" reads and doesn't run off the end of the last object. The rand test generates a random number modulo "num_writes", then converts that to an object and offset pair, also avoiding any reads beyond the end of the last object, and the I/O distribution across the entire set of objects and across offsets within objects is balanced.

A separate issue is that the rados -o option is doubly-defined. I haven't addressed this, as I don't know which is the intended usage. It looks like -o no longer works as a synonym for --output. It seems like the --object_size usage is the wrong one?

} else if (ceph_argparse_witharg(args, i, &val, "--object-size", (char*)NULL)) {
      opts["object-size"] = val;
    ...
    } else if (ceph_argparse_witharg(args, i, &val, "-o", (char*)NULL)) {
      opts["object-size"] = val;
    ...
    } else if (ceph_argparse_witharg(args, i, &val, "-o", "--output", (char*)NULL)) {
      opts["output"] = val;