Project

General

Profile

Actions

Bug #2196

closed

`rados bench` will write test objects with a constant oid, under-reporting performance.

Added by David McBride about 12 years ago. Updated about 12 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

(As discussed on #ceph, 2012/03/21 -- with thanks to joshd)

The command rados bench generates a sequence of named objects for bandwidth tests. The name of each object should not be constant; otherwise, each object will be written to the same PG, and the test will (probably substantially) under-report the true bandwidth capacity of the cluster.

The code that generates a unique object name is in generate_object_name() in src/osdc/rados_bencher.h:

void generate_object_name(char *s, int objnum, int pid = 0)
{
  char hostname[30];
  gethostname(hostname, sizeof(hostname)-1);
  hostname[sizeof(hostname)-1] = 0;
  if (pid) {
    snprintf(s, sizeof(hostname), "%s_%d_object%d", hostname, pid, objnum);
  } else {
    snprintf(s, sizeof(hostname), "%s_%d_object%d", hostname, getpid(), objnum);
  }
}

Note how it's operating with a character array of length 30.

Now, on my test host, I run a rados bench, and discover that all of the objects are being written to the same OSDs:

% rados -p rbd bench 10 write --debug-objecter 10 --log-to-stderr 2>&1|grep op_submit
2012-03-21 19:05:07.298927 7f8a2a3e9760 client.4272.objecter op_submit oid illustrious.doc.ic.ac.uk_1492 @2 [write 0~4194304] tid 1 osd.3
2012-03-21 19:05:07.299189 7f8a2a3e9760 client.4272.objecter op_submit oid illustrious.doc.ic.ac.uk_1492 @2 [write 0~4194304] tid 2 osd.3
2012-03-21 19:05:07.299362 7f8a2a3e9760 client.4272.objecter op_submit oid illustrious.doc.ic.ac.uk_1492 @2 [write 0~4194304] tid 3 osd.3
2012-03-21 19:05:07.299584 7f8a2a3e9760 client.4272.objecter op_submit oid illustrious.doc.ic.ac.uk_1492 @2 [write 0~4194304] tid 4 osd.3
2012-03-21 19:05:07.299805 7f8a2a3e9760 client.4272.objecter op_submit oid illustrious.doc.ic.ac.uk_1492 @2 [write 0~4194304] tid 5 osd.3

Note how the oid is constant; this is because the gethostname() call returns a string 25 chars long, and the remaining underscore and PID (possibly truncated!) take up the rest of the 30 character array -- leaving no room for the object's serial number. Hence the object id is constant; hence the PG it is allocated to is constant, hence most of the OSDs get left out of the benchmark.

There are several obvious possible fixes: either make the character array larger, reorder the fields such that the hostname is placed last, or restructure the code to use std::strings instead.

(As I'm about to build 0.44 anyway, I'll try tweaking the constant upwards from 30 to 128 and retest.)

Cheers,
David

Actions #1

Updated by Samuel Just about 12 years ago

  • Status changed from New to In Progress
  • Assignee set to Samuel Just
Actions #2

Updated by Samuel Just about 12 years ago

  • Status changed from In Progress to Resolved
Actions #3

Updated by Samuel Just about 12 years ago

2ec8f27f58adca40d125051a23547b639ee7d5f6

Actions

Also available in: Atom PDF