Project

General

Profile

Actions

Bug #538

closed

Write performance does not scale over multiple computers

Added by Ed Burnette over 13 years ago. Updated over 12 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I have ceph0.22.1 installed on a cluster of 208 lightly loaded 64-bit Linux nodes (RHEL5.5 ext3). The configuration is pretty much out of the box (no changes to replication or crush options). I've tested write performance with a couple of different benchmarks and they indicate that the total write throughput does not scale with the number of computers doing the writes. With 1 computer I can get about 63MB/sec, but with 4 I get as little as 33MB/sec (per node), with 8 I get 10MB/sec, with 16 I get 6MB/sec, and so on up to 208 nodes where I get under 1MB/sec per node. The cpu usage on the machine running cmon and cmds stays relatively low.

I tried it with the high level cfuse file system writes and the low level rados writes with similar results. For example here are some results using the "rados bench" command on one machine and on multiple machines. (The mpirun program runs a program simultaneously on multiple computers, the number specified by the -np option).

> rados bench 10 write -p benchpool | grep Bandwidth | sort -n -k3
Bandwidth (MB/sec):    62.686
> mpirun -np 4 rados bench 10 write -p benchpool | grep Bandwidth | sort -n -k3
Bandwidth (MB/sec):    33.410
Bandwidth (MB/sec):    37.618
Bandwidth (MB/sec):    38.106
Bandwidth (MB/sec):    59.590
> mpirun -np 8 rados bench 10 write -p benchpool | grep Bandwidth | sort -n -k3
Bandwidth (MB/sec):    9.931
Bandwidth (MB/sec):    15.875
...
Bandwidth (MB/sec):    27.801
Bandwidth (MB/sec):    34.115
> mpirun -np 16 rados bench 10 write -p benchpool | grep Bandwidth | sort -n -k3
Bandwidth (MB/sec):    6.240
Bandwidth (MB/sec):    7.860
...
Bandwidth (MB/sec):    12.906
Bandwidth (MB/sec):    15.725
> mpirun -np 32 rados bench 10 write -p benchpool | grep Bandwidth | sort -n -k3
Bandwidth (MB/sec):    4.035
Bandwidth (MB/sec):    4.168
...
Bandwidth (MB/sec):    8.686
Bandwidth (MB/sec):    8.821
> mpirun -np 64 rados bench 10 write -p benchpool | grep Bandwidth | sort -n -k3
Bandwidth (MB/sec):    2.435
Bandwidth (MB/sec):    2.537
...
Bandwidth (MB/sec):    5.009
Bandwidth (MB/sec):    6.364
> mpirun -np 128 rados bench 10 write -p benchpool | grep Bandwidth | sort -n -k3
Bandwidth (MB/sec):    1.452
Bandwidth (MB/sec):    1.479
...
Bandwidth (MB/sec):    3.152
Bandwidth (MB/sec):    3.733
> mpirun -np 208 rados bench 10 write -p benchpool | grep Bandwidth | sort -n -k3
Bandwidth (MB/sec):    0.930
Bandwidth (MB/sec):    0.950
...
Bandwidth (MB/sec):    2.006
Bandwidth (MB/sec):    2.136

I would expect to see bandwidth drop slightly going from the 1 computer to 4 computer case, but then remain steady until it hit a network bottleneck. The disk drives on these machines give about 50MB/sec of write throughput, and they all have a 1Gbps Ethernet connection going to a fast switch that should do up to 40Gbps bisectional throughput.

I tried longer tests (same results) and I tried using -t 1 (worse results).

Actions

Also available in: Atom PDF