Support #40303
closedAverage latency output in rados bench seems incorrect
0%
Description
Seen on ceph version 14.2.1 (d555a9489eb35f84f2e1ef49b77e19da9d113972) nautilus (stable)
# /usr/bin/rados bench -c /etc/ceph/fast.conf -p pool 60 write -t 2 -b 65536 --no-cleanup
hints = 1
Maintaining 2 concurrent writes of 65536 bytes to objects of size 65536 for up to 60 seconds or 0 objects
Object prefix: benchmark_data_bn336.convoke.cloud_16758
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
0 0 0 0 0 0 - 0
1 2 501 499 31.182 31.1875 0.00156332 0.00372242
2 2 992 990 30.9324 30.6875 0.0014457 0.00403083
3 1 1638 1637 34.0986 40.4375 0.00119016 0.00359742
4 2 2323 2321 36.2598 42.75 0.00161445 0.00344049
5 2 2861 2859 35.7317 33.625 0.00126015 0.00348331
6 2 3414 3412 35.5359 34.5625 0.00168682 0.00349687
7 2 3922 3920 34.9943 31.75 0.00189924 0.00356997
8 2 4495 4493 35.0959 35.8125 0.00114464 0.00355285
9 1 4875 4874 33.8417 23.8125 0.00170197 0.00368152
10 2 5421 5419 33.8632 34.0625 0.00152365 0.00368276
11 2 5924 5922 33.6422 31.4375 0.00116316 0.00371197
12 2 6370 6368 33.1613 27.875 0.00131946 0.00374918
13 2 6834 6832 32.8409 29 0.00101812 0.00379321
14 2 7363 7361 32.8565 33.0625 0.00154718 0.00379322
15 2 7759 7757 32.3159 24.75 0.00139544 0.00386386
16 2 8243 8241 32.1865 30.25 0.00127026 0.00387422
17 2 8910 8908 32.7451 41.6875 0.00135777 0.00380918
18 2 9386 9384 32.5785 29.75 0.00114425 0.00382886
19 2 10055 10053 33.0642 41.8125 0.0014974 0.00377908
2019-06-12 10:40:28.220542 min lat: 0.000867134 max lat: 0.282099 avg lat: 0.00371032
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
20 2 10780 10778 33.6763 45.3125 0.00148318 0.00371032
21 2 11221 11219 33.385 27.5625 0.0013521 0.00371883
22 2 11839 11837 33.623 38.625 0.00109343 0.00371319
23 2 12464 12462 33.8592 39.0625 0.00105072 0.00369039
24 2 12928 12926 33.6566 29 0.00113544 0.00370982
25 2 13485 13483 33.7026 34.8125 0.00134962 0.00370475
26 2 14034 14032 33.7259 34.3125 0.00182426 0.00369088
27 2 14560 14558 33.6942 32.875 0.00115982 0.00370405
28 2 15307 15305 34.1581 46.6875 0.00139306 0.00365811
29 2 15893 15891 34.243 36.625 0.00133283 0.00364906
30 2 16608 16606 34.5909 44.6875 0.00113804 0.00361237
31 2 17359 17357 34.989 46.9375 0.00113668 0.00356149
32 2 18059 18057 35.2626 43.75 0.00138045 0.00354355
33 2 18808 18806 35.6124 46.8125 0.00121765 0.00350637
34 2 19530 19528 35.892 45.125 0.00127839 0.00347845
35 1 20106 20105 35.8968 36.0625 0.0015811 0.00347737
36 2 20742 20740 36.0019 39.6875 0.00138548 0.00346994
37 2 21319 21317 36.0034 36.0625 0.00117204 0.00347066
38 2 21961 21959 36.1118 40.125 0.00155697 0.0034602
39 2 22628 22626 36.2546 41.6875 0.00124545 0.00344635
2019-06-12 10:40:48.223129 min lat: 0.000769916 max lat: 0.282099 avg lat: 0.00341283
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
40 2 23434 23432 36.6075 50.375 0.00134771 0.00341283
41 2 24074 24072 36.69 40 0.00154871 0.00339875
42 2 24676 24674 36.7122 37.625 0.00158016 0.0034028
43 2 25329 25327 36.8074 40.8125 0.00195395 0.00339488
44 1 25974 25973 36.8884 40.375 0.00165454 0.0033867
45 2 26760 26758 37.1587 49.0625 0.00173819 0.00336172
46 2 27643 27641 37.5505 55.1875 0.00150727 0.00332601
47 2 28436 28434 37.8059 49.5625 0.00122901 0.00330491
48 2 29227 29225 38.0481 49.4375 0.00158476 0.00328129
49 2 29782 29780 37.9795 34.6875 0.00148581 0.00329006
50 2 30578 30576 38.2147 49.75 0.00122827 0.00326905
51 1 31278 31277 38.3244 43.8125 0.00137523 0.00326043
52 2 32067 32065 38.5344 49.25 0.00173384 0.00324264
53 2 32810 32808 38.6834 46.4375 0.00104165 0.0032298
54 2 33632 33630 38.9183 51.375 0.0012043 0.00320996
55 2 34362 34360 39.0402 45.625 0.00199642 0.00320062
56 2 35204 35202 39.2827 52.625 0.00123679 0.00317964
57 2 36163 36161 39.6449 59.9375 0.00149645 0.00314994
58 2 36972 36970 39.833 50.5625 0.00136391 0.00313687
59 2 37863 37861 40.1016 55.6875 0.00189514 0.00311385
2019-06-12 10:41:08.225652 min lat: 0.000769916 max lat: 0.282099 avg lat: 0.00310966
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
60 2 38561 38559 40.1602 43.625 0.00159082 0.00310966
Total time run: 60.0687
Total writes made: 38562
Write size: 65536
Object size: 65536
Bandwidth (MB/sec): 40.1228
Stddev Bandwidth: 8.31216
Max bandwidth (MB/sec): 59.9375
Min bandwidth (MB/sec): 23.8125
Average IOPS: 641
Stddev IOPS: 132.995
Max IOPS: 959
Min IOPS: 381
Average Latency(s): 0.00311428
Stddev Latency(s): 0.01138
Max latency(s): 0.282099
Min latency(s): 0.000769916
I would expect Avg Latency to be much less based on the last latency readings.
Updated by Matt Dunavant almost 5 years ago
Edit: We've run some more tests and it seems like it might be the Last Latency that is inconsistant with expected results. As we've scaled up tests, the Avg Latency seems to fall in line with what we expect and the Last Latency is the field that hangs around 0.001 - 0.002.
Is there any documentation on how the last and avg latency is calculated with rados bench and how we may not be measuring what we think we're measuring with this tool?
Updated by Greg Farnum almost 5 years ago
- Tracker changed from Bug to Support
- Status changed from New to Closed
Rados bench isn't much documented outside of the source code, and is a very primitive tool. I'd take questions on specific details to the mailing list. If you want good latency measurements using one of the standard storage testing tools on top of RBD or CephFS is likely to be more in-line with what your applications see and provide more usable output anyway. :)