Bug #10163
closedrados bench parameter -b producing wrong values when different blocksize used in writes
0%
Description
The -b (blocksize) parameter used in rados bench does produce wrong measurements iff a preceeding rados bench write with different blocksize (default is 4M if -b parameter is not given) has been used on the same pool.
Assumption: while doing rados bench rand or rados bench seq with -b set, it will use reads with blocksize of the written objects from a preceding rados bench write. It will however display performance information as if the amount of data read was -b blocksize large. This will falisfy information displayed and cause confusion.
Example / steps to reproduce:- Preparation: Write with no -b parameter, so using 4M object sizes into fresh pool
- rados -p rados bench 30 write -t 16 --no-cleanup
Test 1: Read 10 seconds with no -b parameter, so using the correct 4M object size:# rados -p rados bench 10 rand -t 16 sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 0 0 0 0 0 0 - 0 1 16 456 440 1759.47 1760 0.022893 0.0352574 2 16 906 890 1779.62 1800 0.028388 0.0350016 3 16 1289 1273 1696.99 1532 0.009703 0.0342429 4 16 1671 1655 1654.7 1528 0.019726 0.0373331 5 16 2052 2036 1628.53 1524 0.017198 0.0386522 6 16 2467 2451 1633.75 1660 0.033833 0.0389542 7 16 2879 2863 1635.75 1648 0.024425 0.0387982 8 16 3293 3277 1638.26 1656 0.021366 0.0386628 9 16 3602 3586 1593.55 1236 0.027967 0.0399032 10 16 3972 3956 1582.17 1480 0.011962 0.0398027 Total time run: 10.151817 Total reads made: 3972 Read size: 4194304 Bandwidth (MB/sec): 1565.040 Average Latency: 0.0407493 Max latency: 0.705831 Min latency: 0.0064
Test 2: reading with -b 4096 (4K) set, so reading 4M blocks but displaying as if only 4K blocks were read:# rados -p rados bench 10 rand -t 16 -b 4096 sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 0 0 0 0 0 0 - 0 1 16 440 424 1.65593 1.65625 0.011644 0.0331078 2 16 822 806 1.57395 1.49219 0.013107 0.0341816 3 16 1316 1300 1.69244 1.92969 0.022842 0.0366934 4 16 1824 1808 1.76535 1.98438 0.024441 0.0352421 5 15 2283 2268 1.7716 1.79688 0.017336 0.0349558 6 16 2666 2650 1.72499 1.49219 0.305033 0.0354981 7 16 3057 3041 1.69673 1.52734 0.024233 0.036623 8 16 3527 3511 1.7141 1.83594 0.014953 0.0356 9 16 4006 3990 1.73151 1.87109 0.008816 0.0357023 10 16 4215 4199 1.63999 0.816406 0.224568 0.0364528 Total time run: 10.251141 Total reads made: 4215 Read size: 4194304 Bandwidth (MB/sec): 1644.695 Average Latency: 0.0388754 Max latency: 1.06756 Min latency: 0.00647
Notice that in the summary block of Test 2, the real read size is being displayed as 4M and consequently, the bandwidth in the summary is correctly displaying what happened. During runtime however, the one-per second lines are wrong by factor 1000 ! Also the whole test is "wrong" in the sense that the -b parameter got ignored and full block object size reads were being done.
Expectation: rados should either refuse to run a read benchmark with non-matching object size given or only read up to -b data (if -b is smaller than what was written before) so that the displayed values are correct at all times.
Ceph is Giant release 0.87-1trusty, rados is- rados -v
ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578)