Project

General

Profile

Actions

Bug #41027

closed

rbd bench write does not properly count in flight IOs

Added by Dan van der Ster over 4 years ago. Updated over 4 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

While benchmarking a heavily IO saturated hdd ceph cluster I noticed that the rbd bench --io-type=write mode does not seem to wait for outstanding writes to by sync'd to OSDs.

1st, proof the cluster is IO saturated, with rbd bench in read mode:

# rbd bench test/testdan --io-type=read --io-size=4096 --io-threads=1 --io-total=1M --io-pattern rand
bench  type read io_size 4096 io_threads 1 bytes 1048576 pattern random
  SEC       OPS   OPS/SEC   BYTES/SEC
    1         7      6.19  25361.76
    2        17      8.64  35377.47
    3        39     12.00  49170.77
    4        52     13.23  54189.72
    5        81     15.22  62324.34
    6       102     20.16  82579.13
    7       116     19.65  80487.61
    8       138     21.05  86239.17
    9       159     20.56  84216.75
   10       169     19.06  78068.31
   11       185     16.39  67119.80
   12       200     17.09  70001.57
   13       232     18.81  77065.04
elapsed:    14  ops:      256  ops/sec:    17.98  bytes/sec: 73644.95

Now writing to this cluster I get 15k IOPS, which is clearly wrong:

# rbd bench test/testdan --io-type=write --io-size=4096 --io-threads=1 --io-total=100M --io-pattern rand
bench  type write io_size 4096 io_threads 1 bytes 104857600 pattern random
  SEC       OPS   OPS/SEC   BYTES/SEC
    1     15397  15413.13  63132185.43
elapsed:     2  ops:    25600  ops/sec:  9906.94  bytes/sec: 40578823.57

Here, with debug_ms I see that even though I have --io-threads=1 it is sending 5 writes before the first osd_op_reply comes in:

[15:17][root@p05517715y58557 (production:ceph/beesly/mon*2:peon) ~]# rbd --debug_ms=1 bench test/testdan --io-type=write --io-size=4096 --io-threads=1 --io-total=20000 --io-pattern rand 2>&1 | grep write

bench  type write io_size 4096 io_threads 1 bytes 20000 pattern random

2019-07-31 15:17:59.393 7f987affd700  1 -- 188.184.36.164:0/3854274056 --> 128.142.162.23:6884/927919 -- osd_op(unknown.0.0:16 78.333 78:ccc2236b:::rbd_data.33b6d06b8b4567.0000000000000321:head [write 3112960~4096] snapc 0=[] ondisk+write+known_if_redirected e2428362) v8 -- 0x7f98700121d0 con 0
2019-07-31 15:17:59.393 7f987affd700  1 -- 188.184.36.164:0/3854274056 --> 128.142.161.230:6824/3839248 -- osd_op(unknown.0.0:18 78.3df 78:fbeced39:::rbd_data.33b6d06b8b4567.0000000000000592:head [write 425984~4096] snapc 0=[] ondisk+write+known_if_redirected e2428362) v8 -- 0x7f9870016d80 con 0
2019-07-31 15:17:59.393 7f987affd700  1 -- 188.184.36.164:0/3854274056 --> 128.142.161.213:6852/3058740 -- osd_op(unknown.0.0:19 78.3b1 78:8df00f11:::rbd_data.33b6d06b8b4567.00000000000007f4:head [write 3579904~4096] snapc 0=[] ondisk+write+known_if_redirected e2428362) v8 -- 0x7f987001be20 con 0
2019-07-31 15:17:59.393 7f987affd700  1 -- 188.184.36.164:0/3854274056 --> 128.142.162.25:6820/897497 -- osd_op(unknown.0.0:20 78.3d 78:bc39328a:::rbd_data.33b6d06b8b4567.0000000000000620:head [write 3719168~4096] snapc 0=[] ondisk+write+known_if_redirected e2428362) v8 -- 0x7f9870020ec0 con 0
2019-07-31 15:17:59.393 7f987affd700  1 -- 188.184.36.164:0/3854274056 --> 128.142.161.221:6888/3981167 -- osd_op(unknown.0.0:21 78.3ff 78:ffd7215d:::rbd_data.33b6d06b8b4567.0000000000000073:head [write 1892352~4096] snapc 0=[] ondisk+write+known_if_redirected e2428362) v8 -- 0x7f9870026040 con 0

2019-07-31 15:17:59.396 7f9894ddf700  1 -- 188.184.36.164:0/3854274056 <== osd.751 128.142.162.25:6820/897497 1 ==== osd_op_reply(20 rbd_data.33b6d06b8b4567.0000000000000620 [write 3719168~4096] v2428362'96100 uv96100 ondisk = 0) v8 ==== 184+0+0 (2063549112 0 0) 0x7f988c0054e0 con 0x7f987001d750
2019-07-31 15:17:59.396 7f98945de700  1 -- 188.184.36.164:0/3854274056 <== osd.1390 128.142.161.221:6888/3981167 1 ==== osd_op_reply(21 rbd_data.33b6d06b8b4567.0000000000000073 [write 1892352~4096] v2428362'95936 uv95936 ondisk = 0) v8 ==== 184+0+0 (4292782517 0 0) 0x7f9884007d30 con 0x7f9870022810
2019-07-31 15:17:59.397 7f9893ddd700  1 -- 188.184.36.164:0/3854274056 <== osd.534 128.142.161.213:6852/3058740 1 ==== osd_op_reply(19 rbd_data.33b6d06b8b4567.00000000000007f4 [write 3579904~4096] v2428362'96979 uv96979 ondisk = 0) v8 ==== 184+0+0 (253385391 0 0) 0x7f9888005430 con 0x7f98700186b0
2019-07-31 15:17:59.398 7f9894ddf700  1 -- 188.184.36.164:0/3854274056 <== osd.664 128.142.162.23:6884/927919 1 ==== osd_op_reply(16 rbd_data.33b6d06b8b4567.0000000000000321 [write 3112960~4096] v2428362'96098 uv96098 ondisk = 0) v8 ==== 184+0+0 (1504253969 0 0) 0x7f988c0054e0 con 0x7f987000eac0
2019-07-31 15:17:59.512 7f98945de700  1 -- 188.184.36.164:0/3854274056 <== osd.497 128.142.161.230:6824/3839248 1 ==== osd_op_reply(18 rbd_data.33b6d06b8b4567.0000000000000592 [write 425984~4096] v2428362'96712 uv96712 ondisk = 0) v8 ==== 184+0+0 (1159898189 0 0) 0x7f9884007d30 con 0x7f9870013670

Thanks

Actions #1

Updated by Jason Dillaman over 4 years ago

  • Status changed from New to Need More Info

Do you have the RBD in-memory writeback cache enabled?

Actions #2

Updated by Dan van der Ster over 4 years ago

Jason Dillaman wrote:

Do you have the RBD in-memory writeback cache enabled?

Doh.. with `rbd cache = false` I get the expected (low) IOPS.

My config before had the default `rbd cache writethrough until flush = true`... is rbd bench sending a flush early on to turn on writeback?

Actions #3

Updated by Jason Dillaman over 4 years ago

Dan van der Ster wrote:

My config before had the default `rbd cache writethrough until flush = true`... is rbd bench sending a flush early on to turn on writeback?

Yes, `rbd bench` will send a flush prior to starting the benchmark to ensure the cache is in writeback mode.

Actions #4

Updated by Dan van der Ster over 4 years ago

Jason Dillaman wrote:

Dan van der Ster wrote:

My config before had the default `rbd cache writethrough until flush = true`... is rbd bench sending a flush early on to turn on writeback?

Yes, `rbd bench` will send a flush prior to starting the benchmark to ensure the cache is in writeback mode.

OK apologies for the ticket in that case. Thanks for your time, and please close.

Actions #5

Updated by Jason Dillaman over 4 years ago

  • Status changed from Need More Info to Closed
Actions

Also available in: Atom PDF