Project

General

Profile

Actions

Bug #63014

open

osd perf : Effects of osd_op_num_shard_[hdd/ssd] on op latency and bandwidth when use mclock_scheduler

Added by jianwei zhang 7 months ago. Updated 7 months ago.

Status:
Fix Under Review
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
v18.1.0
Backport:
Regression:
Yes
Severity:
2 - major
Reviewed:
09/28/2023
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

background:
  • add patch1: https://github.com/ceph/ceph/pull/53417 (ref: https://tracker.ceph.com/issues/62812)
  • add patch2: https://github.com/ceph/ceph/pull/52809 (ref: https://tracker.ceph.com/issues/62293)
    # ceph daemon osd.0 config show | grep -e osd_mclock -e shards_hdd -e shard_hdd -e memstore_
        "memstore_debug_omit_block_device_write": "false",
        "memstore_device_bytes": "32212254720",
        "memstore_page_set": "false",
        "memstore_page_size": "65536",
        "osd_mclock_force_run_benchmark_on_init": "false",
        "osd_mclock_iops_capacity_threshold_hdd": "500.000000",
        "osd_mclock_iops_capacity_threshold_ssd": "80000.000000",
        "osd_mclock_max_capacity_iops_hdd": "240.000000",
        "osd_mclock_max_capacity_iops_ssd": "21500.000000",
        "osd_mclock_max_sequential_bandwidth_hdd": "251658240",
        "osd_mclock_max_sequential_bandwidth_ssd": "1258291200",
        "osd_mclock_override_recovery_settings": "false",
        "osd_mclock_profile": "custom",
        "osd_mclock_scheduler_anticipation_timeout": "0.000000",
        "osd_mclock_scheduler_background_best_effort_lim": "0.100000",
        "osd_mclock_scheduler_background_best_effort_res": "0.000000",
        "osd_mclock_scheduler_background_best_effort_wgt": "20",
        "osd_mclock_scheduler_background_recovery_lim": "0.500000",
        "osd_mclock_scheduler_background_recovery_res": "0.000000",
        "osd_mclock_scheduler_background_recovery_wgt": "20",
        "osd_mclock_scheduler_client_lim": "1.000000", //100%
        "osd_mclock_scheduler_client_res": "1.000000", //100%
        "osd_mclock_scheduler_client_wgt": "60",       
        "osd_mclock_skip_benchmark": "true",
        "osd_op_num_shards_hdd": "5",            
        "osd_op_num_threads_per_shard_hdd": "1",
    
problem:
  • in scenarios where expected limits are not exceeded, multiple mclock queues can increase the latency of a single op.
  • Here’s why:
    case-1:
       * osd_op_num_shards_hdd = 5 / osd_op_num_threads_per_shard_hdd = 1
       * osd_bandwidth_capacity_per_shard = 240 / 5 = 48 MBps
       * osd_bandwidth_cost_per_io = 240 / 240 = 1MB
       * IO Buffer size : 200K --> align to 1M in calc_scaled_cost
       * client_res = 1 / client_lim = 1 / client_wgt = 60
       * res_bw = 48MBps / lim_bw = 48MBps
       * op delay latency = 1M / 48 MBps = 0.02083s = 20.83ms,  1000ms/20.83ms * 200K / 1024K * 5(shard) = 46.88MBps
         * every shard dequeues ops at a time interval of 20.83 ms
         * this means that each op has to wait at least 20.83 ms in the mclock queue
           * In fact, since the shard queue the op enters is based on the random hash of the pgid where op.oid is located, 
             * absolute balance cannot be achieved.
           * On a macro scale, the number of ops processed by each shard queue is basically balanced, 
             * and there will be no order of magnitude difference.
           * But on a micro scale, at the same time, 
             * some shard queues have a small number of ops accumulated, 
             * some shard queues have a large number of ops accumulated, 
             * which will directly cause op latency.
       * test-case-1 : 
         * res=1/lim=1/wgt=60/bw=240M/iops=240 ,num_shards=5,concurrency="-t 5" 
         * rados bench
           * avg_lat=0.0272721s = 27.27ms  // 27.27-20.83= 6.44ms , it's op avg_lat higher than expected (6.44/20.83=0.3091) 30.91% , bad!
           * bw = 35.77 MBps  // 46.88-35.77= 11.11MBps, it's bw lower than expected (11.11/46.88=0.2369) 23.69%, bad!
    

    case-2:
       * osd_op_num_shards_hdd = 1 / osd_op_num_threads_per_shard_hdd = 5
       * osd_bandwidth_capacity_per_shard = 240 MBps
       * osd_bandwidth_cost_per_io = 240 / 240 = 1MB
       * IO Buffer size : 200K --> align to 1M in calc_scaled_cost
       * client_res = 1 / client_lim = 1 / client_wgt = 60
       * res_bw = 240 MBps / lim_bw = 240 MBps
       * op delay latency = 1M / 240 MBps = 0.0.00416s = 4.16 ms
         * every shard dequeues ops at a time interval of 4.16 ms
         * this means that each op has to wait at least 4.16 ms in the mclock queue
           * 4.16 ms is only 1/5 of 20.83 ms
       * test-case-2
         * res=1/lim=1/wgt=60/bw=240M/iops=240 ,num_shards=1,num_shards_thread=5, concurrency= "-t 5" 
         * rados bench
           * avg_lat = 0.0209406s = 20.94 ms  // 20.94-20.83=0.11ms,  0.11/20.83 = 0.52%, very close, good!
           * bw = 46.61 MBps //46.88 - 46.61 = 0.27MBps,  0.27/46.88 = 0.57%, very close, good!
    

reproduce:

1. MON=1 MGR=1 OSD=1 FS=0 MDS=0 RGW=0 ../src/vstart.sh -n -l -X -b --msgr2 --memstore
2. modify ceph.conf
  memstore_device_bytes = 32212254720
  osd_op_queue = "mclock_scheduler" 
  osd_mclock_skip_benchmark          = true
  osd_mclock_max_capacity_iops_hdd                = 240
  osd_mclock_max_sequential_bandwidth_hdd         = 251658240
  osd_mclock_profile                              = "custom" 
  osd_mclock_scheduler_client_wgt                 = 60
  osd_mclock_scheduler_client_res                 = 1
  osd_mclock_scheduler_client_lim                 = 1
  osd_mclock_scheduler_background_recovery_wgt    = 20
  osd_mclock_scheduler_background_recovery_res    = 0
  osd_mclock_scheduler_background_recovery_lim    = 0.5
  osd_mclock_scheduler_background_best_effort_wgt = 20
  osd_mclock_scheduler_background_best_effort_res = 0
  osd_mclock_scheduler_background_best_effort_lim = 0.1
3. create pool
ceph osd pool create test-pool 128 128 replicated replicated_rule 0 1 off
ceph osd pool application enable test-pool rgw

4. ceph osd pool ls detail
pool 1 'test-pool' replicated size 1 min_size 1 crush_rule 0 object_hash rjenkins pg_num 128 pgp_num 128 autoscale_mode off last_change 274 flags hashpspool stripe_width 0 application rgw

5. ceph osd df tree
ID  CLASS  WEIGHT   REWEIGHT  SIZE    RAW USE  DATA  OMAP  META  AVAIL   %USE   VAR   PGS  STATUS  TYPE NAME         
-1         0.00099         -  30 GiB   11 GiB   0 B   0 B   0 B  19 GiB  35.83  1.00    -          root default      
-3         0.00099         -  30 GiB   11 GiB   0 B   0 B   0 B  19 GiB  35.83  1.00    -              host zjw-q-dev
 0    hdd  0.00099   1.00000  30 GiB   11 GiB   0 B   0 B   0 B  19 GiB  35.83  1.00  128      up          osd.0     
                       TOTAL  30 GiB   11 GiB   0 B   0 B   0 B  19 GiB  35.83                                       
MIN/MAX VAR: 1.00/1.00  STDDEV: 0

6.ceph df
--- RAW STORAGE ---
CLASS    SIZE   AVAIL    USED  RAW USED  %RAW USED
hdd    30 GiB  19 GiB  11 GiB    11 GiB      35.83
TOTAL  30 GiB  19 GiB  11 GiB    11 GiB      35.83

--- POOLS ---
POOL       ID  PGS  STORED  OBJECTS    USED  %USED  MAX AVAIL
test-pool   1  128  11 GiB   56.36k  11 GiB  37.72     18 GiB

7. recreate pgid after osd restart for memstore
cat force-create-pgid.sh 
#!/bin/sh
set -x

for i in $(ceph pg ls| grep +0800 | awk '{print $1}'); do
    ceph osd force-create-pg $i --yes-i-really-mean-it &
done

wait

8. rados bench test
rados -c ./ceph.conf --osd_client_op_priority=63 bench 60 write --no-cleanup -t 5 -p test-pool -b 204800 --show-time

test-case-1 : osd_op_num_shards_hdd = 5 / osd_op_num_threads_per_shard_hdd = 1 (default value)

* res=1/lim=1/wgt=60/bw=240M/iops=240 ,num_shards=5,concurrency="-t 5" 
* rados bench 
  * avg_lat=0.0272721s = 27.27ms
  * bw = 35.77 MBps

# rados -c ./ceph.conf --osd_client_op_priority=63 bench 60 write --no-cleanup -t 5 -p test-pool -b 204800 --show-time
2023-09-27T14:51:49.401628+0800 min lat: 0.00063724 max lat: 0.1238 avg lat: 0.0272496 lat p50: 0.0197331 lat p90: 0.0657348 lat p99: 0.104353 lat p999: 0.109506 lat p100: 0.1238
2023-09-27T14:51:49.401628+0800   sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg lat(s)
2023-09-27T14:51:49.401628+0800    40       5      7336      7331   35.7908   33.5938  0.00264146   0.0272496
2023-09-27T14:51:50.401830+0800    41       5      7527      7522   35.8276   37.3047  0.00134314   0.0272309
2023-09-27T14:51:51.401970+0800    42       5      7728      7723   35.9091   39.2578  0.00109788   0.0271673
2023-09-27T14:51:52.402096+0800    43       5      7918      7913   35.9369   37.1094   0.0024957   0.0271583
2023-09-27T14:51:53.402254+0800    44       5      8108      8103   35.9634   37.1094   0.0418387   0.0271351
2023-09-27T14:51:54.402424+0800    45       5      8285      8280   35.9323   34.5703 0.000866966   0.0271543
2023-09-27T14:51:55.402541+0800    46       5      8464      8459   35.9111   34.9609 0.000913727   0.0271692
2023-09-27T14:51:56.402654+0800    47       5      8639      8634   35.8742   34.1797 0.000983625   0.0271867
2023-09-27T14:51:57.402782+0800    48       5      8825      8820   35.8836   36.3281   0.0219999   0.0271934
2023-09-27T14:51:58.402913+0800    49       5      9012      9007   35.8965   36.5234  0.00206822   0.0271933
2023-09-27T14:51:59.403064+0800    50       5      9183      9178   35.8465   33.3984   0.0914843   0.0271902
2023-09-27T14:52:00.403192+0800    51       5      9371      9366   35.8635   36.7188  0.00104644   0.0272126
2023-09-27T14:52:01.403348+0800    52       5      9555      9550   35.8648   35.9375  0.00179941   0.0272075
2023-09-27T14:52:02.403497+0800    53       5      9733      9728    35.844   34.7656   0.0218633   0.0272322
2023-09-27T14:52:03.403625+0800    54       5      9917      9912   35.8456   35.9375   0.0202004   0.0272326
2023-09-27T14:52:04.403778+0800    55       5     10101     10096   35.8472   35.9375  0.00126869   0.0272176
2023-09-27T14:52:05.403914+0800    56       5     10293     10288   35.8766      37.5  0.00117897   0.0271972
2023-09-27T14:52:06.404071+0800    57       5     10463     10458   35.8296   33.2031  0.00110588   0.0272368
2023-09-27T14:52:07.404198+0800    58       5     10639     10634   35.8044    34.375  0.00136368   0.0272497
2023-09-27T14:52:08.404305+0800    59       5     10826     10821   35.8166   36.5234    0.021253   0.0272467
2023-09-27T14:52:09.404413+0800 min lat: 0.000626773 max lat: 0.173957 avg lat: 0.0272559 lat p50: 0.0199041 lat p90: 0.0649641 lat p99: 0.104244 lat p999: 0.10962 lat p100: 0.173957
2023-09-27T14:52:09.404413+0800   sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg lat(s)
2023-09-27T14:52:09.404413+0800    60       5     11005     11000   35.8022   34.9609   0.0443305   0.0272559
2023-09-27T14:52:10.404683+0800 Total time run:         60.0778
Total writes made:      11005
Write size:             204800
Object size:            204800
Bandwidth (MB/sec):     35.7772
Stddev Bandwidth:       1.69037
Max bandwidth (MB/sec): 39.2578
Min bandwidth (MB/sec): 31.6406
Average IOPS:           183
Stddev IOPS:            8.65471
Max IOPS:               201
Min IOPS:               162
Average Latency(s):     0.0272721
Stddev Latency(s):      0.0246501
Max latency(s):         0.173957
Min latency(s):         0.000626773
Latency P50(s):         0.0199112
Latency P90(s):         0.0649971
Latency P99(s):         0.10425
Latency P99.9(s):       0.10962
Latency P100(s):        0.173957

Every 2.0s: ceph daemon osd.0 dump_op_pq_state | grep scheduler
zjw-q-dev: Wed Sep 27 15:41:37 2023
            "scheduler": 0
            "scheduler": 0
            "scheduler": 2
            "scheduler": 3
            "scheduler": 0

Every 2.0s: ceph daemon osd.0 dump_op_pq_state | grep scheduler
zjw-q-dev: Wed Sep 27 15:41:49 2023
            "scheduler": 5
            "scheduler": 0
            "scheduler": 0
            "scheduler": 0
            "scheduler": 0

Every 2.0s: ceph daemon osd.0 dump_op_pq_state | grep scheduler
zjw-q-dev: Wed Sep 27 15:41:58 2023
            "scheduler": 0
            "scheduler": 0
            "scheduler": 1
            "scheduler": 4
            "scheduler": 0

test-case-2 : osd_op_num_shards_hdd = 1 / osd_op_num_threads_per_shard_hdd = 5 (new value)

* res=1/lim=1/wgt=60/bw=240M/iops=240 ,num_shards=1,num_shards_thread=5, concurrency= "-t 5" 
* rados bench
  * avg_lat = 0.0209406s = 20.94 ms
  * bw = 46.61 MBps

# rados -c ./ceph.conf --osd_client_op_priority=63 bench 60 write --no-cleanup -t 5 -p test-pool -b 204800 --show-time
2023-09-27T17:11:02.876277+0800 min lat: 0.00288863 max lat: 0.0895693 avg lat: 0.0209096 lat p50: 0.0180092 lat p90: 0.0212207 lat p99: 0.0219432 lat p999: 0.071195 lat p100: 0.0895693
2023-09-27T17:11:02.876277+0800   sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg lat(s)
2023-09-27T17:11:02.876277+0800    40       5      9566      9561   46.6765    46.875   0.0209049   0.0209096
2023-09-27T17:11:03.876505+0800    41       5      9806      9801   46.6811    46.875   0.0209343   0.0209077
2023-09-27T17:11:04.876666+0800    42       5     10046     10041   46.6855    46.875    0.020779   0.0209057
2023-09-27T17:11:05.876832+0800    43       5     10286     10281   46.6897    46.875   0.0209019   0.0209038
2023-09-27T17:11:06.877002+0800    44       5     10526     10521   46.6938    46.875   0.0209936   0.0209021
2023-09-27T17:11:07.877171+0800    45       5     10766     10761   46.6976    46.875   0.0207428   0.0209004
2023-09-27T17:11:08.877355+0800    46       5     11006     11001   46.7013    46.875   0.0208264   0.0208988
2023-09-27T17:11:09.877532+0800    47       5     11246     11241   46.7048    46.875   0.0208299   0.0208972
2023-09-27T17:11:10.877715+0800    48       5     11486     11481   46.7082    46.875   0.0207058   0.0208957
2023-09-27T17:11:11.877908+0800    49       4     11726     11722   46.7154   47.0703   0.0209205   0.0208942
2023-09-27T17:11:12.878075+0800    50       5     11967     11962   46.7184    46.875    0.020859   0.0208928
2023-09-27T17:11:13.878280+0800    51       5     12207     12202   46.7213    46.875   0.0207391   0.0208915
2023-09-27T17:11:14.878477+0800    52       5     12406     12401   46.5701   38.8672   0.0207016   0.0209585
2023-09-27T17:11:15.878683+0800    53       5     12646     12641   46.5757    46.875   0.0207462    0.020956
2023-09-27T17:11:16.878854+0800    54       5     12886     12881   46.5811    46.875   0.0208307   0.0209536
2023-09-27T17:11:17.879013+0800    55       5     13126     13121   46.5863    46.875   0.0208078   0.0209513
2023-09-27T17:11:18.879170+0800    56       5     13366     13361   46.5913    46.875   0.0207605    0.020949
2023-09-27T17:11:19.879299+0800    57       5     13606     13601   46.5962    46.875   0.0208917   0.0209468
2023-09-27T17:11:20.879458+0800    58       5     13846     13841   46.6009    46.875   0.0208574   0.0209447
2023-09-27T17:11:21.879600+0800    59       5     14086     14081   46.6054    46.875   0.0208224   0.0209426
2023-09-27T17:11:22.879773+0800 min lat: 0.00288863 max lat: 0.20594 avg lat: 0.0209407 lat p50: 0.0180095 lat p90: 0.0212212 lat p99: 0.0219438 lat p999: 0.07239 lat p100: 0.20594
2023-09-27T17:11:22.879773+0800   sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg lat(s)
2023-09-27T17:11:22.879773+0800    60       2     14324     14322    46.613   47.0703   0.0208395   0.0209407
2023-09-27T17:11:23.880038+0800 Total time run:         60.0186
Total writes made:      14324
Write size:             204800
Object size:            204800
Bandwidth (MB/sec):     46.6132
Stddev Bandwidth:       1.2887
Max bandwidth (MB/sec): 47.0703
Min bandwidth (MB/sec): 38.8672
Average IOPS:           238
Stddev IOPS:            6.59815
Max IOPS:               241
Min IOPS:               199
Average Latency(s):     0.0209406
Stddev Latency(s):      0.00376967
Max latency(s):         0.20594
Min latency(s):         0.00288863
Latency P50(s):         0.0180095
Latency P90(s):         0.0212212
Latency P99(s):         0.0219438
Latency P99.9(s):       0.07238
Latency P100(s):        0.20594

Actions #1

Updated by jianwei zhang 7 months ago

  • solution:
      * Reduce the number of queues osd_op_num_shards_[hdd/ssd] to 1,
      * Increase the corresponding number of osd_op_num_threads_per_shard_[hdd/ssd] threads
    
  • PR https://github.com/ceph/ceph/pull/53708
    commit 75f6c8b327444fa0d549f27e7ee2d63df19df543 (HEAD -> osd_op_num_shard_for_mclock)
    Author: zhangjianwei <zhangjianwei2_yewu@cmss.chinamobile.com>
    Date:   Thu Sep 28 13:54:23 2023 +0800
    
        perf: optimize osd_op_num_shard and thread to reduce op latency
    
        rados bench 60 write -t 5 -p test-pool -b 204800 --show-time
    
        case-1: res=1/lim=1/wgt=60/bw=240M/iops=240 ,num_shards=5, thread=1
          - res_bw = 48MBps / lim_bw = 48MBps
          - op delay latency = 1M / 48 MBps = 0.02083s = 20.83ms,
          - bw : 1000ms/20.83ms * 200K / 1024K * 5(shard) = 46.88MBps
          - avg_lat=0.0272721s = 27.27ms
            - it's op avg_lat higher than expected(6.44/20.83=0.3091) 30.91%
          - bw = 35.77 MBps
            - it's bw lower than expected (11.11/46.88=0.2369) 23.69%
        case-2: res=1/lim=1/wgt=60/bw=240M/iops=240 ,num_shards=1, thread=5
          - res_bw = 240 MBps / lim_bw = 240 MBps
          - op delay latency = 1M / 240 MBps = 0.0.00416s = 4.16 ms
            - concurrency=5, 5 * 4.166 = 20.83 ms
          - bw : 1000 / 4.16 * 200 / 1024 = 46.95 MBps
          - avg_lat = 0.0209406s = 20.94 ms
            - very close
          - bw = 46.61 MBps
            - very close
        reason:
        - since the shard queue the op enters is based on
          - the random hash of the pgid where op.oid is located,
          - absolute balance cannot be achieved.
        - On a macro scale, the number of ops processed by each shard queue
          - is basically balanced,
          - and there will be no order of magnitude difference.
        - But on a micro scale, at the same time,
          - some shard queues have a small number of ops accumulated,
          - some shard queues have a large number of ops accumulated,
          - which will directly cause op latency.
    
        issue: https://tracker.ceph.com/issues/63014
    
        co-author: yanghonggang <yanghonggang_yewu@cmss.chinamobile.com>
        Signed-off-by: zhangjianwei <zhangjianwei2_yewu@cmss.chinamobile.com>
    

please review

Actions #2

Updated by jianwei zhang 7 months ago

test-case-3 : osd_op_num_shards_hdd = 5 / osd_op_num_threads_per_shard_hdd = 1 (default value), rados bench -t 10

* res=1/lim=1/wgt=60/bw=240M/iops=240 ,num_shards=5,concurrency="-t 10" 
* single osd shard-queue:
  * op_count = 10/5= 2
  * op latency = 20.83 * 2 = 41.66ms
* rados bench 
  * avg_lat=0.04882s = 48.82ms. // 48.82-41.66=7.16ms ,  7.16/41.66=0.1718 = 17.18% higher latency
  * bw = 39.96 MBps // 46.88 - 39.96 = 6.92MBps ,  6.92/46.88 = 0.1476 = 14.76% lower bw
    * The concurrency is 10, but the expected bandwidth is still not reached.

# rados -c ./ceph.conf --osd_client_op_priority=63 bench 60 write --no-cleanup -t 10 -p test-pool -b 204800 --show-time
2023-09-27T17:02:00.149474+0800 min lat: 0.000764786 max lat: 0.208813 avg lat: 0.049061 lat p50: 0.0401827 lat p90: 0.106608 lat p99: 0.201124 lat p999: 0.208813 lat p100: 0.208813
2023-09-27T17:02:00.149474+0800   sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg lat(s)
2023-09-27T17:02:00.149474+0800    40      10      8156      8146   39.7691   36.9141   0.0624894    0.049061
2023-09-27T17:02:01.149679+0800    41      10      8349      8339   39.7183   37.6953   0.0637159   0.0491289
2023-09-27T17:02:02.149846+0800    42      10      8554      8544   39.7258   40.0391  0.00269757   0.0491222
2023-09-27T17:02:03.150020+0800    43      10      8765      8755   39.7601   41.2109   0.0230551   0.0490833
2023-09-27T17:02:04.150205+0800    44      10      8981      8971   39.8151   42.1875   0.0832663   0.0490166
2023-09-27T17:02:05.150336+0800    45      10      9185      9175   39.8157   39.8438   0.0287703   0.0490149
2023-09-27T17:02:06.150510+0800    46      10      9399      9389   39.8586   41.7969   0.0228984   0.0489489
2023-09-27T17:02:07.150667+0800    47      10      9605      9595   39.8664   40.2344   0.0429812   0.0489387
2023-09-27T17:02:08.150836+0800    48      10      9793      9783   39.8007   36.7188   0.0413746    0.049001
2023-09-27T17:02:09.151020+0800    49      10     10001      9991   39.8174    40.625   0.0614177   0.0490218
2023-09-27T17:02:10.151161+0800    50      10     10200     10190   39.7983   38.8672  0.00135033   0.0490209
2023-09-27T17:02:11.151322+0800    51      10     10414     10404   39.8373   41.7969  0.00106855   0.0489749
2023-09-27T17:02:12.151499+0800    52      10     10622     10612   39.8523    40.625   0.0237813   0.0489732
2023-09-27T17:02:13.151668+0800    53      10     10839     10829      39.9   42.3828  0.00225313   0.0489165
2023-09-27T17:02:14.151790+0800    54      10     11046     11036   39.9097   40.4297   0.0012074   0.0488299
2023-09-27T17:02:15.151928+0800    55      10     11256     11246   39.9297   41.0156   0.0219607   0.0488638
2023-09-27T17:02:16.152091+0800    56      10     11461     11451   39.9315   40.0391   0.0433643   0.0488833
2023-09-27T17:02:17.152244+0800    57      10     11684     11674    39.995   43.5547   0.0417148   0.0487993
2023-09-27T17:02:18.152407+0800    58      10     11897     11887   40.0225   41.6016   0.0634135   0.0487619
2023-09-27T17:02:19.152597+0800    59      10     12081     12071   39.9532   35.9375   0.0620354   0.0488264
2023-09-27T17:02:20.152724+0800 min lat: 0.000724906 max lat: 0.239877 avg lat: 0.0488142 lat p50: 0.0400258 lat p90: 0.106086 lat p99: 0.200645 lat p999: 0.239877 lat p100: 0.239877
2023-09-27T17:02:20.152724+0800   sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg lat(s)
2023-09-27T17:02:20.152724+0800    60      10     12294     12284   39.9806   41.6016  0.00282966   0.0488142
2023-09-27T17:02:21.153042+0800 Total time run:         60.0745
Total writes made:      12294
Write size:             204800
Object size:            204800
Bandwidth (MB/sec):     39.9699
Stddev Bandwidth:       1.89358
Max bandwidth (MB/sec): 43.75
Min bandwidth (MB/sec): 35.3516
&lt;pre&gt;&lt;code class="xml"&gt;
Average IOPS:           204
&lt;/code&gt;&lt;/pre&gt;

Stddev IOPS:            9.69513
Max IOPS:               224
Min IOPS:               181
Average Latency(s):     0.0488245
Stddev Latency(s):      0.0410119
Max latency(s):         0.239877
Min latency(s):         0.000724906
Latency P50(s):         0.0400488
Latency P90(s):         0.106076
Latency P99(s):         0.200604
Latency P99.9(s):       0.239877
Latency P100(s):        0.239877

test-case-4 : osd_op_num_shards_hdd = 5 / osd_op_num_threads_per_shard_hdd = 1 (default value), rados bench -t 100

* res=1/lim=1/wgt=60/bw=240M/iops=240 ,num_shards=5,concurrency="-t 10" 
* single osd shard-queue:
  * op_count = 100/5= 20
  * op latency = 20.83 * 20 = 416.6ms
* rados bench 
  * avg_lat=0.424939s = 424.93ms. // 424.93-416.6ms=8.32ms ,  8.32/416.6=0.0199 = 1.99% higher latency
  * bw = 45.5307 MBps // 46.88 - 45.53 = 1.35MBps ,  1.35/46.88 = 0.0287 = 2.87% lower bw

# rados -c ./ceph.conf --osd_client_op_priority=63 bench 60 write --no-cleanup -t 100 -p test-pool -b 204800 --show-time
2023-09-27T17:06:43.480487+0800 min lat: 0.000751464 max lat: 1.31278 avg lat: 0.420266 lat p50: 0.357611 lat p90: 0.966935 lat p99: 1.19508 lat p999: 1.31278 lat p100: 1.31278
2023-09-27T17:06:43.480487+0800   sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg lat(s)
2023-09-27T17:06:43.480487+0800    40     100      9530      9430   46.0369   46.4844    0.894509    0.420266
2023-09-27T17:06:44.480744+0800    41     100      9728      9628    45.857   38.6719    0.936132    0.420333
2023-09-27T17:06:45.480924+0800    42     100      9982      9882   45.9462   49.6094     1.12484    0.421395
2023-09-27T17:06:46.481067+0800    43     100     10216     10116   45.9404   45.7031   0.0209113    0.421793
2023-09-27T17:06:47.481284+0800    44     100     10448     10348   45.9259   45.3125    0.502089    0.422033
2023-09-27T17:06:48.481460+0800    45     100     10688     10588   45.9468    46.875    0.499153    0.422226
2023-09-27T17:06:49.481650+0800    46     100     10928     10828   45.9668    46.875    0.330506    0.422106
2023-09-27T17:06:50.481815+0800    47      99     11168     11069   45.9901   47.0703    0.165617     0.42167
2023-09-27T17:06:51.481989+0800    48     100     11404     11304    45.988   45.8984     0.16773    0.421349
2023-09-27T17:06:52.482162+0800    49     100     11636     11536   45.9741   45.3125   0.0415597    0.420838
2023-09-27T17:06:53.482342+0800    50     100     11871     11771   45.9724   45.8984   0.0010504    0.421411
2023-09-27T17:06:54.482524+0800    51     100     12105     12005   45.9669   45.7031    0.102835    0.421473
2023-09-27T17:06:55.482710+0800    52     100     12335     12235   45.9467   44.9219    0.646227    0.422253
2023-09-27T17:06:56.482908+0800    53     100     12568     12468   45.9382   45.5078    0.310422    0.422572
2023-09-27T17:06:57.483084+0800    54     100     12800     12700   45.9265   45.3125     0.22921    0.422627
2023-09-27T17:06:58.483210+0800    55     100     13028     12928    45.901   44.5312   0.0223929    0.422707
2023-09-27T17:06:59.483411+0800    56     100     13216     13116   45.7369   36.7188    0.228842    0.423003
2023-09-27T17:07:00.483609+0800    57     100     13491     13391   45.8766   53.7109    0.395777    0.423745
2023-09-27T17:07:01.483771+0800    58     100     13731     13631   45.8937    46.875    0.229148    0.423636
2023-09-27T17:07:02.483932+0800    59     100     13971     13871   45.9102    46.875    0.283619    0.423531
2023-09-27T17:07:03.484095+0800 min lat: 0.000751464 max lat: 1.31278 avg lat: 0.4236 lat p50: 0.352323 lat p90: 0.981818 lat p99: 1.19678 lat p999: 1.31278 lat p100: 1.31278
2023-09-27T17:07:03.484095+0800   sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg lat(s)
2023-09-27T17:07:03.484095+0800    60      96     14195     14099   45.8871   44.5312    0.187543      0.4236
2023-09-27T17:07:04.484967+0800 Total time run:         60.8921
Total writes made:      14195
Write size:             204800
Object size:            204800
Bandwidth (MB/sec):     45.5307
Stddev Bandwidth:       2.18691
Max bandwidth (MB/sec): 53.7109
Min bandwidth (MB/sec): 36.7188
Average IOPS:           233
Stddev IOPS:            11.197
Max IOPS:               275
Min IOPS:               188
Average Latency(s):     0.424939
Stddev Latency(s):      0.321742
Max latency(s):         1.31278
Min latency(s):         0.000751464
Latency P50(s):         0.354194
Latency P90(s):         0.980933
Latency P99(s):         1.19663
Latency P99.9(s):       1.31278
Latency P100(s):        1.31278

Only in large concurrency scenarios,
Can basically achieve the expected bandwidth,
op mapto> shard-queue, imbalance can be eliminated

Actions #3

Updated by jianwei zhang 7 months ago

test-case-5 : osd_op_num_shards_hdd = 5 / osd_op_num_threads_per_shard_hdd = 1 (default value), rados bench -t 1

* res=1/lim=1/wgt=60/bw=240M/iops=240 ,num_shards=5,concurrency="-t 1" 
* single osd shard-queue:
  * op_count = 1
  * op latency = 20.83 * 1 = 20.83 ms
* rados bench 
  * avg_lat=0.0087478s = 8.74ms. // 8.7=7.16ms ,  7.16/41.66=0.1718 = 17.18% higher latency
    * 8.74 - 4.16 = 4.58ms , 4.58 / 4.16 = 1.100 = 110% higher latency
    * 20.83 - 4.16 =12.08ms, 12.08 / 20.83 = 0.5799 = 57.99% lower lantency
    * Since the five shard-mclock-queues work independently,
      * Interleave ops,
      * Some previous ops and later ops are in the same shard-queue, and the scheduling interval is 20.83ms.
      * Some previous ops are not in the same shard-queue as the next op, and the scheduling interval may be 0ms 
        * because  there is no previous op. 
  * bw = 22.3079 MBps // 46.88 - 22.30 = 24.58MBps ,  24.58/46.88 = 0.5243 = 52.43% lower bw

# rados -c ./ceph.conf --osd_client_op_priority=63 bench 60 write --no-cleanup -t 1 -p test-pool -b 204800 --show-time
2023-09-27T10:11:38.064485+0800 min lat: 0.000849524 max lat: 0.021689 avg lat: 0.00880033 lat p50: 0.00179681 lat p90: 0.0200112 lat p99: 0.021689 lat p999: 0.021689 lat p100: 0.021689
2023-09-27T10:11:38.064485+0800   sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg lat(s)
2023-09-27T10:11:38.064485+0800    40       1      4543      4542   22.1746   19.5312   0.0180289  0.00880033
2023-09-27T10:11:39.064615+0800    41       1      4657      4656   22.1767   22.2656   0.0208662  0.00879872
2023-09-27T10:11:40.064712+0800    42       1      4765      4764   22.1509   21.0938  0.00131755  0.00880755
2023-09-27T10:11:41.064816+0800    43       1      4878      4877    22.149   22.0703  0.00117934  0.00880899
2023-09-27T10:11:42.064939+0800    44       1      4988      4987   22.1338   21.4844  0.00144502  0.00881508
2023-09-27T10:11:43.065091+0800    45       1      5109      5108    22.167   23.6328  0.00141641  0.00880296
2023-09-27T10:11:44.065221+0800    46       1      5226      5225   22.1819   22.8516  0.00110214  0.00879735
2023-09-27T10:11:45.065403+0800    47       1      5338      5337   22.1752    21.875   0.0014958  0.00879824
2023-09-27T10:11:46.065543+0800    48       1      5453      5452   22.1811   22.4609  0.00118351   0.0087973
2023-09-27T10:11:47.065701+0800    49       1      5567      5566   22.1828   22.2656  0.00144899  0.00879675
2023-09-27T10:11:48.065868+0800    50       1      5683      5682   22.1922   22.6562   0.0207746  0.00879325
2023-09-27T10:11:49.066020+0800    51       1      5795      5794   22.1859    21.875  0.00138659  0.00879438
2023-09-27T10:11:50.066157+0800    52       1      5899      5898   22.1498   20.3125  0.00127605  0.00880794
2023-09-27T10:11:51.066309+0800    53       1      6021      6020   22.1814   23.8281  0.00139287  0.00879587
2023-09-27T10:11:52.066499+0800    54       1      6129      6128   22.1612   21.0938   0.0013527  0.00880495
2023-09-27T10:11:53.066651+0800    55       1      6259      6258   22.2198   25.3906  0.00127788  0.00878224
2023-09-27T10:11:54.066791+0800    56       1      6372      6371   22.2171   22.0703  0.00120844   0.0087809
2023-09-27T10:11:55.066926+0800    57       1      6490      6489   22.2316   23.0469  0.00106357  0.00877779
2023-09-27T10:11:56.067087+0800    58       1      6608      6607   22.2456   23.0469  0.00213832   0.0087692
2023-09-27T10:11:57.067236+0800    59       1      6739      6738   22.3022   25.5859 0.000968881  0.00874903
2023-09-27T10:11:58.067434+0800 Total time run:         60
Total writes made:      6853
Write size:             204800
Object size:            204800
Bandwidth (MB/sec):     22.3079
Stddev Bandwidth:       1.38429
Max bandwidth (MB/sec): 25.9766
Min bandwidth (MB/sec): 19.5312
Average IOPS:           114
Stddev IOPS:            7.08758
Max IOPS:               133
Min IOPS:               100
Average Latency(s):     0.0087478
Stddev Latency(s):      0.00904938
Max latency(s):         0.0281679
Min latency(s):         0.000779053
Latency P50(s):         0.00178837
Latency P90(s):         0.0200079
Latency P99(s):         0.0218087
Latency P99.9(s):       0.0219888
Latency P100(s):        0.0281679

In rare concurrency scenarios, the delay of the op has deviated from the expected results.

Actions #4

Updated by jianwei zhang 7 months ago

test-case-5 : osd_op_num_shards_hdd = 1 / osd_op_num_threads_per_shard_hdd = 5 (new value), rados bench -t 1

* res=1/lim=1/wgt=60/bw=240M/iops=240 ,num_shards=1,thread=5, concurrency="-t 1" 
* single osd shard-queue:
  * op_count = 1
  * op latency = 4.16 * 1 = 4.16 ms
* rados bench 
  * avg_lat =  0.00418212 = 4.18ms  //achieve expected results
  * bw = 46.631 MBps. //achieve expected results

# rados -c ./ceph.conf --osd_client_op_priority=63 bench 60 write --no-cleanup -t 1 -p test-pool -b 204800 --show-time
2023-09-28T08:59:08.215753+0800 min lat: 0.00096535 max lat: 0.0607635 avg lat: 0.00417728 lat p50: 0.00367809 lat p90: 0.00433169 lat p99: 0.00630797 lat p999: 0.00750567 lat p100: 0.0607635
2023-09-28T08:59:08.215753+0800   sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg lat(s)
2023-09-28T08:59:08.215753+0800    40       1      9563      9562   46.6828    46.875  0.00417911  0.00417728
2023-09-28T08:59:09.215869+0800    41       1      9803      9802   46.6873    46.875  0.00425973  0.00417689
2023-09-28T08:59:10.216018+0800    42       1     10043     10042   46.6916    46.875  0.00410263  0.00417645
2023-09-28T08:59:11.216156+0800    43       1     10283     10282   46.6958    46.875   0.0040753  0.00417606
2023-09-28T08:59:12.216302+0800    44       1     10523     10522   46.6997    46.875  0.00404351  0.00417568
2023-09-28T08:59:13.216419+0800    45       1     10763     10762   46.7034    46.875  0.00418563  0.00417532
2023-09-28T08:59:14.216533+0800    46       1     11003     11002   46.7071    46.875  0.00416281    0.004175
2023-09-28T08:59:15.216687+0800    47       1     11208     11207   46.5651   40.0391  0.00419329  0.00418775
2023-09-28T08:59:16.216845+0800    48       1     11448     11447   46.5714    46.875  0.00429851  0.00418716
2023-09-28T08:59:17.216964+0800    49       1     11688     11687   46.5774    46.875  0.00423865  0.00418661
2023-09-28T08:59:18.217107+0800    50       1     11928     11927   46.5833    46.875  0.00413176  0.00418606
2023-09-28T08:59:19.217263+0800    51       1     12168     12167   46.5888    46.875  0.00392387  0.00418556
2023-09-28T08:59:20.217403+0800    52       1     12408     12407   46.5942    46.875  0.00402929  0.00418525
2023-09-28T08:59:21.217546+0800    53       1     12648     12647   46.5994    46.875  0.00393544  0.00418478
2023-09-28T08:59:22.217688+0800    54       1     12888     12887   46.6044    46.875   0.0041457  0.00418435
2023-09-28T08:59:23.217825+0800    55       1     13127     13126   46.6056   46.6797  0.00405358  0.00418405
2023-09-28T08:59:24.217955+0800    56       1     13367     13366   46.6103    46.875  0.00423642  0.00418365
2023-09-28T08:59:25.218104+0800    57       1     13607     13606   46.6148    46.875  0.00431051  0.00418328
2023-09-28T08:59:26.218206+0800    58       1     13848     13847   46.6226   47.0703  0.00404745  0.00418286
2023-09-28T08:59:27.218326+0800    59       1     14088     14087   46.6268    46.875  0.00411271  0.00418247
2023-09-28T08:59:28.218550+0800 Total time run:         60.004
Total writes made:      14326
Write size:             204800
Object size:            204800
Bandwidth (MB/sec):     46.631
Stddev Bandwidth:       1.14194
Max bandwidth (MB/sec): 47.0703
Min bandwidth (MB/sec): 40.0391
Average IOPS:           238
Stddev IOPS:            5.84672
Max IOPS:               241
Min IOPS:               205
Average Latency(s):     0.00418212
Stddev Latency(s):      0.00116668
Max latency(s):         0.0817517
Min latency(s):         0.00096334
Latency P50(s):         0.00367646
Latency P90(s):         0.00429972
Latency P99(s):         0.0063091
Latency P99.9(s):       0.00796828
Latency P100(s):        0.0817517

1 concurrency can achieve the expected bandwidth and latency

Actions #5

Updated by jianwei zhang 7 months ago

test-case-7 : osd_op_num_shards_hdd = 1 / osd_op_num_threads_per_shard_hdd = 5 (new value), rados bench -t 5

* res=1/lim=1/wgt=60/bw=240M/iops=240 ,num_shards=1,thread=5, concurrency="-t 5" 
* single osd shard-queue:
  * op_count = 1
  * op latency = 4.16 * 5 = 20.83 ms
* rados bench 
  * avg_lat =  0.0209406 = 20.94ms  //achieve expected results
  * bw = 46.6132 MBps. //achieve expected results

# rados -c ./ceph.conf --osd_client_op_priority=63 bench 60 write --no-cleanup -t 5 -p test-pool -b 204800 --show-time
2023-09-27T17:11:02.876277+0800 min lat: 0.00288863 max lat: 0.0895693 avg lat: 0.0209096 lat p50: 0.0180092 lat p90: 0.0212207 lat p99: 0.0219432 lat p999: 0.071195 lat p100: 0.0895693
2023-09-27T17:11:02.876277+0800   sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg lat(s)
2023-09-27T17:11:02.876277+0800    40       5      9566      9561   46.6765    46.875   0.0209049   0.0209096
2023-09-27T17:11:03.876505+0800    41       5      9806      9801   46.6811    46.875   0.0209343   0.0209077
2023-09-27T17:11:04.876666+0800    42       5     10046     10041   46.6855    46.875    0.020779   0.0209057
2023-09-27T17:11:05.876832+0800    43       5     10286     10281   46.6897    46.875   0.0209019   0.0209038
2023-09-27T17:11:06.877002+0800    44       5     10526     10521   46.6938    46.875   0.0209936   0.0209021
2023-09-27T17:11:07.877171+0800    45       5     10766     10761   46.6976    46.875   0.0207428   0.0209004
2023-09-27T17:11:08.877355+0800    46       5     11006     11001   46.7013    46.875   0.0208264   0.0208988
2023-09-27T17:11:09.877532+0800    47       5     11246     11241   46.7048    46.875   0.0208299   0.0208972
2023-09-27T17:11:10.877715+0800    48       5     11486     11481   46.7082    46.875   0.0207058   0.0208957
2023-09-27T17:11:11.877908+0800    49       4     11726     11722   46.7154   47.0703   0.0209205   0.0208942
2023-09-27T17:11:12.878075+0800    50       5     11967     11962   46.7184    46.875    0.020859   0.0208928
2023-09-27T17:11:13.878280+0800    51       5     12207     12202   46.7213    46.875   0.0207391   0.0208915
2023-09-27T17:11:14.878477+0800    52       5     12406     12401   46.5701   38.8672   0.0207016   0.0209585
2023-09-27T17:11:15.878683+0800    53       5     12646     12641   46.5757    46.875   0.0207462    0.020956
2023-09-27T17:11:16.878854+0800    54       5     12886     12881   46.5811    46.875   0.0208307   0.0209536
2023-09-27T17:11:17.879013+0800    55       5     13126     13121   46.5863    46.875   0.0208078   0.0209513
2023-09-27T17:11:18.879170+0800    56       5     13366     13361   46.5913    46.875   0.0207605    0.020949
2023-09-27T17:11:19.879299+0800    57       5     13606     13601   46.5962    46.875   0.0208917   0.0209468
2023-09-27T17:11:20.879458+0800    58       5     13846     13841   46.6009    46.875   0.0208574   0.0209447
2023-09-27T17:11:21.879600+0800    59       5     14086     14081   46.6054    46.875   0.0208224   0.0209426
2023-09-27T17:11:22.879773+0800 min lat: 0.00288863 max lat: 0.20594 avg lat: 0.0209407 lat p50: 0.0180095 lat p90: 0.0212212 lat p99: 0.0219438 lat p999: 0.07239 lat p100: 0.20594
2023-09-27T17:11:22.879773+0800   sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg lat(s)
2023-09-27T17:11:22.879773+0800    60       2     14324     14322    46.613   47.0703   0.0208395   0.0209407
2023-09-27T17:11:23.880038+0800 Total time run:         60.0186
Total writes made:      14324
Write size:             204800
Object size:            204800
Bandwidth (MB/sec):     46.6132
Stddev Bandwidth:       1.2887
Max bandwidth (MB/sec): 47.0703
Min bandwidth (MB/sec): 38.8672
Average IOPS:           238
Stddev IOPS:            6.59815
Max IOPS:               241
Min IOPS:               199
Average Latency(s):     0.0209406
Stddev Latency(s):      0.00376967
Max latency(s):         0.20594
Min latency(s):         0.00288863
Latency P50(s):         0.0180095
Latency P90(s):         0.0212212
Latency P99(s):         0.0219438
Latency P99.9(s):       0.07238
Latency P100(s):        0.20594

10 concurrency can achieve the expected bandwidth and latency

Actions #6

Updated by jianwei zhang 7 months ago

test-case-8 : osd_op_num_shards_hdd = 1 / osd_op_num_threads_per_shard_hdd = 5 (new value), rados bench -t 10

* res=1/lim=1/wgt=60/bw=240M/iops=240 ,num_shards=1,thread=5, concurrency="-t 10" 
* single osd shard-queue:
  * op_count = 1
  * op latency = 4.16 * 10 = 41.6 ms
* rados bench 
  * avg_lat = 0.0418179 = 41.81 ms  //achieve expected results
  * bw = 46.6831  MBps. //achieve expected results

# rados -c ./ceph.conf --osd_client_op_priority=63 bench 60 write --no-cleanup -t 10 -p test-pool -b 204800 --show-time
2023-09-28T14:52:08.526529+0800 min lat: 0.00465967 max lat: 0.148879 avg lat: 0.0417454 lat p50: 0.0410025 lat p90: 0.0474422 lat p99: 0.0488911 lat p999: 0.0982306 lat p100: 0.148879
2023-09-28T14:52:08.526529+0800   sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg lat(s)
2023-09-28T14:52:08.526529+0800    40      10      9587      9577   46.7556    46.875   0.0416502   0.0417454
2023-09-28T14:52:09.526675+0800    41      10      9827      9817   46.7584    46.875   0.0415091   0.0417433
2023-09-28T14:52:10.526820+0800    42      10     10067     10057    46.761    46.875   0.0419229   0.0417414
2023-09-28T14:52:11.526933+0800    43      10     10307     10297   46.7635    46.875    0.041696   0.0417396
2023-09-28T14:52:12.527092+0800    44      10     10547     10537   46.7659    46.875   0.0415556   0.0417377
2023-09-28T14:52:13.527243+0800    45      10     10787     10777   46.7681    46.875   0.0416958   0.0417359
2023-09-28T14:52:14.527397+0800    46      10     10992     10982   46.6217   40.0391   0.0416719   0.0418662
2023-09-28T14:52:15.527558+0800    47      10     11232     11222   46.6269    46.875   0.0416568   0.0418618
2023-09-28T14:52:16.527700+0800    48      10     11472     11462    46.632    46.875   0.0414449   0.0418575
2023-09-28T14:52:17.527858+0800    49      10     11712     11702   46.6368    46.875   0.0418099   0.0418534
2023-09-28T14:52:18.527999+0800    50      10     11952     11942   46.6414    46.875   0.0416113   0.0418496
2023-09-28T14:52:19.528161+0800    51      10     12192     12182   46.6458    46.875   0.0414765   0.0418458
2023-09-28T14:52:20.528274+0800    52      10     12432     12422   46.6501    46.875   0.0417257   0.0418422
2023-09-28T14:52:21.528438+0800    53      10     12672     12662   46.6542    46.875   0.0414387   0.0418387
2023-09-28T14:52:22.528635+0800    54      10     12913     12903   46.6618   47.0703   0.0415033   0.0418354
2023-09-28T14:52:23.528766+0800    55      10     13153     13143   46.6655    46.875   0.0416341   0.0418322
2023-09-28T14:52:24.528881+0800    56      10     13393     13383   46.6692    46.875    0.041721   0.0418291
2023-09-28T14:52:25.529022+0800    57      10     13633     13623   46.6727    46.875   0.0416514   0.0418261
2023-09-28T14:52:26.529151+0800    58      10     13873     13863   46.6761    46.875   0.0416045   0.0418233
2023-09-28T14:52:27.529268+0800    59      10     14113     14103   46.6793    46.875   0.0416218   0.0418205
2023-09-28T14:52:28.529392+0800 min lat: 0.00465967 max lat: 0.223459 avg lat: 0.0418179 lat p50: 0.0410028 lat p90: 0.0474373 lat p99: 0.0488851 lat p999: 0.11657 lat p100: 0.223459
2023-09-28T14:52:28.529392+0800   sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg lat(s)
2023-09-28T14:52:28.529392+0800    60       8     14351     14343   46.6825    46.875   0.0418207   0.0418179
2023-09-28T14:52:29.529668+0800 Total time run:         60.0416
Total writes made:      14351
Write size:             204800
Object size:            204800
Bandwidth (MB/sec):     46.6831
Stddev Bandwidth:       1.03
Max bandwidth (MB/sec): 47.0703
Min bandwidth (MB/sec): 40.0391
Average IOPS:           239
Stddev IOPS:            5.27362
Max IOPS:               241
Min IOPS:               205
Average Latency(s):     0.0418179
Stddev Latency(s):      0.00510051
Max latency(s):         0.223459
Min latency(s):         0.00465967
Latency P50(s):         0.0410028
Latency P90(s):         0.0474373
Latency P99(s):         0.0488851
Latency P99.9(s):       0.11649
Latency P100(s):        0.223459

10 concurrency can achieve the expected bandwidth and latency

Actions #7

Updated by jianwei zhang 7 months ago

test-case-9 : osd_op_num_shards_hdd = 1 / osd_op_num_threads_per_shard_hdd = 5 (new value), rados bench -t 100

* res=1/lim=1/wgt=60/bw=240M/iops=240 ,num_shards=1,thread=5, concurrency="-t 100" 
* single osd shard-queue:
  * op_count = 100
  * op latency = 4.16 * 100 = 416 ms
* rados bench 
  * avg_lat = 0.415216 = 415.21 ms  //achieve expected results
  * bw = 46.877  MBps. //achieve expected results

100 concurrency can achieve the expected bandwidth and latency
2023-09-28T14:55:21.279701+0800 min lat: 0.0335749 max lat: 0.536941 avg lat: 0.414483 lat p50: 0.463869 lat p90: 0.536941 lat p99: 0.536941 lat p999: 0.536941 lat p100: 0.536941
2023-09-28T14:55:21.279701+0800   sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg lat(s)
2023-09-28T14:55:21.279701+0800    40     100      9702      9602   46.8773    46.875    0.416707    0.414483
2023-09-28T14:55:22.279913+0800    41     100      9942      9842    46.877    46.875    0.416608    0.414536
2023-09-28T14:55:23.280089+0800    42     100     10182     10082   46.8767    46.875    0.416621    0.414587
2023-09-28T14:55:24.280271+0800    43     100     10422     10322   46.8765    46.875    0.416774    0.414634
2023-09-28T14:55:25.280406+0800    44     100     10662     10562   46.8763    46.875    0.416772    0.414681
2023-09-28T14:55:26.280541+0800    45     100     10902     10802   46.8761    46.875    0.416772    0.414725
2023-09-28T14:55:27.280684+0800    46     100     11142     11042    46.876    46.875    0.416756    0.414766
2023-09-28T14:55:28.280867+0800    47     100     11382     11282   46.8758    46.875    0.416577    0.414807
2023-09-28T14:55:29.281024+0800    48     100     11622     11522   46.8756    46.875    0.416715    0.414845
2023-09-28T14:55:30.281189+0800    49     100     11862     11762   46.8754    46.875    0.416584    0.414882
2023-09-28T14:55:31.281360+0800    50     100     12102     12002   46.8753    46.875    0.416817    0.414917
2023-09-28T14:55:32.281514+0800    51     100     12342     12242   46.8751    46.875     0.41655    0.414952
2023-09-28T14:55:33.281691+0800    52     100     12582     12482    46.875    46.875     0.41679    0.414983
2023-09-28T14:55:34.281855+0800    53     100     12822     12722   46.8748    46.875    0.416809    0.415015
2023-09-28T14:55:35.282014+0800    54     100     13062     12962   46.8747    46.875    0.416698    0.415046
2023-09-28T14:55:36.282185+0800    55     100     13302     13202   46.8745    46.875    0.416668    0.415075
2023-09-28T14:55:37.282358+0800    56     100     13542     13442   46.8744    46.875    0.416718    0.415104
2023-09-28T14:55:38.282536+0800    57     100     13782     13682   46.8743    46.875    0.416708     0.41513
2023-09-28T14:55:39.282704+0800    58     100     14022     13922   46.8741    46.875    0.416679    0.415157
2023-09-28T14:55:40.282846+0800    59     100     14262     14162    46.874    46.875    0.416556    0.415182
2023-09-28T14:55:41.283013+0800 min lat: 0.0335749 max lat: 0.592423 avg lat: 0.415207 lat p50: 0.464046 lat p90: 0.540895 lat p99: 0.558186 lat p999: 0.559915 lat p100: 0.592423
2023-09-28T14:55:41.283013+0800   sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg lat(s)
2023-09-28T14:55:41.283013+0800    60      98     14500     14402   46.8739    46.875    0.416469    0.415207
2023-09-28T14:55:42.283323+0800 Total time run:         60.414
Total writes made:      14500
Write size:             204800
Object size:            204800
Bandwidth (MB/sec):     46.877
Stddev Bandwidth:       0.0353555
Max bandwidth (MB/sec): 47.0703
Min bandwidth (MB/sec): 46.875
Average IOPS:           240
Stddev IOPS:            0.18102
Max IOPS:               241
Min IOPS:               240
Average Latency(s):     0.415216
Stddev Latency(s):      0.0220192
Max latency(s):         0.592423
Min latency(s):         0.0335749
Latency P50(s):         0.464053
Latency P90(s):         0.540895
Latency P99(s):         0.558185
Latency P99.9(s):       0.559914
Latency P100(s):        0.592423

Actions #8

Updated by Neha Ojha 7 months ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 53708
Actions

Also available in: Atom PDF