Project

General

Profile

Actions

Bug #63197

open

mClockScheduler: Reservation and limit settings on SSD based OSDs results in nearly twice the allocated OSD's bandwidth consumption.

Added by Sridhar Seshasayee 7 months ago. Updated 6 months ago.

Status:
In Progress
Priority:
High
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Test Environment:

Ceph cluster configured with 4 OSDs with NVMe SSD backing devices.
1 OSD per NVMe device.
Replication factor = 3
OSD IOPS Capacity: 35000 IOPS (measured by OSD Bench on startup)
Fio benchmark measured on RBD pool (3x replication): ~24000 IOPS

Test 1:
mClock Profile: 'custom'
reservation: 0 (MIN)
weight: 1
limit: 0.1 -> 10% of 35000 = 3500 IOPS (Max Expected)

Fio Result Summary:

write: IOPS=6057, BW=23.7MiB/s (24.8MB/s)(7100MiB/300036msec); 0 zone resets
iops        : min=  100, max=12200, avg=6079.70, stdev=755.31, samples=29979

Test 2:
mClock Profile: 'custom'
reservation: 0 (MIN)
weight: 1
limit: 0.25 -> 25% of 35000 = 8750 IOPS (Max Expected)

Fio Result Summary:

write: IOPS=13.8k, BW=54.0MiB/s (56.7MB/s)(15.8GiB/300041msec); 0 zone resets
iops        : min=  100, max=23500, avg=13878.15, stdev=1656.19, samples=29971


Test 3:
mClock Profile: 'custom'
reservation: 0 (MIN)
weight: 1
limit: 0.50 -> 50% of 35000 = 17500 IOPS (Max Expected)

Fio Result Summary:

write: IOPS=24.2k, BW=94.7MiB/s (99.3MB/s)(27.8GiB/300003msec); 0 zone resets
iops        : min=  100, max=29100, avg=24450.37, stdev=2625.99, samples=29852

Test 3 results is not nearly 2x because disks are already saturated with 3x replication.
See the Fio benchmark result above.

The above tests clearly show that the measured IOPS exceed the expected limit enforcement by approximately 2x.

Actions #1

Updated by Sridhar Seshasayee 7 months ago

  • Pull request ID set to 53998
Actions #2

Updated by Radoslaw Zarzynski 7 months ago

  • Priority changed from Normal to High
Actions #3

Updated by Sridhar Seshasayee 6 months ago

  • Pull request ID deleted (53998)
Actions

Also available in: Atom PDF