Project

General

Profile

Actions

Bug #58530

open

Pacific: Significant write amplification as compared to Nautilus

Added by Joshua Baergen over 1 year ago. Updated 12 months ago.

Status:
Triaged
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

After upgrading multiple RBD clusters from 14.2.18 to 16.2.9, we've found that OSDs write significantly more to the underlying disks per client write, on average, under Pacific than Nautilus. This is a 3x replicated configuration.

To show this, we can calculate, across the cluster, the average number of disk I/Os issued for each client write issued to the cluster:

A similar effect can be seen in disk bytes written per client byte written to the cluster:

One can see the point where the cluster was upgraded, just after the 15th.

Here's the average disk I/O size on a per-host basis:

The hosts in the bottom cluster are 4k MAS, whereas the top cluster are 16k MAS. To my eyes, there was an increase in disk I/O size for both 16k and 4k MAS hosts, but the increase was larger for 4k MAS disks. The one yellow line that migrates from the top group to the bottom was a recreation of a number of the OSDs on that host, thus moving from 16k to 4k MAS. We don't have deferred write stats over the period, unfortunately.

At a high level, this doesn't seem to have a major performance impact. We've mostly seen this show up as a small increase in median latency for some clusters, but most customers see a latency improvement over Nautilus (avg, p50, p99). We did have to make some adjustments to older hardware to handle the increased write load. We are concerned that as we move on to even older hardware that this will result in a larger issue.

The other concern that we have is that this increases wear on SSD drives, since more is being written to the drive than before.

We do continue to investigate this issue, though we've hit a wall several times in what further information we can gather or which avenues to explore. We'll try to keep this ticket up to date with any further findings on our side.


Files

disk-writes-per-client-write.png (124 KB) disk-writes-per-client-write.png Joshua Baergen, 01/20/2023 09:34 PM
disk-bytes-per-client-write-byte.png (148 KB) disk-bytes-per-client-write-byte.png Joshua Baergen, 01/20/2023 09:40 PM
average-io-size.png (820 KB) average-io-size.png Joshua Baergen, 01/20/2023 09:50 PM
osd.300-perf-dump-2023-01-23T14_59_41+00_00 (29.6 KB) osd.300-perf-dump-2023-01-23T14_59_41+00_00 Joshua Baergen, 01/23/2023 04:37 PM
osd.683-perf-dump-2023-01-23T14_59_44+00_00 (29.6 KB) osd.683-perf-dump-2023-01-23T14_59_44+00_00 Joshua Baergen, 01/23/2023 04:37 PM
osd.42-perf-dump-2023-01-23T14_57_02+00_00 (29.7 KB) osd.42-perf-dump-2023-01-23T14_57_02+00_00 Joshua Baergen, 01/23/2023 04:37 PM
osd.683-perf-dump-2023-01-20T21_14_57+00_00 (28.9 KB) osd.683-perf-dump-2023-01-20T21_14_57+00_00 Joshua Baergen, 01/23/2023 04:37 PM
osd.300-perf-dump-2023-01-20T21_13_58+00_00 (28.8 KB) osd.300-perf-dump-2023-01-20T21_13_58+00_00 Joshua Baergen, 01/23/2023 04:37 PM
osd.42-perf-dump-2023-01-20T21_12_59+00_00 (28.9 KB) osd.42-perf-dump-2023-01-20T21_12_59+00_00 Joshua Baergen, 01/23/2023 04:37 PM
log-compactions-two-clusters.png (35.6 KB) log-compactions-two-clusters.png Joshua Baergen, 01/27/2023 02:40 PM
logged-bytes-two-clusters.png (40.4 KB) logged-bytes-two-clusters.png Joshua Baergen, 01/27/2023 02:40 PM
2023-02-16 Deferred Bytes by Host.png (146 KB) 2023-02-16 Deferred Bytes by Host.png Joshua Baergen, 02/16/2023 03:03 PM
2023-02-16 Deferred Ops by Host.png (157 KB) 2023-02-16 Deferred Ops by Host.png Joshua Baergen, 02/16/2023 03:03 PM
2023-02-16 Write Byte Amp.png (45.1 KB) 2023-02-16 Write Byte Amp.png Joshua Baergen, 02/16/2023 03:03 PM
2023-02-16 Write Op Amp.png (38.1 KB) 2023-02-16 Write Op Amp.png Joshua Baergen, 02/16/2023 03:03 PM

Related issues 1 (0 open1 closed)

Related to bluestore - Bug #61466: Add bluefs write op count metricsResolvedJoshua Baergen

Actions
Actions

Also available in: Atom PDF