Project

General

Profile

Actions

Bug #17782

closed

performance drop with Linux Kernel 4.4.0 (ceph tell osd.* bench)

Added by Yoann Moulin over 7 years ago. Updated almost 3 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
OSD
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
Yes
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hello,

I found a performance drop between kernel 3.13.0-88 (default kernel on Ubuntu Trusty 14.04) or 4.2.0-38-generic and kernel 4.4.0.24.14 (default kernel on Ubuntu Xenial 16.04).

I noticed this on Jewel (10.2.2) and Infernalis (9.2.0)

All the benchmarks has been ran on strictly identical hardware setups per node.

I did a couple of benchmarks on 3 clusters. Each cluster has 3 nodes strictly identical.
Each node has 10 OSDs. Journals are on the disk.

bench5 : Ubuntu 14.04 / Ceph Infernalis
bench6 : Ubuntu 14.04 / Ceph Jewel
bench7 : Ubuntu 16.04 / Ceph jewel

this is the average of 2 runs of "ceph tell osd.* bench" on each cluster (2 x 30
OSDs)

bench5 / 14.04 / Infernalis / kernel 3.13 : 54.35 MB/s
bench6 / 14.04 / Jewel / kernel 3.13 : 86.47 MB/s

bench5 / 14.04 / Infernalis / kernel 4.2 : 63.38 MB/s
bench6 / 14.04 / Jewel / kernel 4.2 : 107.75 MB/s
bench7 / 16.04 / Jewel / kernel 4.2 : 101.54 MB/s

bench5 / 14.04 / Infernalis / kernel 4.4 : 53.61 MB/s
bench6 / 14.04 / Jewel / kernel 4.4 : 65.82 MB/s
bench7 / 16.04 / Jewel / kernel 4.4 : 61.57 MB/s

Cheers,

Yoann

Actions #1

Updated by Sage Weil over 7 years ago

  • Status changed from New to Need More Info

Can you please try kernel 4.8? IIRC this was a kernel regression that has already been fixed.

Actions #2

Updated by Yoann Moulin over 7 years ago

Hello,

some new benchmarks with the latest jewel release (10.2.5) on 4 nodes (Each node has 10 OSDs)

I ran 2 times "ceph tell osd.* bench" over 40 OSDs, here the average speed :

4.2.0-42-generic 97.45 MB/s
4.4.0-53-generic 55.73 MB/s
4.8.15-040815-generic 62.41 MB/s
4.9.0-040900-generic 60.88 MB/s

I have the same behaviour with at least 35 to 40% performance drop between kernel 4.2 and kernel > 4.4

I can do further benches if needed.

raw data : ceph-post-file: c18eed71-8fa3-41a8-b1e8-d2ea101c1cdf

Yoann

Actions #3

Updated by Yoann Moulin over 7 years ago

Results of my latest benchmarks are interesting, I don't have the same behaviour on the second cluster with different disks, but rather, I have better performance with kernels >= 4.4 (but it's not significant in my opinion). You can find below, the hardware details of servers.

I don't have any beginning of explanation to understand this behaviour...

Raw output of "ceph tell bench" + collectl

ceph-post-file: 137b4466-5c4b-4b8b-b867-466df376277b

cluster with 4 NODE1 , ran 2 time bench (40 OSDs x 2=80), avergare speed :

4.2.0-42-generic : 99.22 MB/s
4.4.0-53-generic : 62.24 MB/s
4.8.15-040815-generic : 67.68 MB/s
4.9.0-040900-generic : 67.17 MB/s

cluster with 3 NODE2 , ran 4 time bench (18 OSDs x 4=72), avergare speed :

4.2.0-42-generic : 62.90 MB/s
4.4.0-57-generic : 69.71 MB/s
4.8.15-040815-generic : 69.58 MB/s
4.9.0-040900-generic : 68.54 MB/s

++ NODE1 +++

1 x Intel S2600WT Server Board
2 x E5-2680 Processor
128B DDR4 2133MHz ECC Registered Server Memory
2 x Intel 240GB SSD Drive (Raid-1 OS)
10 x 6TB S-ATAIII Raid
2 x LSI Logic / Symbios Logic SAS2308 PCI-Express Fusion-MPT SAS-2

Disk Information :

=== START OF INFORMATION SECTION ===
Device Model: HGST HUS726060ALE610
LU WWN Device Id: 5 000cca 242c8cd1c
Firmware Version: APGNT517
User Capacity: 6,001,175,126,016 bytes [6.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
ATA Version is: ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)

+++ Node 2 +++

1 x Intel S2600WT Server Board
2 x E5-2680 Processor
256GB DDR4 2133MHz ECC Registered Server Memory
2 x Intel 240GB SSD Drive (Raid-1 OS)
6 x 4TB S-ATAIII Raid
2 x LSI Logic / Symbios Logic SAS2308 PCI-Express Fusion-MPT SAS-2

Disk Information :

=== START OF INFORMATION SECTION ===
Model Family: Western Digital Se
Device Model: WDC WD4000F9YZ-09N20L1
LU WWN Device Id: 5 0014ee 003fef9f0
Firmware Version: 01.01A02
User Capacity: 4,000,787,030,016 bytes [4.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
ATA Version is: ATA8-ACS (minor revision not indicated)
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)

++ ceph osd tree of the cluster with 4 NODE1 +++

ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 212.83098 root default
-2 54.46289 host node007
0 5.44629 osd.0 up 1.00000 1.00000
4 5.44629 osd.4 up 1.00000 1.00000
10 5.44629 osd.10 up 1.00000 1.00000
14 5.44629 osd.14 up 1.00000 1.00000
18 5.44629 osd.18 up 1.00000 1.00000
22 5.44629 osd.22 up 1.00000 1.00000
26 5.44629 osd.26 up 1.00000 1.00000
30 5.44629 osd.30 up 1.00000 1.00000
34 5.44629 osd.34 up 1.00000 1.00000
38 5.44629 osd.38 up 1.00000 1.00000
-3 54.46289 host node015
3 5.44629 osd.3 up 1.00000 1.00000
5 5.44629 osd.5 up 1.00000 1.00000
8 5.44629 osd.8 up 1.00000 1.00000
12 5.44629 osd.12 up 1.00000 1.00000
16 5.44629 osd.16 up 1.00000 1.00000
20 5.44629 osd.20 up 1.00000 1.00000
24 5.44629 osd.24 up 1.00000 1.00000
28 5.44629 osd.28 up 1.00000 1.00000
32 5.44629 osd.32 up 1.00000 1.00000
36 5.44629 osd.36 up 1.00000 1.00000
-4 54.46289 host node019
1 5.44629 osd.1 up 1.00000 1.00000
7 5.44629 osd.7 up 1.00000 1.00000
9 5.44629 osd.9 up 1.00000 1.00000
13 5.44629 osd.13 up 1.00000 1.00000
17 5.44629 osd.17 up 1.00000 1.00000
21 5.44629 osd.21 up 1.00000 1.00000
25 5.44629 osd.25 up 1.00000 1.00000
29 5.44629 osd.29 up 1.00000 1.00000
33 5.44629 osd.33 up 1.00000 1.00000
37 5.44629 osd.37 up 1.00000 1.00000
-5 49.44231 host node023
2 5.44629 osd.2 up 1.00000 1.00000
6 5.44629 osd.6 up 1.00000 1.00000
11 5.44629 osd.11 up 1.00000 1.00000
15 5.44629 osd.15 up 1.00000 1.00000
19 5.44629 osd.19 up 1.00000 1.00000
23 5.44629 osd.23 up 1.00000 1.00000
40 5.44530 osd.40 up 1.00000 1.00000
41 5.44629 osd.41 up 1.00000 1.00000
42 5.44629 osd.42 up 1.00000 1.00000
43 0.42670 osd.43 up 1.00000 1.00000

++ ceph osd tree of the cluster with 3 NODE2 +++

ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 65.28955 root default
-2 21.76318 host node030
0 3.62720 osd.0 up 1.00000 1.00000
5 3.62720 osd.5 up 1.00000 1.00000
6 3.62720 osd.6 up 1.00000 1.00000
10 3.62720 osd.10 up 1.00000 1.00000
14 3.62720 osd.14 up 1.00000 1.00000
16 3.62720 osd.16 up 1.00000 1.00000
-3 21.76318 host node031
1 3.62720 osd.1 up 1.00000 1.00000
4 3.62720 osd.4 up 1.00000 1.00000
8 3.62720 osd.8 up 1.00000 1.00000
11 3.62720 osd.11 up 1.00000 1.00000
13 3.62720 osd.13 up 1.00000 1.00000
17 3.62720 osd.17 up 1.00000 1.00000
-4 21.76318 host node032
2 3.62720 osd.2 up 1.00000 1.00000
3 3.62720 osd.3 up 1.00000 1.00000
7 3.62720 osd.7 up 1.00000 1.00000
9 3.62720 osd.9 up 1.00000 1.00000
12 3.62720 osd.12 up 1.00000 1.00000
15 3.62720 osd.15 up 1.00000 1.00000

Actions #4

Updated by Sage Weil almost 3 years ago

  • Status changed from Need More Info to Closed
Actions

Also available in: Atom PDF