Bug #9192: krbd: poor read (about 10%) vs write performance - Linux kernel client - Ceph

Actions

Copy link

Bug #9192

open

krbd: poor read (about 10%) vs write performance

Added by Eric Eastman over 9 years ago. Updated over 4 years ago.

Status:

New

Priority:

High

Assignee:

Category:

Target version:

% Done:

Source:

Community (user)

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Crash signature (v1):

Crash signature (v2):

Description

We started testing the 3.17rc1 kernel over the weekend, as it is the only Linus
released kernel that has the fix for bug http://tracker.ceph.com/issues/8818

We noticed that the read performance was much slower then write performance for large
sequential writes to an XFS file system mounted on a kRBD device.

To verify that the problem was not with our Ceph cluster, or XFS, but with the kernel RBD
driver, I wrote a pair of C tools that allows me to directly read/write large sequential blocks
to RBD using either the kernel rbd or lib rbd interface.

Testing with these tools has shown that with some thread counts, the lib rbd
interface is more then 10x faster then doing reads then using the kernel
rbd interface.

With a 16MB block size, a 600 second run time, with each thread writing to its own
image in the same pool, the 3 run average throughput values were.

              krbd read total  librbd read total   krbd write total   librbd write total
1   threads   129  MB/sec       1546 MB/sec        879  MB/sec         216  MB/sec
2   threads   230  MB/sec       2651 MB/sec        1400 MB/sec         377  MB/sec
4   threads   375  MB/sec       2758 MB/sec        2020 MB/sec         563  MB/sec
8   threads   563  MB/sec       1216 MB/sec        2560 MB/sec         886  MB/sec
16  threads   863  MB/sec       1750 MB/sec        2561 MB/sec         1294 MB/sec
32  threads   1237 MB/sec       2325 MB/sec        2684 MB/sec         1857 MB/sec
64  threads   1784 MB/sec       2859 MB/sec        2715 MB/sec         2702 MB/sec
128 threads   1651 MB/sec       3942 MB/sec        2270 MB/sec         2878 MB/sec

NOTE: RBD cache is not enabled on for librbd
NOTE: The images were unmapped while running the librbd tests

Read loop for lib rbd:

for ( i = 0; i < gcsv.count; i++){
        off = i * gcsv.blocksize; 
        readlen = rbd_read(rbdimage, off, gcsv.blocksize, gbuffer);
        if ( readlen != (ssize_t)gcsv.blocksize ) {
            printf ("ERROR: Read error, read = %ld, blocksize = %ld in loop %ld, byte offset %ld\n", 
                    (long)readlen, gcsv.blocksize, i, (long)off);
        }
   }

Read loop for kernel rbd

for ( i = 0; i < gcsv.count; i++){
        readlen = read(fd, gbuffer, gcsv.blocksize);
        if ( readlen != (ssize_t)gcsv.blocksize ) {
            printf ("ERROR: Read error, read = %ld, blocksize = %ld in loop %ld, byte offset %ld\n", 
                    (long)readlen, gcsv.blocksize, i, off);
        }
        off += readlen;
    }

Ceph version on all nodes: 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6)
OS version on all nodes: Ubuntu 14.04.1 LTS

Test client running kRBD and librbd is a seperate systems from the cluster nodes and is configured:
Dual socket CPU E5-2660 @ 2.20GHz (16 total cores)
96 GB RAM
Mellenox dual port 40Gb Ethenet card, using 1 port

Kernel on kRBD clinet:
Linux version 3.17.0-031700rc1-generic (apw@gomeisa) (gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) ) #201408161335 SMP Sat Aug 16 17:36:29 UTC 2014
The kernel was downloaded from http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.17-rc1-utopic/

6 OSD Nodes, each with
12 7200RPM 4GB SAS drives each
10GbE public network
10GbE cluster network

3 Dedicated monitors

Test pool info:

# ceph osd pool get  ERIC-TEST-01 pg_num
pg_num: 8192
# ceph osd pool get  ERIC-TEST-01 pgp_num
pgp_num: 8192
# ceph osd pool get  ERIC-TEST-01 size
size: 1
# ceph osd pool get  ERIC-TEST-01 min_size
min_size: 1

Note: We started with size=3, and our write performance was less then 1GB/sec for both librbd and krbd.  We went 
to size=1 for this performance testing, and plan to reset to size=3 once we are done with these tests. Changing
the size=1 slightly decreased read performance.

Files

ceph.conf (962 Bytes) ceph.conf

Eric Eastman, 08/21/2014 11:18 AM

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » Linux kernel client

Custom queries

Bug #9192

krbd: poor read (about 10%) vs write performance

Updated by Sage Weil over 9 years ago

Updated by Sage Weil over 9 years ago

Updated by Eric Eastman over 9 years ago

Updated by Ilya Dryomov over 9 years ago

Updated by Eric Eastman over 9 years ago

Updated by Ilya Dryomov over 9 years ago

Updated by Ilya Dryomov over 9 years ago

Updated by Eric Eastman over 9 years ago

Updated by Ilya Dryomov over 9 years ago

Updated by Patrick Donnelly over 4 years ago