Project

General

Profile

Actions

Bug #36652

open

[rbd-mirror] replay performance issue

Added by Sameh Ghane over 5 years ago. Updated over 5 years ago.

Status:
Need More Info
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hello,

[New to ceph community, please pardon my faux pas.]

I have 2 ceph clusters are separated by 40ms RTT.

2 rbd-mirror instances are running, each one close to a cluster.

rados bench from rbd-mirror instances shows 300MB/s speed in the worst case scenario (remote writes).

Remote (from rbd-mirror perspective) cluster is running 12.2.7.
Local cluster is running 12.2.4.
rbd-mirror is running 12.2.7.

When I map an mount a 10GB rbd(-nbd) image and run this command to fill it:
[root@systasks001 mnt]# dd if=/dev/zero of=TEST bs=1M count=5000
5000+0 records in
5000+0 records out
5242880000 bytes (5.2 GB) copied, 32.512 s, 161 MB/s

This translates to 300k+ entries to replay:

replaying, master_position=[object_number=3077, tag_tid=11, entry_tid=648217], mirror_position=[object_number=2778, tag_tid=11, entry_tid=305374], entries_behind_master=342843

rbd-mirror reads from remote cluster and writes to local cluster (but the problem is the same when it reads locally and replays remotely).

It replays entries at a rate of about 400+ entries per second, which is about 25 times slower than the initial dd write.

Attached is a tcpdump pcap where you can observe a chunk of the data being replayed by rbd-mirror:
  • reads from the remote OSDs. (During this, no data is sent to local OSDs)
  • writes to the local OSDs. (During this, no data is received from remote OSDs)

I marked this issue as major, because after a certain threshold, performance will break functionality.

Cheers,


Files

ceph.rbd-mirror.pcap (52 KB) ceph.rbd-mirror.pcap Sameh Ghane, 10/30/2018 08:41 PM
Actions #1

Updated by Mykola Golub over 5 years ago

It should be much faster in your case if the image journal is created with large (1Mb) rbd_journal_max_payload_bytes and rbd_mirror_journal_max_fetch_bytes is set to large value (1Mb or more) in rbd mirror. rbd_journal_max_payload_bytes is also improves performance when writing to a journal for large request sizes.

Could you try this?

The rationale why we have so low defaults is to limit rbd-mirror memory usage when mirroring a pool with many images, and that a usual rbd workload is small size requests for which these params are not so useful.

Actions #2

Updated by Mykola Golub over 5 years ago

  • Status changed from New to Need More Info
Actions

Also available in: Atom PDF