Project

General

Profile

Actions

Bug #3737

closed

Higher ping-latency observed in qemu with rbd_cache=true during disk-write

Added by Oliver Francke over 11 years ago. Updated almost 11 years ago.

Status:
Resolved
Priority:
High
Assignee:
Target version:
% Done:

0%

Source:
Development
Tags:
Backport:
bobtail
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hi Josh,

as per our short conversation in IRC-#ceph there is an issue with latency/responsiveness with rbd_cache enabled, no matter what cache= says.
In the lab we have qemu-1.2.2 currently as well as ceph version 0.56-111-ga14a36e (a14a36ed78d9febb7fbf1f6bf209d9bd58daace6)

Please advise necessary debug-switches to narrow down the problem.

Thnx and have a pretty good year to all of you ;)

Oliver.


Files

905_test.log.xz (2.27 MB) 905_test.log.xz Oliver Francke, 01/21/2013 09:35 AM
ping.log.xz (1.14 KB) ping.log.xz Oliver Francke, 01/21/2013 09:35 AM
test.log (15.6 KB) test.log /tmp/test.log Chris Dunlop, 02/20/2013 11:20 PM

Related issues 1 (0 open1 closed)

Related to rbd - Subtask #4091: ObjectCacher: optionally make readx/writex calls never blockResolvedJosh Durgin02/11/2013

Actions
Actions #1

Updated by Sage Weil over 11 years ago

  • Priority changed from Normal to High
Actions #2

Updated by Ian Colle over 11 years ago

  • Project changed from Ceph to rbd
  • Category deleted (qemu)
  • Target version deleted (v0.56)

Updated by Oliver Francke over 11 years ago

Hi Josh,

according to our conversation I did some testing.
I started the dd if=/dev... of=/tmp/doof.dat bs=4k count=256000 at around 18:10:00 as you can assume with my ping.log.
I think highest RTT was 500ms. And all above let's say 3-5ms I do not see with rbd_cache=false.

Best regards,

Oliver.

Actions #4

Updated by Chris Dunlop about 11 years ago

Confirmed here, with ceph-0.56.3 and qemu-1.3.1.

See attached test output.

A summary is, the average ping time, and the standard deviation of the same, is much worse with rbd_cache=1:

rbd_cache=0: Avg: 0.493 ms Std: 0.109 ms
rbd_cache=1: Avg: 148.107 ms Std: 219.786 ms

Actions #5

Updated by Chris Dunlop about 11 years ago

Sigh. The attachment might help...

Actions #6

Updated by Josh Durgin about 11 years ago

I've looked at the logs, and I think #4091 should fix this. The high ping times tend to occur around when the cache fills up, making aio_write() block.

Actions #7

Updated by Sage Weil about 11 years ago

  • Tracker changed from Bug to Fix
Actions #8

Updated by Ian Colle about 11 years ago

  • Target version set to v0.60
Actions #9

Updated by Sage Weil about 11 years ago

  • Translation missing: en.field_story_points set to 8.00
Actions #10

Updated by Neil Levine about 11 years ago

  • Status changed from New to 12
Actions #11

Updated by Sage Weil about 11 years ago

  • Status changed from 12 to 7
Actions #12

Updated by Ian Colle about 11 years ago

  • Target version changed from v0.60 to v0.61 - Cuttlefish
Actions #13

Updated by Josh Durgin about 11 years ago

Looks like I finally found a fix - using an explicitly asynchronous flush (instead of the sync flush made async by qemu coroutines) fixes the problem in my environment. The rest of the I/O through qemu already uses explicitly async calls, so it's something about the interaction with coroutines or the way in which qemu uses coroutines to make the sync flush async. I'd still like to dig deeper to see what the underlying issue is, and see whether it's a generic problem in qemu or a known bad idea to mix aio and qemu coroutines.

Actions #14

Updated by Josh Durgin about 11 years ago

There's no way around it - we need an async flush in librbd. Using coroutines vs callbacks doesn't matter in this case, if the flush is not async, there's no way for the coroutine to yield.

Actions #15

Updated by Josh Durgin about 11 years ago

  • Status changed from 7 to Fix Under Review
Actions #16

Updated by Josh Durgin about 11 years ago

  • Status changed from Fix Under Review to Resolved

commit:95c4a81be1af193786d0483fcbe81104d3da7c40 Note that the qemu patch still needs to get merged upstream (#4581).

Actions #17

Updated by Josh Durgin about 11 years ago

  • Tracker changed from Fix to Bug
  • Status changed from Resolved to Pending Backport
  • Backport set to bobtail
Actions #18

Updated by Stefan Priebe about 11 years ago

Thanks for your great work! Is there already a way / branch to test this with bobtail?

Actions #19

Updated by Josh Durgin about 11 years ago

  • Status changed from Pending Backport to 7

The branch wip-bobtail-rbd-backports-req-order has the fix for this plus several other bugs backported on top of the current bobtail branch. It passes simple testing, and is going through more thorough testing overnight.

Actions #20

Updated by Oliver Francke about 11 years ago

Hi Josh,

sounds promising, unfortunately I'm currently on 0.60... in our lab. We are going to move forward to latest bobtail next week in our productive env perhaps, do you think it will make it into this package?

Thnx n best regards,

Oliver.

Actions #21

Updated by Josh Durgin about 11 years ago

Yeah, the backports should definitely be merged by next week. On your lab cluster, you could try librbd from the 'next' branch, which has the librbd side of the fix for this.

Actions #22

Updated by Oliver Francke about 11 years ago

Well,

could it be, that the fix already made it into "ceph version 0.60 (f26f7a39021dbf440c28d6375222e21c94fe8e5c)"? I did not see any high latencies while writing...

Oliver.

Actions #23

Updated by Oliver Francke about 11 years ago

Ooops, sorry...,

was a bit misleaded, cause "cache=writeback" was still in the config file.

Oliver.

Actions #24

Updated by Wido den Hollander about 11 years ago

I just tested the Qemu patch with a cherry-pick to Qemu 1.2 and with the wip-bobtail-rbd-backports-req-order branch and that does indeed seem to improve the write performance a lot.

I saw about a 90% performance increase on this particular system.

Actions #25

Updated by Josh Durgin about 11 years ago

  • Status changed from 7 to Resolved

Thanks for testing it out everyone. It's now in the bobtail branch too.

Actions #26

Updated by Edwin Peer almost 11 years ago

Using ceph 0.61.2 and qemu 1.4.2 or earlier versions with the patch:

The following hangs after a few iterations:

phobos ~ # i=0; while [ $i -lt 30 ]; do dd if=/dev/zero of=test bs=4k count=1000000 conv=fdatasync; i=$[$i+1]; done
1000000+0 records in
1000000+0 records out
4096000000 bytes (4.1 GB) copied, 141.949 s, 28.9 MB/s
1000000+0 records in
1000000+0 records out
4096000000 bytes (4.1 GB) copied, 115.936 s, 35.3 MB/s

If I revert the qemu patch, then it no longer locks up, but the latency issue is present (even with caching disabled).

Any ideas?

Actions #27

Updated by Edwin Peer almost 11 years ago

Update: seems to work fine if I turn writeback caching back on again (previously turned off before patching).

Actions

Also available in: Atom PDF