Bug #3737
Higher ping-latency observed in qemu with rbd_cache=true during disk-write
0%
Description
Hi Josh,
as per our short conversation in IRC-#ceph there is an issue with latency/responsiveness with rbd_cache enabled, no matter what cache= says.
In the lab we have qemu-1.2.2 currently as well as ceph version 0.56-111-ga14a36e (a14a36ed78d9febb7fbf1f6bf209d9bd58daace6)
Please advise necessary debug-switches to narrow down the problem.
Thnx and have a pretty good year to all of you ;)
Oliver.
Related issues
Associated revisions
librbd: add an async flush
At this point it's a simple wrapper around the ObjectCacher or
librados.
This is needed for QEMU so that its main thread can continue while a
flush is occurring. Since this will be backported, don't update the
librbd version yet, just add a #define that QEMU and others can use to
detect the presence of aio_flush().
Refs: #3737
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
librbd: add an async flush
At this point it's a simple wrapper around the ObjectCacher or
librados.
This is needed for QEMU so that its main thread can continue while a
flush is occurring. Since this will be backported, don't update the
librbd version yet, just add a #define that QEMU and others can use to
detect the presence of aio_flush().
Refs: #3737
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
(cherry picked from commit 302b93c478b3f4bc2c82bfb08329e3c98389dd97)
History
#1 Updated by Sage Weil about 11 years ago
- Priority changed from Normal to High
#2 Updated by Ian Colle about 11 years ago
- Project changed from Ceph to rbd
- Category deleted (
qemu) - Target version deleted (
v0.56)
#3 Updated by Oliver Francke about 11 years ago
- File 905_test.log.xz added
- File ping.log.xz added
Hi Josh,
according to our conversation I did some testing.
I started the dd if=/dev... of=/tmp/doof.dat bs=4k count=256000 at around 18:10:00 as you can assume with my ping.log.
I think highest RTT was 500ms. And all above let's say 3-5ms I do not see with rbd_cache=false.
Best regards,
Oliver.
#4 Updated by Chris Dunlop about 11 years ago
Confirmed here, with ceph-0.56.3 and qemu-1.3.1.
See attached test output.
A summary is, the average ping time, and the standard deviation of the same, is much worse with rbd_cache=1:
rbd_cache=0: Avg: 0.493 ms Std: 0.109 ms
rbd_cache=1: Avg: 148.107 ms Std: 219.786 ms
#5 Updated by Chris Dunlop about 11 years ago
Sigh. The attachment might help...
#6 Updated by Josh Durgin about 11 years ago
I've looked at the logs, and I think #4091 should fix this. The high ping times tend to occur around when the cache fills up, making aio_write() block.
#7 Updated by Sage Weil about 11 years ago
- Tracker changed from Bug to Fix
#8 Updated by Ian Colle about 11 years ago
- Target version set to v0.60
#9 Updated by Sage Weil about 11 years ago
- translation missing: en.field_story_points set to 8.00
#10 Updated by Neil Levine about 11 years ago
- Status changed from New to 12
#11 Updated by Sage Weil about 11 years ago
- Status changed from 12 to 7
#12 Updated by Ian Colle about 11 years ago
- Target version changed from v0.60 to v0.61 - Cuttlefish
#13 Updated by Josh Durgin about 11 years ago
Looks like I finally found a fix - using an explicitly asynchronous flush (instead of the sync flush made async by qemu coroutines) fixes the problem in my environment. The rest of the I/O through qemu already uses explicitly async calls, so it's something about the interaction with coroutines or the way in which qemu uses coroutines to make the sync flush async. I'd still like to dig deeper to see what the underlying issue is, and see whether it's a generic problem in qemu or a known bad idea to mix aio and qemu coroutines.
#14 Updated by Josh Durgin about 11 years ago
There's no way around it - we need an async flush in librbd. Using coroutines vs callbacks doesn't matter in this case, if the flush is not async, there's no way for the coroutine to yield.
#15 Updated by Josh Durgin almost 11 years ago
- Status changed from 7 to Fix Under Review
#16 Updated by Josh Durgin almost 11 years ago
- Status changed from Fix Under Review to Resolved
commit:95c4a81be1af193786d0483fcbe81104d3da7c40 Note that the qemu patch still needs to get merged upstream (#4581).
#17 Updated by Josh Durgin almost 11 years ago
- Tracker changed from Fix to Bug
- Status changed from Resolved to Pending Backport
- Backport set to bobtail
#18 Updated by Stefan Priebe almost 11 years ago
Thanks for your great work! Is there already a way / branch to test this with bobtail?
#19 Updated by Josh Durgin almost 11 years ago
- Status changed from Pending Backport to 7
The branch wip-bobtail-rbd-backports-req-order has the fix for this plus several other bugs backported on top of the current bobtail branch. It passes simple testing, and is going through more thorough testing overnight.
#20 Updated by Oliver Francke almost 11 years ago
Hi Josh,
sounds promising, unfortunately I'm currently on 0.60... in our lab. We are going to move forward to latest bobtail next week in our productive env perhaps, do you think it will make it into this package?
Thnx n best regards,
Oliver.
#21 Updated by Josh Durgin almost 11 years ago
Yeah, the backports should definitely be merged by next week. On your lab cluster, you could try librbd from the 'next' branch, which has the librbd side of the fix for this.
#22 Updated by Oliver Francke almost 11 years ago
Well,
could it be, that the fix already made it into "ceph version 0.60 (f26f7a39021dbf440c28d6375222e21c94fe8e5c)"? I did not see any high latencies while writing...
Oliver.
#23 Updated by Oliver Francke almost 11 years ago
Ooops, sorry...,
was a bit misleaded, cause "cache=writeback" was still in the config file.
Oliver.
#24 Updated by Wido den Hollander almost 11 years ago
I just tested the Qemu patch with a cherry-pick to Qemu 1.2 and with the wip-bobtail-rbd-backports-req-order branch and that does indeed seem to improve the write performance a lot.
I saw about a 90% performance increase on this particular system.
#25 Updated by Josh Durgin almost 11 years ago
- Status changed from 7 to Resolved
Thanks for testing it out everyone. It's now in the bobtail branch too.
#26 Updated by Edwin Peer almost 11 years ago
Using ceph 0.61.2 and qemu 1.4.2 or earlier versions with the patch:
The following hangs after a few iterations:
phobos ~ # i=0; while [ $i -lt 30 ]; do dd if=/dev/zero of=test bs=4k count=1000000 conv=fdatasync; i=$[$i+1]; done
1000000+0 records in
1000000+0 records out
4096000000 bytes (4.1 GB) copied, 141.949 s, 28.9 MB/s
1000000+0 records in
1000000+0 records out
4096000000 bytes (4.1 GB) copied, 115.936 s, 35.3 MB/s
If I revert the qemu patch, then it no longer locks up, but the latency issue is present (even with caching disabled).
Any ideas?
#27 Updated by Edwin Peer almost 11 years ago
Update: seems to work fine if I turn writeback caching back on again (previously turned off before patching).