Project

General

Profile

Actions

Bug #6494

closed

High memory consumption of qemu/librbd with enabled cache

Added by Ivan Mironov over 10 years ago. Updated over 9 years ago.

Status:
Resolved
Priority:
High
Assignee:
Target version:
-
% Done:

0%

Source:
other
Tags:
lin
Backport:
firefly, dumpling
Regression:
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I observe very high memory consumption on client with write-intensive load with qemu 1.6.0 + librbd 0.67.3.

For benchmarking purposes I'm trying to simultaneously run 15 VMs with 3 GiB of RAM on the one host. Each VM uses RBD image cloned from protected snapshot of "master image". After boot of each VM, "rpm -ihv" with a bunch of really large RPMs (~8 GiB of unpacked small files) is automatically started. Here is part of libvirt's XML of one of these VMs:

    <disk type='network' device='disk'>
      <driver name='qemu' type='raw' cache='writeback'/>
      <source protocol='rbd' name='storage-benchmark-vms/vm-image-1:rbd_cache=1:rbd_cache_max_dirty=134217728:rbd_cache_size=268435456:rbd_cache_max_dirty_age=20'>
        <host name='192.168.0.1' port='6789'/>
        <host name='192.168.0.2' port='6789'/>
        <host name='192.168.0.3' port='6789'/>
      </source>
      <target dev='hda' bus='ide'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>

Some time after start I can see unexpected growth of memory consumption of qemu-kvm processes:
 5565 qemu      20   0 9091m 7.3g  10m S  2.6  7.7   4:41.31 qemu-kvm
 5416 qemu      20   0 8059m 6.4g  10m S 27.8  6.8   4:40.93 qemu-kvm
 5490 qemu      20   0 6723m 5.3g  10m S 26.2  5.6   4:30.51 qemu-kvm
 5591 qemu      20   0 6475m 5.1g  10m S 39.1  5.3   4:35.68 qemu-kvm
 5390 qemu      20   0 6227m 4.9g  10m S  2.0  5.1   4:26.42 qemu-kvm
 5615 qemu      20   0 6203m 4.8g  10m S 27.5  5.1   4:34.56 qemu-kvm
 5692 qemu      20   0 6171m 4.8g  10m S 17.5  5.1   4:28.95 qemu-kvm
 5666 qemu      20   0 6163m 4.8g  10m S  2.0  5.1   4:29.66 qemu-kvm
 5740 qemu      20   0 6139m 4.8g  10m S 23.2  5.1   4:39.22 qemu-kvm
 5716 qemu      20   0 5899m 4.6g  10m S 20.2  4.8   4:30.84 qemu-kvm
 5539 qemu      20   0 5827m 4.5g  10m S  1.7  4.8   4:27.02 qemu-kvm
 5515 qemu      20   0 5651m 4.4g  10m S  4.6  4.7   4:25.20 qemu-kvm
 5640 qemu      20   0 5603m 4.3g  10m S  6.6  4.6   4:28.90 qemu-kvm
 5442 qemu      20   0 5373m 4.1g  10m S  2.3  4.4   4:28.45 qemu-kvm
 5466 qemu      20   0 5387m 4.1g  10m S 41.7  4.3   4:41.00 qemu-kvm

It could grow up further:
 5565 qemu 20 0 22.6g 18g 2772 S 2.6 20.0 6:07.40 qemu-kvm                                                                                                                                                                         

And then free some part of memory at some point:
 5565 qemu 20 0 8011m 6.0g 2796 S 2.3 6.3 6:23.10 qemu-kvm                                                                                                                                                                          

I tried to reduce cache size to defaults, as suggested on #ceph (replace "rbd_cache=1:rbd_cache_max_dirty=134217728:rbd_cache_size=268435456:rbd_cache_max_dirty_age=20" with just "rbd_cache=1"), but it didn't help much:
15297 qemu 20 0 7747m 6.1g 10m S 1.0 6.4 4:47.26 qemu-kvm 

Then I tried to disable cache (remove "cache='writeback'" and change rbd_cache to 0), and memory consumption became normal:
19590 qemu      20   0 4251m 3.0g  10m S  9.2  3.2   3:33.42 qemu-kvm
19526 qemu      20   0 4251m 3.0g  10m S  8.6  3.1   3:22.01 qemu-kvm
19399 qemu      20   0 4251m 3.0g  10m S  9.6  3.1   3:15.01 qemu-kvm
19612 qemu      20   0 4251m 3.0g  10m S  3.0  3.1   4:12.41 qemu-kvm
19568 qemu      20   0 4251m 3.0g  10m S  3.0  3.1   3:32.04 qemu-kvm
19632 qemu      20   0 4251m 3.0g  10m S  7.3  3.1   3:47.57 qemu-kvm
19419 qemu      20   0 4251m 3.0g  10m S  8.9  3.1   3:20.40 qemu-kvm
19484 qemu      20   0 4251m 3.0g  10m S  7.6  3.1   3:30.56 qemu-kvm
19676 qemu      20   0 4251m 3.0g  10m S  4.0  3.1   3:48.99 qemu-kvm
19654 qemu      20   0 4251m 3.0g  10m S  7.3  3.1   3:49.83 qemu-kvm
19464 qemu      20   0 4251m 3.0g  10m S  8.9  3.1   3:45.45 qemu-kvm
19441 qemu      20   0 4251m 3.0g  10m S  7.3  3.1   3:20.58 qemu-kvm
19377 qemu      20   0 4251m 3.0g  10m S  7.9  3.1   3:16.99 qemu-kvm
19548 qemu      20   0 4251m 3.0g  10m S  9.9  3.1   3:33.59 qemu-kvm
19506 qemu      20   0 4251m 3.0g  10m S  7.6  3.1   3:16.94 qemu-kvm

I also tried to drop all caches inside one of the VMs and see how memory usage of qemu-kvm will change:
killall -s STOP rpm
sync
echo 3 >/proc/sys/vm/drop_caches

But it didn't made any difference outside of VM (except CPU usage because of SIGSTOP).


Files

massif.out.8867 (56.1 KB) massif.out.8867 Ivan Mironov, 10/09/2013 03:00 AM
massif.out.11923 (5.12 MB) massif.out.11923 Ivan Mironov, 10/10/2013 07:29 AM
plot.svg (15.2 KB) plot.svg Ivan Mironov, 10/10/2013 07:29 AM
Actions

Also available in: Atom PDF