Bug #20388: combination of kvm using librbd from kraken and online resize leads to data corruption - rbd - Ceph

Actions

Copy link

Bug #20388

closed

combination of kvm using librbd from kraken and online resize leads to data corruption

Added by Yann Dupont almost 7 years ago. Updated over 6 years ago.

Status:

Closed

Priority:

High

Assignee:

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

Ceph - v11.2.1

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

Hi everybody. We experimented big data corruption recently. I've been able to reproduce it and I suspect librbd from kraken. Here are some steps which leads to a reproducible behavior.

-> Start a fresh standard debian and with Jewel librbd & librados (deb http://download.ceph.com/debian-jewel jessie main)

~~> Launch a VM on this machine, using some volumes from a ceph cluster with librbd.~~
> Use the VM, do an online resize of ceph volume : All is OK (of course, need to restart qemu to benefit the extra space)

Now Stop vm, change libraries for kraken ones (deb http://download.ceph.com/debian-kraken jessie main)

- restart the VM, do an online resize of volume : resize operation is stuck forever (and notice some virtio or scsi errors on your vm).
In fact resize is stuck until you stop your VM. As soon as your vm is stopped , resize operation succeed, BUT...
Your data is lost. Now volume is filled with zeroes.

Please note : unmouting the volume from the vm isn't sufficient ; resize operation is stuck (until vm is stopped), and data corruption occurs.

Stopping VM (qemu stops) and doing resize with vm stopped seems safe.

Can somebody try to reproduce the issue ?

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » rbd

Custom queries

Bug #20388

combination of kvm using librbd from kraken and online resize leads to data corruption

Updated by Yann Dupont almost 7 years ago

Updated by Jason Dillaman almost 7 years ago

Updated by Yann Dupont almost 7 years ago

Updated by Jason Dillaman almost 7 years ago

Updated by Jason Dillaman over 6 years ago