Project

General

Profile

Bug #13726

QEMU hangs after creating snapshot and stopping VM

Added by Jason Dillaman about 3 years ago. Updated about 3 years ago.

Status:
Resolved
Priority:
Normal
Target version:
-
Start date:
11/09/2015
Due date:
% Done:

0%

Source:
Community (user)
Tags:
Backport:
hammer,infernalis
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:

Description

When RBD cache is disabled, taking a snapshot and stopping the VM results in a hung QEMU process. Setting "rbd_non_blocking_aio" to false apparently resolves the issue.

http://www.spinics.net/lists/ceph-devel/msg27170.html

ceph.client.log View (1010 KB) alexandre derumier, 11/09/2015 02:01 PM


Related issues

Related to rbd - Bug #14988: QEMU VM hangs talking to RBD via librbd Resolved 03/05/2016
Copied to rbd - Backport #13755: QEMU hangs after creating snapshot and stopping VM Resolved
Copied to rbd - Backport #13756: QEMU hangs after creating snapshot and stopping VM Resolved

Associated revisions

Revision bfeb90e5 (diff)
Added by Jason Dillaman about 3 years ago

librbd: fixed deadlock while attempting to flush AIO requests

In-flight AIO requests might force a flush if a snapshot was created
out-of-band. The flush completion was previously invoked asynchronously,
potentially via the same thread worker handling the AIO request. This
resulted in the flush operation deadlocking since it can't complete.

Fixes: #13726
Backport: infernalis, hammer
Signed-off-by: Jason Dillaman <>

Revision 9c33dcca (diff)
Added by Jason Dillaman about 3 years ago

librbd: fixed deadlock while attempting to flush AIO requests

In-flight AIO requests might force a flush if a snapshot was created
out-of-band. The flush completion was previously invoked asynchronously,
potentially via the same thread worker handling the AIO request. This
resulted in the flush operation deadlocking since it can't complete.

Fixes: #13726
Backport: infernalis, hammer
Signed-off-by: Jason Dillaman <>
(cherry picked from commit bfeb90e5fe24347648c72345881fd3d932243c98)

Revision 83c38802 (diff)
Added by Jason Dillaman about 3 years ago

librbd: fixed deadlock while attempting to flush AIO requests

In-flight AIO requests might force a flush if a snapshot was created
out-of-band. The flush completion was previously invoked asynchronously,
potentially via the same thread worker handling the AIO request. This
resulted in the flush operation deadlocking since it can't complete.

Fixes: #13726
Backport: infernalis, hammer
Signed-off-by: Jason Dillaman <>
(cherry picked from commit bfeb90e5fe24347648c72345881fd3d932243c98)

History

#1 Updated by alexandre derumier about 3 years ago

I have attached the client log,

the snapshot create command was :

rbd -p pooltest --image vm-162-disk-1 snap create --snap snap1

#2 Updated by Jason Dillaman about 3 years ago

In-flight AIO read request forces an image refresh due to out-of-band snapshot creation. Detecting the newly created snapshot forces librbd to flush all in-flight ops. If there are no in-flight ops to flush, it will enqueue a completion on the thread pool. However, the thread pool will be blocked handling the AIO request waiting for the flush to complete.

The synchronous ImageCtx::flush_async_operations() method cannot use async callbacks.

#3 Updated by Jason Dillaman about 3 years ago

  • Backport set to hammer,infernalis

#4 Updated by Jason Dillaman about 3 years ago

  • Status changed from New to In Progress
  • Assignee set to Jason Dillaman

#5 Updated by Jason Dillaman about 3 years ago

  • Status changed from In Progress to Need Review

#6 Updated by alexandre derumier about 3 years ago

I confirm this is fixed with this PR.

(tested on infernalis)

#7 Updated by Loic Dachary about 3 years ago

  • Status changed from Need Review to Pending Backport

#8 Updated by Loic Dachary about 3 years ago

  • Copied to Backport #13755: QEMU hangs after creating snapshot and stopping VM added

#9 Updated by Loic Dachary about 3 years ago

  • Copied to Backport #13756: QEMU hangs after creating snapshot and stopping VM added

#10 Updated by Jason Dillaman about 3 years ago

  • Status changed from Pending Backport to Resolved

#11 Updated by Nathan Cutler almost 3 years ago

  • Related to Bug #14988: QEMU VM hangs talking to RBD via librbd added

Also available in: Atom PDF