Bug #21009: hammer:librbd: The qemu VMs hang occasionally after a snapshot is created. - rbd - Ceph

Actions

Copy link

Bug #21009

closed

hammer:librbd: The qemu VMs hang occasionally after a snapshot is created.

Added by yupeng chen over 6 years ago. Updated over 6 years ago.

Status:

Rejected

Priority:

Normal

Assignee:

Target version:

% Done:

Source:

other

Tags:

librbd,qemu

Backport:

hammer

Regression:

Severity:

2 - major

Reviewed:

Affected Versions:

Ceph - v0.94.5, Ceph - v0.94.6, Ceph - v0.94.7, Ceph - v0.94.8, Ceph - v0.94.9

ceph-qa-suite:

rbd

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

We're hosting hundreds of VMs with qemu and ceph as core infrastructure in the production environment. The ceph base version is 0.94.5 with 'rbd_non_blocking_aio' set to true.

For the security reasons, a snapshot of each rbd image is created, and then that snapshot is exported from the source cluster and reimported to another ceph cluster.

However, we found that several VMs hung during the backup phase.

And after googling, we find that http://tracker.ceph.com/issues/15033 mentions what we met.

So we applied the patch from https://github.com/ceph/ceph/pull/8011 hoping to resolve the problem. But when doing regression test, the VMs still hung occasionally.

Actions

Copy link

Updated by yupeng chen over 6 years ago

After investigating the backtrace and logs, we find a deadlock is possible in the following scenario:

1) OPs issued by qemu are queued in the aio_work_queue, and finally an aio_flush. After the aio_flush is processed,the Imagectx's async_ops looks like:
async_ops --> [aio_flush, aio_op_1, ] | |---> flush_contexts [C_AsyncCallback(C_AioWrite(AioCompletion))]

2) A snapshot is then created and more OPs arrive. When ThreadPool thread tries to process an OP, the newly create snapshot requires the ImageCtx be refreshed, leading to flush_async_operations(), so the thread blocks waiting for all the OPs in async_ops to complete.

async_ops -->  [aio_op_2, ..., aio_flush, aio_op_1, ]
                   |                        |
                   |                        |---> flush_contexts [C_AsyncCallback(C_AioWrite(AioCompletion))]
                   |
                   |-> flush_contexts [C_SafeCond]

3) aio_op_1 completes, triggering C_AsyncCallback's finish(), which then queue C_AioWrite(AioCompletion) to the ImageCtx's op_work_queue, waiting for the ThreadPool to process it but the thread blocks in flush_async_operations(). So the deadlock occurs.

After investigating the backtrace and logs, we find a deadlock is possible in the following scenario:

2) A snapshot is created and more OPs arrive. When ThreadPool thread tries to process an OP, the newly create snapshot requires the ImageCtx be refreshed, leading to flush_async_operations(), so the thread blocks waiting for all the OPs in async_ops to complete.

async_ops -->  [aio_op_2, ..., aio_flush, aio_op_1, ]
                   |                        |
                   |                        |---> flush_contexts [C_AsyncCallback(C_AioWrite(AioCompletion))]
                   |
                   |-> flush_contexts [C_SafeCond]

What we did to break the deadlock is to queue C_AioWrite(AioCompletion) using the ImageCtx's writeback_handler in C_AsyncCallback's finish() as https://github.com/ceph/ceph/pull/8011 did.

PR: https://github.com/ceph/ceph/pull/17045

Actions

Copy link

Updated by yupeng chen over 6 years ago

Sorry for the repeat.

Actions

Copy link

Updated by yupeng chen over 6 years ago

After investigating the backtrace and logs, we find a deadlock is possible in the following scenario:

1) OPs issued by qemu are queued in the aio_work_queue, and finally an aio_flush. After the aio_flush is processed,the Imagectx's async_ops looks like:

async_ops -->  [aio_flush, aio_op_1, ]
                            |
                            |---> flush_contexts [C_AsyncCallback(C_AioWrite(AioCompletion))]

async_ops -->  [aio_op_2, ..., aio_flush, aio_op_1, ]
                   |                        |
                   |                        |---> flush_contexts [C_AsyncCallback(C_AioWrite(AioCompletion))]
                   |
                   |-> flush_contexts [C_SafeCond]

What we did to break the deadlock is to queue C_AioWrite(AioCompletion) using the ImageCtx's writeback_handler in C_AsyncCallback's finish() as https://github.com/ceph/ceph/pull/8011 did.

PR: https://github.com/ceph/ceph/pull/17045

Actions

Copy link