Project

General

Profile

Actions

Bug #52485

open

rbd-mirror: trashed and linked secondary image cannot be restored

Added by Lubo Fr over 2 years ago. Updated over 2 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Following discussion in PR #41696

When secondary image has cloned snapshots, primary image removal leaves only one option : to remove secondary clones as secondary image is left by rbd-mirror in REMOVING state.
If secondary image was in NORMAL state, user could also choose to restore secondary image and make it primary.

Steps to reproduce :

```
# ceph version 16.2.5 (9b9dd76e12f1907fe5dcc0c1fadadbb784022a42) pacific (stable)

pool=
image=
snapshot=
primary=
secondary=

rbd snap create $pool/$image@$snapshot --cluster=$primary
rbd snap protect $pool/$image@$snapshot --cluster=$primary
rbd mirror image enable $pool/$image snapshot --cluster=$primary
# wait sync

rbd clone $pool/$image@$snapshot $pool/$image-$snapshot  --cluster=$secondary
rbd snap unprotect $pool/$image@$snapshot --cluster=$primary
# in secondary log :
# rbd::mirror::image_replayer::snapshot::ApplyImageStateRequest: 0x55aeea733b00 handle_unprotect_snapshot: failed to unprotect snapshot shot: (16) Device or resource busy
# rbd::mirror::image_replayer::snapshot::Replayer: 0x55aef051d800 handle_apply_image_state: failed to apply remote image state to local image: (16) Device or resource busy

rbd snap purge $pool/$image --cluster=$primary
rbd rm $pool/$image --cluster=$primary
# in secondary log :
# rbd::mirror::ImageReplayer: 0x55aeec032500 [6/80cdd255-5be3-4362-ba48-d39842ee0d80] handle_shut_down: remote image no longer exists: scheduling deletion
# rbd::mirror::ImageReplayer: 0x55aeec032500 [6/80cdd255-5be3-4362-ba48-d39842ee0d80] handle_shut_down: mirror image no longer exists

rbd info $pool/$image-$snapshot --cluster=$secondary
# parent: $pool/$image@$snapshot (trash $trash)
trash=

rbd trash ls -p $pool -l --all --cluster=$secondary
# ID      NAME   SOURCE     DELETED_AT                STATUS                               PARENT
# $trash  image  MIRRORING  Wed Sep  1 12:54:24 2021  expired at Wed Sep  1 12:54:24 2021

rbd trash rm $pool/$trash --force --cluster=$secondary
# rbd: image has snapshots - these must be deleted with 'rbd snap purge' before the image can be removed.

rbd trash restore $pool/$trash --cluster=$secondary
# rbd: restore error: (16) Device or resource busy
# -1 librbd::api::Trash: restore: error restoring image id $trash, which is pending deletion

rados -p $pool listomapvals rbd_trash --cluster=$secondary
#id_$trash
#value (33 bytes) :
#00000000  02 01 1b 00 00 00 01 05  00 00 00 69 6d 61 67 65  |...........image|
#00000010  e0 5b 2f 61 c3 7b 49 1b  e0 5b 2f 61 c3 7b 49 1b  |.[/a.{I..[/a.{I.|
#00000020  02                                                |.|

```

It is possible to restore the image by changing last byte from 02 to 00 but this isn't user-friendly.

From a different perspective, if rbd-mirror secondary removal was replaced by user actions, it would end in NORMAL not REMOVING state and restore would be possible :

rbd snap create $pool/$image@$snapshot
rbd snap protect $pool/$image@$snapshot
rbd clone $pool/$image@$snapshot $pool/$image-$snapshot

# rbd-mirror actions simulated by user on secondary cluster :
rbd snap unprotect $pool/$image@$snapshot
# fails as there is a clone
rbd trash mv $pool/$image
rbd trash rm $pool/$trash
# fails as there are snapshots but image is in NORMAL state and restore is possible

Actions #1

Updated by Lubo Fr over 2 years ago

If some admin can replace ``` with pre it would be easier to read the issue.

Actions #2

Updated by Deepika Upadhyay over 2 years ago

  • Description updated (diff)
Actions

Also available in: Atom PDF