Project

General

Profile

Actions

Bug #55241

open

rbd export complains about not existing snapshot and fails

Added by Peter Gervai about 2 years ago. Updated almost 2 years ago.

Status:
Need More Info
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
rbd
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

May be related to #18367 (which was closed due to lack of feedback), but not completly the same.

This fails:

# rbd --cluster ceph --pool triple export --export-format 2 vm-7010-disk-1 -  > /dev/null
error setting snapshot context: (2) No such file or directory
Exporting image: 0% complete...failed.
rbd: export error: (2) No such file or directory

No snapshots:

# rbd  -p triple snap ls vm-7010-disk-1
# 

But object-map is in ruins (which should not matter but maybe it does), also snapshot count is suspiciously one:

# rbd  -p triple info vm-7010-disk-1
rbd image 'vm-7010-disk-1':
        size 400 GiB in 102400 objects
        order 22 (4 MiB objects)
        snapshot_count: 1
        id: b149126b8b4567
        block_name_prefix: rbd_data.b149126b8b4567
        format: 2
        features: layering, exclusive-lock, object-map, fast-diff, operations
        op_features: snap-trash
        flags: object map invalid, fast diff invalid
        create_timestamp: Wed Oct 16 14:47:11 2019

The beliefs of rados are:

# rados -p triple listomapvals rbd_header.b149126b8b4567  | more
create_timestamp
value (8 bytes) :
00000000  4f 11 a7 5d f9 2f bf 00                           |O..]./..|
00000008

features
value (8 bytes) :
00000000  1d 01 00 00 00 00 00 00                           |........|
00000008

flags
value (8 bytes) :
00000000  03 00 00 00 00 00 00 00                           |........|
00000008

object_prefix
value (27 bytes) :
00000000  17 00 00 00 72 62 64 5f  64 61 74 61 2e 62 31 34  |....rbd_data.b14|
00000010  39 31 32 36 62 38 62 34  35 36 37                 |9126b8b4567|
0000001b

op_features
value (8 bytes) :
00000000  08 00 00 00 00 00 00 00                           |........|
00000008

order
value (1 bytes) :
00000000  16                                                |.|
00000001

size
value (8 bytes) :
00000000  00 00 00 00 64 00 00 00                           |....d...|
00000008

snap_seq
value (8 bytes) :
00000000  99 da 01 00 00 00 00 00                           |........|
00000008

snapshot_000000000001bca5
value (108 bytes) :
00000000  08 08 66 00 00 00 a5 bc  01 00 00 00 00 00 24 00  |..f...........$.|
00000010  00 00 65 61 33 34 39 32  65 62 2d 30 66 38 65 2d  |..ea3492eb-0f8e-|
00000020  34 63 33 65 2d 38 61 33  66 2d 64 37 35 63 30 63  |4c3e-8a3f-d75c0c|
00000030  30 39 31 62 62 63 00 00  00 00 64 00 00 00 00 03  |091bbc....d.....|
00000040  00 00 00 00 00 00 00 01  01 12 00 00 00 02 00 00  |................|
00000050  00 06 00 00 00 76 7a 64  75 6d 70 00 00 00 00 d9  |.....vzdump.....|
00000060  2d 1f 62 7c 38 f7 13 00  00 00 00 00              |-.b|8.......|
0000006c

Unfortunately rebuilding object-map on a running vm isn't quote possible (mounted by krbd and updated, so rebuild loses lock faster than it could move on).

# rbd -v
ceph version 16.2.7 (f9aa029788115b5df5eeee328f584156565ee5b7) pacific (stable)
Actions #1

Updated by Ilya Dryomov about 2 years ago

Hi Peter,

What is the output of "rbd snap ls --all"? It looks like the snapshot in question is in the RBD trash bin.

Actions #2

Updated by Peter Gervai about 2 years ago

Ilya Dryomov wrote:

What is the output of "rbd snap ls --all"? It looks like the snapshot in question is in the RBD trash bin.

Indeed:

# rbd --cluster ceph --pool triple snap ls --all vm-7010-disk-1
SNAPID  NAME                                  SIZE     PROTECTED  TIMESTAMP                 NAMESPACE     
113829  ea3492eb-0f8e-4c3e-8a3f-d75c0c091bbc  400 GiB             Wed Mar  2 09:42:01 2022  trash (vzdump)

I'm not sure what the moral of the story is; probably that there ought to be some suggestion about possible directions to look at. I wasn't even aware that snapshots can be separately trashed, and it was probably created by a runaway background process.

Probably the message of export could describe the [possible] problem in a little more detail.

For this specific case I thank you for your help!

Actions #3

Updated by Peter Gervai about 2 years ago

Tried to check to follow on a mail (https://www.mail-archive.com/ceph-users@lists.ceph.com/msg53551.html) which seemed to be related, since I wanted to figure out how to get rid of that snapshot, but got this:

# rbd -p triple children --snap-id 113829 vm-7010-disk-1
2022-04-08T22:56:34.896+0200 7f3a35667700 -1 librbd::object_map::RefreshRequest: failed to load object map: rbd_object_map.b149126b8b4567.000000000001bca5
2022-04-08T22:56:34.908+0200 7f3a35667700 -1 librbd::object_map::InvalidateRequest: 0x7f3a18015280 should_complete: r=0

Actions #4

Updated by Ilya Dryomov about 2 years ago

The clone of that snapshot might itself be in the trash bin. What is the output of "rbd -p triple children vm-7010-disk-1 --all --descendants"? What is the output of "rbd -p triple trash ls --all --long"?

Actions #5

Updated by Peter Gervai about 2 years ago

  1. rbd -p triple children vm-7010-disk-1 --all --descendants
    (empty)
  1. rbd -p triple trash ls --all --long
    ID NAME SOURCE DELETED_AT STATUS PARENT
    025cdf331ef784 vm-2003-disk-0 USER Fri Apr 8 12:02:44 2022 expired at Fri Apr 8 12:02:44 2022
    (unrelated)
Actions #6

Updated by Ilya Dryomov about 2 years ago

You should be able to get rid of that snapshot with "rbd -p triple snap rm --snap-id 113829 vm-7010-disk-1".

Actions #7

Updated by Peter Gervai about 2 years ago

Yes, in the meantime I have found another similar image to test on (wanted to keep this if you happened to want any further info), and the by-id removal worked fine, and I'm sure it will work on this one, too.

I guess rbd could be a bit more helpful here as it seems to be a bit hard to see why it doesn't work.

Also I asked around the irc and nobody seemed to know what one's supposed to do with trashed images with snapshots, since (according to rbd doc) they cannot be removed or purged, so it's not clear what's the use of trashing non-removeable objects. (I can, and indeed should, restore them, and that seems to be all.)

For me this is resolved, for now, thank you; I am not sure people could find this solution without asking for help of someone who already familar with the signs.

Actions #8

Updated by Ilya Dryomov about 2 years ago

Peter Gervai wrote:

Also I asked around the irc and nobody seemed to know what one's supposed to do with trashed images with snapshots, since (according to rbd doc) they cannot be removed or purged, so it's not clear what's the use of trashing non-removeable objects. (I can, and indeed should, restore them, and that seems to be all.)

This is not a trashed image with snapshots, rather this is an image with a trashed snapshot. A trashed snapshot on an otherwise "normal" image occurs when one attempts to remove a snapshot that still has clones that depend on it. In that case "rbd snap rm" moves that snapshot to the trash bin, pending either the flatten or the removal of all clones that are based off of that snapshot. The snapshot is supposed to be removed automatically when that happens, but it seems like that did not happen here for some reason.

For me this is resolved, for now, thank you; I am not sure people could find this solution without asking for help of someone who already familar with the signs.

You mentioned a runaway background process earlier. Could you share some details on the automation you are using to create snapshots/clones and to remove them?

How many images in this state did you have? Was there something in common between them?

Do you recall if there were clone(s) based off of "vzdump" snapshot -- when were they created, how/when where they removed, etc?

Actions #9

Updated by Ilya Dryomov almost 2 years ago

  • Status changed from New to Need More Info
Actions

Also available in: Atom PDF