Project

General

Profile

Actions

Bug #52810

closed

RefreshRequest can fail with ENOENT if raced with "rbd flatten"

Added by Ilya Dryomov over 2 years ago. Updated 4 months ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
backport_processed
Backport:
pacific,quincy
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This appears to be fairly easy to hit when combined with auto-deletion of trashed snapshots:

$ rbd snap create a@snap
Creating snap: 100% complete...done.
$ rbd clone --rbd-default-clone-format 2 a@snap b
$ rbd snap rm a@snap
Removing snap: 100% complete...done.
$ rbd flatten b
$ rbd info b
rbd image 'b':
        size 1 GiB in 256 objects
        order 22 (4 MiB objects)
        snapshot_count: 0
        id: 47904bf44ef0
        block_name_prefix: rbd_data.47904bf44ef0
        format: 2
        features: layering, deep-flatten, operations
        op_features: clone-child
        flags: 
        create_timestamp: Thu Sep 30 07:28:38 2021
        access_timestamp: Thu Sep 30 07:28:38 2021
        modify_timestamp: Thu Sep 30 07:28:38 2021
        parent: rbd/a@5272fce1-e70c-44f9-a7b5-ba52828ab6e7
        overlap: 1 GiB
$ rbd info b
2021-09-30T07:28:45.707-0400 7f9709aa4700 -1 librbd::image::OpenRequest: failed to set image snapshot: (2) No such file or directory
2021-09-30T07:28:45.711-0400 7f970a2a5700 -1 librbd::image::RefreshParentRequest: failed to open parent image: (2) No such file or directory
2021-09-30T07:28:45.711-0400 7f970a2a5700 -1 librbd::image::RefreshRequest: failed to refresh parent image: (2) No such file or directory
2021-09-30T07:28:45.711-0400 7f970a2a5700 -1 librbd::image::OpenRequest: failed to refresh image: (2) No such file or directory
rbd: error opening image b: (2) No such file or directory
$ rbd info b
rbd image 'b':
        size 1 GiB in 256 objects
        order 22 (4 MiB objects)
        snapshot_count: 0
        id: 47904bf44ef0
        block_name_prefix: rbd_data.47904bf44ef0
        format: 2
        features: layering, deep-flatten
        op_features: 
        flags: 
        create_timestamp: Thu Sep 30 07:28:38 2021
        access_timestamp: Thu Sep 30 07:28:38 2021
        modify_timestamp: Thu Sep 30 07:28:38 2021

ceph-csi calls "rbd info" in a number of places and sporadic ErrImageNotFound failures result in stale images being left behind (i.e. a slowly-growing space leak).


Related issues 3 (1 open2 closed)

Related to rbd - Bug #52910: rbd du | rbd ls -l | ... rbd: xxx failed: (2) No such file or directoryNew

Actions
Copied to rbd - Backport #57452: pacific: RefreshRequest can fail with ENOENT if raced with "rbd flatten"ResolvedIlya DryomovActions
Copied to rbd - Backport #57453: quincy: RefreshRequest can fail with ENOENT if raced with "rbd flatten"ResolvedIlya DryomovActions
Actions #1

Updated by Ilya Dryomov over 2 years ago

  • Backport set to octopus,pacific
Actions #2

Updated by Ilya Dryomov about 2 years ago

  • Related to Bug #52910: rbd du | rbd ls -l | ... rbd: xxx failed: (2) No such file or directory added
Actions #3

Updated by Ilya Dryomov over 1 year ago

  • Priority changed from High to Urgent
  • Backport changed from octopus,pacific to pacific,quincy
Actions #4

Updated by Ilya Dryomov over 1 year ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 47987
Actions #5

Updated by Ilya Dryomov over 1 year ago

  • Status changed from Fix Under Review to Pending Backport
Actions #6

Updated by Backport Bot over 1 year ago

  • Copied to Backport #57452: pacific: RefreshRequest can fail with ENOENT if raced with "rbd flatten" added
Actions #7

Updated by Backport Bot over 1 year ago

  • Copied to Backport #57453: quincy: RefreshRequest can fail with ENOENT if raced with "rbd flatten" added
Actions #8

Updated by Backport Bot over 1 year ago

  • Tags set to backport_processed
Actions #9

Updated by Backport Bot about 1 year ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Actions #10

Updated by yite gu 4 months ago

In my this case(https://tracker.ceph.com/issues/63888), can I remove parent key by command `rados rmomapkey rbd_header.xxxxxx parent` to restore access image b?

Actions #11

Updated by Ilya Dryomov 4 months ago

I don't think there is any relation to https://tracker.ceph.com/issues/63888 here.

Actions

Also available in: Atom PDF