Project

General

Profile

Bug #38184

osd: recovery does not preserve copy-on-write allocations between object clones after 'rbd revert'

Added by Vitaliy Filippov about 1 month ago. Updated 21 days ago.

Status:
Verified
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
Start date:
02/05/2019
Due date:
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:

Description

Hi. I've already reported it in issue 36614, but here is a more concrete case.

- Start with a bluestore Ceph cluster
- Create an RBD image
- Fill it with data
- Remember disk space used by the image as X
- Create a snapshot of it
- Immediately revert to it (rbd snap revert)
- After revert finishes you'll see that there was still X space used, but object count in the cluster is doubled
- Trigger a massive rebalance in the cluster
- After rebalance finishes you'll see that the image's objects residing in moved PGs now use 2*X disk space. This is because virtual clones stop being virtual after their data is moved
- Now run rbd snap revert again
- You'll see the space usage drop. This is because "virtual clones" become "virtual" again.

I think it's a bug and should be fixed. It had led to a bad situation in our cluster once, described in issue 36614.

History

#1 Updated by Vitaliy Filippov 22 days ago

Anyone?

#2 Updated by Sage Weil 21 days ago

  • Project changed from bluestore to RADOS
  • Subject changed from Virtual clones break and begin to eat space after rebalancing to osd: recovery does not preserve copy-on-write allocations between object clones after 'rbd revert'
  • Status changed from New to Verified

This is indeed the current behavior. The OSD isn't clever enough to preserve the shared allocations across recovery. It is a large effort to change this.

Also available in: Atom PDF