Project

General

Profile

Actions

Bug #64735

open

OSD/MON: rollback_to snap the latest overlap is not right

Added by dian xing about 2 months ago. Updated 29 days ago.

Status:
Fix Under Review
Priority:
Normal
Category:
Snapshots
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
quincy,reef,squid
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

when rollback_to snap, we use the latest clone's current overlap to intersection_of older snapshot's clone overlap.
we should update the latest clone's current overlap to [0, obs.oi.size] before intersection_of other clone's overlap.

How to reproduce:

rbd create ceph-vm-pool-1/test_rollback -s 4M
fio -ioengine=rbd -bs=4M -size=4M -rw=write -pool=ceph-vm-pool-1 -rbdname=test_rollback --group_report -name=test-direct=1 -numjobs=1 -iodepth=1
rbd snap create ceph-vm-pool-1/test_rollback@base
fio -ioengine=rbd -bs=4M -size=4M -rw=write -pool=ceph-vm-pool-1 -rbdname=test_rollback --group_report -name=test-direct=1 -numjobs=1 -iodepth=1
[root@test xingd]# rbd du --exact ceph-vm-pool-1/test_rollback
NAME               PROVISIONED USED
test_rollback@base       4 MiB 4 MiB
test_rollback            4 MiB 4 MiB
<TOTAL>                  4 MiB 8 MiB
rbd snap rollback ceph-vm-pool-1/test_rollback@base
[root@test xingd]# rbd du --exact ceph-vm-pool-1/test_rollback
NAME               PROVISIONED USED
test_rollback@base       4 MiB 4 MiB
test_rollback            4 MiB 4 MiB       ### this is not correct after rollback to snap
<TOTAL>                  4 MiB 8 MiB

Related issues 2 (2 open0 closed)

Related to crimson - Bug #65697: fix _do_rollback_to clone_overlapsNew

Actions
Copied to rbd - Bug #64874: post-rollback "rbd du" output is incorrectNew

Actions
Actions #1

Updated by Radoslaw Zarzynski about 2 months ago

  • Description updated (diff)
  • Assignee set to Matan Breizman

Hi Matan! Would you mind taking a look?

Actions #3

Updated by Matan Breizman about 2 months ago

  • Status changed from New to Need More Info

We should first understand whether this is a bug or intentional behavior, given the following order of operations:

1) create 4M image
2) write 4M [A]
3) take snap_1
4) write 4M [B]
5) rollback to snap

The disk usage is:

NAME                PROVISIONED  USED                                                                                                                                                                                                 
test_rollback@snap_1      4 MiB  4 MiB                                                                                                                                                                                                
test_rollback             4 MiB  4 MiB                                                                                                                                                                                                  
<TOTAL>                   8 MiB  8 MiB

Since the snapshot was taken before any writes [B] to the image,
the suggested change here suggests that the disk usage should actually be:

NAME                PROVISIONED  USED                                                                                                                                                                                                 
test_rollback@snap_1      4 MiB  4 MiB                                                                                                                                                                                                
test_rollback             4 MiB    0 B                                                                                                                                                                                                
<TOTAL>                   4 MiB  4 MiB 

The latter would not allow us to remove "test_rollback@snap_1" and return to "test_rollback" image.
---

To clarify with another example:

1) create 4M image
2) write 4M [A]
3) take snap_1
4) write 4M [B]
5) take snap_2
6) write 4M [C]

The disk usage is:
NAME                 PROVISIONED   USED 
test_rollback@snap_1       4 MiB  4 MiB                                                                                                                                                                                                                                                                                                                                                                                                
test_rollback@snap_2       4 MiB  4 MiB                                                                                                                                                                                                
test_rollback              4 MiB  4 MiB                                                                                                                                                                                                  
<TOTAL>                    4 MiB 12 MiB

Given the existing implementation, we are able to rollback to "test_rollback@snap_1" and
then to "test_rollback@snap_2" (back and forth) which wouldn't have been possible with the suggested change.
I'm not sure we want to disallow this option.

Actions #4

Updated by Ilya Dryomov about 2 months ago

Hi Matan,

We are able to roll back back and forth between arbitrary snapshots and the suggested change in https://github.com/ceph/ceph/pull/55991 (even if isn't quite right as is -- see Sam's comments in the PR) doesn't make that impossible because it's just an accounting fix.

There are two separate areas here:

- rollback op handler in the OSD which produces incorrect post-rollback overlap ranges -- visible in "rados listsnaps" output and not specific to RBD. This is what https://github.com/ceph/ceph/pull/55991 is trying to address.
- rbd CLI tool which produces "rbd du" output which is similarly incorrect but most likely (I can't claim this with absolute certainty because I wasn't involved) on purpose. It appears that the OSD bug was basically replicated there to make the numbers line up.

Once the OSD bug is fixed, I'll work on "rbd du". I'm going to clone this ticket to RBD for that purpose.

Actions #5

Updated by Ilya Dryomov about 2 months ago

Matan Breizman wrote:

the suggested change here suggests that the disk usage should actually be:
NAME PROVISIONED USED
test_rollback@snap_1 4 MiB 4 MiB
test_rollback 4 MiB 0 B
<TOTAL> 4 MiB 4 MiB

Yes, and the suggestion is correct.

The latter would not allow us to remove "test_rollback@snap_1" and return to "test_rollback" image.

I think you might be taking these numbers too literally. The output in the quote is what you get right after the snapshot is created. At that point one is certainly capable of removing the snapshot and post-removal output would be:

NAME                PROVISIONED  USED                                                                                                                                                                                                                                                                                                                                                                                                 
test_rollback             4 MiB  4 MiB                                                                                                                    
<TOTAL>                   4 MiB  4 MiB

After rolling back to test_rollback@snap_1, because it's the most recent snapshot, the situation (and the expected output) should be exactly the same. This is because rollback discards all changes made to image HEAD and makes it identical to the moment right after test_rollback@snap_1 was taken.

If one wanted to remove test_rollback@snap_1 after rolling back to it, they could and the expected output after removal would again be:

NAME                PROVISIONED  USED                                                                                                                                                                                                                                                                                                                                                                                                 
test_rollback             4 MiB  4 MiB                                                                                                                    
<TOTAL>                   4 MiB  4 MiB

Actions #6

Updated by Ilya Dryomov about 2 months ago

  • Copied to Bug #64874: post-rollback "rbd du" output is incorrect added
Actions #7

Updated by Ilya Dryomov about 2 months ago

Ilya Dryomov wrote:

This is because rollback discards all changes made to image HEAD and makes it identical to the moment right after test_rollback@snap_1 was taken.

Put another way: rollback is a destructive operation. One isn't expected to be able to go back to test_rollback (image HEAD prior to rollback) after running "rbd snap rollback test_rollback@snap_1".

Actions #8

Updated by Matan Breizman about 2 months ago

Ilya Dryomov wrote:

Put another way: rollback is a destructive operation. One isn't expected to be able to go back to test_rollback (image HEAD prior to rollback) after running "rbd snap rollback test_rollback@snap_1".

IIUC, that means that once we rollback to snap_1, snap_2 should no longer exist, right? Since it was taken after snap_1.
This requires a broader change that the one discussed here if so.

Actions #9

Updated by Ilya Dryomov about 2 months ago

Matan Breizman wrote:

Ilya Dryomov wrote:

Put another way: rollback is a destructive operation. One isn't expected to be able to go back to test_rollback (image HEAD prior to rollback) after running "rbd snap rollback test_rollback@snap_1".

IIUC, that means that once we rollback to snap_1, snap_2 should no longer exist, right? Since it was taken after snap_1.

No, snap2 would continue to exist and one should be able to "rollback" to it. Rollback is really a "make image HEAD look like this snap" operation -- it's destructive with respect to the image HEAD but not with respect to any of the snapshots.

Actions #10

Updated by Matan Breizman about 2 months ago

Ilya Dryomov wrote:

No, snap2 would continue to exist and one should be able to "rollback" to it. Rollback is really a "make image HEAD look like this snap" operation -- it's destructive with respect to the image HEAD but not with respect to any of the snapshots.

Ack, got it now. Thanks, will update once rados behavior is sorted.
Added some documentation here: https://github.com/ceph/ceph/pull/56173
I'll add the rollback case to it once ready.

Actions #11

Updated by Matan Breizman 29 days ago

  • Status changed from Need More Info to Fix Under Review
  • Backport set to quincy,reef,squid
  • Pull request ID set to 56696
Actions #12

Updated by Matan Breizman 3 days ago

  • Related to Bug #65697: fix _do_rollback_to clone_overlaps added
Actions

Also available in: Atom PDF