Project

General

Profile

Actions

Bug #56181

closed

mirror snapshot syncing progress reporting (last_copied_object_number) is broken for sparse images

Added by Ilya Dryomov almost 2 years ago. Updated almost 2 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
octopus,pacific,quincy
Regression:
Yes
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

$ rbd create --size 2G data/testimg
$ rbd mirror image enable data/testimg snapshot
Mirroring enabled
$ sudo rbd device map data/testimg
$ dd if=/dev/urandom of=/dev/rbd0 bs=4M oflag=direct
dd: error writing '/dev/rbd0': No space left on device
513+0 records in
512+0 records out
2147483648 bytes (2.1 GB, 2.0 GiB) copied, 37.0197 s, 58.0 MB/s
$ sudo rbd device unmap data/testimg
$ rbd mirror image snapshot data/testimg
Snapshot ID: 8

Monitor last_copied_object_number and complete fields on the secondary cluster:

$ rbd snap ls --all --format=json data/testimg | jq '.[-1] | .namespace.last_copied_object_number,.namespace.complete'

For a fully allocated image, last_copied_object_number is gradually increasing from 1 to 512 (image size / object size). The snapshot is marked complete when last_copied_object_number reaches 512, as expected:

1                                                                                         
false                                                                                     
---                                                                                       
19                                                                                        
false                                                                                     
---                                                                                       
42                                                                                        
false                                                                                     
---  
50
false
---
72
false
---
92
false
---
107
false
---
124
false
---
142
false
---
161
false
---
181
false
---
200
false
---
214
false
---
235
false
---
255
false
---
270
false
---
288
false
---
305
false
---
320
false
---
338
false
---
355
false
---
370
false
---
386
false
---
402
false
---
418
false
---
432
false
---
451
false
---
465
false
---
484
false
---
502
false
---
512
true
---
512
true
---
512
true

For an image with holes in it, last_copied_object_number gets tripped over the first hole. The sync continues but last_copied_object_number isn't updated. Eventually the snapshot is marked complete with last_copied_object_number still stuck:

$ rbd create --size 2G data/testimg
$ rbd mirror image enable data/testimg snapshot
Mirroring enabled
$ sudo rbd device map data/testimg
$ dd if=/dev/urandom of=/dev/rbd0 bs=4M count=100 oflag=direct
100+0 records in
100+0 records out
419430400 bytes (419 MB, 400 MiB) copied, 15.9774 s, 26.3 MB/s
$ dd if=/dev/urandom of=/dev/rbd0 bs=4M seek=110 oflag=direct
dd: error writing '/dev/rbd0': No space left on device
403+0 records in
402+0 records out
1686110208 bytes (1.7 GB, 1.6 GiB) copied, 65.5053 s, 25.7 MB/s
$ sudo rbd device unmap data/testimg
$ rbd mirror image snapshot data/testimg
Snapshot ID: 16
1                                                                                         
false                                                                                     
---                                                                                       
5                                                                                         
false                                                                                     
---                                                                                       
29                                                                                        
false                                                                                     
---                                                                                       
40                                                                                        
false                                                                                     
---                                                                                       
51                                                                                        
false                                                                                     
---                                                                                       
64                                                                                        
false                                                                                     
---                                                                                       
73                                                                                        
false                                                                                     
---                                                                                       
85                                                                                        
false                                                                                     
---                                                                                       
87                                                                                        
false                                                                                     
---                                                                                       
99                                                                                        
false                                                                                     
---                                                                                       
99                                                                                        
false                                                                                     
---                                                                                       
99                                                                                        
false                                                                                     
---

[...]

---
99
false
---
99
false
---
99
false
---
100
true
---
100
true
---
100
true

This can also be observed by monitoring syncing_percent field in "rbd mirror image status" output.


Related issues 3 (0 open3 closed)

Copied to rbd - Backport #56430: quincy: mirror snapshot syncing progress reporting (last_copied_object_number) is broken for sparse imagesResolvedIlya DryomovActions
Copied to rbd - Backport #56431: octopus: mirror snapshot syncing progress reporting (last_copied_object_number) is broken for sparse imagesResolvedIlya DryomovActions
Copied to rbd - Backport #56432: pacific: mirror snapshot syncing progress reporting (last_copied_object_number) is broken for sparse imagesResolvedIlya DryomovActions
Actions #1

Updated by Ilya Dryomov almost 2 years ago

  • Description updated (diff)
Actions #2

Updated by Ilya Dryomov almost 2 years ago

  • Description updated (diff)
Actions #3

Updated by Ilya Dryomov almost 2 years ago

  • Description updated (diff)
Actions #4

Updated by Ilya Dryomov almost 2 years ago

  • Backport set to octopus,pacific,quincy
  • Pull request ID set to 46858
Actions #5

Updated by Ilya Dryomov almost 2 years ago

  • Status changed from In Progress to Fix Under Review
Actions #6

Updated by Ilya Dryomov almost 2 years ago

  • Regression changed from No to Yes
Actions #7

Updated by Ilya Dryomov almost 2 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #8

Updated by Backport Bot almost 2 years ago

  • Copied to Backport #56430: quincy: mirror snapshot syncing progress reporting (last_copied_object_number) is broken for sparse images added
Actions #9

Updated by Backport Bot almost 2 years ago

  • Copied to Backport #56431: octopus: mirror snapshot syncing progress reporting (last_copied_object_number) is broken for sparse images added
Actions #10

Updated by Backport Bot almost 2 years ago

  • Copied to Backport #56432: pacific: mirror snapshot syncing progress reporting (last_copied_object_number) is broken for sparse images added
Actions #11

Updated by Ilya Dryomov almost 2 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF