Cleanup #16130
closed
Proxied operations shouldn't result in error messages if replayed
Added by Jason Dillaman almost 8 years ago.
Updated over 7 years ago.
Description
For example, if the "snap rename" operation is replayed, it's possible that the second SnapshotRenameRequest state machine will encounter '-EEXISTS' when testing for the destination. This should apply to snap create, snap rename, snap remove, snap protect, snap unprotect, and rename.
# rbd bench-write -p cephfs_data --image testing2 --io-size 10240 &
# for i in {1..100}; do rbd snap rename cephfs_data/testing2@snap$i cephfs_data/testing2@snappey$i; done
2016-05-30 08:52:11.847718 7f8329ffb700 -1 librbd::SnapshotRenameRequest: encountered error: (17) File exists
2016-05-30 08:52:12.293367 7f8329ffb700 -1 librbd::SnapshotRenameRequest: encountered error: (17) File exists
2016-05-30 08:52:13.731278 7f8329ffb700 -1 librbd::SnapshotRenameRequest: encountered error: (17) File exists
2016-05-30 08:52:16.385266 7f8329ffb700 -1 librbd::SnapshotRenameRequest: encountered error: (17) File exists
2016-05-30 08:52:20.514545 7f8329ffb700 -1 librbd::SnapshotRenameRequest: encountered error: (17) File exists
2016-05-30 08:52:24.469216 7f8329ffb700 -1 librbd::SnapshotRenameRequest: encountered error: (17) File exists
- Status changed from New to In Progress
- Assignee set to Vikhyat Umrao
@Jason Borden , I hope here you meant we should not return replay errors in Snapshot<Type>Request . For types : snap create, snap rename, snap remove, snap protect, snap unprotect, and rename if replayed.
- I have tested in latest master and I am not getting these error messages from Snapshot<Type>Request, if replayed :
./rbd snap rename rbd/testrbd@snap1 rbd/testrbd@snap1
rbd: renaming snap failed: (17) File exists
./rbd -p rbd ls -l
NAME SIZE PARENT FMT PROT LOCK
testrbd 1024k 2
testrbd@snap1 1024k 2 yes
testrbd@testsnap 1024k 2
./rbd snap protect rbd/testrbd@snap1
rbd: protecting snap failed: (16) Device or resource busy
$ ./rbd snap unprotect rbd/testrbd@snap1
$ ./rbd snap unprotect rbd/testrbd@snap1
rbd: unprotecting snap failed: (22) Invalid argument
- This is my current understanding or I am missing something ?
@Vikhyat Umrao: yes, we should suppress the lderr log messages from the state machines if it's possible to hit the error from a duplicated op (e.g. ldout but don't lderr in SnapshotRemoveRequest if you hit -ENOENT attempting to remove a snapshot).
- Status changed from In Progress to Fix Under Review
- Status changed from Fix Under Review to Resolved
- Status changed from Resolved to Pending Backport
- Backport set to jewel
- Copied to Backport #17481: jewel: Proxied operations shouldn't result in error messages if replayed added
- Status changed from Pending Backport to Resolved
Also available in: Atom
PDF