Bug #49939
closedcephfs-mirror: be resilient to recreated snapshot during synchronization
100%
Description
The mirror daemon works with snapshots paths. It does rely on snap-id to infer deleted and renamed snapshots, but once a snapshot is picked up for synchronization, accessing snapshots data is path-based.
This can lead to nasty races such as transferring incorrect snapshot data (to the remote file system), when a snapshot (that is just picked for synchronization) is deleted and a new snapshot (with the same name, but possible different contents) gets created.
Patrick suggested to introduce snapshot paths based on snap-ids such as /path/to/dir/.snap/<snap-id>/... and resolve the snap-id to snapshot name (or maybe /path/to/dir/.snap/_<snap-d>_<snap-name>/... making use of the fact that snashot names begining with "_" are disallowed).
Alternatively, we could introduce *at() family of APIs in libcephfs to mitigate this issue.
Updated by Patrick Donnelly about 3 years ago
- Status changed from New to In Progress
- Assignee set to Venky Shankar
Updated by Venky Shankar about 3 years ago
So, I am experimenting with how MDS handles path traversals when just an inode number rather than inode number+dname especially when the inode is deleted by a client when another client has an open fd on it. We already have `fstatx()` implemented in the Client (libcephfs) -- so I forced a getattr on the (deleted inode) and the MDS seems to handle it (i.e. resovle the inode and return inode attrs).
Seems like its pretty much straightforward to implement *at() family of calls. I went ahead and added those that would be required for the mirror daemon to safely walk snapshot contents (fstatxat, openat, readlinkat, etc..). Haven't tested it yet w/ the mirror daemon, but I'll get to it soon.
However, the library changes should be out as a PR really soon.
@Patrick ^^
Updated by Venky Shankar about 3 years ago
- Status changed from In Progress to Fix Under Review
- Pull request ID set to 40831
Updated by Venky Shankar almost 3 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Backport Bot almost 3 years ago
- Copied to Backport #50994: pacific: cephfs-mirror: be resilient to recreated snapshot during synchronization added
Updated by Loïc Dachary almost 3 years ago
- Status changed from Pending Backport to Resolved
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".