Feature #1576
openDeduplication support in RADOS
0%
Description
Hi,
Ceph and RADOS are wonderfull and allow us to scale the storage. But one great feature would be to have a deduplication feature in order to reduce the footprint of similar data stored multiple times by users in the cluster.
of course, one possibility for that could be to use the underlying FS, for exemple BTRFS with the dedup patches from Josef Bacik (http://lwn.net/Articles/422331/) but :
1) they don't seem to be included/supported by the btrfs team
2) that would be more generic to have it embedded in the ceph core, allowing us to use this feature with other FS.
Does it sounds you feasible or good feature to bring in ?
Regards,
Wilfrid Allembrand
Updated by Greg Farnum over 12 years ago
- Priority changed from Normal to Low
This is a blue-sky feature. It's been discussed in the past and will probably happen someday, but isn't an easy feature or something that we're going to devote resources to for a long while.
Updated by Wilfrid Allembrand over 12 years ago
Thanks Greg. Yep, for sure I'm aware you have other priorities first.
In the meantime it is implemented, are you aware or do you recommand any third party solution that we could could install in order to bring dedupe in a Ceph environment ?
Updated by Greg Farnum over 12 years ago
Not aware of anything like that, sorry!
Updated by Jon Skarpeteig about 9 years ago
Has there been any recent discussions about this feature?
Updated by Greg Farnum about 9 years ago
Only enough to make sure we won't be implementing a dedup feature for the foreseeable future. Doing one locally on the node can be done at layers below Ceph, and doing a distributed one across nodes is still a research project.