Project

General

Profile

Feature #40285

Updated by Patrick Donnelly almost 5 years ago

The main goal of this feature is to support moving whole trees to cheaper storage hardware. This can be done manually by the client by opening a file, setting a file layout to use another lower-cost data pool, and then copying the file data over. This is laborious and presents consistency issues if the file is opened in parallel by another application. 

 So the idea of this feature is to support an extended attribute like: 

 <pre> 
     

     ceph.[dir|file].migrate.<layout> 
 </pre> 

 This would update the inode for the dir/file with the new *migration layout* along with some epoch (globally incremented). The MDS would the insert this inode into a migration queue similar to the existing purge queue. 

 The MDS would be responsible for pulling inodes off of the migration queue and updating the actual layout. For directories, this means queueing the directory's dentries (inodes) for migration as well. For files, the MDS needs to obtain all locks on the inode's contents and perform the actual migration. This operation prevents other clients from using the file into the migration completes. Once the migration finishes, the inode's file layout can be updated. 

 Tricky implementation points: 

 * How to respond to migration layout updates on an inode while a migration is in progress? This is part of the reason for the epoch as it allows us to ignore older migration layout updates (in the event of hard linked files in multiple directories). I think one reasonable approach to this is to re-insert the inode into the back of the migration queue continually until the current migration completes. 

 * How to improve performance here: we can trivially multithread the migration process but we need to consider the ramifications of the MDS using many more cores.

Back