Project

General

Profile

Bug #62596

Updated by Matan Breizman 9 months ago

Clusters affected by the SnapMapper malformed key conversion [1] (which was fixed) may still suffer from space leak caused by stale clone objects. 
 The leak may occur in the following scenario: 

 A cluster which had snapshots taken and was updated from N (and earlier) to O (up to 16.2.11 - *before the fix [2] was merged*). 
 If one of the snapshots which were taken before the update is removed, the clone objects of this snapshot will become stale. 
 Note: Even if non of the snapshots were removed yet, the key is still malformed and any future removal of this snapshot will cause the same effect. (Unless the SnapMapper key is fixed). 

 *** 

 The fix for the affected clusters includes a 2-step will include 2-phase procedure: 

 h3. 1) Fixing the key 

 This can be achieved in 2 ways: 
 *Q and later releases*: Scrub will remove and fix the corrupted keys to the correct structure [3]. using the newly introduced `fix_malformed_snapmapper_keys` command. 

 Note: Currently, no Pacific backport An alternative possibility is planned to re-deploy each osd since there is an alternative solution which is available for this step. 
 This may the keys can be changed and will be finally decided before P final release. fixed on recovery as well. 

 *P and later releases*: Re-deploying 2) Re-removing the affected OSDs. Once already removed snapshots using the OSD `force_reremove_snap` command. After the key is redeployed - the keys fixed, it will be recreated correctly. 

 h3. 2) Removing possible to map the stale clone objects 

 In order from the remove the stale clone objects, the removed (purged) snapshot should be re-removed once and clean the SnapMapper key space leak. 

 This fix is valid. 
 A _purged_snaps_scrub_ occurs in planned to be backported to P final release. Once the background every deep scrub interval which fix will handle the snapshot re-removal. 
 The _scrub_purged_snaps_ can also be called using an osd asock command without waiting for next deep scrub interval. 
 <pre><code class="text"> 
 ceph daemon osd.<id> scrub_purged_snaps 
 </code></pre> 

 *** 

 The _last_scrub_purged_snaps_ timestamp is part of the OSDSuperblock both ready and can verified in main - there will be obtained using a better estimation on the ceph-objectstore-tool: 
 <pre><code class="text"> 
 ceph-objectstore-tool --data-path <store_path> --op dump-super | grep last_purged_snaps_scrub 
 </code></pre> backport strategy and timing. 

 *** 

 To verify if a cluster is was affected, malformed keys can be identified using the `ceph-kvstore-tool`. 
 The following command can be run offline only: 

 <pre><code class="text"> 
 ceph-kvstore-tool bluestore-kv <store-path> list p |    grep 'SNA.*_$' 
 </code></pre> 

 *** 

 Disclaimer regarding the scrub "verify SnapMapper consistency" PR [3] which won't be backported to P: 
 The scrub enchantments are only responsible for fixing a corrupted key IFF the snapshot still exists (not removed). 
 If the snapshot was already removed, both the key won't be fixed by scrub and the clone objects will remain stale.  

 *** 

 [1] https://tracker.ceph.com/issues/56147 
 [2] https://github.com/ceph/ceph/pull/46908 
 [3] https://github.com/ceph/ceph/pull/47388

Back