


Bug #62983

Updated by Matan Breizman 7 months ago

purged_snap_ keys stored in the monitor for each snapshots removed. 
 The keys are merged on contiguous snap id intervals. See OSDMonitor::insert_purged_snap_update(). 
 In similarity, the OSD stores SnapMapper::PURGED_SNAP_PREFIX (PSN) keys, See SnapMapper::record_purged_snaps(). 

 On snap removal, the snap_seq is incremented and leaves gaps behind. 
 Self-managed snapshots uses `pending_pseudo_purged_snaps` on deletion (POOL_OP_DELETE_UNMANAGED_SNAP) which helps to avoid the discontinuity when removing snapshots. 

 <pre><code class="text"> 
 // also record the new seq as purged: this avoids a discontinuity 
 // after all of the snaps have been purged, since the seq assigned 
 // during removal lives in the same namespace as the actual snaps. 

 <pre><code class="text"> 
 for i in `seq 0 10`; do echo $i && rbd snap create pl1/img@snp$i && sleep 1 && rbd snap rm pl1/img@snp$i && sleep 1; done 

 Mon db: 
 osd_snap / purged_snap_2_0000000000000019 

 However, the SnapMapper::PURGED_SNAP_PREFIX (PSN) lacks the pseudo_purged_snaps concept and results in entries that won't get merged:  

 <pre><code class="text"> 
 OSD db: 
 p         %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000002 
 p         %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000004 
 p         %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000006 
 p         %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000008 
 p         %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_000000000000000a 
 p         %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_000000000000000c 
 p         %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_000000000000000e 
 p         %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000010 
 p         %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000012 
 p         %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000014 
 p         %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000016 
 p         %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000018 

 In pool Snaps, pseudo_purged_snaps concept is missing in both the monitor and the osd: 

 <pre><code class="text"> 
 for i in `seq 0 10`; do echo $i && rados -p pl mksnap snp$i && rados -p pl rmsnap snp$i; done 
 osd_snap / purged_snap_2_0000000000000001 
 osd_snap / purged_snap_2_0000000000000003 
 osd_snap / purged_snap_2_0000000000000005 
 osd_snap / purged_snap_2_0000000000000007 
 osd_snap / purged_snap_2_0000000000000009 
 osd_snap / purged_snap_2_000000000000000b 
 osd_snap / purged_snap_2_000000000000000d 
 osd_snap / purged_snap_2_000000000000000f 
 osd_snap / purged_snap_2_0000000000000011 
 osd_snap / purged_snap_2_0000000000000013 
 osd_snap / purged_snap_2_0000000000000015 

 Note: if the snapshots are first created and only then deleted, they will be merged: 

 <pre><code class="text"> 
 for i in `seq 0 10`; do echo $i && rados -p pl2 mksnap snp$i; done 
 for i in `seq 0 10`; do echo $i && rados -p pl2 rmsnap snp$i; done 
 osd_snap / purged_snap_3_000000000000000b 

 pseudo_purged_snaps should be also applied when writing PSN keys and when recording the monitor`s `purged_snaps` keys when removing pool snaps. 

 *       Apply apply pseudo_purged_snaps when recording PSN keys (pool/self-managed snaps). 
 *       Apply apply pseudo_purged_snaps when removing pools snaps. 

 *       Fix Is this out of scope from the gaps (after removal has taken place) to allow entry merging in affected clusters. 


 reported mail? 
 <pre><code class="text"> 
 A similar report had "almost 40 million" `purged_snaps` keys in the monitor db that affected startup times. 
 However, it seems that it is not related reported issue relates to the issue described above as rbd's mirror snapshots mirroring does track 
 and merge that causes the purged snaps correctly. This may be caused due monitor db to the workload of the reporter which interrupts grow large with `purged_snaps` keys. 
 the mirror snapshot creation "very frequently" (Requires separate tracker). 

