Project

General

Profile

Bug #62983

Updated by Matan Breizman 7 months ago

purged_snap_ keys stored in the monitor for each snapshots removed. 
 The keys are merged on contiguous snap id intervals. See OSDMonitor::insert_purged_snap_update(). 
 In similarity, the OSD stores SnapMapper::PURGED_SNAP_PREFIX (PSN) keys, See SnapMapper::record_purged_snaps(). 

 On snap removal, the snap_seq is incremented and leaves gaps behind. 
 Self-managed snapshots uses `pending_pseudo_purged_snaps` on deletion (POOL_OP_DELETE_UNMANAGED_SNAP) which helps to avoid the discontinuity when removing snapshots. 

 <pre><code class="text"> 
 pending_inc.new_removed_snaps[m->pool].insert(m->snapid); 
 // also record the new seq as purged: this avoids a discontinuity 
 // after all of the snaps have been purged, since the seq assigned 
 // during removal lives in the same namespace as the actual snaps. 
 pending_pseudo_purged_snaps[m->pool].insert(pp.get_snap_seq()); 
 </code></pre> 
 See: https://github.com/ceph/ceph/pull/28330/commits/d831abeae1688a18eb446dd1a63eb6ed94f45d81 

 Self-managed: 
 <pre><code class="text"> 
 for i in `seq 0 10`; do echo $i && rbd snap create pl1/img@snp$i && sleep 1 && rbd snap rm pl1/img@snp$i && sleep 1; done 
 .. 

 Mon db: 
 osd_snap / purged_snap_2_0000000000000019 
 </code></pre> 

 However, the SnapMapper::PURGED_SNAP_PREFIX (PSN) lacks the pseudo_purged_snaps concept and results in entries that won't get merged:  

 <pre><code class="text"> 
 OSD db: 
 p         %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000002 
 p         %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000004 
 p         %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000006 
 p         %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000008 
 p         %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_000000000000000a 
 p         %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_000000000000000c 
 p         %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_000000000000000e 
 p         %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000010 
 p         %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000012 
 p         %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000014 
 p         %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000016 
 p         %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000018 
 </code></pre> 

 In pool Snaps, pseudo_purged_snaps concept is missing in both the monitor and the osd: 

 Pool-snaps: 
 <pre><code class="text"> 
 for i in `seq 0 10`; do echo $i && rados -p pl mksnap snp$i && rados -p pl rmsnap snp$i; done 
 .. 
 osd_snap / purged_snap_2_0000000000000001 
 osd_snap / purged_snap_2_0000000000000003 
 osd_snap / purged_snap_2_0000000000000005 
 osd_snap / purged_snap_2_0000000000000007 
 osd_snap / purged_snap_2_0000000000000009 
 osd_snap / purged_snap_2_000000000000000b 
 osd_snap / purged_snap_2_000000000000000d 
 osd_snap / purged_snap_2_000000000000000f 
 osd_snap / purged_snap_2_0000000000000011 
 osd_snap / purged_snap_2_0000000000000013 
 osd_snap / purged_snap_2_0000000000000015 
 </code></pre> 

 Note: if the snapshots are first created and only then deleted, they will be merged: 

 <pre><code class="text"> 
 for i in `seq 0 10`; do echo $i && rados -p pl2 mksnap snp$i; done 
 for i in `seq 0 10`; do echo $i && rados -p pl2 rmsnap snp$i; done 
 .. 
 osd_snap / purged_snap_3_000000000000000b 
 </code></pre> 

 TODO: 
 pseudo_purged_snaps should be also applied when writing PSN keys and when recording the monitor`s `purged_snaps` keys when removing pool snaps. 

 *       Apply apply pseudo_purged_snaps when recording PSN keys (pool/self-managed snaps). 
 *       Apply apply pseudo_purged_snaps when removing pools snaps. 
 

 *       Fix Is this out of scope from the gaps (after removal has taken place) to allow entry merging in affected clusters. 

 *** 

 reported mail? 
 <pre><code class="text"> 
 A similar report had "almost 40 million" `purged_snaps` keys in the monitor db that affected startup times. 
 However, it seems that it is not related reported issue relates to the issue described above as rbd's mirror snapshots mirroring does track 
 and merge that causes the purged snaps correctly. This may be caused due monitor db to the workload of the reporter which interrupts grow large with `purged_snaps` keys. 
 the mirror snapshot creation "very frequently" (Requires separate tracker). 
 Reported: https://lists.ceph.io/hyperkitty/list/dev@ceph.io/message/UOJG46YXTIPOXJUSELIN42ATAD5FPMDY/ 

 
 </code></pre> 

Back