Bug #62983
Updated by Matan Breizman 7 months ago
purged_snap_ keys stored in the monitor for each snapshots removed. The keys are merged on contiguous snap id intervals. See OSDMonitor::insert_purged_snap_update(). //WIP: In similarity, the OSD stores SnapMapper::PURGED_SNAP_PREFIX (PSN) keys, See SnapMapper::record_purged_snaps(). On snap removal, the snap_seq The issue is incremented and leaves gaps behind. Self-managed that unlike pool snapshots, self-managed snapshots uses `pending_pseudo_purged_snaps` on deletion (POOL_OP_DELETE_UNMANAGED_SNAP) which helps to avoid the discontinuity allocates snap id when removing snapshots. <pre><code class="text"> pending_inc.new_removed_snaps[m->pool].insert(m->snapid); // also record the requesting new seq as purged: this avoids a discontinuity // after all of snapid from the snaps have been purged, since the seq assigned // during removal lives in the same namespace monitor. This snapid is not tracked by RBD and won't be marked as the actual snaps. pending_pseudo_purged_snaps[m->pool].insert(pp.get_snap_seq()); </code></pre> See: https://github.com/ceph/ceph/pull/28330/commits/d831abeae1688a18eb446dd1a63eb6ed94f45d81 purged. Those untracked ids will leave holes and won't allow adjacent purged ids to merge. Self-managed: //WIP2: <pre><code class="text"> for i in `seq 0 10`; do echo $i && rbd snap_seq increments on snap create pl1/img@snp$i && sleep 1 && rbd snap rm pl1/img@snp$i && sleep 1; done .. Mon db: osd_snap / purged_snap_2_0000000000000019 </code></pre> However, the SnapMapper::PURGED_SNAP_PREFIX (PSN) lacks the pseudo_purged_snaps concept removal and results in entries that won't get merged: <pre><code class="text"> leaves gaps. OSD db: p %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000002 p %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000004 p %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000006 p %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000008 p %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_000000000000000a p %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_000000000000000c p %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_000000000000000e p %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000010 p %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000012 p %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000014 p %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000016 p %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000018 </code></pre> In pool Snaps, pseudo_purged_snaps concept is missing in both the monitor and the osd: Pool-snaps: <pre><code class="text"> for i in `seq 0 10`; do echo $i && rados -p pl mksnap snp$i && rados -p pl rmsnap snp$i; done .. osd_snap / purged_snap_2_0000000000000001 osd_snap / purged_snap_2_0000000000000003 osd_snap / purged_snap_2_0000000000000005 osd_snap / purged_snap_2_0000000000000007 osd_snap / purged_snap_2_0000000000000009 osd_snap / purged_snap_2_000000000000000b osd_snap / purged_snap_2_000000000000000d osd_snap / purged_snap_2_000000000000000f osd_snap / purged_snap_2_0000000000000011 osd_snap / purged_snap_2_0000000000000013 osd_snap / purged_snap_2_0000000000000015 </code></pre> Note: if the snapshots are first created and only then deleted, they will be merged: <pre><code class="text"> for i in `seq 0 10`; do echo $i && rados -p pl2 mksnap snp$i; done for i in `seq 0 10`; do echo $i && rados -p pl2 rmsnap snp$i; done .. osd_snap / purged_snap_3_000000000000000b </code></pre> TODO: Self-managed: pseudo_purged_snaps should be <pre><code class="text"> for i in `seq 0 10`; do echo $i && rbd snap create pl1/img@snp$i && sleep 1 && rbd snap rm pl1/img@snp$i && sleep 1; done .. Mon db: osd_snap / purged_snap_2_0000000000000019 OSD db: p %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000002 p %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000004 p %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000006 p %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000008 p %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_000000000000000a p %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_000000000000000c p %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_000000000000000e p %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000010 p %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000012 p %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000014 p %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000016 p %00%00%00%00%00%00%00%00%c1%a3%fcn%00%00%00%00%00%00%04%03.PSN__2_0000000000000018 </code></pre> See: https://github.com/ceph/ceph/pull/28330/commits/d831abeae1688a18eb446dd1a63eb6ed94f45d81 <pre><code class="text"> pending_inc.new_removed_snaps[m->pool].insert(m->snapid); // also applied when writing PSN keys and when recording record the monitor`s `purged_snaps` keys when removing pool new seq as purged: this avoids a discontinuity // after all of the snaps have been purged, since the seq assigned // during removal lives in the same namespace as the actual snaps. pending_pseudo_purged_snaps[m->pool].insert(pp.get_snap_seq()); </code></pre> TODO: * apply pseudo_purged_snaps when recording for PSN keys (pool/self-managed snaps). as well? * apply pseudo_purged_snaps when removing for pools snaps. snaps? (both mon and osd keys) * Is this out of scope from the reported mail? <pre><code class="text"> A reported issue relates to rbd's mirror snapshots that causes This can cause the monitor db to grow large with `purged_snaps` keys. and affect the monitor's startup times. Reported: https://lists.ceph.io/hyperkitty/list/dev@ceph.io/message/UOJG46YXTIPOXJUSELIN42ATAD5FPMDY/ </code></pre>