Bug #18937
closedcache/tiering flush bug with head delete
0%
Description
base: 77=[77,76,74,71,6f,6d,62,61]:[]+head
promoted at 77, then deleted in cache
cache: 7a=[7a,76,74,6f,6d,62,61]:[7a]+head (whiteout)
trim 7a,
cache: 7a=[74,6d,62,61]:[]+head (whiteout)
copy-from,
cache: 94=[94,91,8f,8e,8c,8b,88,84,80,62,61]:[]+head
base: 77=[77,76,74,71,6f,6d,62,61]:[]+head
flush generates
1:ff87f37b:::smithi01026326-90:head [delete] snapc 60=[]
ORDERSNAP flag set and snapc seq 60 < snapset seq 77
1:ff87f37b:::smithi01026326-90:head [copy-from ver 23] snapc 94=[91,8c,8b,88,80,61]
cloning 94 snaps=[91,8c,8b,88,80]
final snapset 94=[91,8c,8b,88,80,61]:[94]+head
base: 94=[91,8c,8b,88,80,61]:[94]+head
but should be (i think)
base: 94=[91,8c,8b,88,80,61]:[77(61)]+head
then we read at 88 and expect ENOENT but gets clone 94 version
I think the cache tier needs to keep track of the oldest (dirty) clone it
ever had and issue the delete based on that?
2017-02-14 18:43:56.364142 7f9a0f086700 10 osd.2 pg_epoch: 195 pg[2.3( v 195'699 (0'0,195'699] local-les=183 n=50 ec=8 les/c/f 183/183/0 182/182/182) [2,5,0] r=0 lpr=182 luod=195'698 lua=195'698 crt=195'699 lcod 195'697 mlcod 195'697 active+clean] start_flush 2:ff87f37b:::smithi01026326-90:head v179'586 uv23 non-blocking/best-effort
2017-02-14 18:43:56.364193 7f9a0f086700 1 -- 172.21.15.10:0/23112 --> 172.21.15.10:6808/23110 -- osd_op(osd.2.5:66 1.3 1:ff87f37b:::smithi01026326-90:head [delete] snapc 60=[] write+ordersnap+ignore_overlay+enforce_snapc+known_if_redirected e195) v8 -- ?+0 0x7f9a4432d440 con 0x7f9a43ddb600
2017-02-14 18:43:56.364223 7f9a0f086700 1 -- 172.21.15.10:0/23112 --> 172.21.15.10:6808/23110 -- osd_op(osd.2.5:67 1.3 1:ff87f37b:::smithi01026326-90:head [copy-from ver 23] snapc 94=[91,8c,8b,88,80,61] ondisk+write+ignore_overlay+enforce_snapc+known_if_redirected e195) v8 -- ?+0 0x7f9a4432dc80 con 0x7f9a43ddb600
/a/sage-2017-02-14_14:45:43-rados-wip-pg-split-interval---basic-smithi/815195
1:ff87f37b:::smithi01026326-90:head