mds: race of fetching large dirfrag
When a dirfrag contains more than 'mds_dir_keys_per_op' items, MDS needs to send multiple 'omap-get-vals' requests to fetch the dirfrag completely.
There is a race if MDS commits the dirfrag in the middle of these 'omap-get-vals' requests. For example:
- MDS fetches a dirfrag, sending 'omap-get-vals' request to osd.
- MDS commits the dirfrag, removing a key that corresponds to null dentry 'X'.
- MDS got omap-get-vals reply. The returned omap is not complete, but contains kv that corresponds to dentry 'X'. MDS send 'omap-get-vals'request to fetch the rest omap.
- dirfrag is committed. MDS marks null dentry 'X' clean and removes it from its cache.
- MDS got omap-get-vals reply. Now the returned omap is complete. MDS calls CDir::omap_fetched(), re-adds dentry 'X' to its cache.
The fix can be re-fetch from the beginning if dirfrag get committed in the middle of omap-get-vals requests.