https://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2021-03-22T13:41:41ZCeph CephFS - Bug #49912: client: dir->dentries inconsistent, both newname and oldname points to same inode, mv complains "are the same file"https://tracker.ceph.com/issues/49912?journal_id=1882942021-03-22T13:41:41ZPatrick Donnellypdonnell@redhat.com
<ul><li><strong>Subject</strong> changed from <i>dir->dentries inconsistent, both newname and oldname points to same inode, mv complains "are the same file"</i> to <i>client: dir->dentries inconsistent, both newname and oldname points to same inode, mv complains "are the same file"</i></li><li><strong>Status</strong> changed from <i>New</i> to <i>Triaged</i></li><li><strong>Assignee</strong> set to <i>Xiubo Li</i></li><li><strong>Target version</strong> set to <i>v17.0.0</i></li><li><strong>Component(FS)</strong> <i>Client</i> added</li></ul> CephFS - Bug #49912: client: dir->dentries inconsistent, both newname and oldname points to same inode, mv complains "are the same file"https://tracker.ceph.com/issues/49912?journal_id=1882982021-03-22T14:02:33ZJeff Laytonjlayton@redhat.com
<ul></ul><p>I think that after the mv, the directory should no longer be considered ORDERED. We probably <em>can</em> consider it complete in that we know that the dentry should no longer exist at that point.</p>
<p>One question: When you see this occur on a client, was the previous mv done on that client or on a different one? If you can reproduce this all with a single client, then we probably need to fix up the rename codepaths. If the mv occurs on a different client completely, then we may need to look at lease/cap handling.</p> CephFS - Bug #49912: client: dir->dentries inconsistent, both newname and oldname points to same inode, mv complains "are the same file"https://tracker.ceph.com/issues/49912?journal_id=1883692021-03-23T03:29:20ZXiubo Lixiubli@redhat.com
<ul></ul><p>@Xiaoxi</p>
<p>Is this reproduceable for you ? If so, how often ? Locally I was trying in a loop by renaming two file for hours, and didn't hit any issue. Is there any other setting you did ? Or do you have more logs ?</p> CephFS - Bug #49912: client: dir->dentries inconsistent, both newname and oldname points to same inode, mv complains "are the same file"https://tracker.ceph.com/issues/49912?journal_id=1883842021-03-23T09:13:54ZXiubo Lixiubli@redhat.com
<ul></ul><p>I am doubting that if there has two tasks are doing the rename:</p>
<p>For task1, if it just do _lookup(_INPROGRESS) and then sleep to wait for the client_lock when doing the _lookup(_COMPLETE).</p>
<p>If then task2 start to run and will finish the rename before the task1 is wokeup, and after that the task1's _lookup(_COMPLETE) will succeed, then the task1 will get the errors like this.</p>
<p>Not sure whether the test case above is similiar with yours.</p>
<p>In two terminals to run the follwoing command, randomly I can hit:</p>
<pre>
# while [ 1 ]; do date; mv file file_tmp; mv file_tmp file; done
mv: cannot move 'file' to 'file_tmp': No such file or directory
Tue Mar 23 17:25:51 CST 2021
Tue Mar 23 17:25:51 CST 2021
mv: overwrite 'file'?
</pre> CephFS - Bug #49912: client: dir->dentries inconsistent, both newname and oldname points to same inode, mv complains "are the same file"https://tracker.ceph.com/issues/49912?journal_id=1894512021-04-03T16:21:03ZXiaoxi Chenxiaoxchen@ebay.com
<ul></ul><p>Jeff Layton wrote:</p>
<blockquote>
<p>I think that after the mv, the directory should no longer be considered ORDERED. We probably <em>can</em> consider it complete in that we know that the dentry should no longer exist at that point.</p>
<p>One question: When you see this occur on a client, was the previous mv done on that client or on a different one? If you can reproduce this all with a single client, then we probably need to fix up the rename codepaths. If the mv occurs on a different client completely, then we may need to look at lease/cap handling.</p>
</blockquote>
<p>There are several clients that mounting the shares however the MV (should only run on one node, based on the input of application owner), at least, the node that complains "are the same file" always the same node.</p>
<p>The application does thousands of MV each day, it fails ~ 2-3 times per day, consistently reproduce-able.</p> CephFS - Bug #49912: client: dir->dentries inconsistent, both newname and oldname points to same inode, mv complains "are the same file"https://tracker.ceph.com/issues/49912?journal_id=1894522021-04-03T16:35:53ZXiaoxi Chenxiaoxchen@ebay.com
<ul></ul><p>Xiubo Li wrote:</p>
<blockquote>
<p>I am doubting that if there has two tasks are doing the rename:</p>
<p>For task1, if it just do _lookup(_INPROGRESS) and then sleep to wait for the client_lock when doing the _lookup(_COMPLETE).</p>
<p>If then task2 start to run and will finish the rename before the task1 is wokeup, and after that the task1's _lookup(_COMPLETE) will succeed, then the task1 will get the errors like this.</p>
<p>Not sure whether the test case above is similiar with yours.</p>
<p>In two terminals to run the follwoing command, randomly I can hit:</p>
<p>[...]</p>
</blockquote>
<p>Similar with my guessing though I believe it is two task both trying to "mv INPROGRESS COMPLETE". <br />Task 1 lookup INPROGRESS ,sleep, task2 lookup INPROGRESS, lookup COMPLETE, finish mv. task 1 woke up lookup COMPLETE then complains "are the same file". However application team doesnt believe and based on the fact that only one client has cap, so it is hard for us to find out the other competitor.</p>
<p>Regardless, the dirty dentries in readdir_cache is an issue, that makes the application cannot retry and recovery it self.</p> CephFS - Bug #49912: client: dir->dentries inconsistent, both newname and oldname points to same inode, mv complains "are the same file"https://tracker.ceph.com/issues/49912?journal_id=1897562021-04-08T07:58:08ZXiubo Lixiubli@redhat.com
<ul></ul><p>Xiaoxi Chen wrote:</p>
<blockquote>
<p>Jeff Layton wrote:</p>
<blockquote>
<p>I think that after the mv, the directory should no longer be considered ORDERED. We probably <em>can</em> consider it complete in that we know that the dentry should no longer exist at that point.</p>
<p>One question: When you see this occur on a client, was the previous mv done on that client or on a different one? If you can reproduce this all with a single client, then we probably need to fix up the rename codepaths. If the mv occurs on a different client completely, then we may need to look at lease/cap handling.</p>
</blockquote>
<p>There are several clients that mounting the shares however the MV (should only run on one node, based on the input of application owner), at least, the node that complains "are the same file" always the same node.</p>
<p>The application does thousands of MV each day, it fails ~ 2-3 times per day, consistently reproduce-able.</p>
</blockquote>
<p>I doubt that it is the multiple clients and lease/cap issue caused this as Jeff mentioned above.</p> CephFS - Bug #49912: client: dir->dentries inconsistent, both newname and oldname points to same inode, mv complains "are the same file"https://tracker.ceph.com/issues/49912?journal_id=1898222021-04-08T14:29:28ZXiubo Lixiubli@redhat.com
<ul></ul><p>Since the `ls` command output was correct, the ORDERED flag should have been cleared as expected, or it should show '.dw_gem2_cmn_sd_COMPLETE' instead of '.dw_gem2_cmn_sd_INPROGRESS'.</p>
<p>I still couldn't figure out why the old dentry wasn't removed from 'dir->dentries' after `mv`, from the code the old dentry removing and new dentry inserting are both under a signal client_lock scope, no unlock/lock happen during that. Till now what I can figure out is that the only possible case is the `request->old_dentry()` is NULL for some reason.</p> CephFS - Bug #49912: client: dir->dentries inconsistent, both newname and oldname points to same inode, mv complains "are the same file"https://tracker.ceph.com/issues/49912?journal_id=1899292021-04-09T06:21:15ZXiubo Lixiubli@redhat.com
<ul></ul><p>I have figured one case could reprodce it in theory:</p>
<p>1, I have check the `mv` source code, before doing the `rename`, it will stat both the '_INPROGRESS' and '_COMPLETE' files one by one.</p>
<p>2, If there has two threads are doing the 'mv', and when the thread_A is on the way, and in the thread_B it is possible that the two stats will both be successful and return the same info. That's because during the two stats gap in thread_B, thread_A could have finished the 'mv' successfully. Locally I can reproduce this.</p>
<p>3, Then in thread_B, the 'mv' itself will try to compare the two stats' info before doing the rename, which will fail with they are the same file.</p>
<p>4, In the Description, you mentioned when stating the two files, they are showing the same info. Possiblly the 'mv' still on the way in the app when you were doing these ? So next time when you do the stat again it may fail, no matter you did the caches dropping or not.</p>
<p>5, I have checked the drop caches related code, it won't do anything that's related to the dir->dentries.</p> CephFS - Bug #49912: client: dir->dentries inconsistent, both newname and oldname points to same inode, mv complains "are the same file"https://tracker.ceph.com/issues/49912?journal_id=1903042021-04-14T04:26:28ZXiubo Lixiubli@redhat.com
<ul><li><strong>Pull request ID</strong> set to <i>40787</i></li></ul> CephFS - Bug #49912: client: dir->dentries inconsistent, both newname and oldname points to same inode, mv complains "are the same file"https://tracker.ceph.com/issues/49912?journal_id=1971102021-06-15T04:39:13ZXiubo Lixiubli@redhat.com
<ul><li><strong>Status</strong> changed from <i>Triaged</i> to <i>Fix Under Review</i></li></ul> CephFS - Bug #49912: client: dir->dentries inconsistent, both newname and oldname points to same inode, mv complains "are the same file"https://tracker.ceph.com/issues/49912?journal_id=2422762023-07-13T18:48:04ZRishabh Dave
<ul></ul><p>The PR has been merged. Should this PR be backported?</p> CephFS - Bug #49912: client: dir->dentries inconsistent, both newname and oldname points to same inode, mv complains "are the same file"https://tracker.ceph.com/issues/49912?journal_id=2422832023-07-14T00:35:11ZXiubo Lixiubli@redhat.com
<ul><li><strong>Backport</strong> set to <i>pacific, quincy, reef</i></li></ul><p>Rishabh Dave wrote:</p>
<blockquote>
<p>The PR has been merged. Should this PR be backported?</p>
</blockquote>
<p>Yeah, it should be.</p> CephFS - Bug #49912: client: dir->dentries inconsistent, both newname and oldname points to same inode, mv complains "are the same file"https://tracker.ceph.com/issues/49912?journal_id=2422842023-07-14T00:35:27ZXiubo Lixiubli@redhat.com
<ul><li><strong>Status</strong> changed from <i>Fix Under Review</i> to <i>Pending Backport</i></li></ul> CephFS - Bug #49912: client: dir->dentries inconsistent, both newname and oldname points to same inode, mv complains "are the same file"https://tracker.ceph.com/issues/49912?journal_id=2422852023-07-14T00:56:50ZBackport Bot
<ul><li><strong>Copied to</strong> <i><a class="issue tracker-9 status-3 priority-5 priority-high3 closed" href="/issues/62010">Backport #62010</a>: quincy: client: dir->dentries inconsistent, both newname and oldname points to same inode, mv complains "are the same file"</i> added</li></ul> CephFS - Bug #49912: client: dir->dentries inconsistent, both newname and oldname points to same inode, mv complains "are the same file"https://tracker.ceph.com/issues/49912?journal_id=2422872023-07-14T00:56:57ZBackport Bot
<ul><li><strong>Copied to</strong> <i><a class="issue tracker-9 status-3 priority-5 priority-high3 closed" href="/issues/62011">Backport #62011</a>: reef: client: dir->dentries inconsistent, both newname and oldname points to same inode, mv complains "are the same file"</i> added</li></ul> CephFS - Bug #49912: client: dir->dentries inconsistent, both newname and oldname points to same inode, mv complains "are the same file"https://tracker.ceph.com/issues/49912?journal_id=2422892023-07-14T00:57:04ZBackport Bot
<ul><li><strong>Copied to</strong> <i><a class="issue tracker-9 status-3 priority-5 priority-high3 closed" href="/issues/62012">Backport #62012</a>: pacific: client: dir->dentries inconsistent, both newname and oldname points to same inode, mv complains "are the same file"</i> added</li></ul> CephFS - Bug #49912: client: dir->dentries inconsistent, both newname and oldname points to same inode, mv complains "are the same file"https://tracker.ceph.com/issues/49912?journal_id=2422912023-07-14T00:57:05ZBackport Bot
<ul><li><strong>Tags</strong> set to <i>backport_processed</i></li></ul> CephFS - Bug #49912: client: dir->dentries inconsistent, both newname and oldname points to same inode, mv complains "are the same file"https://tracker.ceph.com/issues/49912?journal_id=2447622023-08-21T02:11:47ZXiubo Lixiubli@redhat.com
<ul><li><strong>Status</strong> changed from <i>Pending Backport</i> to <i>Resolved</i></li></ul>