Bug #56524
closedxfstest-dev: generic/467 failed with "open_by_handle(/mnt/kcephfs.A/467-dir/file000006) opened an unlinked file!"
0%
Description
I hit it only once by using the 'testing' branch and haven't reproduced it yet:
generic/467 73s ... - output mismatch (see /data/xfstests-dev/results//generic/467.out.bad)
--- tests/generic/467.out 2022-04-07 21:42:10.690048179 0800
++ /data/xfstests-dev/results//generic/467.out.bad 2022-07-07 16:34:34.682222265 +0800
@ -1,5 +1,6
@
QA output created by 467
test_file_handles TEST_DIR/467-dir -dp
+open_by_handle(/mnt/kcephfs.A/467-dir/file000006) opened an unlinked file!
test_file_handles TEST_DIR/467-dir -rp
test_file_handles TEST_DIR/467-dir -dkr
test_file_handles TEST_DIR/467-dir -lr
...
(Run 'diff -u /data/xfstests-dev/tests/generic/467.out /data/xfstests-dev/results//generic/467.out.bad' to see the entire diff)
Updated by Xiubo Li almost 2 years ago
Hit it again:
generic/467 73s ... - output mismatch (see /data/xfstests-dev/results//generic/467.out.bad) --- tests/generic/467.out 2022-04-07 21:42:10.690048179 +0800 +++ /data/xfstests-dev/results//generic/467.out.bad 2022-07-12 16:52:45.692472984 +0800 @@ -1,5 +1,6 @@ QA output created by 467 test_file_handles TEST_DIR/467-dir -dp +open_by_handle(/mnt/kcephfs.A/467-dir/file000002) opened an unlinked file! test_file_handles TEST_DIR/467-dir -rp test_file_handles TEST_DIR/467-dir -dkr test_file_handles TEST_DIR/467-dir -lr ... (Run 'diff -u /data/xfstests-dev/tests/generic/467.out /data/xfstests-dev/results//generic/467.out.bad' to see the entire diff) Ran: generic/467 Failures: generic/467 Failed 1 of 1 tests
Updated by Xiubo Li over 1 year ago
Locally I have reproduced it and the root cause is that just after a file was deleted and only the unsafe reply was received, but the dentry was still held by the unlink request->r_dentry.
So later when the xfstests-dev test try to open it by using open_by_handle_at() the kernel will succeed:
182 static struct dentry *__fh_to_dentry(struct super_block *sb, u64 ino) 183 { 184 struct inode *inode = __lookup_inode(sb, ino); 185 int err; 186 187 if (IS_ERR(inode)) 188 return ERR_CAST(inode); 189 /* We need LINK caps to reliably check i_nlink */ 190 err = ceph_do_getattr(inode, CEPH_CAP_LINK_SHARED, false); 191 if (err) { 192 iput(inode); 193 return ERR_PTR(err); 194 } 195 /* -ESTALE if inode as been unlinked and no file is open */ 196 if ((inode->i_nlink == 0) && (atomic_read(&inode->i_count) == 1)) { 197 iput(inode); 198 return ERR_PTR(-ESTALE); 199 } 200 return d_obtain_alias(inode); 201 }
To fix it we need to make sure the dentry is not unhashed.
Updated by Xiubo Li over 1 year ago
- Status changed from In Progress to Fix Under Review
The patchwork link: https://patchwork.kernel.org/project/ceph-devel/patch/20220804080624.14768-1-xiubli@redhat.com/
commit 575c8218fa3e9399ca03c9e3840c6eab59749532 (HEAD -> lxb-testing37) Author: Xiubo Li <xiubli@redhat.com> Date: Thu Aug 4 15:21:44 2022 +0800 ceph: fail the open_by_handle_at() if the dentry is being unlinked When unlinking a file the kclient will send a unlink request to MDS by holding the dentry reference, and then the MDS will return 2 replies, which are unsafe reply and a deferred safe reply. After the unsafe reply received the kernel will return and succeed the unlink request to user space apps. Only when the safe reply received the dentry's reference will be released. Or the dentry will only be unhashed from dcache. But when the open_by_handle_at() begins to open the unlinked files it will succeed. URL: https://tracker.ceph.com/issues/56524 Signed-off-by: Xiubo Li <xiubli@redhat.com>
Updated by Xiubo Li over 1 year ago
- Status changed from Fix Under Review to Resolved