Bug #56524
closed
xfstest-dev: generic/467 failed with "open_by_handle(/mnt/kcephfs.A/467-dir/file000006) opened an unlinked file!"
Added by Xiubo Li almost 2 years ago.
Updated over 1 year ago.
Description
I hit it only once by using the 'testing' branch and haven't reproduced it yet:
generic/467 73s ... - output mismatch (see /data/xfstests-dev/results//generic/467.out.bad)
--- tests/generic/467.out 2022-04-07 21:42:10.690048179 0800
++ /data/xfstests-dev/results//generic/467.out.bad 2022-07-07 16:34:34.682222265 +0800
@ -1,5 +1,6
@
QA output created by 467
test_file_handles TEST_DIR/467-dir -dp
+open_by_handle(/mnt/kcephfs.A/467-dir/file000006) opened an unlinked file!
test_file_handles TEST_DIR/467-dir -rp
test_file_handles TEST_DIR/467-dir -dkr
test_file_handles TEST_DIR/467-dir -lr
...
(Run 'diff -u /data/xfstests-dev/tests/generic/467.out /data/xfstests-dev/results//generic/467.out.bad' to see the entire diff)
Hit it again:
generic/467 73s ... - output mismatch (see /data/xfstests-dev/results//generic/467.out.bad)
--- tests/generic/467.out 2022-04-07 21:42:10.690048179 +0800
+++ /data/xfstests-dev/results//generic/467.out.bad 2022-07-12 16:52:45.692472984 +0800
@@ -1,5 +1,6 @@
QA output created by 467
test_file_handles TEST_DIR/467-dir -dp
+open_by_handle(/mnt/kcephfs.A/467-dir/file000002) opened an unlinked file!
test_file_handles TEST_DIR/467-dir -rp
test_file_handles TEST_DIR/467-dir -dkr
test_file_handles TEST_DIR/467-dir -lr
...
(Run 'diff -u /data/xfstests-dev/tests/generic/467.out /data/xfstests-dev/results//generic/467.out.bad' to see the entire diff)
Ran: generic/467
Failures: generic/467
Failed 1 of 1 tests
- Status changed from New to In Progress
Locally I have reproduced it and the root cause is that just after a file was deleted and only the unsafe reply was received, but the dentry was still held by the unlink request->r_dentry.
So later when the xfstests-dev test try to open it by using open_by_handle_at() the kernel will succeed:
182 static struct dentry *__fh_to_dentry(struct super_block *sb, u64 ino)
183 {
184 struct inode *inode = __lookup_inode(sb, ino);
185 int err;
186
187 if (IS_ERR(inode))
188 return ERR_CAST(inode);
189 /* We need LINK caps to reliably check i_nlink */
190 err = ceph_do_getattr(inode, CEPH_CAP_LINK_SHARED, false);
191 if (err) {
192 iput(inode);
193 return ERR_PTR(err);
194 }
195 /* -ESTALE if inode as been unlinked and no file is open */
196 if ((inode->i_nlink == 0) && (atomic_read(&inode->i_count) == 1)) {
197 iput(inode);
198 return ERR_PTR(-ESTALE);
199 }
200 return d_obtain_alias(inode);
201 }
To fix it we need to make sure the dentry is not unhashed.
- Status changed from In Progress to Fix Under Review
The patchwork link: https://patchwork.kernel.org/project/ceph-devel/patch/20220804080624.14768-1-xiubli@redhat.com/
commit 575c8218fa3e9399ca03c9e3840c6eab59749532 (HEAD -> lxb-testing37)
Author: Xiubo Li <xiubli@redhat.com>
Date: Thu Aug 4 15:21:44 2022 +0800
ceph: fail the open_by_handle_at() if the dentry is being unlinked
When unlinking a file the kclient will send a unlink request to MDS
by holding the dentry reference, and then the MDS will return 2 replies,
which are unsafe reply and a deferred safe reply.
After the unsafe reply received the kernel will return and succeed
the unlink request to user space apps.
Only when the safe reply received the dentry's reference will be
released. Or the dentry will only be unhashed from dcache. But when
the open_by_handle_at() begins to open the unlinked files it will
succeed.
URL: https://tracker.ceph.com/issues/56524
Signed-off-by: Xiubo Li <xiubli@redhat.com>
- Status changed from Fix Under Review to Resolved
Also available in: Atom
PDF