https://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2022-01-25T15:43:31ZCeph Linux kernel client - Bug #54013: centos stream 8 kernel 358: async dirops causes Cannot write: Operation not permittedhttps://tracker.ceph.com/issues/54013?journal_id=2091762022-01-25T15:43:31ZDan van der Ster
<ul></ul><p>I believe the issue is related to the mds path restriction or osd namespace restriction. (Both created by the fs volumes driver, used by manila).</p>
<p>If I mount the same cluster with kernel 358 using the admin keyring and root path /, I don't have any problems.</p> Linux kernel client - Bug #54013: centos stream 8 kernel 358: async dirops causes Cannot write: Operation not permittedhttps://tracker.ceph.com/issues/54013?journal_id=2091922022-01-25T16:41:54ZJeff Laytonjlayton@redhat.com
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>In Progress</i></li><li><strong>Assignee</strong> set to <i>Jeff Layton</i></li></ul><p>Thanks for the bug report. Do you know whether this also happens with more recent mainline kernels, or is the problem only seen with centos8 stream kernels? In the meantime, I'll see if I can set up a reproducer using the path restricted caps.</p> Linux kernel client - Bug #54013: centos stream 8 kernel 358: async dirops causes Cannot write: Operation not permittedhttps://tracker.ceph.com/issues/54013?journal_id=2091992022-01-25T17:58:39ZDan van der Ster
<ul></ul><p>Jeff Layton wrote:</p>
<blockquote>
<p>Thanks for the bug report. Do you know whether this also happens with more recent mainline kernels, or is the problem only seen with centos8 stream kernels? In the meantime, I'll see if I can set up a reproducer using the path restricted caps.</p>
</blockquote>
<p>Thanks for the quick reply!</p>
<p>5.16.2 has the same issue:</p>
<pre>
# uname -a
Linux xx.cern.ch 5.16.2-1.el8.elrepo.x86_64 #1 SMP PREEMPT Tue Jan 18 15:26:58 EST 2022 x86_64 x86_64 x86_64 GNU/Linux
# mount redacted.cern.ch:6789:/volumes/_nogroup/xxx -t ceph -oname=cephprojectspace,secret=xxxx /mnt/
# cd /mnt/test/
# ls -l
total 197056
drwx------. 3 root root 9 Jan 25 18:52 linux-5.17-rc1
-rw-r--r--. 1 root root 201780465 Jan 25 16:41 linux-5.17-rc1.tar.gz
-rw-r--r--. 1 root root 3795 Jul 3 2019 out.dat.bak
-rwxr-xr-x. 1 root root 105 Jul 3 2019 test.py
# rm -rf linux-5.17-rc1
# tar xf linux-5.17-rc1.tar.gz 2>&1 | head
tar: linux-5.17-rc1/.get_maintainer.ignore: Cannot write: Operation not permitted
tar: linux-5.17-rc1/.gitattributes: Cannot write: Operation not permitted
tar: linux-5.17-rc1/.gitignore: Cannot write: Operation not permitted
tar: linux-5.17-rc1/.mailmap: Cannot write: Operation not permitted
tar: linux-5.17-rc1/COPYING: Cannot write: Operation not permitted
tar: linux-5.17-rc1/CREDITS: Cannot write: Operation not permitted
tar: linux-5.17-rc1/Documentation/ABI/obsolete/sysfs-bus-iio: Cannot write: Operation not permitted
tar: linux-5.17-rc1/Documentation/ABI/obsolete/sysfs-bus-usb: Cannot write: Operation not permitted
tar: linux-5.17-rc1/Documentation/ABI/obsolete/sysfs-class-typec: Cannot write: Operation not permitted
tar: linux-5.17-rc1/Documentation/ABI/obsolete/sysfs-cpuidle: Cannot write: Operation not permitted
</pre>
<p>Remounting with wsync:<br /><pre>
# umount /mnt/
# mount redacted.cern.ch:6789:/volumes/_nogroup/xxx -t ceph -oname=cephprojectspace,secret=xxx=,wsync /mnt/
[root@cephfs-testcs8ml-bcdc59edd4 ~]# cd /mnt/test/
[root@cephfs-testcs8ml-bcdc59edd4 test]# ls
linux-5.17-rc1 linux-5.17-rc1.tar.gz out.dat.bak test.py
[root@cephfs-testcs8ml-bcdc59edd4 test]# rm -rf linux-5.17-rc1
[root@cephfs-testcs8ml-bcdc59edd4 test]# tar xvf linux-5.17-rc1.tar.gz 2>&1 | head
linux-5.17-rc1/
linux-5.17-rc1/.clang-format
linux-5.17-rc1/.cocciconfig
linux-5.17-rc1/.get_maintainer.ignore
linux-5.17-rc1/.gitattributes
linux-5.17-rc1/.gitignore
linux-5.17-rc1/.mailmap
linux-5.17-rc1/COPYING
linux-5.17-rc1/CREDITS
linux-5.17-rc1/Documentation/
...
</pre></p>
<p>The cap for above is:<br /><pre>
[client.cephprojectspace]
key = xx==
caps mds = "allow rw path=/volumes/_nogroup/xxx"
caps mon = "allow r"
caps osd = "allow rw pool=cephfs_data namespace=fsvolumens_xxx"
</pre></p> Linux kernel client - Bug #54013: centos stream 8 kernel 358: async dirops causes Cannot write: Operation not permittedhttps://tracker.ceph.com/issues/54013?journal_id=2092012022-01-25T18:57:46ZJeff Laytonjlayton@redhat.com
<ul></ul><p>Thanks. I was able to reproduce this too with the restricted caps. Here's what we see at the syscall level:</p>
<pre>
1834 openat(AT_FDCWD, "linux-5.17-rc1/.gitattributes", O_WRONLY|O_CREAT|O_EXCL|O_NOCTTY|O_NONBLOCK|O_CLOEXEC, 0664) = 4
1834 write(4, "*.c diff=cpp\n*.h diff=cpp\n*."..., 62) = -1 EPERM (Operation not permitted)
</pre>
<p>The problem is the call in ceph_write_iter to ceph_pool_perm_check which returns -EPERM in some cases.</p>
<p>I haven't tracked it down fully yet, but I suspect the problem is that we're not getting the inherited layout correct on an async create. Still trying to confirm that however.</p> Linux kernel client - Bug #54013: centos stream 8 kernel 358: async dirops causes Cannot write: Operation not permittedhttps://tracker.ceph.com/issues/54013?journal_id=2092092022-01-25T21:37:32ZJeff Laytonjlayton@redhat.com
<ul></ul><p>Ok, the problem is that we weren't filling out the pool_ns in the inode info for new inodes. Patch posted to the ceph-devel ml:</p>
<pre><code><a class="external" href="https://lore.kernel.org/ceph-devel/20220125211022.114286-1-jlayton@kernel.org/T/#u">https://lore.kernel.org/ceph-devel/20220125211022.114286-1-jlayton@kernel.org/T/#u</a></code></pre> Linux kernel client - Bug #54013: centos stream 8 kernel 358: async dirops causes Cannot write: Operation not permittedhttps://tracker.ceph.com/issues/54013?journal_id=2092142022-01-26T08:27:43ZDan van der Ster
<ul></ul><p>Filed a Stream 8 bug: <a class="external" href="https://bugzilla.redhat.com/show_bug.cgi?id=2046021">https://bugzilla.redhat.com/show_bug.cgi?id=2046021</a></p> Linux kernel client - Bug #54013: centos stream 8 kernel 358: async dirops causes Cannot write: Operation not permittedhttps://tracker.ceph.com/issues/54013?journal_id=2092982022-01-27T14:38:45ZJeff Laytonjlayton@redhat.com
<ul></ul><p>Thanks. We did a bit of investigation into why our QA didn't catch this. Ceph has a bajillion different options and config knobs, and we simply can't run every possible test with every possible permutation. We currently rely on random selections for certain options (like wsync/nowsync).</p>
<p>As far as we can tell, the specific test that tests --namespace-isolated subvolumes just never got run with async dirops, due to pure dumb luck. We're planning to discuss how we can ID these sorts of coverage gaps and improve this, but it may be tough to improve that given the limits to testing infrastructure that we have.</p> Linux kernel client - Bug #54013: centos stream 8 kernel 358: async dirops causes Cannot write: Operation not permittedhttps://tracker.ceph.com/issues/54013?journal_id=2101002022-02-09T09:54:35ZDan van der Ster
<ul></ul><p>ftr, i re-tested with kernel 5.16.7 and it looks fixed. (Fix was in 5.16.5). Thanks Jeff!</p> Linux kernel client - Bug #54013: centos stream 8 kernel 358: async dirops causes Cannot write: Operation not permittedhttps://tracker.ceph.com/issues/54013?journal_id=2170202022-06-01T17:09:39ZJeff Laytonjlayton@redhat.com
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Resolved</i></li></ul>