Project

General

Profile

Bug #56067

Cephfs data loss with root_squash enabled

Added by Vladimir Brik 6 months ago. Updated 4 months ago.

Status:
Fix Under Review
Priority:
Normal
Assignee:
Category:
-
Target version:
% Done:

0%

Source:
Community (user)
Tags:
Backport:
quincy,pacific
Regression:
No
Severity:
1 - critical
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

With root_squash client capability enabled, a file is created as a non-root user on one host, appears empty when read on another host (0 size, cat returns nothing)

Steps to reproduce:
$ ssh vbrik@c9-1 'date > /mnt/ceph1-npx/stress-test/vbrik/XXX'
$ ssh vbrik@c9-2 'cat /mnt/ceph1-npx/stress-test/vbrik/XXX'
(no output)

This is on Ceph 16.2.9 with clients running mainline 4.19.246 and EL8 4.18.0-372.9.1 kernels.

The created files are read correctly on the host that created them, but if the file system is remounted the files become empty.

On occasion, the first read on the host that didn't create the file succeeds, but subsequent reads return nothing.

I am attaching mds log with debug_mds=25 for the commands above: c9-1 (writer) client id is 1039563, c9-2 (reader) client id is 1041297, file name is XXX.

ceph-mds.default.ceph1-mon-1.nglojk.log.zst (116 KB) Vladimir Brik, 06/15/2022 04:06 PM


Related issues

Related to CephFS - Feature #42451: mds: add root_squash Resolved
Related to CephFS - Bug #57154: kernel/fuse client using ceph ID with uid restricted MDS caps cannot update caps Fix Under Review

History

#1 Updated by Patrick Donnelly 6 months ago

Try running `sync` after creating the file. It should report errors.

#2 Updated by Vladimir Brik 6 months ago

Running sync after creating the file did not report errors.

#3 Updated by Patrick Donnelly 6 months ago

  • Assignee set to Ramana Raja
  • Target version set to v18.0.0
  • Source set to Community (user)
  • Backport set to quincy,pacific
  • Component(FS) MDS added

Odd. From the log:

2022-06-15T15:52:50.284+0000 7f26c9d35700 10 MDSAuthCap is_capable inode(path /npx/stress-test/vbrik owner 27317:47700 mode 040755) by caller 27317:47700 mask 2 new 33188:0 cap: MDSAuthCaps[allow rw fsname=default path="/npx" root_squash]
...
2022-06-15T15:52:51.312+0000 7f26c9d35700 10 MDSAuthCap is_capable inode(path /npx/stress-test/vbrik/XXX owner 27317:47700 mode 0100644) by caller 0:0 mask 2 new 0:0 cap: MDSAuthCaps[allow rw fsname=default path="/npx" root_squash]
2022-06-15T15:52:51.312+0000 7f26c9d35700 10 mds.0.locker check_access failed, dropping cap update on [inode 0x100458ce0fb [4,head] /npx/stress-test/vbrik/XXX auth v202 ap=2 DIRTYPARENT s=0 n(v0 1=1+0) (iauth excl->sync) (ifile excl) (ixattr excl) (iversion lock) cr={1068936=0-4194304@3} caps={1068936=pAsLsXsxFsxcrwb/pAsxXsxFxwb@3},l=1068936 | request=1 lock=1 caps=1 dirtyparent=1 dirty=1 waiter=1 authpin=1 0x5640cc045700]

Ramana, please take a look.

#4 Updated by Patrick Donnelly 6 months ago

#5 Updated by Ramana Raja 5 months ago

With vstart cluster (ceph main branch), I was able to reproduce the issue with a kernel client (5.17.11-200.fc35.x86_64). The FUSE client works as expected.

$ ./bin/ceph -v
*** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH ***
ceph version 17.0.0-11507-gbd82d21943f (bd82d21943f9be8e4948542d9cd09ba54735a49a) quincy (dev)

$ uname -r
5.17.11-200.fc35.x86_64

$ ./bin/ceph fs volume ls
[
    {
        "name": "a" 
    }
]

$ ./bin/ceph fs authorize a client.test_a / rw root_squash

$ ./bin/ceph auth get client.test_a
[client.test_a]
    caps mds = "allow rw fsname=a root_squash" 
    caps mon = "allow r fsname=a" 
    caps osd = "allow rw tag cephfs data=a" 
exported keyring for client.test_a

$ sudo ./bin/mount.ceph test_a@.a=/ /mnt/cephfs/

$ pushd /mnt/cephfs/

$ date > file-kc.txt; sync file-kc.txt; cat file-kc.txt
Wed Jul 20 09:26:36 PM EDT 2022

$ ls -lah file-kc.txt
-rw-rw-r--. 1 rraja rraja 0 Jul 20 21:26 file-kc.txt

$ su -c 'echo 3 > /proc/sys/vm/drop_caches'

$ cat file-kc.txt

$ # file-kc.txt is empty

In the MDS logs,

2022-07-20T21:26:36.760-0400 7fb790ee5640  7 mds.0.locker handle_client_caps  on 0x100000001fd tid 21 follows 1 op flush flags 0x1
2022-07-20T21:26:36.760-0400 7fb790ee5640 20 mds.0.3 get_session have 0x565004a3d900 client.4193 10.0.0.148:0/3034637914 state open
2022-07-20T21:26:36.760-0400 7fb790ee5640 10 mds.0.locker  head inode [inode 0x100000001fd [2,head] /file-kc.txt auth v248 s=0 n(v0 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4193=0-4194304@1} caps={4193=pAsxLsXsxFsxcrwb/pAsxXsxFxwb@1},l=4193 | request=1 caps=1 0x565004bc3080]
2022-07-20T21:26:36.760-0400 7fb790ee5640 10 mds.0.cache pick_inode_snap follows 1 on [inode 0x100000001fd [2,head] /file-kc.txt auth v248 s=0 n(v0 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4193=0-4194304@1} caps={4193=pAsxLsXsxFsxcrwb/pAsxXsxFxwb@1},l=4193 | request=1 caps=1 0x565004bc3080]
2022-07-20T21:26:36.760-0400 7fb790ee5640 10 mds.0.locker  follows 1 retains pAsxLsXsxFsxcrwb dirty Fw on [inode 0x100000001fd [2,head] /file-kc.txt auth v248 s=0 n(v0 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4193=0-4194304@1} caps={4193=pAsxLsXsxFsxcrwb/pAsxXsxFxwb@1},l=4193 | request=1 caps=1 0x565004bc3080]
2022-07-20T21:26:36.760-0400 7fb790ee5640  7 mds.0.locker  flush client.4193 dirty Fw seq 1 on [inode 0x100000001fd [2,head] /file-kc.txt auth v248 s=0 n(v0 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4193=0-4194304@1} caps={4193=pAsxLsXsxFsxcrwb/pAsxXsxFxwb@1},l=4193 | request=1 caps=1 0x565004bc3080]
2022-07-20T21:26:36.760-0400 7fb790ee5640 10 mds.0.locker  wanted pAsxXsxFxwb -> pAsxXsxFsxcrwb
2022-07-20T21:26:36.760-0400 7fb790ee5640 10 mds.0.locker _do_cap_update dirty Fw issued pAsxLsXsxFsxcrwb wanted pAsxXsxFsxcrwb on [inode 0x100000001fd [2,head] /file-kc.txt auth v248 s=0 n(v0 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4193=0-4194304@1} caps={4193=pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@1},l=4193 | request=1 caps=1 0x565004bc3080]
2022-07-20T21:26:36.760-0400 7fb790ee5640 20 mds.0.locker inode is file
2022-07-20T21:26:36.760-0400 7fb790ee5640 20 mds.0.locker client has write caps; m->get_max_size=0; old_max=4194304
2022-07-20T21:26:36.760-0400 7fb790ee5640 20 mds.0.3 get_session have 0x565004a3d900 client.4193 10.0.0.148:0/3034637914 state open
2022-07-20T21:26:36.760-0400 7fb790ee5640 20 Session check_access path /file-kc.txt
2022-07-20T21:26:36.760-0400 7fb790ee5640 10 MDSAuthCap is_capable inode(path /file-kc.txt owner 1000:1000 mode 0100664) by caller 0:0 mask 2 new 0:0 cap: MDSAuthCaps[allow rw fsname=a root_squash]
2022-07-20T21:26:36.760-0400 7fb790ee5640 10 mds.0.locker check_access failed, dropping cap update on [inode 0x100000001fd [2,head] /file-kc.txt auth v248 s=0 n(v0 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4193=0-4194304@1} caps={4193=pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@1},l=4193 | request=1 caps=1 0x565004bc3080]
2022-07-20T21:26:36.760-0400 7fb790ee5640 10 mds.0.3 send_message_client_counted client.4193 seq 139 client_caps(flush_ack ino 0x100000001fd 1 seq 1 tid 21 caps=pAsxLsXsxFsxcrwb dirty=Fw wanted=- follows 0 size 0/0 mtime 0.000000) v12

In Locker::_do_cap_update, there is the following code

Session *session = mds->get_session(m);
if (session->check_access(in, MAY_WRITE,
m->caller_uid, m->caller_gid, NULL, 0, 0) < 0) {
dout(10) << "check_access failed, dropping cap update on " << *in << dendl;
return false;
}

From the logs, the m->caller_uid and m->caller_gid are 0 and 0. The "root_squash" MDS caps disallows MAY_WRITE for caller_uid and caller_gid 0. So the check_access failed and dropped the cap update. I'm not sure why the m->caller_uid and m->caller_gid are 0 and 0. I would have expected them to be not root.

#6 Updated by Ramana Raja 5 months ago

  • Status changed from New to In Progress

#7 Updated by Greg Farnum 5 months ago

Hmm. Is the kernel client just losing track of root_squash when flushing caps? That is a different path than the more typical mds lookup/setattr paths.

#8 Updated by Ramana Raja 5 months ago

Greg Farnum wrote:

Hmm. Is the kernel client just losing track of root_squash when flushing caps? That is a different path than the more typical mds lookup/setattr paths.

The PR https://github.com/ceph/ceph/pull/36457 introduced root_squash MDS auth caps primarily to prevent actions such as accidental `sudo rm -rf /` . It made changes in MDSAuthCaps to prevent write operations if root_squash is enabled and caller uid/gid is 0. See,
https://github.com/ceph/ceph/pull/36457/commits/e1c52207809cbd6408311b01460d4fcc2f987b4c#diff-f09c3821c9bcbf6583492f18d669d0e23c73420a06e13f3b05b5b5c6526c476aR244

The kernel client when flusing 'Fw' caps sets the message's caller_uid and caller_gid as 0. I think the related kclient code is here, https://github.com/ceph/ceph-client/blob/for-linus/fs/ceph/caps.c#L1292 within static void encode_cap_msg(struct ceph_msg *msg, struct cap_msg_args *arg)

    /*
     * caller_uid/caller_gid (version 7)
     *
     * Currently, we don't properly track which caller dirtied the caps
     * last, and force a flush of them when there is a conflict. For now,
     * just set this to 0:0, to emulate how the MDS has worked up to now.
     */
    ceph_encode_32(&p, 0);
    ceph_encode_32(&p, 0);

As mentioned in Comment#5, on the MDS server side, Locker:: _do_cap_update() drops the cap update as it checks whether MAY_WRITE is possible on the inode with caller_uid and caller_gid as zero.

#9 Updated by Patrick Donnelly 5 months ago

Good work tracking that down Ramana! I don't think it's reasonable to try to require the client mount to keep track of which application uid/gid dirtied data. Perhaps the locker check should be changed?

#10 Updated by Ramana Raja 4 months ago

Patrick Donnelly wrote:

I don't think it's reasonable to try to require the client mount to keep track of which application uid/gid dirtied data. Perhaps the locker check should be changed?

Yeah, Patrick. Maybe I can always pass caller_uid and caller_gid as -1 in the session->check_access() call in Locker::_do_cap_update? It seems like that's what fuse client does,

2022-07-24T13:43:15.672-0400 7eff3f267640  7 mds.0.locker handle_client_caps  on 0x10000000000 tid 2 follows 1 op update flags 0x0
2022-07-24T13:43:15.672-0400 7eff3f267640 20 mds.0.3 get_session have 0x5638a1f09900 client.4169 10.0.0.148:0/2931922343 state open
2022-07-24T13:43:15.672-0400 7eff3f267640 10 mds.0.locker  head inode [inode 0x10000000000 [2,head] /foo-fuse-root-squash.txt auth v2 DIRTYPARENT s=0 n(v0 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4169=0-4194304@1} caps={4169=pAsxLsXsxFsxcrwb/pAsxXsxFxwb@1},l=4169 | request=0 caps=1 dirtyparent=1 dirty=1 0x5638a1ee3700]
2022-07-24T13:43:15.672-0400 7eff3f267640 10 mds.0.cache pick_inode_snap follows 1 on [inode 0x10000000000 [2,head] /foo-fuse-root-squash.txt auth v2 DIRTYPARENT s=0 n(v0 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4169=0-4194304@1} caps={4169=pAsxLsXsxFsxcrwb/pAsxXsxFxwb@1},l=4169 | request=0 caps=1 dirtyparent=1 dirty=1 0x5638a1ee3700]
2022-07-24T13:43:15.673-0400 7eff3f267640 10 mds.0.locker  follows 1 retains pAsLsXsFsc dirty Fw on [inode 0x10000000000 [2,head] /foo-fuse-root-squash.txt auth v2 DIRTYPARENT s=0 n(v0 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4169=0-4194304@1} caps={4169=pAsLsXsFsc/pAsxXsxFxwb@1},l=4169 | request=0 caps=1 dirtyparent=1 dirty=1 0x5638a1ee3700]
2022-07-24T13:43:15.673-0400 7eff3f267640  7 mds.0.locker  flush client.4169 dirty Fw seq 1 on [inode 0x10000000000 [2,head] /foo-fuse-root-squash.txt auth v2 DIRTYPARENT s=0 n(v0 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4169=0-4194304@1} caps={4169=pAsLsXsFsc/pAsxXsxFxwb@1},l=4169 | request=0 caps=1 dirtyparent=1 dirty=1 0x5638a1ee3700]
2022-07-24T13:43:15.673-0400 7eff3f267640 10 mds.0.locker  wanted pAsxXsxFxwb -> -
2022-07-24T13:43:15.673-0400 7eff3f267640 10 mds.0.locker _do_cap_update dirty Fw issued pAsLsXsFsc wanted - on [inode 0x10000000000 [2,head] /foo-fuse-root-squash.txt auth v2 DIRTYPARENT s=0 n(v0 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4169=0-4194304@1} caps={4169=pAsLsXsFsc/-@1},l=4169 | request=0 caps=1 dirtyparent=1 dirty=1 0x5638a1ee3700]
2022-07-24T13:43:15.673-0400 7eff3f267640 20 mds.0.locker inode is file
2022-07-24T13:43:15.673-0400 7eff3f267640 20 mds.0.3 get_session have 0x5638a1f09900 client.4169 10.0.0.148:0/2931922343 state open
2022-07-24T13:43:15.673-0400 7eff3f267640 20 Session check_access path /foo-fuse-root-squash.txt
2022-07-24T13:43:15.673-0400 7eff3f267640 10 MDSAuthCap is_capable inode(path /foo-fuse-root-squash.txt owner 1000:1000 mode 0100664) by caller 4294967295:4294967295 mask 2 new 0:0 cap: MDSAuthCaps[allow rw fsname=a root_squash]
2022-07-24T13:43:15.673-0400 7eff3f267640 15 mds.0.cache.ino(0x10000000000) project_inode 0x10000000000
2022-07-24T13:43:15.673-0400 7eff3f267640 10 mds.0.cache.dir(0x1) pre_dirty 4
2022-07-24T13:43:15.673-0400 7eff3f267640 10 mds.0.cache.den(0x1 foo-fuse-root-squash.txt) pre_dirty [dentry #0x1/foo-fuse-root-squash.txt [2,head] auth (dn sync l) (dversion lock) pv=4 v=2 ino=0x10000000000 state=1610612736 | request=0 lock=0 inodepin=1 dirty=1 authpin=0 clientlease=1 0x5638a1e8bb80]
2022-07-24T13:43:15.673-0400 7eff3f267640 10 mds.0.cache.ino(0x10000000000) pre_dirty 4 (current v 2)

Let me check if that works.

#11 Updated by Greg Farnum 4 months ago

Patrick Donnelly wrote:

Good work tracking that down Ramana! I don't think it's reasonable to try to require the client mount to keep track of which application uid/gid dirtied data. Perhaps the locker check should be changed?

It's not really practical for a client to track which uid dirties data, but having the MDS just allow root access for cap updates on a root_squashed cephx cap also doesn't make much sense. Cap updates include operations like truncates!

Despite our documentation, root_squash has pretty well-defined semantics (namely, any uid-0 operation gets mapped to the NOBODY user on the server and is checked on that basis). If we're going to name a feature that way, we should follow those semantics (and indeed, it seems like it would be pretty simple to have the clients do that).

But now I have another question -- does this mean that a kclient which only has access as uid=27 will send cap updates as uid=0 and fail to make those commits as well?

#12 Updated by Ramana Raja 4 months ago

I made the following change to the Locker code, and then checked how kclient and fuse client behaved with root_squash MDS caps enabled.
```
diff --git a/src/mds/Locker.cc b/src/mds/Locker.cc
index 102d13ca3c3..01f660bc8be 100644
--- a/src/mds/Locker.cc
+++ b/src/mds/Locker.cc
@ -3877,7 +3877,7 @ bool Locker::_do_cap_update(CInode *in, Capability *cap,

Session *session = mds->get_session(m);
if (session->check_access(in, MAY_WRITE,
- m->caller_uid, m->caller_gid, NULL, 0, 0) < 0) {
+ 4294967295, 4294967295, NULL, 0, 0) < 0) {
dout(10) << "check_access failed, dropping cap update on " << *in << dendl;
return false;
}

```

The fuse client worked as expected as it did before the change.

When using the kclient, the data written by a non-root user wasn't lost after remounting the client or clearing the client's cache. So the change solved this tracker issue. A root user wasn't able to create or remove a file, but was able to write to the file (`sudo vim foo.txt`), and occasionally change the file stamps (`sudo touch foo.txt`).

#13 Updated by Patrick Donnelly 4 months ago

Please open a PR for discussion.

#14 Updated by Ramana Raja 4 months ago

Ramana Raja wrote:

I made the following change to the Locker code, and then checked how kclient and fuse client behaved with root_squash MDS caps enabled.
```
diff --git a/src/mds/Locker.cc b/src/mds/Locker.cc
index 102d13ca3c3..01f660bc8be 100644
--- a/src/mds/Locker.cc
+++ b/src/mds/Locker.cc
@ -3877,7 +3877,7 @ bool Locker::_do_cap_update(CInode *in, Capability *cap,

Session *session = mds->get_session(m);
if (session->check_access(in, MAY_WRITE,
- m->caller_uid, m->caller_gid, NULL, 0, 0) < 0) {
+ 4294967295, 4294967295, NULL, 0, 0) < 0) {
dout(10) << "check_access failed, dropping cap update on " << *in << dendl;
return false;
}

```

The fuse client worked as expected as it did before the change.

When using the kclient, the data written by a non-root user wasn't lost after remounting the client or clearing the client's cache. So the change solved this tracker issue. A root user wasn't able to create or remove a file, but was able to write to the file (`sudo vim foo.txt`), and occasionally change the file stamps (`sudo touch foo.txt`).

$ ./bin/ceph fs authorize a client.test_a / rw root_squash
$ sudo ./bin/mount.ceph test_a@.a=/ /mnt/cephfs1/
$ cd /mnt/cephfs1/

$ date > kernel-root-squash-100.txt
$ su -c 'echo 3 > /proc/sys/vm/drop_caches'
$ cat kernel-root-squash-100.txt
Wed Jul 27 08:26:47 PM EDT 2022
$ # non-root user writes are not lost anymore when using the kclient with root_squash MDS caps

$ sudo touch kernel-root-squash-100.txt
touch: setting times of 'kernel-root-squash-100.txt': Permission denied

$ sudo rm kernel-root-squash-100.txt
rm: cannot remove 'kernel-root-squash-100.txt': Permission denied

$ sudo truncate -s 0 kernel-root-squash-100.txt
truncate: failed to truncate 'kernel-root-squash-100.txt' at 0 bytes: Permission denied
$ # truncate as root failed too!

$ sudo vim kernel-root-squash-100.txt
$ # Able edit the file as root user, which I didn't expect to happen. I don't see this behavior with FUSE client.
$ su -c 'echo 3 > /proc/sys/vm/drop_caches'
$ cat kernel-root-squash-100.txt
Hello World! Wed Jul 27 08:26:47 PM EDT 2022

In the mds.log I see,

2022-07-27T20:28:24.100-0400 7fd0192a2640  1 -- [v2:10.0.0.148:6810/3875245833,v1:10.0.0.148:6811/3875245833] <== osd.0 v2:10.0.0.148:6802/2003141 18 ==== osd_op_reply(181 200.00000000 [writefull 0~90 [fadvise_dontneed]] v31'129 uv129 ondisk = 0) v8 ==== 156+0+0 (crc 0 0 0) 0x5600b617ab40 con 0x5600b6395000
2022-07-27T20:28:24.102-0400 7fd015a9b640  1 -- [v2:10.0.0.148:6810/3875245833,v1:10.0.0.148:6811/3875245833] <== client.4163 10.0.0.148:0/3662841394 34266 ==== client_caps(flush ino 0x100000001fb 12 seq 8 tid 46 caps=pAsxLsXsxFsxcrwb dirty=Fw wanted=pAsxXsxFsxcrwb follows 1 size 45/0 mtime 2022-07-27T20:28:24.099525-0400 tws 1 xattrs(v=18446637042079226624 l=0)) v10 ==== 236+0+0 (crc 0 0 0) 0x5600b6b00000 con 0x5600b625d400
2022-07-27T20:28:24.102-0400 7fd015a9b640  7 mds.0.locker handle_client_caps  on 0x100000001fb tid 46 follows 1 op flush flags 0x1
2022-07-27T20:28:24.102-0400 7fd015a9b640 20 mds.0.3 get_session have 0x5600b62f0000 client.4163 10.0.0.148:0/3662841394 state open
2022-07-27T20:28:24.102-0400 7fd015a9b640 10 mds.0.locker  head inode [inode 0x100000001fb [2,head] /kernel-root-squash-100.txt auth v140 DIRTYPARENT s=32 n(v0 rc2022-07-27T20:28:24.079525-0400 b32 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4163=0-4194304@1} caps={4163=pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@8},l=4163 | ptrwaiter=0 request=0 lock=0 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=0 0x5600b62ae680]
2022-07-27T20:28:24.102-0400 7fd015a9b640 10 mds.0.cache pick_inode_snap follows 1 on [inode 0x100000001fb [2,head] /kernel-root-squash-100.txt auth v140 DIRTYPARENT s=32 n(v0 rc2022-07-27T20:28:24.079525-0400 b32 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4163=0-4194304@1} caps={4163=pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@8},l=4163 | ptrwaiter=0 request=0 lock=0 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=0 0x5600b62ae680]
2022-07-27T20:28:24.102-0400 7fd015a9b640 10 mds.0.locker  follows 1 retains pAsxLsXsxFsxcrwb dirty Fw on [inode 0x100000001fb [2,head] /kernel-root-squash-100.txt auth v140 DIRTYPARENT s=32 n(v0 rc2022-07-27T20:28:24.079525-0400 b32 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4163=0-4194304@1} caps={4163=pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@8},l=4163 | ptrwaiter=0 request=0 lock=0 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=0 0x5600b62ae680]
2022-07-27T20:28:24.102-0400 7fd015a9b640  7 mds.0.locker  flush client.4163 dirty Fw seq 8 on [inode 0x100000001fb [2,head] /kernel-root-squash-100.txt auth v140 DIRTYPARENT s=32 n(v0 rc2022-07-27T20:28:24.079525-0400 b32 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4163=0-4194304@1} caps={4163=pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@8},l=4163 | ptrwaiter=0 request=0 lock=0 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=0 0x5600b62ae680]
2022-07-27T20:28:24.102-0400 7fd015a9b640 10 mds.0.locker _do_cap_update dirty Fw issued pAsxLsXsxFsxcrwb wanted pAsxXsxFsxcrwb on [inode 0x100000001fb [2,head] /kernel-root-squash-100.txt auth v140 DIRTYPARENT s=32 n(v0 rc2022-07-27T20:28:24.079525-0400 b32 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4163=0-4194304@1} caps={4163=pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@8},l=4163 | ptrwaiter=0 request=0 lock=0 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=0 0x5600b62ae680]
2022-07-27T20:28:24.102-0400 7fd015a9b640 20 mds.0.locker inode is file
2022-07-27T20:28:24.102-0400 7fd015a9b640 20 mds.0.locker client has write caps; m->get_max_size=0; old_max=4194304
2022-07-27T20:28:24.102-0400 7fd015a9b640 20 mds.0.3 get_session have 0x5600b62f0000 client.4163 10.0.0.148:0/3662841394 state open
2022-07-27T20:28:24.102-0400 7fd015a9b640 20 Session check_access path /kernel-root-squash-100.txt
2022-07-27T20:28:24.102-0400 7fd015a9b640 10 MDSAuthCap is_capable inode(path /kernel-root-squash-100.txt owner 1000:1000 mode 0100664) by caller 4294967295:4294967295 mask 2 new 0:0 cap: MDSAuthCaps[allow rw fsname=a root_squash]
2022-07-27T20:28:24.102-0400 7fd015a9b640 15 mds.0.cache.ino(0x100000001fb) project_inode 0x100000001fb
2022-07-27T20:28:24.102-0400 7fd015a9b640 10 mds.0.cache.dir(0x1) pre_dirty 142
2022-07-27T20:28:24.102-0400 7fd015a9b640 10 mds.0.cache.den(0x1 kernel-root-squash-100.txt) pre_dirty [dentry #0x1/kernel-root-squash-100.txt [2,head] auth (dn sync l) (dversion lock) pv=142 v=140 ino=0x100000001fb state=1610612736 | request=0 lock=0 inodepin=1 dirty=1 authpin=0 clientlease=1 0x5600b6414000]
2022-07-27T20:28:24.102-0400 7fd015a9b640 10 mds.0.cache.ino(0x100000001fb) pre_dirty 142 (current v 140)

#15 Updated by Ramana Raja 4 months ago

Greg Farnum wrote:

But now I have another question -- does this mean that a kclient which only has access as uid=27 will send cap updates as uid=0 and fail to make those commits as well?

Yeah, looks like it. I restricted the kclient's MDS caps to specific uid, and the cap updates were sent as uid=0 and dropped by the MDS.

$ ./bin/ceph auth get-or-create client.uid_1000 mon 'allow r' mds 'allow r, allow rw path=/testdir00 uid=1000' osd 'allow rw'

$ sudo ./bin/mount.ceph uid_1000@.a=/ /mnt/cephfs1/

$ cd /mnt/cephfs1/

$ cd testdir00

$ date > testfile.txt

$ cat testfile.txt
Fri Aug  5 04:35:34 PM EDT 2022

$ stat testfile.txt
  File: testfile.txt
  Size: 32              Blocks: 1          IO Block: 4194304 regular file
Device: 3eh/62d Inode: 1099511628778  Links: 1
Access: (0664/-rw-rw-r--)  Uid: ( 1000/   rraja)   Gid: ( 1000/   rraja)

$ su -c 'echo 3 > /proc/sys/vm/drop_caches'

$ cat testfile.txt
$ # file is empty

In the MDS logs, I see that the cap updates were dropped

2022-08-05T16:35:36.447-0400 7fbbb7791640  1 -- [v2:10.0.0.148:6810/1949999487,v1:10.0.0.148:6811/1949999487] <== client.4155 10.0.0.148:0/779216324 422 ==== client_caps(update ino 0x100000003ea 1 seq 1 tid 3 caps=pAsxLsXsxFsxcrwb dirty=Fw wanted=pAsxXsxFxwb follows 1 size 32/0 mtime 2022-08-05T16:35:34.681028-0400 xattrs(v=18446630786367633444 l=0)) v10 ==== 236+0+0 (crc 0 0 0) 0x56078e6afc00 con 0x56078e60a400
2022-08-05T16:35:36.447-0400 7fbbb7791640  7 mds.0.locker handle_client_caps  on 0x100000003ea tid 3 follows 1 op update flags 0x1
2022-08-05T16:35:36.447-0400 7fbbb7791640 20 mds.0.3 get_session have 0x56078e5d8000 client.4155 10.0.0.148:0/779216324 state open
2022-08-05T16:35:36.447-0400 7fbbb7791640 10 mds.0.locker  head inode [inode 0x100000003ea [2,head] /testdir00/testfile.txt auth v6 s=0 n(v0 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4155=0-4194304@1} caps={4155=pAsxLsXsxFsxcrwb/pAsxXsxFxwb@1},l=4155 | request=1 caps=1 0x56078e5a1700]
2022-08-05T16:35:36.447-0400 7fbbb7791640 10 mds.0.cache pick_inode_snap follows 1 on [inode 0x100000003ea [2,head] /testdir00/testfile.txt auth v6 s=0 n(v0 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4155=0-4194304@1} caps={4155=pAsxLsXsxFsxcrwb/pAsxXsxFxwb@1},l=4155 | request=1 caps=1 0x56078e5a1700]
2022-08-05T16:35:36.447-0400 7fbbb7791640 10 mds.0.locker  follows 1 retains pAsxLsXsxFsxcrwb dirty Fw on [inode 0x100000003ea [2,head] /testdir00/testfile.txt auth v6 s=0 n(v0 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4155=0-4194304@1} caps={4155=pAsxLsXsxFsxcrwb/pAsxXsxFxwb@1},l=4155 | request=1 caps=1 0x56078e5a1700]
2022-08-05T16:35:36.447-0400 7fbbb7791640  7 mds.0.locker  flush client.4155 dirty Fw seq 1 on [inode 0x100000003ea [2,head] /testdir00/testfile.txt auth v6 s=0 n(v0 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4155=0-4194304@1} caps={4155=pAsxLsXsxFsxcrwb/pAsxXsxFxwb@1},l=4155 | request=1 caps=1 0x56078e5a1700]
2022-08-05T16:35:36.447-0400 7fbbb7791640 10 mds.0.locker _do_cap_update dirty Fw issued pAsxLsXsxFsxcrwb wanted pAsxXsxFxwb on [inode 0x100000003ea [2,head] /testdir00/testfile.txt auth v6 s=0 n(v0 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4155=0-4194304@1} caps={4155=pAsxLsXsxFsxcrwb/pAsxXsxFxwb@1},l=4155 | request=1 caps=1 0x56078e5a1700]
2022-08-05T16:35:36.447-0400 7fbbb7791640 20 mds.0.locker inode is file
2022-08-05T16:35:36.447-0400 7fbbb7791640 20 mds.0.locker client has write caps; m->get_max_size=0; old_max=4194304
2022-08-05T16:35:36.447-0400 7fbbb7791640 20 mds.0.3 get_session have 0x56078e5d8000 client.4155 10.0.0.148:0/779216324 state open
2022-08-05T16:35:36.447-0400 7fbbb7791640 20 Session check_access path /testdir00/testfile.txt
2022-08-05T16:35:36.447-0400 7fbbb7791640 10 MDSAuthCap is_capable inode(path /testdir00/testfile.txt owner 1000:1000 mode 0100664) by caller 0:0 mask 2 new 0:0 cap: MDSAuthCaps[allow r, allow rw path="/testdir00" uid=1000]
2022-08-05T16:35:36.447-0400 7fbbb7791640 10 mds.0.locker check_access failed, dropping cap update on [inode 0x100000003ea [2,head] /testdir00/testfile.txt auth v6 s=0 n(v0 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4155=0-4194304@1} caps={4155=pAsxLsXsxFsxcrwb/pAsxXsxFxwb@1},l=4155 | request=1 caps=1 0x56078e5a1700]
2022-08-05T16:35:36.447-0400 7fbbb7791640 10 mds.0.3 send_message_client_counted client.4155 seq 13 client_caps(flush_ack ino 0x100000003ea 1 seq 1 tid 3 caps=pAsxLsXsxFsxcrwb dirty=Fw wanted=- follows 0 size 0/0 mtime 0.000000) v12
2022-08-05T16:35:36.447-0400 7fbbb7791640  1 -- [v2:10.0.0.148:6810/1949999487,v1:10.0.0.148:6811/1949999487] --> 10.0.0.148:0/779216324 -- client_caps(flush_ack ino 0x100000003ea 1 seq 1 tid 3 caps=pAsxLsXsxFsxcrwb dirty=Fw wanted=- follows 0 size 0/0 mtime 0.000000) v12 -- 0x56078e6ae000 con 0x56078e60a400
2022-08-05T16:35:36.447-0400 7fbbb7791640 10 mds.0.locker eval 3648 [inode 0x100000003ea [2,head] /testdir00/testfile.txt auth v6 s=0 n(v0 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4155=0-4194304@1} caps={4155=pAsxLsXsxFsxcrwb/pAsxXsxFxwb@1},l=4155 | request=1 caps=1 0x56078e5a1700]
2022-08-05T16:35:36.447-0400 7fbbb7791640  7 mds.0.locker file_eval wanted=xwb loner_wanted=xwb other_wanted=  filelock=(ifile excl) on [inode 0x100000003ea [2,head] /testdir00/testfile.txt auth v6 s=0 n(v0 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4155=0-4194304@1} caps={4155=pAsxLsXsxFsxcrwb/pAsxXsxFxwb@1},l=4155 | request=1 caps=1 0x56078e5a1700]
2022-08-05T16:35:36.447-0400 7fbbb7791640 20 mds.0.locker  is excl
2022-08-05T16:35:36.447-0400 7fbbb7791640  7 mds.0.locker file_eval loner_issued=sxcrwb other_issued= xlocker_issued=
2022-08-05T16:35:36.447-0400 7fbbb7791640 10 mds.0.locker simple_eval (iauth excl) on [inode 0x100000003ea [2,head] /testdir00/testfile.txt auth v6 s=0 n(v0 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4155=0-4194304@1} caps={4155=pAsxLsXsxFsxcrwb/pAsxXsxFxwb@1},l=4155 | request=1 caps=1 0x56078e5a1700]
2022-08-05T16:35:36.448-0400 7fbbb7791640 10 mds.0.locker simple_eval (ilink sync) on [inode 0x100000003ea [2,head] /testdir00/testfile.txt auth v6 s=0 n(v0 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4155=0-4194304@1} caps={4155=pAsxLsXsxFsxcrwb/pAsxXsxFxwb@1},l=4155 | request=1 caps=1 0x56078e5a1700]
2022-08-05T16:35:36.448-0400 7fbbb7791640 10 mds.0.locker simple_eval (ixattr excl) on [inode 0x100000003ea [2,head] /testdir00/testfile.txt auth v6 s=0 n(v0 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4155=0-4194304@1} caps={4155=pAsxLsXsxFsxcrwb/pAsxXsxFxwb@1},l=4155 | request=1 caps=1 0x56078e5a1700]
2022-08-05T16:35:36.448-0400 7fbbb7791640 10 mds.0.locker eval done
2022-08-05T16:35:36.452-0400 7fbbbaf98640  1 -- [v2:10.0.0.148:6810/1949999487,v1:10.0.0.148:6811/1949999487] <== osd.0 v2:10.0.0.148:6802/245705 60 ==== osd_op_reply(60 200.00000001 [write 18875~3552 [fadvise_dontneed]] v32'52 uv52 ondisk = 0) v8 ==== 156+0+0 (crc 0 0 0) 0x56078e46d200 con 0x56078e552400
2022-08-05T16:35:36.452-0400 7fbbb1785640 10 MDSIOContextBase::complete: 18C_MDS_openc_finish
2022-08-05T16:35:36.452-0400 7fbbb1785640 10 MDSContext::complete: 18C_MDS_openc_finish

Same issue is hit with a FUSE client with uid restricted MDS caps, where cap updates were sent as uid -1 and dropped by the MDS.

$ ./bin/ceph auth get client.uid_1000
[client.uid_1000]
    key = AQBkfO1iLX+bFRAARQOVtwR7/fnvgOlvNS5qkw==
    caps mds = "allow r, allow rw path=/testdir00 uid=1000" 
    caps mon = "allow r" 
    caps osd = "allow rw" 

$ sudo ./bin/ceph-fuse --id=uid_1000 /mnt/ceph-fuse/

$ cd /mnt/ceph-fuse/testdir00/

$ date > testfuse01.txt

$ cat testfuse01.txt
Fri Aug  5 05:05:21 PM EDT 2022

$ cd /mnt

$ sudo umount /mnt/ceph-fuse/

$ sudo ./bin/ceph-fuse --id=uid_1000 /mnt/ceph-fuse/

$ cd /mnt/ceph-fuse/testdir00/

$ cat testfuse01.txt
$ # file is empty

In the MDS logs, I see the caps being dropped

022-08-05T17:06:15.223-0400 7fbbb7791640 10 mds.0.locker  head inode [inode 0x10000000004 [2,head] /testdir00/testfuse01.txt auth v20 DIRTYPARENT s=0 n(v0 1=1+0) (iversion lock) cr={4160=0-4194304@1} caps={4160=pAsLsXsFscr/pFscr@4},l=4160 | request=0 lock=0 caps=1 dirtyparent=1 dirty=1 authpin=0 0x56078e661b80]
2022-08-05T17:06:15.223-0400 7fbbb7791640 10 mds.0.locker  follows 0 retains pAsLsXsFsc dirty - on [inode 0x10000000004 [2,head] /testdir00/testfuse01.txt auth v20 DIRTYPARENT s=0 n(v0 1=1+0) (iversion lock) cr={4160=0-4194304@1} caps={4160=pAsLsXsFsc/pFscr@4},l=4160 | request=0 lock=0 caps=1 dirtyparent=1 dirty=1 authpin=0 0x56078e661b80]
2022-08-05T17:06:15.223-0400 7fbbb7791640 10 mds.0.locker  wanted pFscr -> -
2022-08-05T17:06:15.223-0400 7fbbb7791640 10 mds.0.locker _do_cap_update dirty - issued pAsLsXsFsc wanted - on [inode 0x10000000004 [2,head] /testdir00/testfuse01.txt auth v20 DIRTYPARENT s=0 n(v0 1=1+0) (iversion lock) cr={4160=0-4194304@1} caps={4160=pAsLsXsFsc/-@4},l=4160 | request=0 lock=0 caps=1 dirtyparent=1 dirty=1 authpin=0 0x56078e661b80]
2022-08-05T17:06:15.223-0400 7fbbb7791640 20 mds.0.locker inode is file
2022-08-05T17:06:15.223-0400 7fbbb7791640 10 mds.0.locker  i want to change file_max, but lock won't allow it (yet)
2022-08-05T17:06:15.223-0400 7fbbb7791640  7 mds.0.locker file_excl (ifile sync) on [inode 0x10000000004 [2,head] /testdir00/testfuse01.txt auth v20 DIRTYPARENT s=0 n(v0 1=1+0) (iversion lock) cr={4160=0-4194304@1} caps={4160=pAsLsXsFsc/-@4},l=4160 | request=0 lock=0 caps=1 dirtyparent=1 dirty=1 authpin=0 0x56078e661b80]
2022-08-05T17:06:15.223-0400 7fbbb7791640  7 mds.0.locker get_allowed_caps loner client.4160 allowed=pAsLsXsFsxcrwb, xlocker allowed=pAsLsXsFsxcrwb, others allowed=pAsLsXs on [inode 0x10000000004 [2,head] /testdir00/testfuse01.txt auth v20 DIRTYPARENT s=0 n(v0 1=1+0) (ifile excl) (iversion lock) cr={4160=0-4194304@1} caps={4160=pAsLsXsFsc/-@4},l=4160 | request=0 lock=0 caps=1 dirtyparent=1 dirty=1 authpin=0 0x56078e661b80]
2022-08-05T17:06:15.223-0400 7fbbb7791640 20 mds.0.locker  client.4160 pending pAsLsXsFsc allowed pAsLsXsFsxcrwb wanted -
2022-08-05T17:06:15.223-0400 7fbbb7791640 20 mds.0.locker   !revoke and new|suppressed|stale, skipping client.4160
2022-08-05T17:06:15.223-0400 7fbbb7791640 20 mds.0.3 get_session have 0x56078e5d8000 client.4160 10.0.0.148:0/972546133 state open
2022-08-05T17:06:15.223-0400 7fbbb7791640 20 Session check_access path /testdir00/testfuse01.txt
2022-08-05T17:06:15.223-0400 7fbbb7791640 10 MDSAuthCap is_capable inode(path /testdir00/testfuse01.txt owner 1000:1000 mode 0100664) by caller 4294967295:4294967295 mask 2 new 0:0 cap: MDSAuthCaps[allow r, allow rw path="/testdir00" uid=1000]
2022-08-05T17:06:15.223-0400 7fbbb7791640 10 mds.0.locker check_access failed, dropping cap update on [inode 0x10000000004 [2,head] /testdir00/testfuse01.txt auth v20 DIRTYPARENT s=0 n(v0 1=1+0) (ifile excl) (iversion lock) cr={4160=0-4194304@1} caps={4160=pAsLsXsFsc/-@4},l=4160 | request=0 lock=0 caps=1 dirtyparent=1 dirty=1 authpin=0 0x56078e661b80]
2022-08-05T17:06:15.223-0400 7fbbb7791640 10 mds.0.locker eval 3648 [inode 0x10000000004 [2,head] /testdir00/testfuse01.txt auth v20 DIRTYPARENT s=0 n(v0 1=1+0) (ifile excl) (iversion lock) cr={4160=0-4194304@1} caps={4160=pAsLsXsFsc/-@4},l=4160 | request=0 lock=0 caps=1 dirtyparent=1 dirty=1 authpin=0 0x56078e661b80]
2022-08-05T17:06:15.223-0400 7fbbb7791640 10 mds.0.locker eval want loner: client.-1 but failed to set it
2022-08-05T17:06:15.223-0400 7fbbb7791640  7 mds.0.locker file_eval wanted= loner_wanted= other_wanted=  filelock=(ifile excl) on [inode 0x10000000004 [2,head] /testdir00/testfuse01.txt auth v20 DIRTYPARENT s=0 n(v0 1=1+0) (ifile excl) (iversion lock) cr={4160=0-4194304@1} caps={4160=pAsLsXsFsc/-@4},l=4160(-1) | request=0 lock=0 caps=1 dirtyparent=1 dirty=1 authpin=0 0x56078e661b80]
2022-08-05T17:06:15.223-0400 7fbbb7791640 20 mds.0.locker  is excl
2022-08-05T17:06:15.223-0400 7fbbb7791640  7 mds.0.locker file_eval loner_issued=sc other_issued= xlocker_issued=
2022-08-05T17:06:15.223-0400 7fbbb7791640 20 mds.0.locker  should lose it

#16 Updated by Ramana Raja 4 months ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 47506

#17 Updated by Ramana Raja 4 months ago

Patrick Donnelly wrote:

Please open a PR for discussion.

https://github.com/ceph/ceph/pull/47506 . Please take a look.

#18 Updated by Ramana Raja 4 months ago

Created a separate tracker ticket for the cap updates being dropped for clients using ceph IDs with uid restricted MDS caps,
https://tracker.ceph.com/issues/57154

The issue was described in https://tracker.ceph.com/issues/56067#note-15

#19 Updated by Patrick Donnelly 3 months ago

  • Related to Bug #57154: kernel/fuse client using ceph ID with uid restricted MDS caps cannot update caps added

Also available in: Atom PDF