Cephfs - multitenancy features


Make CephFS more secure in multi-tenant environments


  • Sage Weil (Red Hat)
  • Name (Affiliation)
  • Name

Interested Parties

  • Danny Al-Gaaf (Deutsche Telekom)
  • Name (Affiliation)
  • Name

Current Status

CephFS has a capability defined that is intended to describe what a client is allowed to do. Currently it parses a bunch fo stuff but we only interpret it as a binary value: is this client allowed to mount the file system? If it mounts, it can do anything.
Ceph has similar capabilities that let you lock a client into different rados pools or rados namespaces.
CephFS lets you set a policy on a directory that puts new files in specific rados pools.

Detailed Description

Read-only mount
Make the capability specify that a given client can only mount read-only and cannot make any cahnges

Root squash
Make the capability specify that a given client cannot act as root. This should mimic the NFS semantics. I went hunting for a detailed technical description of what that is, but didn't find anything. Reading the source is probably the best way to establish that. I think it means:

cannot read/write/etc a file that is only readable/writeable/etc by root

Since the cephfs client is trusted to map ops to users, anything non-root is still fair game since the client can claim to be whatever uid/gid own the file. I don't think it is worth trying anything else clever there... at least not until we specify in teh capability that we can only be a specific uid.

Path-based mount restriction
This is parsed, but not enforced. Here we should only allow teh client to mount a subdirectory (or something beneath it). That means that any open-by-ino needs to ensure that the ino is part of that subtree. I think this can happen int eh request path, somewhere in the neighborhood of rdlock_path_pin_ref() or rdlock_path_xlock_dentry().
  • verify that any lookup-by-ino (targeti?) lives in the subtree
  • possibly special case allow inodes where we have a remote dentry inside the subtree pointing to something outside of it? at the very least we need to handle inodes that are in a straydir if their previous owner was in the correct tree. This may be tricky to establish :(

Extend file layout ceph_file_layout to specify a namespace id (uint64_t?)
Right now this layout specifies a data pool number. It could also specify a namespace id, or 0 for none. The actual rados namespaces are described by strings, but we could make the cephfs ones be decimal string forms of the namespace numbers.
We have two unused fields in this struct we could reclaim:

__le32 fl_cas_hash; // never used
__le32 fl_unused; // used to be the preferred pg, removed years ago

Perhaps 2^32 namespaces is enough and we just use one of them.
This would require changes in
  • mds: parse virtual xattrs for this field
  • mds: use this namespace for all backtrace operations
  • mds: do any file probe or cleanup in this namespace
  • mds: use this namespace for backtraces
  • cephfs-journal-tool: use proper namespace for any scavenge/repair operations
  • Client: use this namespace for io, both sync and ObjectCacher-initiated
  • libceph.ko: generic namespace support
  • ceph.ko: use correct namespace for any IO

Work items

Coding tasks

  1. expand MDSAuthCap grammar to capture any/all of the above
  2. root squash
  3. path-based mount restrictions
  4. namespaces for ceph_file_layout