Project

General

Profile

Actions

Bug #42688

open

Standard CephFS caps do not allow certain dot files to be written

Added by Markus Kienast over 4 years ago. Updated almost 2 years ago.

Status:
Triaged
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
pacific,octopus,nautilus
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
fs
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I have repeatedly setup a Ceph Nautilus cluster via MAAS/Juju (openstack-charmers charms), using the latest Ubuntu cloud-archive bionic-train repos.
ceph version 14.2.2 (4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus (stable)

My client is running from the same bionic-train cloud-archive and is sporting the latest Ubuntu HWE 18.04 edge kernel - ergo Version 5.3.0.

I have setup ceph-fs using the openstack-charmers/ceph-fs charm and setup the client auth as follows:
ceph fs authorize ceph-fs client.ltsp-home / rw

I should therefore have general rw access to all of ceph-fs.

ceph auth get client.ltsp-home returns:
[client.ltsp-home]
key = OBSCURED==
caps mds = "allow rw"
caps mon = "allow r"
caps osd = "allow rw tag cephfs data=ceph-fs"

I am using a pure HDD + bluestore-db on NVMe setup with standard 3x replication - no cache tiering, no ereasure coding. You find the specifics at the very end.

But I have tested other variants as well, but always with the same results:

The standard caps don't allow me to write certain files. The one thing they all have in common, they begin with a ".".

elias@maas:~/ltsp-scripts$ sudo rsync -axv /home/ubuntu/ /mnt/cephfs/ubuntu/
sending incremental file list
.bash_logout
.bashrc
.profile
rsync: write failed on "/mnt/cephfs/ubuntu/.bash_logout": Operation not permitted (1)
rsync error: error in file IO (code 11) at receiver.c(393) [receiver=3.1.2]

"cp -a" produces the same problem.

It also does not make a difference, whether I am using ceph-fuse or the kernel driver. So i presume the problem lies in MDS caps handling code.

I have tried different ways to setup the caps. Here are the ones, which did produce the same problem. And at the very end, you find the setup, which finally worked.

So here to non working configs:

client.ltsp-home
key: AQApPMRd4nzxGRAAOzgjaEwEEFAJsC9C7M15dQ==
caps: [mds] allow rw
caps: [mon] allow r
caps: [osd] allow rw tag cephfs data=ceph-fs
client.ltsp-home2
key: AQApPMRd4nzxGRAAOzgjaEwEEFAJsC9C7M15dQ==
caps: [mds] allow rw
caps: [mon] allow r
caps: [osd] allow rw tag cephfs data=ceph-fs_data
client.ltsp-home3
key: AQApPMRd4nzxGRAAOzgjaEwEEFAJsC9C7M15dQ==
caps: [mds] allow rw, allow rw path=/
caps: [mon] allow r
caps: [osd] allow rw tag cephfs data=ceph-fs
client.ltsp-home4
key: AQApPMRd4nzxGRAAOzgjaEwEEFAJsC9C7M15dQ==
caps: [mds] allow rw, allow rw path=/
caps: [mon] allow r
caps: [osd] allow rw tag cephfs data=ceph-fs_data

And finally the working config:
client.ltsp-home5
key: AQApPMRd4nzxGRAAOzgjaEwEEFAJsC9C7M15dQ==
caps: [mds] allow rw, allow rw path=/
caps: [mon] allow r
caps: [osd] allow rw tag cephfs data=ceph-fs, allow rw pool=ceph-fs_data

Also very interesting is to mention, that one time things keep working, even if I re-import the initial caps setup (the first one, which is derived from your docs) over the now working client.ltsp-home5. So something must change deep under and stay like that, even if I change the caps later. Might be a hint, that helps you track down the problem. However, I just tried again and could not reproduce this behavior. But there is a difference in setup, as this time I was using ceph-fuse instead of the kernel driver to mount CephFS.

Anyhow, probably not worth pursuing this part of the problem.

The other however renders the auth setup method recommended in your docs rather useless, so this is probably of some importance to you. BTW, these docs start with the special case of restricting access to just one dir, while the general access method is not covered at all! You might want to change that and provide the least restrictive method also and first before going into more advanced scenarios.

With great appreciation for your work,
I remain with best regards,

Markus Kienast

root@juju-f18b61-1-lxd-0:~# ceph osd lspools
1 ceph-fs_data
2 ceph-fs_metadata
3 rbd
4 squashfs

root@juju-f18b61-1-lxd-0:~# ceph fs ls
name: ceph-fs, metadata pool: ceph-fs_metadata, data pools: [ceph-fs_data ]

root@juju-f18b61-1-lxd-0:~# ceph osd dump | grep 'replicated size'
pool 1 'ceph-fs_data' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 28 flags hashpspool stripe_width 0 target_size_ratio 0.1 application cephfs
pool 2 'ceph-fs_metadata' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 27 flags hashpspool stripe_width 0 target_size_ratio 0.1 application cephfs
pool 3 'rbd' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 2 pgp_num 2 autoscale_mode on last_change 33 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd
pool 4 'squashfs' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 2 pgp_num 2 autoscale_mode on last_change 39 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd

Actions #1

Updated by Patrick Donnelly over 4 years ago

  • Assignee set to Rishabh Dave
  • Target version changed from v14.2.2 to v15.0.0
  • Start date deleted (11/07/2019)
  • Tags deleted (ceph-fs, cephfs, )
  • Component(FS) MDS added
Actions #2

Updated by Patrick Donnelly over 4 years ago

  • Target version deleted (v15.0.0)
Actions #3

Updated by Markus Kienast almost 4 years ago

The problem persists in Octopus!

Just did a fresh MAAS/JUJU cephfs install.
I believe at least the documentation should be updated accordingly. If somebody points me to the right place, I'll submit a patch.

Actions #4

Updated by Patrick Donnelly over 3 years ago

  • Status changed from New to Triaged
  • Target version set to v16.0.0
  • Backport set to octopus,nautilus
Actions #5

Updated by Patrick Donnelly over 3 years ago

  • Target version changed from v16.0.0 to v17.0.0
  • Backport changed from octopus,nautilus to pacific,octopus,nautilus
Actions #6

Updated by Patrick Donnelly almost 2 years ago

  • Target version deleted (v17.0.0)
Actions

Also available in: Atom PDF