Bug #43761
mon/MDSMonitor: "ceph fs authorize cephfs client.test /test rw" does not give the necessary right anymore
0%
Description
Hello,
I notice a regression on "ceph fs authorize" command that is not enough anymore to give right access to be able to write on the cephfs volume
dslab2020@icitsrv5:~$ ceph -s cluster: id: fded5bb5-62c5-4a88-b62c-0986d7c7ac09 health: HEALTH_OK services: mon: 3 daemons, quorum iccluster039,iccluster041,iccluster042 (age 46h) mgr: iccluster039(active, since 45h), standbys: iccluster041, iccluster042 mds: cephfs:3 {0=iccluster043=up:active,1=iccluster041=up:active,2=iccluster042=up:active} osd: 24 osds: 24 up (since 46h), 24 in (since 46h) rgw: 1 daemon active (iccluster043.rgw0) data: pools: 9 pools, 568 pgs objects: 800 objects, 431 KiB usage: 24 GiB used, 87 TiB / 87 TiB avail pgs: 568 active+clean dslab2020@icitsrv5:~$ ceph fs status cephfs - 3 clients ====== +------+--------+--------------+---------------+-------+-------+ | Rank | State | MDS | Activity | dns | inos | +------+--------+--------------+---------------+-------+-------+ | 0 | active | iccluster043 | Reqs: 0 /s | 44 | 22 | | 1 | active | iccluster041 | Reqs: 0 /s | 12 | 16 | | 2 | active | iccluster042 | Reqs: 0 /s | 11 | 14 | +------+--------+--------------+---------------+-------+-------+ +-----------------+----------+-------+-------+ | Pool | type | used | avail | +-----------------+----------+-------+-------+ | cephfs_metadata | metadata | 5193k | 27.6T | | cephfs_data | data | 0 | 27.6T | +-----------------+----------+-------+-------+ +-------------+ | Standby MDS | +-------------+ +-------------+ MDS version: ceph version 14.2.6 (f0aa067ac7a02ee46ea48aa26c6e298b5ea272e9) nautilus (stable) dslab2020@icitsrv5:~$ ceph osd pool ls detail pool 1 'cephfs_data' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 256 pgp_num 256 autoscale_mode warn last_change 83 lfor 0/0/81 flags hashpspool stripe_width 0 expected_num_objects 1 application cephfs pool 2 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 48 flags hashpspool stripe_width 0 expected_num_objects 1 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs pool 3 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 51 flags hashpspool stripe_width 0 application rgw pool 4 'defaults.rgw.buckets.data' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 256 pgp_num 256 autoscale_mode warn last_change 78 lfor 0/0/76 flags hashpspool stripe_width 0 application rgw pool 5 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 53 flags hashpspool stripe_width 0 application rgw pool 6 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 55 flags hashpspool stripe_width 0 application rgw pool 7 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 57 flags hashpspool stripe_width 0 application rgw pool 8 'default.rgw.buckets.index' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 87 flags hashpspool stripe_width 0 application rgw pool 9 'default.rgw.buckets.data' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 90 flags hashpspool stripe_width 0 application rgw dslab2020@icitsrv5:~$ ceph fs authorize cephfs client.test /test rw [client.test] key = XXX dslab2020@icitsrv5:~$ ceph auth get client.test exported keyring for client.test [client.test] key = XXX caps mds = "allow rw path=/test" caps mon = "allow r" caps osd = "allow rw tag cephfs data=cephfs" root@icitsrv5:~# ceph --cluster dslab2020 auth get-key client.test > /etc/ceph/dslab2020.client.test.secret root@icitsrv5:~# mkdir -p /mnt/dslab2020/test ; mount -t ceph -o rw,relatime,name=test,secretfile=/etc/ceph/dslab2020.client.test.secret iccluster039.iccluster.epfl.ch,iccluster041.iccluster.epfl.ch,iccluster042.iccluster.epfl.ch:/test /mnt/dslab2020/test root@icitsrv5:~# ls -al /mnt/dslab2020/test total 4 drwxr-xr-x 1 root root 1 Jan 23 07:21 . drwxr-xr-x 3 root root 4096 Jan 23 07:14 .. root@icitsrv5:~# echo "test" > /mnt/dslab2020/test/foo -bash: echo: write error: Operation not permitted root@icitsrv5:~# ls -al /mnt/dslab2020/test total 4 drwxr-xr-x 1 root root 1 Jan 23 07:21 . drwxr-xr-x 3 root root 4096 Jan 23 07:14 .. -rw-r--r-- 1 root root 0 Jan 23 07:21 foo
To be able to write on the ceph volume, I must change the caps by :
$ ceph auth caps client.test mds "allow rw path=/test" mon "allow r" osd "allow class-read object_prefix rbd_children, allow rw pool=cephfs_data"
I have a similar cluster installed one week before with same version that does not have this behaviour. Both have been deploy with ceph-ansible playbook stable-4.0.
the other cluster (artemis) where the "ceph fs authorize" command works :
artemis@icitsrv5:~$ ceph -s cluster: id: 815ea021-7839-4a63-9dc1-14f8c5feecc6 health: HEALTH_OK services: mon: 3 daemons, quorum iccluster003,iccluster005,iccluster007 (age 6d) mgr: iccluster021(active, since 5d), standbys: iccluster023 mds: cephfs:1 {0=iccluster013=up:active} 2 up:standby osd: 80 osds: 80 up (since 6d), 80 in (since 6d); 68 remapped pgs rgw: 8 daemons active (iccluster003.rgw0, iccluster005.rgw0, iccluster007.rgw0, iccluster013.rgw0, iccluster015.rgw0, iccluster019.rgw0, iccluster021.rgw0, iccluster023.rgw0) data: pools: 9 pools, 1592 pgs objects: 41.82M objects, 103 TiB usage: 149 TiB used, 292 TiB / 442 TiB avail pgs: 22951249/457247397 objects misplaced (5.019%) 1524 active+clean 55 active+remapped+backfill_wait 13 active+remapped+backfilling io: client: 0 B/s rd, 7.6 MiB/s wr, 0 op/s rd, 201 op/s wr recovery: 340 MiB/s, 121 objects/s artemis@icitsrv5:~$ ceph fs status cephfs - 4 clients ====== +------+--------+--------------+---------------+-------+-------+ | Rank | State | MDS | Activity | dns | inos | +------+--------+--------------+---------------+-------+-------+ | 0 | active | iccluster013 | Reqs: 10 /s | 346k | 337k | +------+--------+--------------+---------------+-------+-------+ +-----------------+----------+-------+-------+ | Pool | type | used | avail | +-----------------+----------+-------+-------+ | cephfs_metadata | metadata | 751M | 81.1T | | cephfs_data | data | 14.2T | 176T | +-----------------+----------+-------+-------+ +--------------+ | Standby MDS | +--------------+ | iccluster019 | | iccluster015 | +--------------+ MDS version: ceph version 14.2.6 (f0aa067ac7a02ee46ea48aa26c6e298b5ea272e9) nautilus (stable) artemis@icitsrv5:~$ ceph osd pool ls detail pool 3 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 125 flags hashpspool stripe_width 0 application rgw pool 4 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 128 flags hashpspool stripe_width 0 application rgw pool 5 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 130 flags hashpspool stripe_width 0 application rgw pool 6 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 131 flags hashpspool stripe_width 0 application rgw pool 7 'cephfs_data' erasure size 11 min_size 9 crush_rule 1 object_hash rjenkins pg_num 512 pgp_num 512 autoscale_mode warn last_change 204 lfor 0/0/199 flags hashpspool,ec_overwrites stripe_width 32768 application cephfs pool 8 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 144 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs pool 9 'default.rgw.buckets.data' erasure size 11 min_size 9 crush_rule 2 object_hash rjenkins pg_num 1024 pgp_num 808 pgp_num_target 1024 autoscale_mode warn last_change 2982 lfor 0/0/180 flags hashpspool stripe_width 32768 application rgw pool 10 'default.rgw.buckets.index' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 171 flags hashpspool stripe_width 0 application rgw pool 11 'default.rgw.buckets.non-ec' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 176 flags hashpspool stripe_width 0 application rgw artemis@icitsrv5:~$ ceph fs authorize cephfs client.test /test rw [client.test] key = XXX artemis@icitsrv5:~$ ceph auth get client.test exported keyring for client.test [client.test] key = XXX caps mds = "allow rw path=/test" caps mon = "allow r" caps osd = "allow rw tag cephfs data=cephfs" root@icitsrv5:~# ceph --cluster artemis auth get-key client.test > /etc/ceph/artemis.client.test.secret root@icitsrv5:~# mkdir -p /mnt/artemis/test/ ; mount -t ceph -o rw,relatime,name=test,secretfile=/etc/ceph/artemis.client.test.secret iccluster003.iccluster.epfl.ch,iccluster005.iccluster.epfl.ch,iccluster007.iccluster.epfl.ch:/test /mnt/artemis/test/ root@icitsrv5:~# ls -la /mnt/artemis/test/ total 5 drwxr-xr-x 1 root root 1 Jan 23 07:21 . drwxr-xr-x 3 root root 4096 Jan 23 07:15 .. root@icitsrv5:~# echo "test" > /mnt/artemis/test/foo root@icitsrv5:~# ls -la /mnt/artemis/test/ total 5 drwxr-xr-x 1 root root 1 Jan 23 07:21 . drwxr-xr-x 3 root root 4096 Jan 23 07:15 .. -rw-r--r-- 1 root root 5 Jan 23 07:21 foo
What I did to have EC pool for the cephfs_data pool :
# must stop mds servers ansible -i ~/iccluster/ceph-config/cluster-artemis/inventory mdss -m shell -a " systemctl stop ceph-mds.target" # must allow pool deletion ceph --cluster artemis tell mon.\* injectargs '--mon-allow-pool-delete=true' ceph --cluster artemis fs rm cephfs --yes-i-really-mean-it #delete cephfs pool ceph --cluster artemis osd pool rm cephfs_data cephfs_data --yes-i-really-really-mean-it --yes-i-really-really-mean-it ceph --cluster artemis osd pool rm cephfs_metadata cephfs_metadata --yes-i-really-really-mean-it --yes-i-really-really-mean-it # disallow pool deletion ceph --cluster artemis tell mon.\* injectargs '--mon-allow-pool-delete=false' # create earsure coding profile # ecpool-8-3 for apollo cluster ceph --cluster artemis osd erasure-code-profile set ecpool-8-3 k=8 m=3 crush-failure-domain=host #re create pool for cephfs # cephfs_data in erasure coding ceph --cluster artemis osd pool create cephfs_data 64 64 erasure ecpool-8-3 # cephfs_metadata must be in replicated ! ceph --cluster artemis osd pool create cephfs_metadata 8 8 # must set allow_ec_overwrites to be able to create a cephfs over EC pool ceph --cluster artemis osd pool set cephfs_data allow_ec_overwrites true # create the cephfs filesystem named "cephfs" ceph --cluster artemis fs new cephfs cephfs_metadata cephfs_data
Best regards,
Yoann
Related issues
History
#1 Updated by Greg Farnum about 4 years ago
- Project changed from Ceph to CephFS
#2 Updated by Yoann Moulin about 4 years ago
Hello,
From the mailing list : https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/23FDDSYBCDVMYGCUTALACPFAJYITLOHJ/
I got an answer to fix this issue
Previously, I had :
$ ceph osd pool ls detail --format=json | jq '.[] | select(.pool_name| startswith("cephfs")) | .pool_name, .application_metadata' "cephfs_data" { "cephfs": {} } "cephfs_metadata" { "cephfs": {} }
Then Frank Schilder and Ilya Dryomov gave me the solution. I ran those commands.
ceph osd pool application set cephfs_data cephfs data cephfs ceph osd pool application set cephfs_metadata cephfs metadata cephfs
and now I have this :
"cephfs_data" { "cephfs": { "data": "cephfs" } } "cephfs_metadata" { "cephfs": { "metadata": "cephfs" } }
and it works.
I don't know where those settings have to be done during the install, in the ceph-ansible playbook or in the ceph's install scripts themselves.
Thanks,
Best regards,
#3 Updated by Patrick Donnelly about 4 years ago
- Subject changed from "ceph fs authorize cephfs client.test /test rw" does not give the necessary right anymore to mon/MDSMonitor: "ceph fs authorize cephfs client.test /test rw" does not give the necessary right anymore
- Status changed from New to Triaged
- Assignee set to Ramana Raja
- Priority changed from Normal to Urgent
- Target version changed from v14.2.6 to v15.0.0
- Tags deleted (
cephfs) - Backport set to nautilus,mimic
- Component(FS) MDSMonitor added
Ramana, I'm assigning this to you. The bug is arguably in ceph-ansible because it's enabling the application but not setting application pool metadata. We can fix this by having the MDSMonitor set the value if it doesn't already exist?
#4 Updated by Patrick Donnelly about 4 years ago
- Target version changed from v15.0.0 to v16.0.0
- Backport changed from nautilus,mimic to octopus,nautilus
#5 Updated by Ramana Raja almost 4 years ago
- Status changed from Triaged to In Progress
- Pull request ID set to 34534
Patrick Donnelly wrote:
Ramana, I'm assigning this to you. The bug is arguably in ceph-ansible because it's enabling the application but not setting application pool metadata. We can fix this by having the MDSMonitor set the value if it doesn't already exist?
Yes, it's in ceph-ansible and in the `fs new` command handling code. I've filed the ceph-ansible bug,
https://github.com/ceph/ceph-ansible/issues/5278
I've submitted a PR to fix the `fs new` command handling code. The issue was that incorrect application metadata was applied on the pools by the `fs new` command if the application `cephfs` was already enabled on the pools passed to the `fs new` command.
#6 Updated by Greg Farnum almost 4 years ago
- Status changed from In Progress to Pending Backport
#7 Updated by Nathan Cutler almost 4 years ago
- Copied to Backport #45225: nautilus: mon/MDSMonitor: "ceph fs authorize cephfs client.test /test rw" does not give the necessary right anymore added
#8 Updated by Nathan Cutler almost 4 years ago
- Copied to Backport #45226: octopus: mon/MDSMonitor: "ceph fs authorize cephfs client.test /test rw" does not give the necessary right anymore added
#9 Updated by Nathan Cutler almost 4 years ago
- Status changed from Pending Backport to Resolved
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".