Project

General

Profile

Bug #43761

mon/MDSMonitor: "ceph fs authorize cephfs client.test /test rw" does not give the necessary right anymore

Added by Yoann Moulin 9 months ago. Updated 6 months ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
% Done:

0%

Source:
Community (user)
Tags:
Backport:
octopus,nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDSMonitor
Labels (FS):
Pull request ID:
Crash signature:

Description

Hello,

I notice a regression on "ceph fs authorize" command that is not enough anymore to give right access to be able to write on the cephfs volume

dslab2020@icitsrv5:~$ ceph -s
  cluster:
    id:     fded5bb5-62c5-4a88-b62c-0986d7c7ac09
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum iccluster039,iccluster041,iccluster042 (age 46h)
    mgr: iccluster039(active, since 45h), standbys: iccluster041, iccluster042
    mds: cephfs:3 {0=iccluster043=up:active,1=iccluster041=up:active,2=iccluster042=up:active}
    osd: 24 osds: 24 up (since 46h), 24 in (since 46h)
    rgw: 1 daemon active (iccluster043.rgw0)

  data:
    pools:   9 pools, 568 pgs
    objects: 800 objects, 431 KiB
    usage:   24 GiB used, 87 TiB / 87 TiB avail
    pgs:     568 active+clean

dslab2020@icitsrv5:~$ ceph fs status
cephfs - 3 clients
======
+------+--------+--------------+---------------+-------+-------+
| Rank | State  |     MDS      |    Activity   |  dns  |  inos |
+------+--------+--------------+---------------+-------+-------+
|  0   | active | iccluster043 | Reqs:    0 /s |   44  |   22  |
|  1   | active | iccluster041 | Reqs:    0 /s |   12  |   16  |
|  2   | active | iccluster042 | Reqs:    0 /s |   11  |   14  |
+------+--------+--------------+---------------+-------+-------+
+-----------------+----------+-------+-------+
|       Pool      |   type   |  used | avail |
+-----------------+----------+-------+-------+
| cephfs_metadata | metadata | 5193k | 27.6T |
|   cephfs_data   |   data   |    0  | 27.6T |
+-----------------+----------+-------+-------+
+-------------+
| Standby MDS |
+-------------+
+-------------+
MDS version: ceph version 14.2.6 (f0aa067ac7a02ee46ea48aa26c6e298b5ea272e9) nautilus (stable)
dslab2020@icitsrv5:~$ ceph osd pool ls detail
pool 1 'cephfs_data' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 256 pgp_num 256 autoscale_mode warn last_change 83 lfor 0/0/81 flags hashpspool stripe_width 0 expected_num_objects 1 application cephfs
pool 2 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 48 flags hashpspool stripe_width 0 expected_num_objects 1 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs
pool 3 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 51 flags hashpspool stripe_width 0 application rgw
pool 4 'defaults.rgw.buckets.data' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 256 pgp_num 256 autoscale_mode warn last_change 78 lfor 0/0/76 flags hashpspool stripe_width 0 application rgw
pool 5 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 53 flags hashpspool stripe_width 0 application rgw
pool 6 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 55 flags hashpspool stripe_width 0 application rgw
pool 7 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 57 flags hashpspool stripe_width 0 application rgw
pool 8 'default.rgw.buckets.index' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 87 flags hashpspool stripe_width 0 application rgw
pool 9 'default.rgw.buckets.data' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 90 flags hashpspool stripe_width 0 application rgw

dslab2020@icitsrv5:~$ ceph fs authorize cephfs client.test /test rw
[client.test]
    key = XXX
dslab2020@icitsrv5:~$ ceph auth get client.test
exported keyring for client.test
[client.test]
    key = XXX
    caps mds = "allow rw path=/test" 
    caps mon = "allow r" 
    caps osd = "allow rw tag cephfs data=cephfs" 
root@icitsrv5:~# ceph --cluster dslab2020 auth get-key client.test > /etc/ceph/dslab2020.client.test.secret
root@icitsrv5:~# mkdir -p /mnt/dslab2020/test ; mount -t ceph -o rw,relatime,name=test,secretfile=/etc/ceph/dslab2020.client.test.secret  iccluster039.iccluster.epfl.ch,iccluster041.iccluster.epfl.ch,iccluster042.iccluster.epfl.ch:/test /mnt/dslab2020/test
root@icitsrv5:~# ls -al /mnt/dslab2020/test
total 4
drwxr-xr-x 1 root root    1 Jan 23 07:21 .
drwxr-xr-x 3 root root 4096 Jan 23 07:14 ..
root@icitsrv5:~# echo "test" > /mnt/dslab2020/test/foo
-bash: echo: write error: Operation not permitted
root@icitsrv5:~# ls -al /mnt/dslab2020/test
total 4
drwxr-xr-x 1 root root    1 Jan 23 07:21 .
drwxr-xr-x 3 root root 4096 Jan 23 07:14 ..
-rw-r--r-- 1 root root    0 Jan 23 07:21 foo

To be able to write on the ceph volume, I must change the caps by :

$ ceph auth caps client.test mds "allow rw path=/test" mon "allow r" osd "allow class-read object_prefix rbd_children, allow rw pool=cephfs_data" 

I have a similar cluster installed one week before with same version that does not have this behaviour. Both have been deploy with ceph-ansible playbook stable-4.0.

the other cluster (artemis) where the "ceph fs authorize" command works :

artemis@icitsrv5:~$ ceph -s
  cluster:
    id:     815ea021-7839-4a63-9dc1-14f8c5feecc6
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum iccluster003,iccluster005,iccluster007 (age 6d)
    mgr: iccluster021(active, since 5d), standbys: iccluster023
    mds: cephfs:1 {0=iccluster013=up:active} 2 up:standby
    osd: 80 osds: 80 up (since 6d), 80 in (since 6d); 68 remapped pgs
    rgw: 8 daemons active (iccluster003.rgw0, iccluster005.rgw0, iccluster007.rgw0, iccluster013.rgw0, iccluster015.rgw0, iccluster019.rgw0, iccluster021.rgw0, iccluster023.rgw0)

  data:
    pools:   9 pools, 1592 pgs
    objects: 41.82M objects, 103 TiB
    usage:   149 TiB used, 292 TiB / 442 TiB avail
    pgs:     22951249/457247397 objects misplaced (5.019%)
             1524 active+clean
             55   active+remapped+backfill_wait
             13   active+remapped+backfilling

  io:
    client:   0 B/s rd, 7.6 MiB/s wr, 0 op/s rd, 201 op/s wr
    recovery: 340 MiB/s, 121 objects/s

artemis@icitsrv5:~$ ceph fs status
cephfs - 4 clients
======
+------+--------+--------------+---------------+-------+-------+
| Rank | State  |     MDS      |    Activity   |  dns  |  inos |
+------+--------+--------------+---------------+-------+-------+
|  0   | active | iccluster013 | Reqs:   10 /s |  346k |  337k |
+------+--------+--------------+---------------+-------+-------+
+-----------------+----------+-------+-------+
|       Pool      |   type   |  used | avail |
+-----------------+----------+-------+-------+
| cephfs_metadata | metadata |  751M | 81.1T |
|   cephfs_data   |   data   | 14.2T |  176T |
+-----------------+----------+-------+-------+
+--------------+
| Standby MDS  |
+--------------+
| iccluster019 |
| iccluster015 |
+--------------+
MDS version: ceph version 14.2.6 (f0aa067ac7a02ee46ea48aa26c6e298b5ea272e9) nautilus (stable)
artemis@icitsrv5:~$ ceph osd pool ls detail
pool 3 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 125 flags hashpspool stripe_width 0 application rgw
pool 4 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 128 flags hashpspool stripe_width 0 application rgw
pool 5 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 130 flags hashpspool stripe_width 0 application rgw
pool 6 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 131 flags hashpspool stripe_width 0 application rgw
pool 7 'cephfs_data' erasure size 11 min_size 9 crush_rule 1 object_hash rjenkins pg_num 512 pgp_num 512 autoscale_mode warn last_change 204 lfor 0/0/199 flags hashpspool,ec_overwrites stripe_width 32768 application cephfs
pool 8 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 144 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs
pool 9 'default.rgw.buckets.data' erasure size 11 min_size 9 crush_rule 2 object_hash rjenkins pg_num 1024 pgp_num 808 pgp_num_target 1024 autoscale_mode warn last_change 2982 lfor 0/0/180 flags hashpspool stripe_width 32768 application rgw
pool 10 'default.rgw.buckets.index' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 171 flags hashpspool stripe_width 0 application rgw
pool 11 'default.rgw.buckets.non-ec' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 176 flags hashpspool stripe_width 0 application rgw

artemis@icitsrv5:~$ ceph fs authorize cephfs client.test /test rw
[client.test]
    key = XXX 
artemis@icitsrv5:~$ ceph auth get client.test
exported keyring for client.test
[client.test]
    key = XXX
    caps mds = "allow rw path=/test" 
    caps mon = "allow r" 
    caps osd = "allow rw tag cephfs data=cephfs" 
root@icitsrv5:~# ceph --cluster artemis auth get-key client.test > /etc/ceph/artemis.client.test.secret
root@icitsrv5:~# mkdir -p /mnt/artemis/test/ ; mount -t ceph -o rw,relatime,name=test,secretfile=/etc/ceph/artemis.client.test.secret iccluster003.iccluster.epfl.ch,iccluster005.iccluster.epfl.ch,iccluster007.iccluster.epfl.ch:/test /mnt/artemis/test/
root@icitsrv5:~# ls -la /mnt/artemis/test/
total 5
drwxr-xr-x 1 root root    1 Jan 23 07:21 .
drwxr-xr-x 3 root root 4096 Jan 23 07:15 ..
root@icitsrv5:~# echo "test" > /mnt/artemis/test/foo 
root@icitsrv5:~# ls -la /mnt/artemis/test/
total 5
drwxr-xr-x 1 root root    1 Jan 23 07:21 .
drwxr-xr-x 3 root root 4096 Jan 23 07:15 ..
-rw-r--r-- 1 root root    5 Jan 23 07:21 foo

What I did to have EC pool for the cephfs_data pool :

# must stop mds servers
ansible -i ~/iccluster/ceph-config/cluster-artemis/inventory mdss -m shell -a " systemctl stop ceph-mds.target" 

# must allow pool deletion
ceph --cluster artemis tell mon.\* injectargs '--mon-allow-pool-delete=true'

ceph --cluster artemis fs rm cephfs --yes-i-really-mean-it

#delete cephfs pool
ceph --cluster artemis osd pool rm cephfs_data  cephfs_data --yes-i-really-really-mean-it --yes-i-really-really-mean-it
ceph --cluster artemis osd pool rm cephfs_metadata  cephfs_metadata --yes-i-really-really-mean-it --yes-i-really-really-mean-it

# disallow pool deletion
ceph --cluster artemis tell mon.\* injectargs '--mon-allow-pool-delete=false'

# create earsure coding profile
# ecpool-8-3 for apollo cluster
ceph --cluster artemis osd erasure-code-profile set ecpool-8-3 k=8 m=3 crush-failure-domain=host

#re create pool for cephfs
# cephfs_data in erasure coding
ceph --cluster artemis osd pool create cephfs_data 64 64 erasure ecpool-8-3
# cephfs_metadata must be in replicated !
ceph --cluster artemis osd pool create cephfs_metadata 8 8

# must set allow_ec_overwrites to be able to create a cephfs over EC pool
ceph --cluster artemis osd pool set cephfs_data allow_ec_overwrites true

# create the cephfs filesystem named "cephfs" 
ceph --cluster artemis fs new cephfs cephfs_metadata cephfs_data

Best regards,

Yoann


Related issues

Copied to fs - Backport #45225: nautilus: mon/MDSMonitor: "ceph fs authorize cephfs client.test /test rw" does not give the necessary right anymore Resolved
Copied to fs - Backport #45226: octopus: mon/MDSMonitor: "ceph fs authorize cephfs client.test /test rw" does not give the necessary right anymore Resolved

History

#1 Updated by Greg Farnum 9 months ago

  • Project changed from Ceph to fs

#2 Updated by Yoann Moulin 9 months ago

Hello,

From the mailing list : https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/23FDDSYBCDVMYGCUTALACPFAJYITLOHJ/
I got an answer to fix this issue

Previously, I had :

$ ceph osd pool ls detail --format=json | jq '.[] | select(.pool_name| startswith("cephfs")) | .pool_name, .application_metadata'
"cephfs_data" 
{
   "cephfs": {}
}
"cephfs_metadata" 
{
    "cephfs": {}
}

Then Frank Schilder and Ilya Dryomov gave me the solution. I ran those commands.

  ceph osd pool application set cephfs_data cephfs data cephfs
  ceph osd pool application set cephfs_metadata cephfs metadata cephfs

and now I have this :

  "cephfs_data" 
  {
    "cephfs": {
      "data": "cephfs" 
    }
  }
  "cephfs_metadata" 
  {
    "cephfs": {
      "metadata": "cephfs" 
    }
  }

and it works.

I don't know where those settings have to be done during the install, in the ceph-ansible playbook or in the ceph's install scripts themselves.

Thanks,

Best regards,

#3 Updated by Patrick Donnelly 9 months ago

  • Subject changed from "ceph fs authorize cephfs client.test /test rw" does not give the necessary right anymore to mon/MDSMonitor: "ceph fs authorize cephfs client.test /test rw" does not give the necessary right anymore
  • Status changed from New to Triaged
  • Assignee set to Ramana Raja
  • Priority changed from Normal to Urgent
  • Target version changed from v14.2.6 to v15.0.0
  • Tags deleted (cephfs)
  • Backport set to nautilus,mimic
  • Component(FS) MDSMonitor added

Ramana, I'm assigning this to you. The bug is arguably in ceph-ansible because it's enabling the application but not setting application pool metadata. We can fix this by having the MDSMonitor set the value if it doesn't already exist?

#4 Updated by Patrick Donnelly 8 months ago

  • Target version changed from v15.0.0 to v16.0.0
  • Backport changed from nautilus,mimic to octopus,nautilus

#5 Updated by Ramana Raja 6 months ago

  • Status changed from Triaged to In Progress
  • Pull request ID set to 34534

Patrick Donnelly wrote:

Ramana, I'm assigning this to you. The bug is arguably in ceph-ansible because it's enabling the application but not setting application pool metadata. We can fix this by having the MDSMonitor set the value if it doesn't already exist?

Yes, it's in ceph-ansible and in the `fs new` command handling code. I've filed the ceph-ansible bug,
https://github.com/ceph/ceph-ansible/issues/5278

I've submitted a PR to fix the `fs new` command handling code. The issue was that incorrect application metadata was applied on the pools by the `fs new` command if the application `cephfs` was already enabled on the pools passed to the `fs new` command.

#6 Updated by Greg Farnum 6 months ago

  • Status changed from In Progress to Pending Backport

#7 Updated by Nathan Cutler 6 months ago

  • Copied to Backport #45225: nautilus: mon/MDSMonitor: "ceph fs authorize cephfs client.test /test rw" does not give the necessary right anymore added

#8 Updated by Nathan Cutler 6 months ago

  • Copied to Backport #45226: octopus: mon/MDSMonitor: "ceph fs authorize cephfs client.test /test rw" does not give the necessary right anymore added

#9 Updated by Nathan Cutler 6 months ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Also available in: Atom PDF