Project

General

Profile

Bug #43761

mon/MDSMonitor: "ceph fs authorize cephfs client.test /test rw" does not give the necessary right anymore

Added by Yoann Moulin about 1 month ago. Updated about 15 hours ago.

Status:
Triaged
Priority:
Urgent
Assignee:
Category:
-
Target version:
% Done:

0%

Source:
Community (user)
Tags:
Backport:
octopus,nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDSMonitor
Labels (FS):
Pull request ID:
Crash signature:

Description

Hello,

I notice a regression on "ceph fs authorize" command that is not enough anymore to give right access to be able to write on the cephfs volume

dslab2020@icitsrv5:~$ ceph -s
  cluster:
    id:     fded5bb5-62c5-4a88-b62c-0986d7c7ac09
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum iccluster039,iccluster041,iccluster042 (age 46h)
    mgr: iccluster039(active, since 45h), standbys: iccluster041, iccluster042
    mds: cephfs:3 {0=iccluster043=up:active,1=iccluster041=up:active,2=iccluster042=up:active}
    osd: 24 osds: 24 up (since 46h), 24 in (since 46h)
    rgw: 1 daemon active (iccluster043.rgw0)

  data:
    pools:   9 pools, 568 pgs
    objects: 800 objects, 431 KiB
    usage:   24 GiB used, 87 TiB / 87 TiB avail
    pgs:     568 active+clean

dslab2020@icitsrv5:~$ ceph fs status
cephfs - 3 clients
======
+------+--------+--------------+---------------+-------+-------+
| Rank | State  |     MDS      |    Activity   |  dns  |  inos |
+------+--------+--------------+---------------+-------+-------+
|  0   | active | iccluster043 | Reqs:    0 /s |   44  |   22  |
|  1   | active | iccluster041 | Reqs:    0 /s |   12  |   16  |
|  2   | active | iccluster042 | Reqs:    0 /s |   11  |   14  |
+------+--------+--------------+---------------+-------+-------+
+-----------------+----------+-------+-------+
|       Pool      |   type   |  used | avail |
+-----------------+----------+-------+-------+
| cephfs_metadata | metadata | 5193k | 27.6T |
|   cephfs_data   |   data   |    0  | 27.6T |
+-----------------+----------+-------+-------+
+-------------+
| Standby MDS |
+-------------+
+-------------+
MDS version: ceph version 14.2.6 (f0aa067ac7a02ee46ea48aa26c6e298b5ea272e9) nautilus (stable)
dslab2020@icitsrv5:~$ ceph osd pool ls detail
pool 1 'cephfs_data' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 256 pgp_num 256 autoscale_mode warn last_change 83 lfor 0/0/81 flags hashpspool stripe_width 0 expected_num_objects 1 application cephfs
pool 2 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 48 flags hashpspool stripe_width 0 expected_num_objects 1 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs
pool 3 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 51 flags hashpspool stripe_width 0 application rgw
pool 4 'defaults.rgw.buckets.data' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 256 pgp_num 256 autoscale_mode warn last_change 78 lfor 0/0/76 flags hashpspool stripe_width 0 application rgw
pool 5 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 53 flags hashpspool stripe_width 0 application rgw
pool 6 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 55 flags hashpspool stripe_width 0 application rgw
pool 7 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 57 flags hashpspool stripe_width 0 application rgw
pool 8 'default.rgw.buckets.index' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 87 flags hashpspool stripe_width 0 application rgw
pool 9 'default.rgw.buckets.data' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 90 flags hashpspool stripe_width 0 application rgw

dslab2020@icitsrv5:~$ ceph fs authorize cephfs client.test /test rw
[client.test]
    key = XXX
dslab2020@icitsrv5:~$ ceph auth get client.test
exported keyring for client.test
[client.test]
    key = XXX
    caps mds = "allow rw path=/test" 
    caps mon = "allow r" 
    caps osd = "allow rw tag cephfs data=cephfs" 
root@icitsrv5:~# ceph --cluster dslab2020 auth get-key client.test > /etc/ceph/dslab2020.client.test.secret
root@icitsrv5:~# mkdir -p /mnt/dslab2020/test ; mount -t ceph -o rw,relatime,name=test,secretfile=/etc/ceph/dslab2020.client.test.secret  iccluster039.iccluster.epfl.ch,iccluster041.iccluster.epfl.ch,iccluster042.iccluster.epfl.ch:/test /mnt/dslab2020/test
root@icitsrv5:~# ls -al /mnt/dslab2020/test
total 4
drwxr-xr-x 1 root root    1 Jan 23 07:21 .
drwxr-xr-x 3 root root 4096 Jan 23 07:14 ..
root@icitsrv5:~# echo "test" > /mnt/dslab2020/test/foo
-bash: echo: write error: Operation not permitted
root@icitsrv5:~# ls -al /mnt/dslab2020/test
total 4
drwxr-xr-x 1 root root    1 Jan 23 07:21 .
drwxr-xr-x 3 root root 4096 Jan 23 07:14 ..
-rw-r--r-- 1 root root    0 Jan 23 07:21 foo

To be able to write on the ceph volume, I must change the caps by :

$ ceph auth caps client.test mds "allow rw path=/test" mon "allow r" osd "allow class-read object_prefix rbd_children, allow rw pool=cephfs_data" 

I have a similar cluster installed one week before with same version that does not have this behaviour. Both have been deploy with ceph-ansible playbook stable-4.0.

the other cluster (artemis) where the "ceph fs authorize" command works :

artemis@icitsrv5:~$ ceph -s
  cluster:
    id:     815ea021-7839-4a63-9dc1-14f8c5feecc6
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum iccluster003,iccluster005,iccluster007 (age 6d)
    mgr: iccluster021(active, since 5d), standbys: iccluster023
    mds: cephfs:1 {0=iccluster013=up:active} 2 up:standby
    osd: 80 osds: 80 up (since 6d), 80 in (since 6d); 68 remapped pgs
    rgw: 8 daemons active (iccluster003.rgw0, iccluster005.rgw0, iccluster007.rgw0, iccluster013.rgw0, iccluster015.rgw0, iccluster019.rgw0, iccluster021.rgw0, iccluster023.rgw0)

  data:
    pools:   9 pools, 1592 pgs
    objects: 41.82M objects, 103 TiB
    usage:   149 TiB used, 292 TiB / 442 TiB avail
    pgs:     22951249/457247397 objects misplaced (5.019%)
             1524 active+clean
             55   active+remapped+backfill_wait
             13   active+remapped+backfilling

  io:
    client:   0 B/s rd, 7.6 MiB/s wr, 0 op/s rd, 201 op/s wr
    recovery: 340 MiB/s, 121 objects/s

artemis@icitsrv5:~$ ceph fs status
cephfs - 4 clients
======
+------+--------+--------------+---------------+-------+-------+
| Rank | State  |     MDS      |    Activity   |  dns  |  inos |
+------+--------+--------------+---------------+-------+-------+
|  0   | active | iccluster013 | Reqs:   10 /s |  346k |  337k |
+------+--------+--------------+---------------+-------+-------+
+-----------------+----------+-------+-------+
|       Pool      |   type   |  used | avail |
+-----------------+----------+-------+-------+
| cephfs_metadata | metadata |  751M | 81.1T |
|   cephfs_data   |   data   | 14.2T |  176T |
+-----------------+----------+-------+-------+
+--------------+
| Standby MDS  |
+--------------+
| iccluster019 |
| iccluster015 |
+--------------+
MDS version: ceph version 14.2.6 (f0aa067ac7a02ee46ea48aa26c6e298b5ea272e9) nautilus (stable)
artemis@icitsrv5:~$ ceph osd pool ls detail
pool 3 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 125 flags hashpspool stripe_width 0 application rgw
pool 4 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 128 flags hashpspool stripe_width 0 application rgw
pool 5 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 130 flags hashpspool stripe_width 0 application rgw
pool 6 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 131 flags hashpspool stripe_width 0 application rgw
pool 7 'cephfs_data' erasure size 11 min_size 9 crush_rule 1 object_hash rjenkins pg_num 512 pgp_num 512 autoscale_mode warn last_change 204 lfor 0/0/199 flags hashpspool,ec_overwrites stripe_width 32768 application cephfs
pool 8 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 144 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs
pool 9 'default.rgw.buckets.data' erasure size 11 min_size 9 crush_rule 2 object_hash rjenkins pg_num 1024 pgp_num 808 pgp_num_target 1024 autoscale_mode warn last_change 2982 lfor 0/0/180 flags hashpspool stripe_width 32768 application rgw
pool 10 'default.rgw.buckets.index' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 171 flags hashpspool stripe_width 0 application rgw
pool 11 'default.rgw.buckets.non-ec' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 176 flags hashpspool stripe_width 0 application rgw

artemis@icitsrv5:~$ ceph fs authorize cephfs client.test /test rw
[client.test]
    key = XXX 
artemis@icitsrv5:~$ ceph auth get client.test
exported keyring for client.test
[client.test]
    key = XXX
    caps mds = "allow rw path=/test" 
    caps mon = "allow r" 
    caps osd = "allow rw tag cephfs data=cephfs" 
root@icitsrv5:~# ceph --cluster artemis auth get-key client.test > /etc/ceph/artemis.client.test.secret
root@icitsrv5:~# mkdir -p /mnt/artemis/test/ ; mount -t ceph -o rw,relatime,name=test,secretfile=/etc/ceph/artemis.client.test.secret iccluster003.iccluster.epfl.ch,iccluster005.iccluster.epfl.ch,iccluster007.iccluster.epfl.ch:/test /mnt/artemis/test/
root@icitsrv5:~# ls -la /mnt/artemis/test/
total 5
drwxr-xr-x 1 root root    1 Jan 23 07:21 .
drwxr-xr-x 3 root root 4096 Jan 23 07:15 ..
root@icitsrv5:~# echo "test" > /mnt/artemis/test/foo 
root@icitsrv5:~# ls -la /mnt/artemis/test/
total 5
drwxr-xr-x 1 root root    1 Jan 23 07:21 .
drwxr-xr-x 3 root root 4096 Jan 23 07:15 ..
-rw-r--r-- 1 root root    5 Jan 23 07:21 foo

What I did to have EC pool for the cephfs_data pool :

# must stop mds servers
ansible -i ~/iccluster/ceph-config/cluster-artemis/inventory mdss -m shell -a " systemctl stop ceph-mds.target" 

# must allow pool deletion
ceph --cluster artemis tell mon.\* injectargs '--mon-allow-pool-delete=true'

ceph --cluster artemis fs rm cephfs --yes-i-really-mean-it

#delete cephfs pool
ceph --cluster artemis osd pool rm cephfs_data  cephfs_data --yes-i-really-really-mean-it --yes-i-really-really-mean-it
ceph --cluster artemis osd pool rm cephfs_metadata  cephfs_metadata --yes-i-really-really-mean-it --yes-i-really-really-mean-it

# disallow pool deletion
ceph --cluster artemis tell mon.\* injectargs '--mon-allow-pool-delete=false'

# create earsure coding profile
# ecpool-8-3 for apollo cluster
ceph --cluster artemis osd erasure-code-profile set ecpool-8-3 k=8 m=3 crush-failure-domain=host

#re create pool for cephfs
# cephfs_data in erasure coding
ceph --cluster artemis osd pool create cephfs_data 64 64 erasure ecpool-8-3
# cephfs_metadata must be in replicated !
ceph --cluster artemis osd pool create cephfs_metadata 8 8

# must set allow_ec_overwrites to be able to create a cephfs over EC pool
ceph --cluster artemis osd pool set cephfs_data allow_ec_overwrites true

# create the cephfs filesystem named "cephfs" 
ceph --cluster artemis fs new cephfs cephfs_metadata cephfs_data

Best regards,

Yoann

History

#1 Updated by Greg Farnum 29 days ago

  • Project changed from Ceph to fs

#2 Updated by Yoann Moulin 29 days ago

Hello,

From the mailing list : https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/23FDDSYBCDVMYGCUTALACPFAJYITLOHJ/
I got an answer to fix this issue

Previously, I had :

$ ceph osd pool ls detail --format=json | jq '.[] | select(.pool_name| startswith("cephfs")) | .pool_name, .application_metadata'
"cephfs_data" 
{
   "cephfs": {}
}
"cephfs_metadata" 
{
    "cephfs": {}
}

Then Frank Schilder and Ilya Dryomov gave me the solution. I ran those commands.

  ceph osd pool application set cephfs_data cephfs data cephfs
  ceph osd pool application set cephfs_metadata cephfs metadata cephfs

and now I have this :

  "cephfs_data" 
  {
    "cephfs": {
      "data": "cephfs" 
    }
  }
  "cephfs_metadata" 
  {
    "cephfs": {
      "metadata": "cephfs" 
    }
  }

and it works.

I don't know where those settings have to be done during the install, in the ceph-ansible playbook or in the ceph's install scripts themselves.

Thanks,

Best regards,

#3 Updated by Patrick Donnelly 28 days ago

  • Subject changed from "ceph fs authorize cephfs client.test /test rw" does not give the necessary right anymore to mon/MDSMonitor: "ceph fs authorize cephfs client.test /test rw" does not give the necessary right anymore
  • Status changed from New to Triaged
  • Assignee set to Ramana Raja
  • Priority changed from Normal to Urgent
  • Target version changed from v14.2.6 to v15.0.0
  • Tags deleted (cephfs)
  • Backport set to nautilus,mimic
  • Component(FS) MDSMonitor added

Ramana, I'm assigning this to you. The bug is arguably in ceph-ansible because it's enabling the application but not setting application pool metadata. We can fix this by having the MDSMonitor set the value if it doesn't already exist?

#4 Updated by Patrick Donnelly about 15 hours ago

  • Target version changed from v15.0.0 to v16.0.0
  • Backport changed from nautilus,mimic to octopus,nautilus

Also available in: Atom PDF