From remote/smithi029/log/ceph-mon.a.log.gz:
2022-07-24 15:47:30.344 7f52d6623700 10 mon.a@0(leader) e1 handle_auth_request con 0x5568e49bfa80 (start) method 2 payload 34
2022-07-24 15:47:30.344 7f52d6623700 10 mon.a@0(leader).auth v3 _assign_global_id 4560 (max 14096)
2022-07-24 15:47:30.344 7f52d6623700 10 start_session entity_name=client.vol_data_isolated global_id=4560 is_new_global_id=1
2022-07-24 15:47:30.344 7f52d7625700 10 mon.a@0(leader) e1 ms_handle_accept con 0x5568e49be880 no session
2022-07-24 15:47:30.344 7f52d6623700 10 cephx server client.vol_data_isolated: start_session server_challenge aef60bfe7b4573b9
2022-07-24 15:47:30.344 7f52d6623700 10 mon.a@0(leader) e1 handle_auth_request con 0x5568e49bfa80 (more) method 2 payload 2
2022-07-24 15:47:30.344 7f52d6623700 10 cephx server client.vol_data_isolated: handle_request get_auth_session_key for client.vol_data_isolated
2022-07-24 15:47:30.344 7f52d7625700 1 -- [v2:172.21.15.29:3300/0,v1:172.21.15.29:6789/0] <== client.? v1:172.21.15.137:0/2195577994 1 ==== auth(proto 0 42 bytes epoch 0) v1 ==== 72+0+0 (unknown 479712082 0 0) 0x5568e4736d80 con 0x5568e49be880
2022-07-24 15:47:30.344 7f52d6623700 0 cephx server client.vol_data_isolated: handle_request failed to decode CephXAuthenticate: buffer::end_of_buffer
2022-07-24 15:47:30.344 7f52d7625700 10 mon.a@0(leader) e1 _ms_dispatch new session 0x5568e4738400 MonSession(client.? v1:172.21.15.137:0/2195577994 is open , features 0x3ffddff8ffecffff (luminous)) features 0x3ffddff8ffecffff
2022-07-24 15:47:30.344 7f52d7625700 20 mon.a@0(leader) e1 entity_name global_id 0 (none) caps
2022-07-24 15:47:30.344 7f52d7625700 10 mon.a@0(leader).paxosservice(auth 1..3) dispatch 0x5568e4736d80 auth(proto 0 42 bytes epoch 0) v1 from client.? v1:172.21.15.137:0/2195577994 con 0x5568e49be880
2022-07-24 15:47:30.344 7f52d6623700 1 --2- [v2:172.21.15.29:3300/0,v1:172.21.15.29:6789/0] >> conn(0x5568e49bfa80 0x5568e4693100 secure :-1 s=AUTH_ACCEPTING_MORE pgs=0 cs=0 l=1 rev1=1 rx=0 tx=0)._auth_bad_method auth_method 2 r (1) Operation not permitted, allowed_methods [2], allowed_modes [2,1]
2022-07-24 15:47:30.344 7f52d7625700 5 mon.a@0(leader).paxos(paxos active c 1..100) is_readable = 1 - now=2022-07-24 15:47:30.347662 lease_expire=2022-07-24 15:47:34.440663 has v0 lc 100
The monitor logs said it failed when doing the authentication while decoding the auth info, which the payload length was 2 and the method was 2:
2022-07-24 15:47:30.344 7f52d6623700 10 mon.a@0(leader) e1 handle_auth_request con 0x5568e49bfa80 (more) method 2 payload 2
By going through the ceph source code, found that the length of the CephXRequestHeader is 2, that means the client only sent the header, no extra data, it should fail in Line#49 below:
38 int CephxClientHandler::build_request(bufferlist& bl) const
39 {
40 ldout(cct, 10) << "build_request" << dendl;
41
42 if (need & CEPH_ENTITY_TYPE_AUTH) {
43 /* authenticate */
44 CephXRequestHeader header;
45 header.request_type = CEPHX_GET_AUTH_SESSION_KEY;
46 encode(header, bl);
47
48 CryptoKey secret;
49 const bool got = keyring->get_secret(cct->_conf->name, secret);
50 if (!got) {
51 ldout(cct, 20) << "no secret found for entity: " << cct->_conf->name << dendl;
52 return -ENOENT;
53 }
54
The keyring->get_secret(cct->_conf->name, secret); will get the authkey and caps from /etc/ceph/ceph.client.vol_data_isolated.keyring, but it failed.
From the teuthology.log:
2022-07-24T15:47:25.336 INFO:tasks.workunit.client.0.smithi137.stderr:+ sudo python3
2022-07-24T15:47:25.337 INFO:tasks.workunit.client.0.smithi137.stderr:+ import_key vol_data_isolated
2022-07-24T15:47:25.337 INFO:tasks.workunit.client.0.smithi137.stderr:+ local client=vol_data_isolated
2022-07-24T15:47:25.337 INFO:tasks.workunit.client.0.smithi137.stderr:+ '[' -n '' ']'
2022-07-24T15:47:25.400 INFO:tasks.workunit.client.0.smithi137.stderr:+ local keyring=/etc/ceph/ceph.client.vol_data_isolated.keyring
2022-07-24T15:47:25.401 INFO:tasks.workunit.client.0.smithi137.stderr:++ mktemp
2022-07-24T15:47:25.443 INFO:tasks.workunit.client.0.smithi137.stderr:+ local T=/tmp/tmp.t0CieZarQn
2022-07-24T15:47:25.444 INFO:tasks.workunit.client.0.smithi137.stderr:+ tee /tmp/tmp.t0CieZarQn
2022-07-24T15:47:25.444 INFO:tasks.workunit.client.0.smithi137.stderr:Traceback (most recent call last):
2022-07-24T15:47:25.444 INFO:tasks.workunit.client.0.smithi137.stderr: File "<stdin>", line 2, in <module>
2022-07-24T15:47:25.445 INFO:tasks.workunit.client.0.smithi137.stderr:ModuleNotFoundError: No module named 'ceph_volume_client'
2022-07-24T15:47:25.445 INFO:tasks.workunit.client.0.smithi137.stderr:+ sudo touch -- /etc/ceph/ceph.client.vol_data_isolated.keyring
2022-07-24T15:47:25.497 INFO:tasks.workunit.client.0.smithi137.stderr:+ sudo ceph-authtool /etc/ceph/ceph.client.vol_data_isolated.keyring --import-keyring /tmp/tmp.t0CieZarQn
We can see that it failed when generating the keyring contents for client.vol_data_isolated. The ceph_volume_client python module was missing.