Actions
Bug #18968
closedmon changing client global_id on restart (failure in TestVolumeClient.test_data_isolated)
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
ceph-deploy
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
This particular FS test restarts a mon while a client is mounted. The test is then hanging because that client appears to be failing to revoke caps, but upon inspection of the logs we see that the client is fine, it's just had its global_id changed by the mon and so the MDS is confused.
Here's an example on master:
http://pulpito.ceph.com/jspray-2017-02-17_09:45:21-fs-master-distro-basic-smithi/824691/
Updated by John Spray about 7 years ago
Here it is: client.4219 sends an auth message, and gets its ID changed to 14100
2017-02-17 09:52:47.924246 7fcdfd92a700 1 -- 172.21.15.1:6789/0 <== client.4219 172.21.15.27:0/2054482199 3 ==== auth(proto 2 165 bytes epoch 0) v1 ==== 195+0+0 (3567574672 0 0) 0x7fce0f207180 con 0x7fce0f245900 2017-02-17 09:52:47.924266 7fcdfd92a700 20 mon.a@0(leader) e2 _ms_dispatch existing session 0x7fce0f0f7c80 for client.4219 172.21.15.27:0/2054482199 2017-02-17 09:52:47.924272 7fcdfd92a700 20 mon.a@0(leader) e2 caps allow r 2017-02-17 09:52:47.924275 7fcdfd92a700 10 mon.a@0(leader).paxosservice(auth 1..11) dispatch 0x7fce0f207180 auth(proto 2 165 bytes epoch 0) v1 from client.4219 172.21.15.27:0/2054482199 con 0x7fce0f245900 2017-02-17 09:52:47.924282 7fcdfd92a700 5 mon.a@0(leader).paxos(paxos active c 1..169) is_readable = 1 - now=2017-02-17 09:52:47.924283 lease_expire=0.000000 has v0 lc 169 2017-02-17 09:52:47.924290 7fcdfd92a700 10 mon.a@0(leader).auth v11 preprocess_query auth(proto 2 165 bytes epoch 0) v1 from client.4219 172.21.15.27:0/2054482199 2017-02-17 09:52:47.924295 7fcdfd92a700 10 mon.a@0(leader).auth v11 prep_auth() blob_size=165 2017-02-17 09:52:47.924298 7fcdfd92a700 10 cephx server client.0: handle_request get_principal_session_key 2017-02-17 09:52:47.924301 7fcdfd92a700 10 cephx: verify_authorizer decrypted service auth secret_id=2 2017-02-17 09:52:47.924344 7fcdfd92a700 10 cephx: verify_authorizer global_id=14100 2017-02-17 09:52:47.924363 7fcdfd92a700 10 cephx: verify_authorizer ok nonce 2ae8944a625558ec reply_bl.length()=36
Updated by Kefu Chai about 7 years ago
- Project changed from CephFS to Ceph
- Category set to MonClient
- Assignee set to Kefu Chai
- Priority changed from Normal to Immediate
Updated by Kefu Chai about 7 years ago
- Status changed from New to 7
Updated by Kefu Chai about 7 years ago
- Status changed from 7 to Resolved
- ceph-qa-suite ceph-deploy added
Updated by Kefu Chai about 7 years ago
- Has duplicate Bug #19028: LibRadosLockECPP.BreakLockPP and LibRadosLockECPP.ListLockersPP failure added
Actions