Project

General

Profile

Actions

Bug #18968

closed

mon changing client global_id on restart (failure in TestVolumeClient.test_data_isolated)

Added by John Spray about 7 years ago. Updated about 7 years ago.

Status:
Resolved
Priority:
Immediate
Assignee:
Category:
MonClient
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
ceph-deploy
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This particular FS test restarts a mon while a client is mounted. The test is then hanging because that client appears to be failing to revoke caps, but upon inspection of the logs we see that the client is fine, it's just had its global_id changed by the mon and so the MDS is confused.

Here's an example on master:
http://pulpito.ceph.com/jspray-2017-02-17_09:45:21-fs-master-distro-basic-smithi/824691/


Related issues 1 (0 open1 closed)

Has duplicate Ceph - Bug #19028: LibRadosLockECPP.BreakLockPP and LibRadosLockECPP.ListLockersPP failureDuplicateKefu Chai02/21/2017

Actions
Actions #1

Updated by John Spray about 7 years ago

Here it is: client.4219 sends an auth message, and gets its ID changed to 14100

2017-02-17 09:52:47.924246 7fcdfd92a700  1 -- 172.21.15.1:6789/0 <== client.4219 172.21.15.27:0/2054482199 3 ==== auth(proto 2 165 bytes epoch 0) v1 ==== 195+0+0 (3567574672 0 0) 0x7fce0f207180 con 0x7fce0f245900
2017-02-17 09:52:47.924266 7fcdfd92a700 20 mon.a@0(leader) e2 _ms_dispatch existing session 0x7fce0f0f7c80 for client.4219 172.21.15.27:0/2054482199
2017-02-17 09:52:47.924272 7fcdfd92a700 20 mon.a@0(leader) e2  caps allow r
2017-02-17 09:52:47.924275 7fcdfd92a700 10 mon.a@0(leader).paxosservice(auth 1..11) dispatch 0x7fce0f207180 auth(proto 2 165 bytes epoch 0) v1 from client.4219 172.21.15.27:0/2054482199 con 0x7fce0f245900
2017-02-17 09:52:47.924282 7fcdfd92a700  5 mon.a@0(leader).paxos(paxos active c 1..169) is_readable = 1 - now=2017-02-17 09:52:47.924283 lease_expire=0.000000 has v0 lc 169
2017-02-17 09:52:47.924290 7fcdfd92a700 10 mon.a@0(leader).auth v11 preprocess_query auth(proto 2 165 bytes epoch 0) v1 from client.4219 172.21.15.27:0/2054482199
2017-02-17 09:52:47.924295 7fcdfd92a700 10 mon.a@0(leader).auth v11 prep_auth() blob_size=165
2017-02-17 09:52:47.924298 7fcdfd92a700 10 cephx server client.0: handle_request get_principal_session_key
2017-02-17 09:52:47.924301 7fcdfd92a700 10 cephx: verify_authorizer decrypted service auth secret_id=2
2017-02-17 09:52:47.924344 7fcdfd92a700 10 cephx: verify_authorizer global_id=14100
2017-02-17 09:52:47.924363 7fcdfd92a700 10 cephx: verify_authorizer ok nonce 2ae8944a625558ec reply_bl.length()=36
Actions #2

Updated by Kefu Chai about 7 years ago

  • Project changed from CephFS to Ceph
  • Category set to MonClient
  • Assignee set to Kefu Chai
  • Priority changed from Normal to Immediate
Actions #3

Updated by Kefu Chai about 7 years ago

  • Status changed from New to 7
Actions #4

Updated by Kefu Chai about 7 years ago

  • Status changed from 7 to Resolved
  • ceph-qa-suite ceph-deploy added
Actions #5

Updated by Kefu Chai about 7 years ago

  • Has duplicate Bug #19028: LibRadosLockECPP.BreakLockPP and LibRadosLockECPP.ListLockersPP failure added
Actions

Also available in: Atom PDF