Actions
Bug #43553
closedmon: client mon_status fails
% Done:
0%
Source:
Q/A
Tags:
Backport:
octopus
Regression:
No
Severity:
4 - irritation
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Monitor
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
2020-01-10T07:21:42.355 INFO:teuthology.orchestra.run.smithi027:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph tell mon.b mon_status ... 2020-01-10T07:23:42.370 DEBUG:teuthology.orchestra.run:got remote process result: 124 2020-01-10T07:23:42.370 ERROR:tasks.mon_thrash.mon_thrasher:exception: Traceback (most recent call last): File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-pdonnell-testing-20200109.233534/qa/tasks/mon_thrash.py", line 232, in do_thrash self._do_thrash() File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-pdonnell-testing-20200109.233534/qa/tasks/mon_thrash.py", line 264, in _do_thrash s = self.manager.get_mon_status(m) File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-pdonnell-testing-20200109.233534/qa/tasks/ceph_manager.py", line 2764, in get_mon_status out = self.raw_cluster_cmd('tell', 'mon.%s' % mon, 'mon_status') File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-pdonnell-testing-20200109.233534/qa/tasks/ceph_manager.py", line 1276, in raw_cluster_cmd stdout=StringIO(), File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/remote.py", line 198, in run r = self._runner(client=self.ssh, name=self.shortname, **kwargs) File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 433, in run r.wait() File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 158, in wait self._raise_for_status() File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 180, in _raise_for_status node=self.hostname, label=self.label CommandFailedError: Command failed on smithi027 with status 124: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph tell mon.b mon_status' ... 2020-01-10T07:23:48.133 INFO:tasks.daemonwatchdog.daemon_watchdog:MonitorThrasher failed 2020-01-10T07:23:48.133 INFO:tasks.daemonwatchdog.daemon_watchdog:BARK! unmounting mounts and killing all daemons
From: /ceph/teuthology-archive/pdonnell-2020-01-10_02:39:18-fs-wip-pdonnell-testing-20200109.233534-distro-basic-smithi/4651189/teuthology.log
Monitor log shows:
2020-01-10T07:21:42.464+0000 7f977475a700 1 --1- [v2:172.21.15.168:3300/0,v1:172.21.15.168:6789/0] >> conn(0x561eb0eb9a80 0x561eb3197800 :6789 s=ACCEPTING pgs=0 cs=0 l=0).send_server_banner sd=37 legacy v1:172.21.15.168:6789/0 socket_addr v1:172.21.15.168:6789/0 target_addr v1:172.21.15.27:33884/0 2020-01-10T07:21:42.464+0000 7f977475a700 10 mon.b@1(peon) e1 handle_auth_request con 0x561eb0eb9a80 (start) method 0 payload 0 2020-01-10T07:21:42.464+0000 7f977475a700 10 In get_auth_session_handler for protocol 0 2020-01-10T07:21:42.464+0000 7f977575c700 10 mon.b@1(peon) e1 ms_handle_accept con 0x561eb0eb9a80 no session 2020-01-10T07:21:42.464+0000 7f977575c700 1 -- [v2:172.21.15.168:3300/0,v1:172.21.15.168:6789/0] <== client.? v1:172.21.15.27:0/4192507940 1 ==== auth(proto 0 30 bytes epoch 0) v1 ==== 60+0+0 (unknown 673663173 0 0) 0x561eb1053f80 con 0x561eb0eb9a80 2020-01-10T07:21:42.464+0000 7f977575c700 10 mon.b@1(peon) e1 _ms_dispatch new session 0x561eb1054ac0 MonSession(client.? v1:172.21.15.27:0/4192507940 is open , features 0x3f01cfb8ffadffff (luminous)) features 0x3f01cfb8ffadffff 2020-01-10T07:21:42.464+0000 7f977575c700 20 mon.b@1(peon) e1 entity caps 2020-01-10T07:21:42.464+0000 7f977575c700 10 mon.b@1(peon).paxosservice(auth 1..110) dispatch 0x561eb1053f80 auth(proto 0 30 bytes epoch 0) v1 from client.? v1:172.21.15.27:0/4192507940 con 0x561eb0eb9a80 2020-01-10T07:21:42.464+0000 7f977575c700 5 mon.b@1(peon).paxos(paxos active c 3013..3574) is_readable = 1 - now=2020-01-10T07:21:42.468324+0000 lease_expire=2020-01-10T07:21:47.214219+0000 has v0 lc 3574 2020-01-10T07:21:42.464+0000 7f977575c700 10 mon.b@1(peon).auth v110 preprocess_query auth(proto 0 30 bytes epoch 0) v1 from client.? v1:172.21.15.27:0/4192507940 2020-01-10T07:21:42.464+0000 7f977575c700 10 mon.b@1(peon).auth v110 prep_auth() blob_size=30 2020-01-10T07:21:42.464+0000 7f977575c700 10 mon.b@1(peon).auth v110 _assign_global_id 794284 (max 814096) 2020-01-10T07:21:42.464+0000 7f977575c700 10 cephx server client.admin: start_session server_challenge e9971bdbd8f321b0 2020-01-10T07:21:42.464+0000 7f977575c700 2 mon.b@1(peon) e1 send_reply 0x561eb00e2a30 0x561eb311bd40 auth_reply(proto 2 0 (0) Success) v1 2020-01-10T07:21:42.464+0000 7f977575c700 1 -- [v2:172.21.15.168:3300/0,v1:172.21.15.168:6789/0] --> v1:172.21.15.27:0/4192507940 -- auth_reply(proto 2 0 (0) Success) v1 -- 0x561eb311bd40 con 0x561eb0eb9a80 2020-01-10T07:21:42.464+0000 7f977575c700 1 -- [v2:172.21.15.168:3300/0,v1:172.21.15.168:6789/0] <== client.? v1:172.21.15.27:0/4192507940 2 ==== auth(proto 2 36 bytes epoch 0) v1 ==== 66+0+0 (unknown 2018028300 0 0) 0x561eb311bd40 con 0x561eb0eb9a80 2020-01-10T07:21:42.464+0000 7f977575c700 20 mon.b@1(peon) e1 _ms_dispatch existing session 0x561eb1054ac0 for client.? 2020-01-10T07:21:42.464+0000 7f977575c700 20 mon.b@1(peon) e1 entity client.admin caps 2020-01-10T07:21:42.464+0000 7f977575c700 10 mon.b@1(peon).paxosservice(auth 1..110) dispatch 0x561eb311bd40 auth(proto 2 36 bytes epoch 0) v1 from client.? v1:172.21.15.27:0/4192507940 con 0x561eb0eb9a80 2020-01-10T07:21:42.464+0000 7f977575c700 5 mon.b@1(peon).paxos(paxos active c 3013..3574) is_readable = 1 - now=2020-01-10T07:21:42.468649+0000 lease_expire=2020-01-10T07:21:47.214219+0000 has v0 lc 3574 2020-01-10T07:21:42.464+0000 7f977575c700 10 mon.b@1(peon).auth v110 preprocess_query auth(proto 2 36 bytes epoch 0) v1 from client.? v1:172.21.15.27:0/4192507940 2020-01-10T07:21:42.464+0000 7f977575c700 10 mon.b@1(peon).auth v110 prep_auth() blob_size=36 2020-01-10T07:21:42.464+0000 7f977575c700 10 cephx server client.admin: handle_request get_auth_session_key for client.admin 2020-01-10T07:21:42.464+0000 7f977575c700 20 cephx server client.admin: checking key: req.key=c3a9158496aea5b5 expected_key=c3a9158496aea5b5 2020-01-10T07:21:42.464+0000 7f977575c700 10 cephx server client.admin: handle_request auth ticket global_id 794284 2020-01-10T07:21:42.464+0000 7f977575c700 10 cephx: build_service_ticket_reply encoding 1 tickets with secret AQCSExheOoCKDRAAfTqSvgpxnoX8nfiz6eO8aA== 2020-01-10T07:21:42.464+0000 7f977575c700 10 cephx: build_service_ticket service auth secret_id 2 ticket_info.ticket.name=client.admin ticket.global_id 794284 2020-01-10T07:21:42.464+0000 7f977575c700 10 cephx keyserverdata: get_caps: name=client.admin 2020-01-10T07:21:42.464+0000 7f977575c700 10 cephx keyserverdata: get_secret: num of caps=4 2020-01-10T07:21:42.464+0000 7f977475a700 1 -- [v2:172.21.15.168:3300/0,v1:172.21.15.168:6789/0] >> v1:172.21.15.27:0/4192507940 conn(0x561eb0eb9a80 legacy=0x561eb3197800 unknown :6789 s=STATE_CONNECTION_ESTABLISHED l=1).read_bulk peer close file descriptor 37 2020-01-10T07:21:42.464+0000 7f977575c700 10 cephx server client.admin: adding key for service auth 2020-01-10T07:21:42.464+0000 7f977575c700 10 cephx keyserverdata: get_caps: name=client.admin 2020-01-10T07:21:42.464+0000 7f977575c700 10 cephx keyserverdata: get_secret: num of caps=4 2020-01-10T07:21:42.464+0000 7f977475a700 1 -- [v2:172.21.15.168:3300/0,v1:172.21.15.168:6789/0] >> v1:172.21.15.27:0/4192507940 conn(0x561eb0eb9a80 legacy=0x561eb3197800 unknown :6789 s=STATE_CONNECTION_ESTABLISHED l=1).read_until read failed 2020-01-10T07:21:42.464+0000 7f977575c700 10 cephx: build_service_ticket_reply encoding 1 tickets with secret AQAGJhhe56XvGxAAybRxRpL8wgVWQLJF3g1f8A== 2020-01-10T07:21:42.464+0000 7f977575c700 10 cephx: build_service_ticket service auth secret_id 2 ticket_info.ticket.name=client.admin ticket.global_id 794284 2020-01-10T07:21:42.464+0000 7f977475a700 1 --1- [v2:172.21.15.168:3300/0,v1:172.21.15.168:6789/0] >> v1:172.21.15.27:0/4192507940 conn(0x561eb0eb9a80 0x561eb3197800 :6789 s=OPENED pgs=2 cs=1 l=1).handle_message read tag failed 2020-01-10T07:21:42.464+0000 7f977475a700 1 --1- [v2:172.21.15.168:3300/0,v1:172.21.15.168:6789/0] >> v1:172.21.15.27:0/4192507940 conn(0x561eb0eb9a80 0x561eb3197800 :6789 s=OPENED pgs=2 cs=1 l=1).fault on lossy channel, failing 2020-01-10T07:21:42.464+0000 7f977575c700 10 mon.b@1(peon) e1 ms_handle_authentication session 0x561eb1054ac0 con 0x561eb0eb9a80 addr v1:172.21.15.27:0/4192507940 MonSession(client.? v1:172.21.15.27:0/4192507940 is open , features 0x3f01cfb8ffadffff (luminous)) 2020-01-10T07:21:42.464+0000 7f977575c700 2 mon.b@1(peon) e1 send_reply 0x561eb00e3390 0x561eb311bf80 auth_reply(proto 2 0 (0) Success) v1 2020-01-10T07:21:42.464+0000 7f977575c700 1 -- [v2:172.21.15.168:3300/0,v1:172.21.15.168:6789/0] --> v1:172.21.15.27:0/4192507940 -- auth_reply(proto 2 0 (0) Success) v1 -- 0x561eb311bf80 con 0x561eb0eb9a80 2020-01-10T07:21:42.464+0000 7f977575c700 1 -- [v2:172.21.15.168:3300/0,v1:172.21.15.168:6789/0] --> v1:172.21.15.27:0/4192507940 -- mon_map magic: 0 v1 -- 0x561eb0f71000 con 0x561eb0eb9a80 2020-01-10T07:21:42.464+0000 7f977575c700 10 mon.b@1(peon) e1 ms_handle_reset 0x561eb0eb9a80 v1:172.21.15.27:0/4192507940 2020-01-10T07:21:42.464+0000 7f977575c700 10 mon.b@1(peon) e1 reset/close on session client.? v1:172.21.15.27:0/4192507940 2020-01-10T07:21:42.464+0000 7f977575c700 10 mon.b@1(peon) e1 remove_session 0x561eb1054ac0 client.? v1:172.21.15.27:0/4192507940 features 0x3f01cfb8ffadffff
From: /ceph/teuthology-archive/pdonnell-2020-01-10_02:39:18-fs-wip-pdonnell-testing-20200109.233534-distro-basic-smithi/4651189/remote/smithi168/log/ceph-mon.b.log.gz
Updated by Patrick Donnelly over 4 years ago
- Project changed from CephFS to RADOS
- Component(RADOS) Monitor added
Updated by Patrick Donnelly almost 4 years ago
- Target version changed from v15.0.0 to v16.0.0
- Backport set to octopus
/ceph/teuthology-archive/yuriw-2020-06-25_22:31:00-fs-octopus-distro-basic-smithi/5180260/teuthology.log
Updated by Neha Ojha about 2 years ago
- Status changed from New to Can't reproduce
Actions