Project

General

Profile

Bug #43640

nautilus: qa: test_async_subvolume_rm failure

Added by Ramana Raja 3 months ago. Updated 3 months ago.

Status:
Need More Info
Priority:
Normal
Assignee:
-
Category:
-
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
mgr/volumes
Labels (FS):
Pull request ID:
Crash signature:

Description

2020-01-10T10:23:10.135 INFO:tasks.cephfs_test_runner:======================================================================
2020-01-10T10:23:10.135 INFO:tasks.cephfs_test_runner:ERROR: test_async_subvolume_rm (tasks.cephfs.test_volumes.TestVolumes)
2020-01-10T10:23:10.136 INFO:tasks.cephfs_test_runner:----------------------------------------------------------------------
2020-01-10T10:23:10.136 INFO:tasks.cephfs_test_runner:Traceback (most recent call last):
2020-01-10T10:23:10.136 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-yuri6-testing-2020-01-09-1744-nautilus/qa/tasks/cephfs/test_volumes.py", line 787, in test_async_subvolume_rm
2020-01-10T10:23:10.136 INFO:tasks.cephfs_test_runner:    self._fs_cmd("subvolume", "rm", self.volname, subvolume)
2020-01-10T10:23:10.136 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-yuri6-testing-2020-01-09-1744-nautilus/qa/tasks/cephfs/test_volumes.py", line 28, in _fs_cmd
2020-01-10T10:23:10.136 INFO:tasks.cephfs_test_runner:    return self.mgr_cluster.mon_manager.raw_cluster_cmd("fs", *args)
2020-01-10T10:23:10.136 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-yuri6-testing-2020-01-09-1744-nautilus/qa/tasks/ceph_manager.py", line 1157, in raw_cluster_cmd
2020-01-10T10:23:10.136 INFO:tasks.cephfs_test_runner:    stdout=StringIO(),
2020-01-10T10:23:10.137 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/remote.py", line 198, in run
2020-01-10T10:23:10.137 INFO:tasks.cephfs_test_runner:    r = self._runner(client=self.ssh, name=self.shortname, **kwargs)
2020-01-10T10:23:10.137 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 433, in run
2020-01-10T10:23:10.137 INFO:tasks.cephfs_test_runner:    r.wait()
2020-01-10T10:23:10.137 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 158, in wait
2020-01-10T10:23:10.137 INFO:tasks.cephfs_test_runner:    self._raise_for_status()
2020-01-10T10:23:10.137 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 180, in _raise_for_status
2020-01-10T10:23:10.137 INFO:tasks.cephfs_test_runner:    node=self.hostname, label=self.label
2020-01-10T10:23:10.137 INFO:tasks.cephfs_test_runner:CommandFailedError: Command failed on smithi079 with status 124: u'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph fs subvolume rm cephfs subvolume_3278'

From, /a/yuriw-2020-01-09_22:23:54-fs-wip-yuri6-testing-2020-01-09-1744-nautilus-distro-basic-smithi/4650220/teuthology.log

History

#1 Updated by Patrick Donnelly 3 months ago

Just the lines from the teuthology log for the mgr connection:

2020-01-10T10:20:27.980 INFO:teuthology.orchestra.run.smithi079.stderr:2020-01-10 10:20:27.979 7efe227fc700  1 --2- 172.21.15.79:0/1827019631 >> [v2:172.21.15.102:6800/33538,v1:172.21.15.102:6801/33538] conn(0x7efe140167a0 0x7efe14018ca0 unknown :-1 s=NONE pgs=0 cs=0 l=1 rx=0 tx=0).connect
2020-01-10T10:20:27.980 INFO:teuthology.orchestra.run.smithi079.stderr:2020-01-10 10:20:27.979 7efe31f37700  1 --2- 172.21.15.79:0/1827019631 >> [v2:172.21.15.102:6800/33538,v1:172.21.15.102:6801/33538] conn(0x7efe140167a0 0x7efe14018ca0 unknown :-1 s=BANNER_CONNECTING pgs=0 cs=0 l=1 rx=0 tx=0)._handle_peer_banner_payload supported=0 required=0
2020-01-10T10:20:27.980 INFO:teuthology.orchestra.run.smithi079.stderr:2020-01-10 10:20:27.979 7efe31f37700  1 --2- 172.21.15.79:0/1827019631 >> [v2:172.21.15.102:6800/33538,v1:172.21.15.102:6801/33538] conn(0x7efe140167a0 0x7efe14018ca0 crc :-1 s=READY pgs=283 cs=0 l=1 rx=0 tx=0).ready entity=mgr.4105 client_cookie=0 server_cookie=0 in_seq=0 out_seq=0
2020-01-10T10:20:28.266 INFO:teuthology.orchestra.run.smithi079.stderr:2020-01-10 10:20:28.273 7efe34116700  1 -- 172.21.15.79:0/1827019631 --> [v2:172.21.15.102:6800/33538,v1:172.21.15.102:6801/33538] -- command(tid 0: {"prefix": "fs subvolume rm", "vol_name": "cephfs", "sub_name": "subvolume_3278", "target": ["mgr", ""]}) v1 -- 0x7efe2c110680 con 0x7efe140167a0

And from the mgr log:

2020-01-10 10:20:27.978 7f3a251f6700  1 --2- [v2:172.21.15.102:6800/33538,v1:172.21.15.102:6801/33538] >>  conn(0x55daff0e3400 0x55db0048a000 unknown :-1 s=NONE pgs=0 cs=0 l=0 rx=0 tx=0).accept
2020-01-10 10:20:27.978 7f3a249f5700  1 --2- [v2:172.21.15.102:6800/33538,v1:172.21.15.102:6801/33538] >>  conn(0x55daff0e3400 0x55db0048a000 unknown :-1 s=BANNER_ACCEPTING pgs=0 cs=0 l=0 rx=0 tx=0)._handle_peer_banner_payload supported=0 required=0
2020-01-10 10:20:27.978 7f3a249f5700 20 AuthRegistry(0x7ffd2b269028) get_handler peer_type 8 method 2 cluster_methods [2] service_methods [2] client_methods [2]
2020-01-10 10:20:27.978 7f3a249f5700 10 cephx: verify_authorizer decrypted service mgr secret_id=2
2020-01-10 10:20:27.978 7f3a249f5700 10 cephx: verify_authorizer global_id=4679
2020-01-10 10:20:27.978 7f3a249f5700 10 cephx: verify_authorizer ok nonce 60ec9846cf7ef634 reply_bl.length()=36
2020-01-10 10:20:27.978 7f3a249f5700 10 mgr.server ms_handle_authentication ms_handle_authentication new session 0x55db008b44e0 con 0x55daff0e3400 entity client.admin addr
2020-01-10 10:20:27.978 7f3a249f5700 10 mgr.server ms_handle_authentication  session 0x55db008b44e0 client.admin has caps allow * 'allow *'
2020-01-10 10:20:27.978 7f3a249f5700  1 --2- [v2:172.21.15.102:6800/33538,v1:172.21.15.102:6801/33538] >> 172.21.15.79:0/1827019631 conn(0x55daff0e3400 0x55db0048a000 crc :-1 s=READY pgs=4 cs=0 l=1 rx=0 tx=0).ready entity=client.4679 client_cookie=0 server_cookie=0 in_seq=0 out_seq=0

From: /ceph/teuthology-archive/yuriw-2020-01-09_22:23:54-fs-wip-yuri6-testing-2020-01-09-1744-nautilus-distro-basic-smithi/4650220/remote/smithi102/log/ceph-mgr.x.log.gz

Looks like the connection stalled but I don't see how or why.

#2 Updated by Patrick Donnelly 3 months ago

  • Subject changed from nautilus qa: test_async_subvolume_rm failure to nautilus: qa: test_async_subvolume_rm failure
  • Status changed from New to Need More Info

Waiting for this to be reproduced again.

Also available in: Atom PDF