Project

General

Profile

Bug #51964

qa: test_cephfs_mirror_restart_sync_on_blocklist failure

Added by Patrick Donnelly over 1 year ago. Updated 5 months ago.

Status:
New
Priority:
Low
Assignee:
Category:
Testing
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
pacific,quincy
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
cephfs-mirror, qa-suite
Labels (FS):
qa, qa-failure
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2021-07-29T07:47:34.804 INFO:tasks.cephfs_test_runner:======================================================================
2021-07-29T07:47:34.804 INFO:tasks.cephfs_test_runner:ERROR: test_cephfs_mirror_restart_sync_on_blocklist (tasks.cephfs.test_mirroring.TestMirroring)
2021-07-29T07:47:34.805 INFO:tasks.cephfs_test_runner:----------------------------------------------------------------------
2021-07-29T07:47:34.805 INFO:tasks.cephfs_test_runner:Traceback (most recent call last):
2021-07-29T07:47:34.805 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_ceph-c_c098ff00fff39aa861a531e01a5abb127a724622/qa/tasks/cephfs/test_mirroring.py", line 627, in test_cephfs_mirror_restart_sync_on_blocklist
2021-07-29T07:47:34.806 INFO:tasks.cephfs_test_runner:    "client.mirror_remote@ceph", '/d0', 'snap0', expected_snap_count=1)
2021-07-29T07:47:34.806 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_ceph-c_c098ff00fff39aa861a531e01a5abb127a724622/qa/tasks/cephfs/test_mirroring.py", line 142, in check_peer_status
2021-07-29T07:47:34.806 INFO:tasks.cephfs_test_runner:    f'{fs_name}@{fs_id}', peer_uuid)
2021-07-29T07:47:34.807 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_ceph-c_c098ff00fff39aa861a531e01a5abb127a724622/qa/tasks/cephfs/test_mirroring.py", line 231, in mirror_daemon_command
2021-07-29T07:47:34.807 INFO:tasks.cephfs_test_runner:    check_status=True, label=cmd_label)
2021-07-29T07:47:34.808 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_git_teuthology_73aa7e3b960c7ffac669297b6aa86606265edd1b/teuthology/orchestra/remote.py", line 509, in run
2021-07-29T07:47:34.808 INFO:tasks.cephfs_test_runner:    r = self._runner(client=self.ssh, name=self.shortname, **kwargs)
2021-07-29T07:47:34.809 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_git_teuthology_73aa7e3b960c7ffac669297b6aa86606265edd1b/teuthology/orchestra/run.py", line 455, in run
2021-07-29T07:47:34.809 INFO:tasks.cephfs_test_runner:    r.wait()
2021-07-29T07:47:34.810 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_git_teuthology_73aa7e3b960c7ffac669297b6aa86606265edd1b/teuthology/orchestra/run.py", line 161, in wait
2021-07-29T07:47:34.811 INFO:tasks.cephfs_test_runner:    self._raise_for_status()
2021-07-29T07:47:34.811 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_git_teuthology_73aa7e3b960c7ffac669297b6aa86606265edd1b/teuthology/orchestra/run.py", line 183, in _raise_for_status
2021-07-29T07:47:34.812 INFO:tasks.cephfs_test_runner:    node=self.hostname, label=self.label
2021-07-29T07:47:34.812 INFO:tasks.cephfs_test_runner:teuthology.exceptions.CommandFailedError: Command failed (peer status for fs: cephfs) on smithi042 with status 22: 'ceph --admin-daemon /var/run/ceph/cephfs-mirror.asok fs mirror peer status cephfs@30 96c863b2-6936-4726-a484-146c9382a70b'

From: /ceph/teuthology-archive/pdonnell-2021-07-29_05:54:43-fs-wip-pdonnell-testing-20210729.025313-distro-basic-smithi/6300469/teuthology.log

History

#1 Updated by Patrick Donnelly over 1 year ago

/ceph/teuthology-archive/pdonnell-2021-11-04_15:43:53-fs-wip-pdonnell-testing-20211103.023355-distro-basic-smithi/6485100/teuthology.log

#2 Updated by Kotresh Hiremath Ravishankar about 1 year ago

Also seen in this run

http://pulpito.front.sepia.ceph.com/yuriw-2021-11-08_15:19:37-fs-wip-yuri2-testing-2021-11-06-1322-pacific-distro-basic-smithi/6491186/

/ceph/teuthology-archive/yuriw-2021-11-08_15:19:37-fs-wip-yuri2-testing-2021-11-06-1322-pacific-distro-basic-smithi/6491186/teuthology.log

#3 Updated by Venky Shankar about 1 year ago

I'll take a look at the failure soon.

#4 Updated by Venky Shankar about 1 year ago

The mirror daemon is stuck at mounting the file system::

021-12-22T16:35:57.556+0000 10697700 20 cephfs::mirror::Utils connect: connecting to cluster=ceph, client=client.mirror, mon_host=
2021-12-22T16:35:57.894+0000 10697700 10 cephfs::mirror::Utils connect: using mon addr=172.21.15.164
2021-12-22T16:35:58.029+0000 10697700 -1 asok(0xa0c6050) AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to bind the UNIX domain socket to '/var/run/ceph/cephfs-mirror.asok': (17) File exists
2021-12-22T16:35:58.202+0000 10697700 10 cephfs::mirror::Utils connect: connected to cluster=ceph using client=client.mirror
2021-12-22T16:35:58.224+0000 10697700 20 cephfs::mirror::Utils mount: filesystem={fscid=30, fs_name=cephfs}
2021-12-22T16:35:58.558+0000 14ea0700 20 cephfs::mirror::ServiceDaemon: 0xfc08d10 update_status: 1 filesystem(s)
2021-12-22T16:35:58.610+0000 10e98700 20 cephfs::mirror::Mirror update_fs_mirrors

... ceph_mount() never returned - will rerun with "debug client = 20".

#5 Updated by Venky Shankar about 1 year ago

Unable to hit this consistently: https://pulpito.ceph.com/vshankar-2022-01-05_05:43:45-fs-wip-vshankar-testing-20220103-142738-testing-default-smithi/

We should probably include "debug client: 20" for cephfs-mirror tests and so that we have debug logs to look at when we hit this again.

#6 Updated by Patrick Donnelly 7 months ago

  • Target version deleted (v17.0.0)

#10 Updated by Venky Shankar 5 months ago

  • Category set to Testing
  • Priority changed from High to Low
  • Target version set to v18.0.0
  • Backport changed from pacific to pacific,quincy
  • Labels (FS) qa added

Lowering priority -- this is an issue with the test case rather than a bug in cephfs-mirror.

Also available in: Atom PDF