Project

General

Profile

Actions

Bug #62221

open

Test failure: test_add_ancestor_and_child_directory (tasks.cephfs.test_mirroring.TestMirroring)

Added by Venky Shankar 9 months ago. Updated about 1 month ago.

Status:
In Progress
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
% Done:

0%

Source:
Tags:
Backport:
reef,quincy,pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
cephfs-mirror
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

/a/yuriw-2023-07-26_14:34:38-fs-reef-release-distro-default-smithi/7353194

2023-07-26T23:01:19.423 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_teuthology_407880c6d3fb77318fff01c863715090f9c2de69/teuthology/orchestra/run.py", line 161, in wait
2023-07-26T23:01:19.424 INFO:tasks.cephfs_test_runner:    self._raise_for_status()
2023-07-26T23:01:19.424 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_teuthology_407880c6d3fb77318fff01c863715090f9c2de69/teuthology/orchestra/run.py", line 181, in _raise_for_status
2023-07-26T23:01:19.424 INFO:tasks.cephfs_test_runner:    raise CommandFailedError(
2023-07-26T23:01:19.424 INFO:tasks.cephfs_test_runner:teuthology.exceptions.CommandFailedError: Command failed (mirror status for fs: cephfs) on smithi053 with status 22: 'ceph --admin-daemon /var/run/ceph/cephfs-mirror.asok fs mirror status cephfs@2'

The test run the mirror daemon with valgrind. Seems like the daemon didn't start at all. Valgrind report is at: ./remote/smithi053/log/valgrind/cephfs-mirror-client.mirror.log.gz

Actions #2

Updated by Venky Shankar 9 months ago

  • Assignee set to Jos Collin

Jos, please take this one.

Actions #3

Updated by Jos Collin 9 months ago

  • Status changed from New to In Progress
Actions #8

Updated by Patrick Donnelly about 2 months ago

/teuthology/yuriw-2024-03-12_14:59:27-fs-wip-yuri11-testing-2024-03-11-0838-reef-distro-default-smithi/7593782/teuthology.log

Actions #9

Updated by Rishabh Dave about 2 months ago

Patrick, following is the traceback from the job you reported in the previous comment (/teuthology/yuriw-2024-03-12_14:59:27-fs-wip-yuri11-testing-2024-03-11-0838-reef-distro-default-smithi/7593782/teuthology.log). Copying the traceback here from the log -

2024-03-12T17:59:40.399 INFO:tasks.cephfs_test_runner:======================================================================
2024-03-12T17:59:40.400 INFO:tasks.cephfs_test_runner:ERROR: test_add_ancestor_and_child_directory (tasks.cephfs.test_mirroring.TestMirroring)
2024-03-12T17:59:40.400 INFO:tasks.cephfs_test_runner:----------------------------------------------------------------------
2024-03-12T17:59:40.400 INFO:tasks.cephfs_test_runner:Traceback (most recent call last):
2024-03-12T17:59:40.400 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/github.com_ceph_ceph-c_49e4a6794d5502bd66908faf714fc54896c51712/qa/tasks/cephfs/test_mirroring.py", line 459, in test_add_ancestor_and_child_directory
2024-03-12T17:59:40.400 INFO:tasks.cephfs_test_runner:    self.enable_mirroring(self.primary_fs_name, self.primary_fs_id)
2024-03-12T17:59:40.400 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/github.com_ceph_ceph-c_49e4a6794d5502bd66908faf714fc54896c51712/qa/tasks/cephfs/test_mirroring.py", line 47, in enable_mirroring
2024-03-12T17:59:40.400 INFO:tasks.cephfs_test_runner:    res = self.mirror_daemon_command(f'counter dump for fs: {fs_name}', 'counter', 'dump')
2024-03-12T17:59:40.400 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/github.com_ceph_ceph-c_49e4a6794d5502bd66908faf714fc54896c51712/qa/tasks/cephfs/test_mirroring.py", line 283, in mirror_daemon_command
2024-03-12T17:59:40.400 INFO:tasks.cephfs_test_runner:    p = self.mount_a.client_remote.run(args=
2024-03-12T17:59:40.400 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_teuthology_e691533f9cbb33d85b2187bba20d7102f098636d/teuthology/orchestra/remote.py", line 523, in run
2024-03-12T17:59:40.400 INFO:tasks.cephfs_test_runner:    r = self._runner(client=self.ssh, name=self.shortname, **kwargs)
2024-03-12T17:59:40.400 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_teuthology_e691533f9cbb33d85b2187bba20d7102f098636d/teuthology/orchestra/run.py", line 455, in run
2024-03-12T17:59:40.400 INFO:tasks.cephfs_test_runner:    r.wait()
2024-03-12T17:59:40.400 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_teuthology_e691533f9cbb33d85b2187bba20d7102f098636d/teuthology/orchestra/run.py", line 161, in wait
2024-03-12T17:59:40.400 INFO:tasks.cephfs_test_runner:    self._raise_for_status()
2024-03-12T17:59:40.400 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_teuthology_e691533f9cbb33d85b2187bba20d7102f098636d/teuthology/orchestra/run.py", line 181, in _raise_for_status
2024-03-12T17:59:40.401 INFO:tasks.cephfs_test_runner:    raise CommandFailedError(
2024-03-12T17:59:40.401 INFO:tasks.cephfs_test_runner:teuthology.exceptions.CommandFailedError: Command failed (counter dump for fs: cephfs) on smithi159 with status 22: 'ceph --admin-daemon /var/run/ceph/cephfs-mirror.asok counter dump'
2024-03-12T17:59:40.401 INFO:tasks.cephfs_test_runner:

The failing command on your job is ceph counter dump but the failing command on this ticket is ceph fs mirror status. Is the same issue underlying both the failing commands? I am asking because I too found the exact same job failure as you - https://pulpito.ceph.com/yuriw-2024-03-14_15:28:28-fs-wip-yuri4-testing-2024-03-13-0733-reef-distro-default-smithi/7600370

If the underlying issues are different than it's better to open a new ticket for it. I've talked with Jos about this. He'll dig deeper to find it out.

Actions #10

Updated by Patrick Donnelly about 2 months ago

Rishabh Dave wrote:

Patrick, following is the traceback from the job you reported in the previous comment (/teuthology/yuriw-2024-03-12_14:59:27-fs-wip-yuri11-testing-2024-03-11-0838-reef-distro-default-smithi/7593782/teuthology.log). Copying the traceback here from the log -

[...]

The failing command on your job is ceph counter dump but the failing command on this ticket is ceph fs mirror status. Is the same issue underlying both the failing commands? I am asking because I too found the exact same job failure as you - https://pulpito.ceph.com/yuriw-2024-03-14_15:28:28-fs-wip-yuri4-testing-2024-03-13-0733-reef-distro-default-smithi/7600370

If the underlying issues are different than it's better to open a new ticket for it. I've talked with Jos about this. He'll dig deeper to find it out.

It may be different as you observed. I didn't look closely.

Actions #11

Updated by Venky Shankar about 1 month ago

Rishabh Dave wrote:

Patrick, following is the traceback from the job you reported in the previous comment (/teuthology/yuriw-2024-03-12_14:59:27-fs-wip-yuri11-testing-2024-03-11-0838-reef-distro-default-smithi/7593782/teuthology.log). Copying the traceback here from the log -

[...]

The failing command on your job is ceph counter dump but the failing command on this ticket is ceph fs mirror status. Is the same issue underlying both the failing commands? I am asking because I too found the exact same job failure as you - https://pulpito.ceph.com/yuriw-2024-03-14_15:28:28-fs-wip-yuri4-testing-2024-03-13-0733-reef-distro-default-smithi/7600370

If the underlying issues are different than it's better to open a new ticket for it. I've talked with Jos about this. He'll dig deeper to find it out.

The failed test that Patrick linked is a valgrind run, so it's likely the same underlying issue even if the failure backtrace is different. Jos, please RCA asap.

Actions

Also available in: Atom PDF