Project

General

Profile

Actions

Bug #58745

open

quincy: qa: cephadm failed to stop mon

Added by Jos Collin about 1 year ago. Updated about 1 year ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

cephadm:Failed to stop "ceph.mon.b" in [1]

[1] http://qa-proxy.ceph.com/teuthology/yuriw-2023-02-13_20:44:19-fs-wip-yuri8-testing-2023-02-07-0753-quincy-distro-default-smithi/7171644/teuthology.log


2023-02-14T04:55:17.726 INFO:tasks.cephadm.mon.a:Stopped mon.a
2023-02-14T04:55:17.727 INFO:tasks.cephadm.mon.c:Stopping mon.b...
2023-02-14T04:55:17.727 DEBUG:teuthology.orchestra.run.smithi103:> sudo systemctl stop ceph-62bef692-ac23-11ed-9ae5-001a4aab830c@mon.b
2023-02-14T04:55:17.759 INFO:teuthology.orchestra.run.smithi103.stderr:Failed to stop ceph-62bef692-ac23-11ed-9ae5-001a4aab830c@mon.b.service: Unit ceph-62bef692-ac23-11ed-9ae5-001a4aab830c@mon.b.service not loaded.
2023-02-14T04:55:17.760 DEBUG:teuthology.orchestra.run:got remote process result: 5
2023-02-14T04:55:17.761 ERROR:tasks.cephadm:Failed to stop "ceph.mon.b" 
Traceback (most recent call last):
  File "/home/teuthworker/src/github.com_ceph_ceph-c_4b684637f31df0c5f7e5100233e62cd73fa883b2/qa/tasks/cephadm.py", line 542, in ceph_bootstrap
    yield
  File "/home/teuthworker/src/git.ceph.com_teuthology_a5875b2da3506f26286d023ce2de3e75c0eb806d/teuthology/contextutil.py", line 31, in nested
    vars.append(enter())
  File "/usr/lib/python3.8/contextlib.py", line 113, in __enter__
    return next(self.gen)
  File "/home/teuthworker/src/github.com_ceph_ceph-c_4b684637f31df0c5f7e5100233e62cd73fa883b2/qa/tasks/cephadm.py", line 670, in ceph_mons
    r = _shell(
  File "/home/teuthworker/src/github.com_ceph_ceph-c_4b684637f31df0c5f7e5100233e62cd73fa883b2/qa/tasks/cephadm.py", line 37, in _shell
    return remote.run(
  File "/home/teuthworker/src/git.ceph.com_teuthology_a5875b2da3506f26286d023ce2de3e75c0eb806d/teuthology/orchestra/remote.py", line 525, in run
    r = self._runner(client=self.ssh, name=self.shortname, **kwargs)
  File "/home/teuthworker/src/git.ceph.com_teuthology_a5875b2da3506f26286d023ce2de3e75c0eb806d/teuthology/orchestra/run.py", line 455, in run
    r.wait()
  File "/home/teuthworker/src/git.ceph.com_teuthology_a5875b2da3506f26286d023ce2de3e75c0eb806d/teuthology/orchestra/run.py", line 161, in wait
    self._raise_for_status()
  File "/home/teuthworker/src/git.ceph.com_teuthology_a5875b2da3506f26286d023ce2de3e75c0eb806d/teuthology/orchestra/run.py", line 181, in _raise_for_status
    raise CommandFailedError(
teuthology.exceptions.CommandFailedError: Command failed on smithi157 with status 127: 'sudo /home/ubuntu/cephtest/cephadm --image quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph:4b684637f31df0c5f7e5100233e62cd73fa883b2 shell -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring --fsid 62bef692-ac23-11ed-9ae5-001a4aab830c -- ceph mon dump -f json'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/teuthworker/src/github.com_ceph_ceph-c_4b684637f31df0c5f7e5100233e62cd73fa883b2/qa/tasks/cephadm.py", line 561, in ceph_bootstrap
    ctx.daemons.get_daemon(type_, id_, cluster).stop()
  File "/home/teuthworker/src/git.ceph.com_teuthology_a5875b2da3506f26286d023ce2de3e75c0eb806d/teuthology/orchestra/daemon/cephadmunit.py", line 154, in stop
    self.remote.sh(self.stop_cmd)
  File "/home/teuthworker/src/git.ceph.com_teuthology_a5875b2da3506f26286d023ce2de3e75c0eb806d/teuthology/orchestra/remote.py", line 96, in sh
    proc = self.run(**kwargs)
  File "/home/teuthworker/src/git.ceph.com_teuthology_a5875b2da3506f26286d023ce2de3e75c0eb806d/teuthology/orchestra/remote.py", line 525, in run
    r = self._runner(client=self.ssh, name=self.shortname, **kwargs)
  File "/home/teuthworker/src/git.ceph.com_teuthology_a5875b2da3506f26286d023ce2de3e75c0eb806d/teuthology/orchestra/run.py", line 455, in run
    r.wait()
  File "/home/teuthworker/src/git.ceph.com_teuthology_a5875b2da3506f26286d023ce2de3e75c0eb806d/teuthology/orchestra/run.py", line 161, in wait
    self._raise_for_status()
  File "/home/teuthworker/src/git.ceph.com_teuthology_a5875b2da3506f26286d023ce2de3e75c0eb806d/teuthology/orchestra/run.py", line 181, in _raise_for_status
    raise CommandFailedError(
teuthology.exceptions.CommandFailedError: Command failed on smithi103 with status 5: 'sudo systemctl stop ceph-62bef692-ac23-11ed-9ae5-001a4aab830c@mon.b'
2023-02-14T04:55:17.762 INFO:tasks.cephadm:Archiving crash dumps...
2023-02-14T04:55:17.766 DEBUG:teuthology.misc:Transferring archived files from smithi016:/var/lib/ceph/62bef692-ac23-11ed-9ae5-001a4aab830c/crash to /home/teuthworker/archive/yuriw-2023-02-13_20:44:19-fs-wip-yuri8-testing-2023-02-07-0753-quincy-distro-default-smithi/7171644/remote/smithi016/crash
2023-02-14T04:55:17.767 DEBUG:teuthology.orchestra.run.smithi016:> sudo tar cz -f - -C /var/lib/ceph/62bef692-ac23-11ed-9ae5-001a4aab830c/crash -- .
2023-02-14T04:55:17.824 DEBUG:teuthology.misc:Transferring archived files from smithi103:/var/lib/ceph/62bef692-ac23-11ed-9ae5-001a4aab830c/crash to /home/teuthworker/archive/yuriw-2023-02-13_20:44:19-fs-wip-yuri8-testing-2023-02-07-0753-quincy-distro-default-smithi/7171644/remote/smithi103/crash
2023-02-14T04:55:17.825 DEBUG:teuthology.orchestra.run.smithi103:> sudo tar cz -f - -C /var/lib/ceph/62bef692-ac23-11ed-9ae5-001a4aab830c/crash -- .
2023-02-14T04:55:17.866 INFO:teuthology.orchestra.run.smithi103.stderr:tar: /var/lib/ceph/62bef692-ac23-11ed-9ae5-001a4aab830c/crash: Cannot open: No such file or directory
2023-02-14T04:55:17.867 INFO:teuthology.orchestra.run.smithi103.stderr:tar: Error is not recoverable: exiting now
2023-02-14T04:55:17.869 DEBUG:teuthology.misc:Transferring archived files from smithi157:/var/lib/ceph/62bef692-ac23-11ed-9ae5-001a4aab830c/crash to /home/teuthworker/archive/yuriw-2023-02-13_20:44:19-fs-wip-yuri8-testing-2023-02-07-0753-quincy-distro-default-smithi/7171644/remote/smithi157/crash
2023-02-14T04:55:17.870 DEBUG:teuthology.orchestra.run.smithi157:> sudo tar cz -f - -C /var/lib/ceph/62bef692-ac23-11ed-9ae5-001a4aab830c/crash -- .
2023-02-14T04:55:17.925 INFO:tasks.cephadm:Checking cluster log for badness...
2023-02-14T04:55:17.926 DEBUG:teuthology.orchestra.run.smithi016:> sudo egrep '\[ERR\]|\[WRN\]|\[SEC\]' /var/log/ceph/62bef692-ac23-11ed-9ae5-001a4aab830c/ceph.log | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | egrep -v 'overall HEALTH_' | egrep -v '\(FS_DEGRADED\)' | egrep -v '\(MDS_FAILED\)' | egrep -v '\(MDS_DEGRADED\)' | egrep -v '\(FS_WITH_FAILED_MDS\)' | egrep -v '\(MDS_DAMAGE\)' | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | egrep -v '\(FS_INLINE_DATA_DEPRECATED\)' | egrep -v 'overall HEALTH_' | egrep -v '\(OSD_DOWN\)' | egrep -v '\(OSD_' | egrep -v 'but it is still running' | egrep -v 'is not responding' | egrep -v 'slow metadata IO' | egrep -v SLOW_OPS | egrep -v 'slow request' | head -n 1
2023-02-14T04:55:17.969 INFO:teuthology.orchestra.run.smithi016.stderr:grep: /var/log/ceph/62bef692-ac23-11ed-9ae5-001a4aab830c/ceph.log: No such file or directory
2023-02-14T04:55:17.971 INFO:tasks.cephadm:Compressing logs...
2023-02-14T04:55:17.972 DEBUG:teuthology.orchestra.run.smithi016:> sudo find /var/log/ceph /var/log/rbd-target-api -name '*.log' -print0 | sudo xargs -0 --no-run-if-empty -- gzip --
2023-02-14T04:55:18.015 DEBUG:teuthology.orchestra.run.smithi103:> sudo find /var/log/ceph /var/log/rbd-target-api -name '*.log' -print0 | sudo xargs -0 --no-run-if-empty -- gzip --
2023-02-14T04:55:18.017 DEBUG:teuthology.orchestra.run.smithi157:> sudo find /var/log/ceph /var/log/rbd-target-api -name '*.log' -print0 | sudo xargs -0 --no-run-if-empty -- gzip --
2023-02-14T04:55:18.041 INFO:teuthology.orchestra.run.smithi103.stderr:find: ‘/var/log/rbd-target-api’: No such file or directory
2023-02-14T04:55:18.043 INFO:teuthology.orchestra.run.smithi157.stderr:find: ‘/var/log/rbd-target-api’: No such file or directory
2023-02-14T04:55:18.049 INFO:teuthology.orchestra.run.smithi016.stderr:find: ‘/var/log/rbd-target-api’: No such file or directory
2023-02-14T04:55:18.088 INFO:teuthology.orchestra.run.smithi016.stderr:gzip: /var/log/ceph/62bef692-ac23-11ed-9ae5-001a4aab830c/ceph-mgr.x.log: file size changed while zipping
2023-02-14T04:55:18.134 INFO:tasks.cephadm:Archiving logs...
2023-02-14T04:55:18.135 DEBUG:teuthology.misc:Transferring archived files from smithi016:/var/log/ceph to /home/teuthworker/archive/yuriw-2023-02-13_20:44:19-fs-wip-yuri8-testing-2023-02-07-0753-quincy-distro-default-smithi/7171644/remote/smithi016/log
2023-02-14T04:55:18.136 DEBUG:teuthology.orchestra.run.smithi016:> sudo tar cz -f - -C /var/log/ceph -- .
2023-02-14T04:55:18.239 DEBUG:teuthology.misc:Transferring archived files from smithi103:/var/log/ceph to /home/teuthworker/archive/yuriw-2023-02-13_20:44:19-fs-wip-yuri8-testing-2023-02-07-0753-quincy-distro-default-smithi/7171644/remote/smithi103/log
2023-02-14T04:55:18.240 DEBUG:teuthology.orchestra.run.smithi103:> sudo tar cz -f - -C /var/log/ceph -- .
2023-02-14T04:55:18.272 DEBUG:teuthology.misc:Transferring archived files from smithi157:/var/log/ceph to /home/teuthworker/archive/yuriw-2023-02-13_20:44:19-fs-wip-yuri8-testing-2023-02-07-0753-quincy-distro-default-smithi/7171644/remote/smithi157/log
2023-02-14T04:55:18.274 DEBUG:teuthology.orchestra.run.smithi157:> sudo tar cz -f - -C /var/log/ceph -- .
2023-02-14T04:55:18.305 INFO:tasks.cephadm:Removing cluster...
Actions #1

Updated by Jos Collin about 1 year ago

  • Subject changed from qa: cephadm failed to stop mon to quincy: qa: cephadm failed to stop mon
Actions #2

Updated by Venky Shankar about 1 year ago

  • Project changed from CephFS to RADOS

Probably not related to cephfs.

Actions #3

Updated by Radoslaw Zarzynski about 1 year ago

  • Project changed from RADOS to Orchestrator

Going over the teuthology.log shows:

Error: OCI runtime error: container_linux.go

which suggests https://tracker.ceph.com/issues/44587#note-12 as a related ticket.

Actions

Also available in: Atom PDF