Project

General

Profile

Actions

Bug #56024

closed

cephadm: removes ceph.conf during qa run causing command failure

Added by Patrick Donnelly almost 2 years ago. Updated almost 2 years ago.

Status:
Resolved
Priority:
High
Category:
-
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
quincy,pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2022-06-12T06:33:37.163 INFO:journalctl@ceph.mon.a.smithi008.stdout:Jun 12 06:33:36 smithi008 ceph-mon[31232]: Removing smithi008:/etc/ceph/ceph.conf
2022-06-12T06:33:37.164 INFO:journalctl@ceph.mon.a.smithi008.stdout:Jun 12 06:33:36 smithi008 ceph-mon[31232]: pgmap v516: 129 pgs: 129 active+clean; 13 GiB data, 28 GiB used, 1.0 TiB / 1.0 TiB avail; 174 KiB/s rd, 100 MiB/s wr, 4.93k op/s
2022-06-12T06:33:37.166 INFO:journalctl@ceph.mon.c.smithi185.stdout:Jun 12 06:33:36 smithi185 ceph-mon[38700]: Removing smithi008:/etc/ceph/ceph.conf
2022-06-12T06:33:37.166 INFO:journalctl@ceph.mon.c.smithi185.stdout:Jun 12 06:33:36 smithi185 ceph-mon[38700]: pgmap v516: 129 pgs: 129 active+clean; 13 GiB data, 28 GiB used, 1.0 TiB / 1.0 TiB avail; 174 KiB/s rd, 100 MiB/s wr, 4.93k op/s
2022-06-12T06:33:37.279 INFO:journalctl@ceph.mon.b.smithi157.stdout:Jun 12 06:33:36 smithi157 ceph-mon[37057]: Removing smithi008:/etc/ceph/ceph.conf
2022-06-12T06:33:37.280 INFO:journalctl@ceph.mon.b.smithi157.stdout:Jun 12 06:33:36 smithi157 ceph-mon[37057]: pgmap v516: 129 pgs: 129 active+clean; 13 GiB data, 28 GiB used, 1.0 TiB / 1.0 TiB avail; 174 KiB/s rd, 100 MiB/s wr, 4.93k op/s
2022-06-12T06:33:39.077 DEBUG:teuthology.orchestra.run.smithi008:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph tell mds.1:0 scrub
status
2022-06-12T06:33:39.165 INFO:journalctl@ceph.mon.a.smithi008.stdout:Jun 12 06:33:39 smithi008 ceph-mon[31232]: pgmap v517: 129 pgs: 129 active+clean; 13 GiB data, 28 GiB used, 1.0 TiB / 1.0 TiB avail; 279 KiB/s rd, 96 MiB/s wr, 4.66k op/s
2022-06-12T06:33:39.416 INFO:journalctl@ceph.mon.c.smithi185.stdout:Jun 12 06:33:39 smithi185 ceph-mon[38700]: pgmap v517: 129 pgs: 129 active+clean; 13 GiB data, 28 GiB used, 1.0 TiB / 1.0 TiB avail; 279 KiB/s rd, 96 MiB/s wr, 4.66k op/s
2022-06-12T06:33:39.527 INFO:journalctl@ceph.mon.b.smithi157.stdout:Jun 12 06:33:39 smithi157 ceph-mon[37057]: pgmap v517: 129 pgs: 129 active+clean; 13 GiB data, 28 GiB used, 1.0 TiB / 1.0 TiB avail; 279 KiB/s rd, 96 MiB/s wr, 4.66k op/s
2022-06-12T06:33:39.801 INFO:teuthology.orchestra.run.smithi008.stderr:Error initializing cluster client: ObjectNotFound('RADOS object not found (error calling conf_read_file)',)
2022-06-12T06:33:39.815 DEBUG:teuthology.orchestra.run:got remote process result: 1
2022-06-12T06:33:39.816 ERROR:tasks.fwd_scrub.fs.[cephfs]:exception:
Traceback (most recent call last):
  File "/home/teuthworker/src/git.ceph.com_ceph-c_36d24a7f39b7e565955f208f512d14b9d7e923ee/qa/tasks/fwd_scrub.py", line 38, in _run
    self.do_scrub()
  File "/home/teuthworker/src/git.ceph.com_ceph-c_36d24a7f39b7e565955f208f512d14b9d7e923ee/qa/tasks/fwd_scrub.py", line 55, in do_scrub
    self._scrub()
  File "/home/teuthworker/src/git.ceph.com_ceph-c_36d24a7f39b7e565955f208f512d14b9d7e923ee/qa/tasks/fwd_scrub.py", line 77, in _scrub
    timeout=self.scrub_timeout)
  File "/home/teuthworker/src/git.ceph.com_ceph-c_36d24a7f39b7e565955f208f512d14b9d7e923ee/qa/tasks/cephfs/filesystem.py", line 1583, in wait_until_scrub_complete
    out_json = self.rank_tell(["scrub", "status"], rank=rank)
  File "/home/teuthworker/src/git.ceph.com_ceph-c_36d24a7f39b7e565955f208f512d14b9d7e923ee/qa/tasks/cephfs/filesystem.py", line 1161, in rank_tell
    out = self.mon_manager.raw_cluster_cmd("tell", f"mds.{self.id}:{rank}", *command)
  File "/home/teuthworker/src/git.ceph.com_ceph-c_36d24a7f39b7e565955f208f512d14b9d7e923ee/qa/tasks/ceph_manager.py", line 1597, in raw_cluster_cmd
    return self.run_cluster_cmd(**kwargs).stdout.getvalue()
  File "/home/teuthworker/src/git.ceph.com_ceph-c_36d24a7f39b7e565955f208f512d14b9d7e923ee/qa/tasks/ceph_manager.py", line 1588, in run_cluster_cmd
    return self.controller.run(**kwargs)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_2290146eac7577b8500f128a53856c3ea4a00e3c/teuthology/orchestra/remote.py", line 510, in run
    r = self._runner(client=self.ssh, name=self.shortname, **kwargs)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_2290146eac7577b8500f128a53856c3ea4a00e3c/teuthology/orchestra/run.py", line 455, in run
    r.wait()
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_2290146eac7577b8500f128a53856c3ea4a00e3c/teuthology/orchestra/run.py", line 161, in wait
    self._raise_for_status()
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_2290146eac7577b8500f128a53856c3ea4a00e3c/teuthology/orchestra/run.py", line 183, in _raise_for_status
    node=self.hostname, label=self.label
teuthology.exceptions.CommandFailedError: Command failed on smithi008 with status 1: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph
tell mds.1:0 scrub status'

From: /ceph/teuthology-archive/pdonnell-2022-06-12_05:08:12-fs:workload-wip-pdonnell-testing-20220612.004943-distro-default-smithi/6875276/teuthology.log

See also: /ceph/teuthology-archive/pdonnell-2022-06-12_05:08:12-fs:workload-wip-pdonnell-testing-20220612.004943-distro-default-smithi/6875321/teuthology.log


Related issues 3 (0 open3 closed)

Related to Orchestrator - Bug #57449: qa: removal of host during QAResolvedAdam King

Actions
Copied to Orchestrator - Backport #56473: pacific: cephadm: removes ceph.conf during qa run causing command failureResolvedAdam KingActions
Copied to Orchestrator - Backport #56474: quincy: cephadm: removes ceph.conf during qa run causing command failureResolvedAdam KingActions
Actions

Also available in: Atom PDF