Bug #56696
admin keyring disappears during qa run
0%
Description
2022-07-22T21:02:39.723 INFO:journalctl@ceph.mon.b.smithi192.stdout:Jul 22 21:02:39 smithi192 ceph-mon[122238]: pgmap v1133: 129 pgs: 129 active+clean; 959 MiB data, 7.7 GiB used, 1.0 TiB / 1.0 TiB avail; 4.7 KiB/s rd, 682 B/s wr, 8 op/s 2022-07-22T21:02:40.130 WARNING:tasks.check_counter:Counter 'mds.exported' not found on daemon mds.d 2022-07-22T21:02:40.130 WARNING:tasks.check_counter:Counter 'mds.imported' not found on daemon mds.d 2022-07-22T21:02:40.131 DEBUG:tasks.check_counter:Getting stats from g 2022-07-22T21:02:40.131 DEBUG:teuthology.orchestra.run.smithi153:> sudo /home/ubuntu/cephtest/cephadm --image quay.ceph.io/ceph-ci/ceph:89768db311950607682ea2bb29f56edc324f86ac shell --fsid d77283b6-09fb-11ed-842f-001a4aab830c -- ceph daemon mds.g perf dump 2022-07-22T21:02:40.992 INFO:journalctl@ceph.mon.a.smithi153.stdout:Jul 22 21:02:40 smithi153 ceph-mon[85483]: from='mgr.14154 172.21.15.153:0/3969930003' entity='mgr.x' 2022-07-22T21:02:40.993 INFO:journalctl@ceph.mon.a.smithi153.stdout:Jul 22 21:02:40 smithi153 ceph-mon[85483]: from='mgr.14154 172.21.15.153:0/3969930003' entity='mgr.x' 2022-07-22T21:02:40.994 INFO:journalctl@ceph.mon.a.smithi153.stdout:Jul 22 21:02:40 smithi153 ceph-mon[85483]: pgmap v1134: 129 pgs: 129 active+clean; 959 MiB data, 7.7 GiB used, 1.0 TiB / 1.0 TiB avail; 5.2 KiB/s rd, 767 B/s wr, 9 op/s 2022-07-22T21:02:40.994 INFO:journalctl@ceph.mon.a.smithi153.stdout:Jul 22 21:02:40 smithi153 ceph-mon[85483]: from='mgr.14154 172.21.15.153:0/3969930003' entity='mgr.x' cmd=[{"prefix": "mon metadata", "id": "c"}]: dispatch 2022-07-22T21:02:40.995 INFO:journalctl@ceph.mon.a.smithi153.stdout:Jul 22 21:02:40 smithi153 ceph-mon[85483]: from='mgr.14154 172.21.15.153:0/3969930003' entity='mgr.x' cmd=[{"prefix": "mon metadata", "id": "b"}]: dispatch 2022-07-22T21:02:41.046 INFO:journalctl@ceph.mon.c.smithi201.stdout:Jul 22 21:02:40 smithi201 ceph-mon[122768]: from='mgr.14154 172.21.15.153:0/3969930003' entity='mgr.x' 2022-07-22T21:02:41.047 INFO:journalctl@ceph.mon.c.smithi201.stdout:Jul 22 21:02:40 smithi201 ceph-mon[122768]: from='mgr.14154 172.21.15.153:0/3969930003' entity='mgr.x' 2022-07-22T21:02:41.047 INFO:journalctl@ceph.mon.c.smithi201.stdout:Jul 22 21:02:40 smithi201 ceph-mon[122768]: pgmap v1134: 129 pgs: 129 active+clean; 959 MiB data, 7.7 GiB used, 1.0 TiB / 1.0 TiB avail; 5.2 KiB/s rd, 767 B/s wr, 9 op/s 2022-07-22T21:02:41.047 INFO:journalctl@ceph.mon.c.smithi201.stdout:Jul 22 21:02:40 smithi201 ceph-mon[122768]: from='mgr.14154 172.21.15.153:0/3969930003' entity='mgr.x' cmd=[{"prefix": "mon metadata", "id": "c"}]: dispatch 2022-07-22T21:02:41.047 INFO:journalctl@ceph.mon.c.smithi201.stdout:Jul 22 21:02:40 smithi201 ceph-mon[122768]: from='mgr.14154 172.21.15.153:0/3969930003' entity='mgr.x' cmd=[{"prefix": "mon metadata", "id": "b"}]: dispatch 2022-07-22T21:02:41.051 INFO:teuthology.orchestra.run.smithi153.stderr:Inferring config /var/lib/ceph/d77283b6-09fb-11ed-842f-001a4aab830c/mon.a/config 2022-07-22T21:02:41.184 INFO:journalctl@ceph.mon.b.smithi192.stdout:Jul 22 21:02:40 smithi192 ceph-mon[122238]: from='mgr.14154 172.21.15.153:0/3969930003' entity='mgr.x' 2022-07-22T21:02:41.185 INFO:journalctl@ceph.mon.b.smithi192.stdout:Jul 22 21:02:40 smithi192 ceph-mon[122238]: from='mgr.14154 172.21.15.153:0/3969930003' entity='mgr.x' 2022-07-22T21:02:41.185 INFO:journalctl@ceph.mon.b.smithi192.stdout:Jul 22 21:02:40 smithi192 ceph-mon[122238]: pgmap v1134: 129 pgs: 129 active+clean; 959 MiB data, 7.7 GiB used, 1.0 TiB / 1.0 TiB avail; 5.2 KiB/s rd, 767 B/s wr, 9 op/s 2022-07-22T21:02:41.185 INFO:journalctl@ceph.mon.b.smithi192.stdout:Jul 22 21:02:40 smithi192 ceph-mon[122238]: from='mgr.14154 172.21.15.153:0/3969930003' entity='mgr.x' cmd=[{"prefix": "mon metadata", "id": "c"}]: dispatch 2022-07-22T21:02:41.185 INFO:journalctl@ceph.mon.b.smithi192.stdout:Jul 22 21:02:40 smithi192 ceph-mon[122238]: from='mgr.14154 172.21.15.153:0/3969930003' entity='mgr.x' cmd=[{"prefix": "mon metadata", "id": "b"}]: dispatch 2022-07-22T21:02:42.109 INFO:teuthology.orchestra.run.smithi153.stderr:Error: statfs /var/lib/ceph/d77283b6-09fb-11ed-842f-001a4aab830c/config/ceph.client.admin.keyring: no such file or directory 2022-07-22T21:02:42.157 DEBUG:teuthology.orchestra.run:got remote process result: 125 2022-07-22T21:02:42.158 ERROR:teuthology.run_tasks:Manager failed: check-counter Traceback (most recent call last): File "/home/teuthworker/src/git.ceph.com_git_teuthology_8d598431210977f8caccec83230b4bfec7bd5d3f/teuthology/run_tasks.py", line 188, in run_tasks suppress = manager.__exit__(*exc_info) File "/home/teuthworker/src/git.ceph.com_git_teuthology_8d598431210977f8caccec83230b4bfec7bd5d3f/teuthology/task/__init__.py", line 132, in __exit__ self.end() File "/home/teuthworker/src/git.ceph.com_ceph-c_89768db311950607682ea2bb29f56edc324f86ac/qa/tasks/check_counter.py", line 71, in end proc = manager.admin_socket(daemon_type, daemon_id, ["perf", "dump"]) File "/home/teuthworker/src/git.ceph.com_ceph-c_89768db311950607682ea2bb29f56edc324f86ac/qa/tasks/ceph_manager.py", line 1849, in admin_socket check_status=check_status, File "/home/teuthworker/src/git.ceph.com_ceph-c_89768db311950607682ea2bb29f56edc324f86ac/qa/tasks/ceph_manager.py", line 52, in shell **kwargs File "/home/teuthworker/src/git.ceph.com_git_teuthology_8d598431210977f8caccec83230b4bfec7bd5d3f/teuthology/orchestra/remote.py", line 510, in run r = self._runner(client=self.ssh, name=self.shortname, **kwargs) File "/home/teuthworker/src/git.ceph.com_git_teuthology_8d598431210977f8caccec83230b4bfec7bd5d3f/teuthology/orchestra/run.py", line 455, in run r.wait() File "/home/teuthworker/src/git.ceph.com_git_teuthology_8d598431210977f8caccec83230b4bfec7bd5d3f/teuthology/orchestra/run.py", line 161, in wait self._raise_for_status() File "/home/teuthworker/src/git.ceph.com_git_teuthology_8d598431210977f8caccec83230b4bfec7bd5d3f/teuthology/orchestra/run.py", line 183, in _raise_for_status node=self.hostname, label=self.label teuthology.exceptions.CommandFailedError: Command failed on smithi153 with status 125: 'sudo /home/ubuntu/cephtest/cephadm --image quay.ceph.io/ceph-ci/ceph:89768db311950607682ea2bb29f56edc324f86ac shell --fsid d77283b6-09fb-11ed-842f-001a4aab830c -- ceph daemon mds.g perf dump'
From: /ceph/teuthology-archive/pdonnell-2022-07-22_19:42:58-fs-wip-pdonnell-testing-20220721.235756-distro-default-smithi/6945820/teuthology.log
See also:
Failure: Command failed on smithi153 with status 125: 'sudo /home/ubuntu/cephtest/cephadm --image quay.ceph.io/ceph-ci/ceph:89768db311950607682ea2bb29f56edc324f86ac shell --fsid d77283b6-09fb-11ed-842f-001a4aab830c -- ceph daemon mds.g perf dump' 3 jobs: ['6945820', '6945835', '6945830'] suites intersection: ['1-cephadm', '2-logrotate}', 'begin/{0-install', 'clusters/1a11s-mds-1c-client-3node', 'conf/{client', 'fs/workload/{0-rhel_8', 'ignorelist_health', 'ignorelist_wrongly_marked_down', 'mds', 'mon', 'mount', 'mount/kclient/{base/{mount-syntax/{v2}', 'ms-die-on-skipped}}', 'osd-asserts', 'osd}', 'overrides/{frag', 'session_timeout}', 'standby-replay', 'tasks/{0-check-counter'] suites union: ['1-cephadm', '2-logrotate}', 'begin/{0-install', 'clusters/1a11s-mds-1c-client-3node', 'conf/{client', 'fs/workload/{0-rhel_8', 'ignorelist_health', 'ignorelist_wrongly_marked_down', 'mds', 'mon', 'mount', 'mount/kclient/{base/{mount-syntax/{v2}', 'ms-die-on-skipped}}', 'ms_mode/legacy', 'ms_mode/secure', 'n/5', 'objectstore-ec/bluestore-bitmap', 'objectstore-ec/bluestore-ec-root', 'omap_limit/10', 'omap_limit/10000', 'osd-asserts', 'osd}', 'overrides/{distro/stock/{k-stock', 'overrides/{distro/testing/k-testing', 'overrides/{frag', 'ranks/1', 'ranks/multi/{export-check', 'replication/default}', 'rhel_8}', 'scrub/no', 'scrub/yes', 'session_timeout}', 'standby-replay', 'subvolume/{no-subvolume}', 'subvolume/{with-namespace-isolated}', 'tasks/{0-check-counter', 'workunit/fs/misc}}', 'workunit/suites/dbench}}', 'workunit/suites/fsync-tester}}', 'wsync/no}', 'wsync/yes}']
Related issues
History
#1 Updated by Adam King over 1 year ago
- Assignee set to Adam King
#2 Updated by Adam King over 1 year ago
- Related to Bug #57462: cephadm removes config & keyring files in mid flight added
#3 Updated by Adam King over 1 year ago
- Related to Bug #57449: qa: removal of host during QA added
#4 Updated by Adam King over 1 year ago
- Pull request ID set to 48074
#5 Updated by Adam King over 1 year ago
- Status changed from New to Resolved
marking this resolved despite it still needing backports. Backports of this will be tracked through https://tracker.ceph.com/issues/57462.
#6 Updated by Voja Molani 12 months ago
The PR has a comment:
Keep in mind so far this issue has only been seen in long running tests
But I have seen this several times (maybe 5-6?) in two not yet in production clusters initially EL8 and then reinstalled as EL9 that have existed for about 6 months. One cluster is VMs and second is physical machines.
It will be interesting to see if 17.2.6 finally fixes this.
ceph orch host maintenance enter x
- reboot server
ceph orch host maintenance exit x
The third server threw error at host maintenance exit
that the keyring was not found and indeed it was missing.