Actions
Bug #61831
openqa: test_mirroring_init_failure_with_recovery failure
Status:
New
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
% Done:
0%
Source:
Tags:
Backport:
reef,quincy,pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
This failure was first reported here - https://tracker.ceph.com/issues/50224#note-13.
Seeing this failure again - https://pulpito.ceph.com/rishabh-2023-06-23_17:37:30-fs-wip-rishabh-improvements-authmon-distro-default-smithi/7313862
2023-06-23T20:20:27.098 INFO:tasks.cephfs_test_runner:====================================================================== 2023-06-23T20:20:27.099 INFO:tasks.cephfs_test_runner:ERROR: test_mirroring_init_failure_with_recovery (tasks.cephfs.test_mirroring.TestMirroring) 2023-06-23T20:20:27.099 INFO:tasks.cephfs_test_runner:Test if the mirror daemon can recover from a init failure 2023-06-23T20:20:27.099 INFO:tasks.cephfs_test_runner:---------------------------------------------------------------------- 2023-06-23T20:20:27.099 INFO:tasks.cephfs_test_runner:Traceback (most recent call last): 2023-06-23T20:20:27.100 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/git.ceph.com_ceph-c_8627af3e0adcb765a3c249fcc209cba9f4873e1b/qa/tasks/cephfs/test_mirroring.py", line 742, in test_mirroring_init_failure_with_recovery 2023-06-23T20:20:27.100 INFO:tasks.cephfs_test_runner: while proceed(): 2023-06-23T20:20:27.100 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/git.ceph.com_teuthology_076bbebc42a14f7d568aaa78eabb0038327bcb23/teuthology/contextutil.py", line 134, in __call__ 2023-06-23T20:20:27.100 INFO:tasks.cephfs_test_runner: raise MaxWhileTries(error_msg) 2023-06-23T20:20:27.101 INFO:tasks.cephfs_test_runner:teuthology.exceptions.MaxWhileTries: 'wait for failed state' reached maximum tries (21) after waiting for 100 seconds 2023-06-23T20:20:27.101 INFO:tasks.cephfs_test_runner:
2023-06-23T18:30:29.202 INFO:tasks.ceph.mgr.x.smithi111.stderr:2023-06-23T18:30:29.197+0000 7f1ac2c6d700 -1 client.0 error registering admin socket command: (17) File exists 2023-06-23T18:30:29.202 INFO:tasks.ceph.mgr.x.smithi111.stderr:2023-06-23T18:30:29.197+0000 7f1ac2c6d700 -1 client.0 error registering admin socket command: (17) File exists 2023-06-23T18:30:29.202 INFO:tasks.ceph.mgr.x.smithi111.stderr:2023-06-23T18:30:29.197+0000 7f1ac2c6d700 -1 client.0 error registering admin socket command: (17) File exists 2023-06-23T18:30:29.202 INFO:tasks.ceph.mgr.x.smithi111.stderr:2023-06-23T18:30:29.197+0000 7f1ac2c6d700 -1 client.0 error registering admin socket command: (17) File exists 2023-06-23T18:30:29.202 INFO:tasks.ceph.mgr.x.smithi111.stderr:2023-06-23T18:30:29.197+0000 7f1ac2c6d700 -1 client.0 error registering admin socket command: (17) File exists 2023-06-23T18:30:29.202 INFO:tasks.ceph.mgr.x.smithi111.stderr:2023-06-23T18:30:29.197+0000 7f1ac6474700 -1 client.0 error registering admin socket command: (17) File exists 2023-06-23T18:30:29.203 INFO:tasks.ceph.mgr.x.smithi111.stderr:2023-06-23T18:30:29.197+0000 7f1ac6474700 -1 client.0 error registering admin socket command: (17) File exists 2023-06-23T18:30:29.203 INFO:tasks.ceph.mgr.x.smithi111.stderr:2023-06-23T18:30:29.197+0000 7f1ac6474700 -1 client.0 error registering admin socket command: (17) File exists 2023-06-23T18:30:29.203 INFO:tasks.ceph.mgr.x.smithi111.stderr:2023-06-23T18:30:29.197+0000 7f1ac6474700 -1 client.0 error registering admin socket command: (17) File exists 2023-06-23T18:30:29.203 INFO:tasks.ceph.mgr.x.smithi111.stderr:2023-06-23T18:30:29.197+0000 7f1ac6474700 -1 client.0 error registering admin socket command: (17) File exists 2023-06-23T18:30:29.214 INFO:tasks.ceph.mgr.x.smithi111.stderr:2023-06-23T18:30:29.209+0000 7f1ac6474700 -1 client.0 error registering admin socket command: (17) File exists 2023-06-23T18:30:29.214 INFO:tasks.ceph.mgr.x.smithi111.stderr:2023-06-23T18:30:29.210+0000 7f1ac6474700 -1 client.0 error registering admin socket command: (17) File exists 2023-06-23T18:30:29.214 INFO:tasks.ceph.mgr.x.smithi111.stderr:2023-06-23T18:30:29.210+0000 7f1ac6474700 -1 client.0 error registering admin socket command: (17) File exists 2023-06-23T18:30:29.215 INFO:tasks.ceph.mgr.x.smithi111.stderr:2023-06-23T18:30:29.210+0000 7f1ac6474700 -1 client.0 error registering admin socket command: (17) File exists 2023-06-23T18:30:29.215 INFO:tasks.ceph.mgr.x.smithi111.stderr:2023-06-23T18:30:29.210+0000 7f1ac2c6d700 -1 client.0 error registering admin socket command: (17) File exists 2023-06-23T18:30:29.215 INFO:tasks.ceph.mgr.x.smithi111.stderr:2023-06-23T18:30:29.210+0000 7f1ac6474700 -1 client.0 error registering admin socket command: (17) File exists 2023-06-23T18:30:29.215 INFO:tasks.ceph.mgr.x.smithi111.stderr:2023-06-23T18:30:29.210+0000 7f1ac2c6d700 -1 client.0 error registering admin socket command: (17) File exists 2023-06-23T18:30:29.215 INFO:tasks.ceph.mgr.x.smithi111.stderr:2023-06-23T18:30:29.210+0000 7f1ac2c6d700 -1 client.0 error registering admin socket command: (17) File exists 2023-06-23T18:30:29.215 INFO:tasks.ceph.mgr.x.smithi111.stderr:2023-06-23T18:30:29.210+0000 7f1ac2c6d700 -1 client.0 error registering admin socket command: (17) File exists 2023-06-23T18:30:29.215 INFO:tasks.ceph.mgr.x.smithi111.stderr:2023-06-23T18:30:29.210+0000 7f1ac2c6d700 -1 client.0 error registering admin socket command: (17) File exists 2023-06-23T18:30:31.138 INFO:teuthology.orchestra.run.smithi111.stdout:{}
Updated by Venky Shankar 10 months ago
- Category set to Correctness/Safety
- Assignee set to Kotresh Hiremath Ravishankar
- Target version set to v19.0.0
- Backport set to reef,quincy,pacific
Updated by Venky Shankar 10 months ago
Kotresh said that he saw no active MDSs. Please RCA, Kotresh.
Updated by Kotresh Hiremath Ravishankar 10 months ago
Looks like mds were down
Looks like mds were down <pre> 2023-06-23T20:20:22.560 DEBUG:teuthology.orchestra.run.smithi111:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph fs fail backup_fs 2023-06-23T20:20:23.062 INFO:tasks.ceph.mon.a.smithi111.stderr:2023-06-23T20:20:23.059+0000 7f1b5fe6f700 -1 log_channel(cluster) log [ERR] : Health check failed: 1 filesystem is offline (MDS_ALL_DOWN) 2023-06-23T20:20:23.064 INFO:tasks.ceph.mds.d.smithi111.stderr: -687> 2023-06-23T20:18:19.059+0000 7f0cc71da700 -1 mds.pinger is_rank_lagging: rank=0 was never sent ping request. 2023-06-23T20:20:23.067 INFO:teuthology.orchestra.run.smithi111.stderr:backup_fs marked not joinable; MDS cannot join the cluster. All MDS ranks marked failed. </pre>
Updated by Kotresh Hiremath Ravishankar 10 months ago
I think the mds is down as part of cleanup. But the mirror status is failed. Need to debug further on it.
2023-06-23T20:18:31.588 INFO:teuthology.orchestra.run:Running command with timeout 30 2023-06-23T20:18:31.589 DEBUG:teuthology.orchestra.run.smithi111:mirror status for fs: cephfs> ceph --admin-daemon /var/run/ceph/cephfs-mirror.asok fs mirror status cephfs@56 2023-06-23T20:18:31.692 INFO:teuthology.orchestra.run.smithi111.stderr:admin_socket: exception getting command descriptions: [Errno 111] Connection refused 2023-06-23T20:18:31.693 DEBUG:teuthology.orchestra.run:got remote process result: 22 2023-06-23T20:18:31.694 WARNING:tasks.cephfs.test_mirroring:mirror daemon command with label "mirror status for fs: cephfs" failed: Command failed (mirror status for fs: cephfs) on smithi111 with status 22: 'ceph --admin-daemon /var/run/ceph/cephfs-mirror.asok fs mirror status cephfs@56'
Updated by Venky Shankar 8 months ago
- Related to Bug #62936: Test failure: test_mirroring_init_failure_with_recovery (tasks.cephfs.test_mirroring.TestMirroring) added
Actions