Actions
Bug #43514
closedqa: test setUp may cause spurious MDS_INSUFFICIENT_STANDBY
Status:
Resolved
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
% Done:
0%
Source:
Q/A
Tags:
Backport:
nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
qa-suite
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
2020-01-07T16:46:12.328 INFO:teuthology.orchestra.run:waiting for 300 2020-01-07T16:46:12.340 INFO:tasks.ceph.mds.a:Stopped 2020-01-07T16:46:12.340 DEBUG:teuthology.parallel:result is None 2020-01-07T16:46:12.373 INFO:tasks.ceph.mds.b:Stopped 2020-01-07T16:46:12.373 DEBUG:teuthology.parallel:result is None 2020-01-07T16:46:12.373 INFO:teuthology.orchestra.run.smithi093:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph mds fail a 2020-01-07T16:46:12.374 INFO:teuthology.orchestra.run.smithi093:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph mds fail b ... 2020-01-07T16:46:12.705 INFO:teuthology.orchestra.run.smithi093.stderr:2020-01-07T16:46:12.697+0000 7f40cf974700 1 -- 172.21.15.93:0/3162667050 --> [v2:172.21.15.93:3300/0,v1:172.21.15.93:6789/0] -- mon_command({"prefix": "mds fail", "role_or_gid": "b"} v 0) v1 -- 0x7f40c809be80 con 0x7f40c8136950 2020-01-07T16:46:12.706 INFO:teuthology.orchestra.run.smithi093.stderr:2020-01-07T16:46:12.698+0000 7fee475ec700 1 -- 172.21.15.93:0/3550511259 --> [v2:172.21.15.106:3300/0,v1:172.21.15.106:6789/0] -- mon_command({"prefix": "mds fail", "role_or_gid": "a"} v 0) v1 -- 0x7fee40098b80 con 0x7fee4007ebc0 2020-01-07T16:46:13.685 INFO:tasks.ceph.mon.a.smithi093.stderr:2020-01-07T16:46:13.682+0000 7fb42c1c8700 -1 log_channel(cluster) log [ERR] : Health check failed: 1 filesystem is offline (MDS_ALL_DOWN) ... 2020-01-07T16:52:08.538 INFO:tasks.ceph:Checking cluster log for badness... 2020-01-07T16:52:08.538 INFO:teuthology.orchestra.run.smithi093:> sudo egrep '\[ERR\]|\[WRN\]|\[SEC\]' /var/log/ceph/ceph.log | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | egrep -v 'overall HEALTH_' | egrep -v '\(FS_DEGRADED\)' | egrep -v '\(MDS_FAILED\)' | egrep -v '\(MDS_DEGRADED\)' | egrep -v '\(FS_WITH_FAILED_MDS\)' | egrep -v '\(MDS_DAMAGE\)' | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | egrep -v '\(FS_INLINE_DATA_DEPRECATED\)' | egrep -v 'overall HEALTH_' | egrep -v '\(OSD_DOWN\)' | egrep -v '\(OSD_' | egrep -v 'but it is still running' | egrep -v 'is not responding' | head -n 1 2020-01-07T16:52:08.585 INFO:teuthology.orchestra.run.smithi093.stdout:2020-01-07T16:46:13.683438+0000 mon.a (mon.0) 805 : cluster [WRN] Health check failed: insufficient standby MDS daemons available (MDS_INSUFFICIENT_STANDBY)
It's not necessary to fail the MDS individually. They will restart when removed from the MDSMap.
Actions