Bug #43514: qa: test setUp may cause spurious MDS_INSUFFICIENT_STANDBY - CephFS - Ceph

Actions

Copy link

Bug #43514

closed

qa: test setUp may cause spurious MDS_INSUFFICIENT_STANDBY

Added by Patrick Donnelly over 4 years ago. Updated about 4 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Patrick Donnelly

Category:

Correctness/Safety

Target version:

Ceph - v15.0.0

% Done:

Source:

Q/A

Tags:

Backport:

nautilus

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(FS):

qa-suite

Labels (FS):

Pull request ID:

32532

Crash signature (v1):

Crash signature (v2):

Description

2020-01-07T16:46:12.328 INFO:teuthology.orchestra.run:waiting for 300
2020-01-07T16:46:12.340 INFO:tasks.ceph.mds.a:Stopped
2020-01-07T16:46:12.340 DEBUG:teuthology.parallel:result is None
2020-01-07T16:46:12.373 INFO:tasks.ceph.mds.b:Stopped
2020-01-07T16:46:12.373 DEBUG:teuthology.parallel:result is None
2020-01-07T16:46:12.373 INFO:teuthology.orchestra.run.smithi093:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph mds fail a
2020-01-07T16:46:12.374 INFO:teuthology.orchestra.run.smithi093:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph mds fail b
...
2020-01-07T16:46:12.705 INFO:teuthology.orchestra.run.smithi093.stderr:2020-01-07T16:46:12.697+0000 7f40cf974700  1 -- 172.21.15.93:0/3162667050 --> [v2:172.21.15.93:3300/0,v1:172.21.15.93:6789/0] -- mon_command({"prefix": "mds fail", "role_or_gid": "b"} v 0) v1 -- 0x7f40c809be80 con 0x7f40c8136950
2020-01-07T16:46:12.706 INFO:teuthology.orchestra.run.smithi093.stderr:2020-01-07T16:46:12.698+0000 7fee475ec700  1 -- 172.21.15.93:0/3550511259 --> [v2:172.21.15.106:3300/0,v1:172.21.15.106:6789/0] -- mon_command({"prefix": "mds fail", "role_or_gid": "a"} v 0) v1 -- 0x7fee40098b80 con 0x7fee4007ebc0
2020-01-07T16:46:13.685 INFO:tasks.ceph.mon.a.smithi093.stderr:2020-01-07T16:46:13.682+0000 7fb42c1c8700 -1 log_channel(cluster) log [ERR] : Health check failed: 1 filesystem is offline (MDS_ALL_DOWN)
...
2020-01-07T16:52:08.538 INFO:tasks.ceph:Checking cluster log for badness...
2020-01-07T16:52:08.538 INFO:teuthology.orchestra.run.smithi093:> sudo egrep '\[ERR\]|\[WRN\]|\[SEC\]' /var/log/ceph/ceph.log | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | egrep -v 'overall HEALTH_' | egrep -v '\(FS_DEGRADED\)' | egrep -v '\(MDS_FAILED\)' | egrep -v '\(MDS_DEGRADED\)' | egrep -v '\(FS_WITH_FAILED_MDS\)' | egrep -v '\(MDS_DAMAGE\)' | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | egrep -v '\(FS_INLINE_DATA_DEPRECATED\)' | egrep -v 'overall HEALTH_' | egrep -v '\(OSD_DOWN\)' | egrep -v '\(OSD_' | egrep -v 'but it is still running' | egrep -v 'is not responding' | head -n 1
2020-01-07T16:52:08.585 INFO:teuthology.orchestra.run.smithi093.stdout:2020-01-07T16:46:13.683438+0000 mon.a (mon.0) 805 : cluster [WRN] Health check failed: insufficient standby MDS daemons available (MDS_INSUFFICIENT_STANDBY)

It's not necessary to fail the MDS individually. They will restart when removed from the MDSMap.

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » CephFS

Custom queries

Bug #43514

qa: test setUp may cause spurious MDS_INSUFFICIENT_STANDBY

Updated by Patrick Donnelly over 4 years ago

Updated by Patrick Donnelly over 4 years ago

Updated by Nathan Cutler over 4 years ago

Updated by Nathan Cutler about 4 years ago