Project

General

Profile

Bug #40967

qa: race in test_standby_replay_singleton_fail

Added by Patrick Donnelly over 4 years ago. Updated over 4 years ago.

Status:
Resolved
Priority:
High
Category:
-
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
qa-suite
Labels (FS):
qa
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2019-07-25T08:27:09.803 INFO:teuthology.orchestra.run.smithi193:Running:
2019-07-25T08:27:09.804 INFO:teuthology.orchestra.run.smithi193:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 900 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-mds.b.asok status
2019-07-25T08:27:09.943 INFO:teuthology.orchestra.run.smithi193.stdout:{
2019-07-25T08:27:09.943 INFO:teuthology.orchestra.run.smithi193.stdout:    "cluster_fsid": "af4c99e3-dbf1-4604-ab36-7fb20694ece5",
2019-07-25T08:27:09.943 INFO:teuthology.orchestra.run.smithi193.stdout:    "whoami": 0,
2019-07-25T08:27:09.943 INFO:teuthology.orchestra.run.smithi193.stdout:    "id": 14245,
2019-07-25T08:27:09.944 INFO:teuthology.orchestra.run.smithi193.stdout:    "want_state": "up:active",
2019-07-25T08:27:09.944 INFO:teuthology.orchestra.run.smithi193.stdout:    "state": "up:active",
2019-07-25T08:27:09.944 INFO:teuthology.orchestra.run.smithi193.stdout:    "rank_uptime": 9.4852699929999993,
2019-07-25T08:27:09.944 INFO:teuthology.orchestra.run.smithi193.stdout:    "mdsmap_epoch": 521,
2019-07-25T08:27:09.944 INFO:teuthology.orchestra.run.smithi193.stdout:    "osdmap_epoch": 282,
2019-07-25T08:27:09.944 INFO:teuthology.orchestra.run.smithi193.stdout:    "osdmap_epoch_barrier": 282,
2019-07-25T08:27:09.944 INFO:teuthology.orchestra.run.smithi193.stdout:    "uptime": 19.607215098000001
2019-07-25T08:27:09.945 INFO:teuthology.orchestra.run.smithi193.stdout:}
2019-07-25T08:27:09.949 INFO:tasks.cephfs.filesystem:_json_asok output: {
    "cluster_fsid": "af4c99e3-dbf1-4604-ab36-7fb20694ece5",
    "whoami": 0,
    "id": 14245,
    "want_state": "up:active",
    "state": "up:active",
    "rank_uptime": 9.4852699929999993,
    "mdsmap_epoch": 521,
    "osdmap_epoch": 282,
    "osdmap_epoch_barrier": 282,
    "uptime": 19.607215098000001
}

2019-07-25T08:27:10.353 INFO:tasks.cephfs.filesystem:are_daemons_healthy: mds map: {u'session_autoclose': 300, u'balancer': u'', u'up': {u'mds_0': 14245}, u'last_failure_osd_epoch': 282, u'in': [0], u'last_failure': 0, u'max_file_size': 1099511627776, u'explicitly_allowed_features': 32, u'damaged': [], u'tableserver': 0, u'metadata_pool': 48, u'failed': [], u'epoch': 520, u'flags': 50, u'max_mds': 1, u'compat': {u'compat': {}, u'ro_compat': {}, u'incompat': {u'feature_10': u'snaprealm v2', u'feature_8': u'no anchor table', u'feature_9': u'file layout v2', u'feature_2': u'client writeable ranges', u'feature_3': u'default file layouts on dirs', u'feature_1': u'base v0.20', u'feature_6': u'dirfrag is stored in omap', u'feature_4': u'dir inode in separate object', u'feature_5': u'mds uses versioned encoding'}}, u'min_compat_client': u'0 (unknown)', u'data_pools': [49], u'info': {u'gid_14245': {u'export_targets': [], u'name': u'b', u'incarnation': 517, u'state_seq': 8, u'state': u'up:active', u'gid': 14245, u'features': 4540138292836696063, u'rank': 0, u'flags': 0, u'addrs': {u'addrvec': [{u'nonce': 1150325967, u'type': u'v2', u'addr': u'172.21.15.193:6836'}, {u'nonce': 1150325967, u'type': u'v1', u'addr': u'172.21.15.193:6837'}]}, u'addr': u'172.21.15.193:6837/1150325967'}}, u'fs_name': u'cephfs', u'created': u'2019-07-25T08:24:27.742592+0000', u'standby_count_wanted': 0, u'enabled': True, u'modified': u'2019-07-25T08:27:09.402763+0000', u'session_timeout': 60, u'stopped': [], u'ever_allowed_features': 32, u'root': 0}
2019-07-25T08:27:10.353 INFO:tasks.cephfs.filesystem:are_daemons_healthy: 1/1
2019-07-25T08:27:10.353 INFO:teuthology.orchestra.run.smithi193:Running:
2019-07-25T08:27:10.353 INFO:teuthology.orchestra.run.smithi193:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 900 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-mds.b.asok status
2019-07-25T08:27:10.483 INFO:teuthology.orchestra.run.smithi193.stdout:{
2019-07-25T08:27:10.483 INFO:teuthology.orchestra.run.smithi193.stdout:    "cluster_fsid": "af4c99e3-dbf1-4604-ab36-7fb20694ece5",
2019-07-25T08:27:10.483 INFO:teuthology.orchestra.run.smithi193.stdout:    "whoami": 0,
2019-07-25T08:27:10.483 INFO:teuthology.orchestra.run.smithi193.stdout:    "id": 14245,
2019-07-25T08:27:10.483 INFO:teuthology.orchestra.run.smithi193.stdout:    "want_state": "up:active",
2019-07-25T08:27:10.483 INFO:teuthology.orchestra.run.smithi193.stdout:    "state": "up:active",
2019-07-25T08:27:10.484 INFO:teuthology.orchestra.run.smithi193.stdout:    "rank_uptime": 10.024941639,
2019-07-25T08:27:10.484 INFO:teuthology.orchestra.run.smithi193.stdout:    "mdsmap_epoch": 521,
2019-07-25T08:27:10.484 INFO:teuthology.orchestra.run.smithi193.stdout:    "osdmap_epoch": 282,
2019-07-25T08:27:10.484 INFO:teuthology.orchestra.run.smithi193.stdout:    "osdmap_epoch_barrier": 282,
2019-07-25T08:27:10.484 INFO:teuthology.orchestra.run.smithi193.stdout:    "uptime": 20.146887712000002
2019-07-25T08:27:10.484 INFO:teuthology.orchestra.run.smithi193.stdout:}
2019-07-25T08:27:10.489 INFO:tasks.cephfs.filesystem:_json_asok output: {
    "cluster_fsid": "af4c99e3-dbf1-4604-ab36-7fb20694ece5",
    "whoami": 0,
    "id": 14245,
    "want_state": "up:active",
    "state": "up:active",
    "rank_uptime": 10.024941639,
    "mdsmap_epoch": 521,
    "osdmap_epoch": 282,
    "osdmap_epoch_barrier": 282,
    "uptime": 20.146887712000002
}

2019-07-25T08:27:10.516 INFO:tasks.cephfs_test_runner:test_standby_replay_singleton_fail (tasks.cephfs.test_failover.TestStandbyReplay) ... ERROR

From: /ceph/teuthology-archive/pdonnell-2019-07-25_06:25:06-fs-wip-pdonnell-testing-20190725.023305-distro-basic-smithi/4146998/teuthology.log

mds.a which had been failed had not been added to the map yet. We need to add a sleep.


Related issues

Copied to CephFS - Backport #41095: nautilus: qa: race in test_standby_replay_singleton_fail Resolved

History

#1 Updated by Patrick Donnelly over 4 years ago

  • Description updated (diff)
  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 29336

#2 Updated by Patrick Donnelly over 4 years ago

  • Status changed from Fix Under Review to Pending Backport

#3 Updated by Patrick Donnelly over 4 years ago

  • Copied to Backport #41095: nautilus: qa: race in test_standby_replay_singleton_fail added

#4 Updated by Nathan Cutler over 4 years ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Also available in: Atom PDF