Actions
Bug #47654
closedtest_mon_pg: mon fails to join quorum to due election strategy mismatch
% Done:
0%
Source:
Tags:
Backport:
pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Monitor
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
2020-09-25T09:27:25.786 INFO:tasks.workunit.client.0.smithi158.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:2938: main: test_mon_pg 2020-09-25T09:27:25.786 INFO:tasks.workunit.client.0.smithi158.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:2037: test_mon_pg: wait_for_health_ok 2020-09-25T09:27:25.787 INFO:tasks.workunit.client.0.smithi158.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/../../standalone/ceph-helpers.sh:1717: wait_for_health_ok: wait_for_health HEALTH_OK 2020-09-25T09:27:25.787 INFO:tasks.workunit.client.0.smithi158.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/../../standalone/ceph-helpers.sh:1696: wait_for_health: local grepstr=HEALTH_OK 2020-09-25T09:27:25.787 INFO:tasks.workunit.client.0.smithi158.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/../../standalone/ceph-helpers.sh:1697: wait_for_health: delays=($(get_timeout_delays $TIMEOUT .1)) 2020-09-25T09:27:25.788 INFO:tasks.workunit.client.0.smithi158.stderr://home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/../../standalone/ceph-helpers.sh:1697: wait_for_health: get_timeout_delays 300 .1 . . . 2020-09-25T09:32:39.895 INFO:tasks.workunit.client.0.smithi158.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/../../standalone/ceph-helpers.sh:1700: wait_for_health: ceph health detail 2020-09-25T09:32:39.895 INFO:tasks.workunit.client.0.smithi158.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/../../standalone/ceph-helpers.sh:1700: wait_for_health: grep HEALTH_OK 2020-09-25T09:32:40.405 INFO:tasks.workunit.client.0.smithi158.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/../../standalone/ceph-helpers.sh:1701: wait_for_health: (( 27 >= 27 )) 2020-09-25T09:32:40.406 INFO:tasks.workunit.client.0.smithi158.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/../../standalone/ceph-helpers.sh:1702: wait_for_health: ceph health detail 2020-09-25T09:32:40.892 INFO:tasks.workunit.client.0.smithi158.stdout:HEALTH_WARN 1/3 mons down, quorum a,b; 47 slow ops, oldest one blocked for 707 sec, mon.c has slow ops 2020-09-25T09:32:40.892 INFO:tasks.workunit.client.0.smithi158.stdout:[WRN] MON_DOWN: 1/3 mons down, quorum a,b 2020-09-25T09:32:40.892 INFO:tasks.workunit.client.0.smithi158.stdout: mon.c (rank 2) addr [v2:172.21.15.158:3302/0,v1:172.21.15.158:6791/0] is down (out of quorum) 2020-09-25T09:32:40.893 INFO:tasks.workunit.client.0.smithi158.stdout:[WRN] SLOW_OPS: 47 slow ops, oldest one blocked for 707 sec, mon.c has slow ops 2020-09-25T09:32:40.903 INFO:tasks.workunit.client.0.smithi158.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/../../standalone/ceph-helpers.sh:1703: wait_for_health: return 1 2020-09-25T09:32:40.903 INFO:tasks.workunit.client.0.smithi158.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/../../standalone/ceph-helpers.sh:1717: wait_for_health_ok: return 1
Looks like mon.c was down and hence we never got to HEALTH_OK.
rados/singleton-bluestore/{all/cephtool mon_election/classic msgr-failures/few msgr/async-v2only objectstore/bluestore-bitmap rados supported-random-distro$/{rhel_8}}
/a/teuthology-2020-09-25_07:01:01-rados-master-distro-basic-smithi/5466864
Actions