Project

General

Profile

Bug #21121

test_health_warnings.sh can fail

Added by Sage Weil over 1 year ago. Updated about 1 year ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
Start date:
08/24/2017
Due date:
% Done:

0%

Source:
Tags:
Backport:
luminous,jewel
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:

Description

- test_mark_all_but_last_osds_down marks all but one osd down
- clears noup
- osd.1 fails the is_healthy check because it is failing to respond on its old address
- meanwhile, all osds are back up.
- eventually mon marks osd.1 out
- test fails...

/a/sage-2017-08-24_17:38:40-rados-wip-sage-testing2-luminous-20170824a-distro-basic-smithi/1560394


Related issues

Copied to RADOS - Backport #21238: luminous: test_health_warnings.sh can fail Resolved
Copied to RADOS - Backport #21239: jewel: test_health_warnings.sh can fail Resolved

History

#1 Updated by Sage Weil over 1 year ago

I believe the fix is to subscribe to osdmaps when in the waiting for healthy state. if we are unhealthy because we are failing to ping our "up" peers, we need to be sure that the cluster actually things they're up and we're not just stuck on an old map.

#2 Updated by Sage Weil over 1 year ago

  • Status changed from Verified to Need Review
  • Backport set to luminous,jewel

#3 Updated by Sage Weil over 1 year ago

  • Status changed from Need Review to Pending Backport

#4 Updated by Nathan Cutler over 1 year ago

  • Copied to Backport #21238: luminous: test_health_warnings.sh can fail added

#5 Updated by Nathan Cutler over 1 year ago

#6 Updated by Nathan Cutler about 1 year ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF