Actions
Bug #20693
closedmonthrash has spurious PG_AVAILABILITY etc warnings
Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
/a/sage-2017-07-19_15:27:16-rados-wip-sage-testing2-distro-basic-smithi/1419393
no osd thrashing, but not fully peered and healthy before mons start thrashing, which makes peering slow enough to trigger a health warning. saw this on another run today too.
can probably fix by making the wait_for_healthy also wait for unknown pgs? and/or also flush the pg stats?
Updated by Nathan Cutler almost 7 years ago
- Related to Bug #20690: Cluster status is HEALTH_OK even though PGs are in unknown state added
Updated by Sage Weil almost 7 years ago
Ok, I've addressed one soruce of this, but there is another, see
/a/sage-2017-07-24_03:44:49-rados-wip-sage-testing-distro-basic-smithi/1437231
The problem is that with at-end.yaml we set require-osd-release luminous, which with the peering deletes triggers a new peering interval and the PG_AVAILABILITY+PG_DEGRADED warning.
Updated by Sage Weil almost 7 years ago
- Status changed from 12 to Fix Under Review
Updated by Sage Weil almost 7 years ago
- Status changed from Fix Under Review to Resolved
Actions