Project

General

Profile

Bug #20693

monthrash has spurious PG_AVAILABILITY etc warnings

Added by Sage Weil about 1 month ago. Updated 24 days ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
Start date:
07/19/2017
Due date:
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Release:
Needs Doc:
No
Component(RADOS):

Description

/a/sage-2017-07-19_15:27:16-rados-wip-sage-testing2-distro-basic-smithi/1419393

no osd thrashing, but not fully peered and healthy before mons start thrashing, which makes peering slow enough to trigger a health warning. saw this on another run today too.

can probably fix by making the wait_for_healthy also wait for unknown pgs? and/or also flush the pg stats?


Related issues

Related to RADOS - Bug #20690: Cluster status is HEALTH_OK even though PGs are in unknown state Need More Info 07/19/2017

History

#1 Updated by Nathan Cutler about 1 month ago

  • Related to Bug #20690: Cluster status is HEALTH_OK even though PGs are in unknown state added

#2 Updated by Sage Weil 27 days ago

Ok, I've addressed one soruce of this, but there is another, see

/a/sage-2017-07-24_03:44:49-rados-wip-sage-testing-distro-basic-smithi/1437231

The problem is that with at-end.yaml we set require-osd-release luminous, which with the peering deletes triggers a new peering interval and the PG_AVAILABILITY+PG_DEGRADED warning.

#3 Updated by Sage Weil 26 days ago

  • Status changed from Verified to Need Review

#4 Updated by Sage Weil 24 days ago

  • Status changed from Need Review to Resolved

Also available in: Atom PDF