Project

General

Profile

Actions

Bug #20693

closed

monthrash has spurious PG_AVAILABILITY etc warnings

Added by Sage Weil almost 7 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

/a/sage-2017-07-19_15:27:16-rados-wip-sage-testing2-distro-basic-smithi/1419393

no osd thrashing, but not fully peered and healthy before mons start thrashing, which makes peering slow enough to trigger a health warning. saw this on another run today too.

can probably fix by making the wait_for_healthy also wait for unknown pgs? and/or also flush the pg stats?


Related issues 1 (1 open0 closed)

Related to RADOS - Bug #20690: Cluster status is HEALTH_OK even though PGs are in unknown stateNeed More Info07/19/2017

Actions
Actions #1

Updated by Nathan Cutler over 6 years ago

  • Related to Bug #20690: Cluster status is HEALTH_OK even though PGs are in unknown state added
Actions #2

Updated by Sage Weil over 6 years ago

Ok, I've addressed one soruce of this, but there is another, see

/a/sage-2017-07-24_03:44:49-rados-wip-sage-testing-distro-basic-smithi/1437231

The problem is that with at-end.yaml we set require-osd-release luminous, which with the peering deletes triggers a new peering interval and the PG_AVAILABILITY+PG_DEGRADED warning.

Actions #3

Updated by Sage Weil over 6 years ago

  • Status changed from 12 to Fix Under Review
Actions #4

Updated by Sage Weil over 6 years ago

  • Status changed from Fix Under Review to Resolved
Actions

Also available in: Atom PDF