Bug #43637: nautilus: qa: Health check failed: Reduced data availability: 16 pgs inactive (PG_AVAILABILITY) - CephFS - Ceph

Actions

Copy link

Bug #43637

open

nautilus: qa: Health check failed: Reduced data availability: 16 pgs inactive (PG_AVAILABILITY)

Added by Ramana Raja over 4 years ago. Updated about 4 years ago.

Status:

Triaged

Priority:

Normal

Assignee:

Ramana Raja

Category:

Target version:

Ceph - v14.2.7

% Done:

Source:

Q/A

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(FS):

qa-suite

Labels (FS):

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

2020-01-10T10:41:33.742 INFO:teuthology.orchestra.run.smithi166:> sudo egrep '\[ERR\]|\[WRN\]|\[SEC\]' /var/log/ceph/ceph.log | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | egrep -v 'overall HEALTH_' | egrep -v '\(FS_DEGRADED\)' | egrep -v '\(MDS_FAILED\)' | egrep -v '\(MDS_DEGRADED\)' | egrep -v '\(FS_WITH_FAILED_MDS\)' | egrep -v '\(MDS_DAMAGE\)' | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | egrep -v 'overall HEALTH_' | egrep -v '\(OSD_DOWN\)' | egrep -v '\(OSD_' | egrep -v 'but it is still running' | egrep -v 'is not responding' | egrep -v 'not responding, replacing' | egrep -v '\(MDS_INSUFFICIENT_STANDBY\)' | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | head -n 1
2020-01-10T10:41:33.765 INFO:teuthology.orchestra.run.smithi166.stdout:2020-01-10 10:22:15.090135 mon.b (mon.0) 1750 : cluster [WRN] Health check failed: Reduced data availability: 16 pgs inactive (PG_AVAILABILITY)

From /a/yuriw-2020-01-09_22:23:54-fs-wip-yuri6-testing-2020-01-09-1744-nautilus-distro-basic-smithi/4650211/

Seen this failure in 12 jobs in fs suite and kcephfs suite,

In fs suite, /a/yuriw-2020-01-09_22:23:54-fs-wip-yuri6-testing-2020-01-09-1744-nautilus-distro-basic-smithi/

Failure: "2020-01-10 10:22:15.090135 mon.b (mon.0) 1750 : cluster [WRN] Health check failed: Reduced data availability: 16 pgs inactive (PG_AVAILABILITY)" in cluster log
5 jobs: ['4650211', '4650242', '4650177', '4650144', '4650276']
suites intersection: ['clusters/1a3s-mds-2c-client.yaml', 'conf/{client.yaml', 'fs/multifs/{begin.yaml', 'mds.yaml', 'mon-debug.yaml', 'mon.yaml', 'mount/fuse.yaml', 'osd.yaml}', 'overrides/{frag_enable.yaml', 'tasks/failover.yaml}', 'whitelist_health.yaml', 'whitelist_wrongly_marked_down.yaml}']
suites union: ['clusters/1a3s-mds-2c-client.yaml', 'conf/{client.yaml', 'fs/multifs/{begin.yaml', 'mds.yaml', 'mon-debug.yaml', 'mon.yaml', 'mount/fuse.yaml', 'objectstore-ec/bluestore-bitmap.yaml', 'objectstore-ec/bluestore-comp-ec-root.yaml', 'objectstore-ec/bluestore-comp.yaml', 'objectstore-ec/bluestore-ec-root.yaml', 'objectstore-ec/filestore-xfs.yaml', 'osd.yaml}', 'overrides/{frag_enable.yaml', 'supported-random-distros$/{centos_latest.yaml}', 'supported-random-distros$/{ubuntu_16.04.yaml}', 'supported-random-distros$/{ubuntu_latest.yaml}', 'tasks/failover.yaml}', 'whitelist_health.yaml', 'whitelist_wrongly_marked_down.yaml}']

In kcephfs suite, /a/yuriw-2020-01-09_22:20:53-kcephfs-wip-yuri6-testing-2020-01-09-1744-nautilus-distro-basic-smithi,

Failure: "2020-01-10 01:06:47.884132 mon.a (mon.0) 2248 : cluster [WRN] Health check failed: Reduced data availability: 16 pgs inactive (PG_AVAILABILITY)" in cluster log
7 jobs: ['4649992', '4649994', '4650026', '4650028', '4650030', '4650048', '4649990']
suites intersection: ['clusters/1-mds-4-client.yaml', 'conf/{client.yaml', 'kcephfs/recovery/{begin.yaml', 'kclient/{mount.yaml', 'log-config.yaml', 'mds.yaml', 'mon.yaml', 'ms-die-on-skipped.yaml}}', 'osd-asserts.yaml', 'osd.yaml}', 'overrides/{frag_enable.yaml', 'whitelist_health.yaml', 'whitelist_wrongly_marked_down.yaml}']
suites union: ['clusters/1-mds-4-client.yaml', 'conf/{client.yaml', 'kcephfs/recovery/{begin.yaml', 'kclient/{mount.yaml', 'log-config.yaml', 'mds.yaml', 'mon.yaml', 'ms-die-on-skipped.yaml}}', 'objectstore-ec/bluestore-bitmap.yaml', 'objectstore-ec/bluestore-comp.yaml', 'objectstore-ec/bluestore-ec-root.yaml', 'objectstore-ec/filestore-xfs.yaml', 'osd-asserts.yaml', 'osd.yaml}', 'overrides/{distro/random/{k-testing.yaml', 'overrides/{distro/rhel/{k-distro.yaml', 'overrides/{frag_enable.yaml', 'rhel_latest.yaml}', 'supported$/{rhel_latest.yaml}}', 'supported$/{ubuntu_latest.yaml}}', 'tasks/damage.yaml}', 'tasks/data-scan.yaml}', 'tasks/failover.yaml}', 'tasks/volume-client.yaml}', 'whitelist_health.yaml', 'whitelist_wrongly_marked_down.yaml}']

Actions

Copy link

Updated by Ramana Raja over 4 years ago

Subject changed from qa: task_failover.yaml failure to nautilus qa: task_failover.yaml failure
Target version changed from v14.2.5 to v14.2.7
Component(FS) deleted (~~qa-suite~~)

Actions

Copy link

Updated by Patrick Donnelly over 4 years ago

Subject changed from nautilus qa: task_failover.yaml failure to nautilus: qa: task_failover.yaml failure
Status changed from New to Triaged
Assignee set to Ramana Raja

Actions

Copy link

Updated by Patrick Donnelly about 4 years ago

Subject changed from nautilus: qa: task_failover.yaml failure to nautilus: qa: Health check failed: Reduced data availability: 16 pgs inactive (PG_AVAILABILITY)
ceph-qa-suite deleted (~~kcephfs~~)
Component(FS) qa-suite added

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » CephFS

Custom queries

Bug #43637

nautilus: qa: Health check failed: Reduced data availability: 16 pgs inactive (PG_AVAILABILITY)

Updated by Ramana Raja over 4 years ago

Updated by Patrick Donnelly over 4 years ago

Updated by Patrick Donnelly about 4 years ago