Cleanup #65521: Add expected warnings in cluster log to ignorelists - RADOS - Ceph

Actions

Copy link

Cleanup #65521

open

Add expected warnings in cluster log to ignorelists

Added by Laura Flores about 1 month ago. Updated 3 days ago.

Status:

New

Priority:

Normal

Assignee:

Category:

Target version:

% Done:

Tags:

Backport:

Reviewed:

Affected Versions:

Component(RADOS):

Pull request ID:

Description

Relevant Slack conversation:

Hey all, as I brought up in today's RADOS call, there has been a surge of cluster warnings in the rados and upgrade suites due to the merge of https://github.com/ceph/ceph/pull/54312 to main and squid.

Here are recent main baselines, where we have a huge percentage of failures due to cluster warnings:

rados suite - https://pulpito.ceph.com/teuthology-2024-04-14_20:00:15-rados-main-distro-default-smithi/
upgrade suite - https://pulpito.ceph.com/teuthology-2024-04-13_03:08:05-upgrade-main-distro-default-smithi/

Squid doesn't look nearly as bad, but still needs some attention especially in the upgrade suite:

I've been making tracker issues to fix a lot of these warnings, but since there are so many and they are non-deterministic, I think this will need to be a group effort.
Here are some I've opened lately:

Any ideas on how we can effectively divide up the work and fix the suites is welcome. The idea is to go through each failure, identify whether the warning is expected (i.e. OSD_DOWN warnings are expected in thrash tests), and add it to the correct ignorelist in a PR like this: https://github.com/ceph/ceph/pull/56619

The mon_cluster_log_to_file change has not yet been backported to Quincy or Reef, but the same work will need to be done for these. I think we should run all suites against these patches and merge them along with ignorelist changes, rather than merging first and fixing second.

Reef backport - https://github.com/ceph/ceph/pull/55431
Quincy backport - https://github.com/ceph/ceph/pull/55430

Related issues 9 (9 open — 0 closed)

Related to RADOS - Bug #65422: upgrade/quincy-x/parallel: "1 pg degraded (PG_DEGRADED)" in cluster log

New

Laura Flores

Actions

Related to Orchestrator - Bug #64868: cephadm/osds, cephadm/workunits: Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED) in cluster log

New

Laura Flores

Actions

Related to RADOS - Bug #65235: upgrade/reef-x/stress-split: "OSDMAP_FLAGS: noscrub flag(s) set" warning in cluster log

New

Brad Hubbard

Actions

Related to RADOS - Bug #62776: rados: cluster [WRN] overall HEALTH_WARN - do not have an application enabled

New

Prashant D

Actions

Related to Dashboard - Bug #64870: Health check failed: 1 osds down (OSD_DOWN)" in cluster log

New

Actions

Related to Orchestrator - Bug #64872: rados/cephadm/smoke: Health check failed: 1 stray daemon(s) not managed by cephadm (CEPHADM_STRAY_DAEMON) in cluster log

New

Actions

Related to Orchestrator - Bug #65728: Daemon managed by cephadm in an unknown state (CEPHADM_FAILED_DAEMON)

New

Actions

Related to RADOS - Bug #65768: rados/verify: Health check failed: 1 osds down (OSD_DOWN)" in cluster log

New

Sridhar Seshasayee

Actions

Related to Orchestrator - Bug #65824: rados/thrash-old-clients: cluster [WRN] Health detail: HEALTH_WARN noscrub flag(s) set" in cluster log

New

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » RADOS

Custom queries

Cleanup #65521

Add expected warnings in cluster log to ignorelists

Updated by Laura Flores about 1 month ago

Updated by Laura Flores about 1 month ago

Updated by Laura Flores about 1 month ago

Updated by Matan Breizman about 1 month ago

Updated by Laura Flores 26 days ago

Updated by Laura Flores 19 days ago

Updated by Laura Flores 19 days ago

Updated by Laura Flores 19 days ago

Updated by Laura Flores 18 days ago

Updated by Laura Flores 18 days ago · Edited

Updated by Laura Flores 18 days ago · Edited

Updated by Laura Flores 18 days ago

Updated by Laura Flores 18 days ago

Updated by Laura Flores 18 days ago

Updated by Matan Breizman 18 days ago · Edited

Updated by Laura Flores 17 days ago

Updated by Laura Flores 12 days ago

Updated by Kamoltat (Junior) Sirivadhna 12 days ago

Updated by Nitzan Mordechai 4 days ago

Updated by Laura Flores 3 days ago

Updated by Laura Flores 3 days ago

Updated by Laura Flores 3 days ago

Updated by Laura Flores 3 days ago

Updated by Laura Flores 3 days ago

Updated by Laura Flores 3 days ago