Project

General

Profile

Actions

Bug #59192

closed

cls/test_cls_sdk.sh: Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED)

Added by Laura Flores about 1 year ago. Updated 4 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Tests
Target version:
% Done:

0%

Source:
Development
Tags:
backport_processed
Backport:
pacific,quincy,reef
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

/a/lflores-2023-03-27_02:17:31-rados-wip-aclamk-bs-elastic-shared-blob-save-25.03.2023-a-distro-default-smithi/7221015

2023-03-27T07:50:01.978 DEBUG:teuthology.orchestra.run.smithi103:workunit test cls/test_cls_sdk.sh> mkdir -p -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=5e717292106ca2d310770101bfebb345837be8e1 TESTDIR="/home/ubuntu/cephtest" CEPH_ARGS="--cluster ceph" CEPH_ID="0" PATH=$PATH:/usr/sbin CEPH_BASE=/home/ubuntu/cephtest/clone.client.0 CEPH_ROOT=/home/ubuntu/cephtest/clone.client.0 CEPH_MNT=/home/ubuntu/cephtest/mnt.0 adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/clone.client.0/qa/workunits/cls/test_cls_sdk.sh

...

2023-03-27T07:50:48.129 DEBUG:teuthology.orchestra.run.smithi103:> sudo egrep '\[ERR\]|\[WRN\]|\[SEC\]' /var/log/ceph/ceph.log | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | egrep -v '\(OSD_SLOW_PING_TIME' | egrep -v '\(PG_AVAILABILITY\)' | head -n 1
2023-03-27T07:50:48.162 INFO:teuthology.orchestra.run.smithi103.stdout:1679903171.7781157 mon.a (mon.0) 589 : cluster [WRN] Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED)
2023-03-27T07:50:48.163 WARNING:tasks.ceph:Found errors (ERR|WRN|SEC) in cluster log
2023-03-27T07:50:48.163 DEBUG:teuthology.orchestra.run.smithi103:> sudo egrep '\[SEC\]' /var/log/ceph/ceph.log | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | egrep -v '\(OSD_SLOW_PING_TIME' | egrep -v '\(PG_AVAILABILITY\)' | head -n 1
2023-03-27T07:50:48.218 DEBUG:teuthology.orchestra.run.smithi103:> sudo egrep '\[ERR\]' /var/log/ceph/ceph.log | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | egrep -v '\(OSD_SLOW_PING_TIME' | egrep -v '\(PG_AVAILABILITY\)' | head -n 1
2023-03-27T07:50:48.272 DEBUG:teuthology.orchestra.run.smithi103:> sudo egrep '\[WRN\]' /var/log/ceph/ceph.log | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | egrep -v '\(OSD_SLOW_PING_TIME' | egrep -v '\(PG_AVAILABILITY\)' | head -n 1


Related issues 5 (1 open4 closed)

Related to rgw - Bug #61168: cluster [WRN] Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED)Pending BackportCasey Bodley

Actions
Related to crimson - Bug #62595: Health check failed: (POOL_APP_NOT_ENABLED)" in cluster logResolved

Actions
Copied to RADOS - Backport #61601: quincy: cls/test_cls_sdk.sh: Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED)ResolvedLaura FloresActions
Copied to RADOS - Backport #61602: pacific: cls/test_cls_sdk.sh: Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED)RejectedLaura FloresActions
Copied to RADOS - Backport #61603: reef: cls/test_cls_sdk.sh: Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED)ResolvedLaura FloresActions
Actions #1

Updated by Laura Flores about 1 year ago

/a/lflores-2023-03-27_13:46:00-rados-wip-aclamk-bs-elastic-shared-blob-save-25.03.2023-a-distro-default-smithi/7221524
/a/yuriw-2023-03-27_23:05:54-rados-wip-yuri4-testing-2023-03-25-0714-distro-default-smithi/7221965

Actions #2

Updated by Laura Flores about 1 year ago

  • Priority changed from Normal to High

I've seen this several times now in two different branches of unmerged PRs. Possible regression?

Actions #3

Updated by Neha Ojha about 1 year ago

seen in the rgw test suite too /a/cbodley-2023-03-22_18:01:21-rgw-main-distro-default-smithi/7216444 - see the discussion in https://github.com/ceph/ceph/pull/47560#issuecomment-1487406107

Actions #4

Updated by Neha Ojha about 1 year ago

Looking at a previous run very similar to /a/lflores-2023-03-27_02:17:31-rados-wip-aclamk-bs-elastic-shared-blob-save-25.03.2023-a-distro-default-smithi/7221015 (rados/basic/{ceph clusters/{fixed-2 openstack} mon_election/classic msgr-failures/many msgr/async objectstore/bluestore-low-osd-mem-target rados supported-random-distro$/{ubuntu_latest} tasks/rados_cls_all}) that had passed, it appears that the warning existed there too but the badness check just didn't catch it.

/a/yuriw-2023-03-17_23:38:21-rados-reef-distro-default-smithi/7212192 (rados/basic/{ceph clusters/{fixed-2 openstack} mon_election/classic msgr-failures/many msgr/async objectstore/bluestore-low-osd-mem-target rados supported-random-distro$/{ubuntu_latest} tasks/rados_cls_all})

2023-03-19T04:46:59.319 INFO:tasks.ceph:Checking cluster log for badness...
2023-03-19T04:46:59.319 DEBUG:teuthology.orchestra.run.smithi121:> sudo egrep '\[ERR\]|\[WRN\]|\[SEC\]' /var/log/ceph/ceph.log | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | egrep -v '\(OSD_SLOW_PING_TIME' | egrep -v '\(PG_AVAILABILITY\)' | head -n 1
2023-03-19T04:46:59.319 INFO:tasks.daemonwatchdog.daemon_watchdog:daemon ceph.mon.b is failed for ~0s
2023-03-19T04:46:59.339 INFO:tasks.ceph:Unmounting /var/lib/ceph/osd/ceph-0 on ubuntu@smithi121.front.sepia
nojha@teuthology:/ceph/teuthology-archive/yuriw-2023-03-17_23:38:21-rados-reef-distro-default-smithi/7212192/remote/smithi121/log$ zgrep "POOL_APP_NOT_ENABLED" ceph.log.gz 
1679200138.2431207 mon.a (mon.0) 640 : cluster 3 Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED)
1679200138.2431207 mon.a (mon.0) 640 : cluster 3 Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED)
1679200143.161677 mon.a (mon.0) 649 : cluster 1 Health check cleared: POOL_APP_NOT_ENABLED (was: 1 pool(s) do not have an application enabled)
1679200143.161677 mon.a (mon.0) 649 : cluster 1 Health check cleared: POOL_APP_NOT_ENABLED (was: 1 pool(s) do not have an application enabled)
Actions #5

Updated by Matan Breizman about 1 year ago

Neha Ojha wrote:

Looking at a previous run very similar to ... that had passed, it appears that the warning existed there too but the badness check just didn't catch it.

Unless I'm missing something, the pools created in this test shouldn't be associated to any application. If that's the case, then we can simply add "POOL_APP_NOT_ENABLED" to the ignore list (as done in other rados suite tests).

Actions #6

Updated by Laura Flores about 1 year ago

@Matan Breizman that's probably right, although I wonder what changed to make this pop up so frequently in the rados/rgw suites.

Actions #8

Updated by Laura Flores about 1 year ago

/a/yuriw-2023-03-27_23:05:54-rados-wip-yuri4-testing-2023-03-25-0714-distro-default-smithi/7221965

Actions #9

Updated by Laura Flores about 1 year ago

  • Backport set to pacific

/a/yuriw-2023-03-16_21:59:27-rados-wip-yuri6-testing-2023-03-12-0918-pacific-distro-default-smithi/7211186
/a/yuriw-2023-03-16_21:59:27-rados-wip-yuri6-testing-2023-03-12-0918-pacific-distro-default-smithi/7211167

Actions #10

Updated by Laura Flores about 1 year ago

/a/yuriw-2023-03-30_21:53:20-rados-wip-yuri7-testing-2023-03-29-1100-distro-default-smithi/7227986

Actions #11

Updated by Radoslaw Zarzynski about 1 year ago

  • Status changed from New to In Progress
  • Assignee set to Radoslaw Zarzynski
Actions #12

Updated by Laura Flores 12 months ago

/a/yuriw-2023-04-25_14:15:40-rados-pacific-release-distro-default-smithi/7251186

Actions #13

Updated by Laura Flores 12 months ago

/a/yuriw-2023-04-24_23:35:26-smoke-pacific-release-distro-default-smithi/7250661

Actions #15

Updated by Laura Flores 12 months ago

/a/yuriw-2023-04-25_21:30:50-rados-wip-yuri3-testing-2023-04-25-1147-distro-default-smithi/7253406

Actions #16

Updated by Laura Flores 12 months ago

/a/yuriw-2023-04-25_18:56:08-rados-wip-yuri5-testing-2023-04-25-0837-pacific-distro-default-smithi/7252745

Actions #17

Updated by Laura Flores 12 months ago

/a/yuriw-2023-05-06_14:41:44-rados-pacific-release-distro-default-smithi/7264188

Actions #18

Updated by Radoslaw Zarzynski 12 months ago

  • Assignee changed from Radoslaw Zarzynski to Laura Flores
  • Priority changed from High to Normal

Laura, would you mind taking a look? Definitely not urgent thing.

Actions #19

Updated by Laura Flores 12 months ago

/a/yuriw-2023-04-26_01:16:19-rados-wip-yuri11-testing-2023-04-25-1605-pacific-distro-default-smithi/7253751

Sure Radek, I will see if something needs to be whitelisted.

Actions #20

Updated by Casey Bodley 12 months ago

  • Related to Bug #61168: cluster [WRN] Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED) added
Actions #21

Updated by Laura Flores 11 months ago

  • Status changed from In Progress to Duplicate
Actions #22

Updated by Laura Flores 11 months ago

  • Is duplicate of Bug #61168: cluster [WRN] Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED) added
Actions #23

Updated by Laura Flores 11 months ago

  • Related to deleted (Bug #61168: cluster [WRN] Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED))
Actions #24

Updated by Laura Flores 11 months ago

  • Status changed from Duplicate to New

Hmm, found another instance that looks like this tracker in main:
/a/yuriw-2023-06-01_19:33:38-rados-wip-yuri-testing-2023-06-01-0746-distro-default-smithi/7294007

The test branch has the commit mentioned above, so perhaps there needs to be an additional fix.

Actions #25

Updated by Laura Flores 11 months ago

  • Is duplicate of deleted (Bug #61168: cluster [WRN] Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED))
Actions #26

Updated by Laura Flores 11 months ago

  • Related to Bug #61168: cluster [WRN] Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED) added
Actions #27

Updated by Laura Flores 11 months ago

/a/yuriw-2023-05-30_21:40:46-rados-wip-yuri10-testing-2023-05-30-1244-distro-default-smithi/7290995

Actions #28

Updated by Radoslaw Zarzynski 11 months ago

  • Backport changed from pacific to pacific,quincy,reef
Actions #29

Updated by Radoslaw Zarzynski 11 months ago

Hi Laura! Do you have the bandwidth to take a deeper look?

Actions #30

Updated by Laura Flores 11 months ago

Hey Radek, yes. Looking into it, it should be a quick whitelist fix. Trying out a fix now.

Actions #31

Updated by Laura Flores 11 months ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 51925
Actions #32

Updated by Laura Flores 11 months ago

  • Status changed from Fix Under Review to Pending Backport
Actions #33

Updated by Backport Bot 11 months ago

  • Copied to Backport #61601: quincy: cls/test_cls_sdk.sh: Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED) added
Actions #34

Updated by Backport Bot 11 months ago

  • Copied to Backport #61602: pacific: cls/test_cls_sdk.sh: Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED) added
Actions #35

Updated by Backport Bot 11 months ago

  • Copied to Backport #61603: reef: cls/test_cls_sdk.sh: Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED) added
Actions #36

Updated by Backport Bot 11 months ago

  • Tags set to backport_processed
Actions #37

Updated by Matan Breizman 8 months ago

  • Related to Bug #62595: Health check failed: (POOL_APP_NOT_ENABLED)" in cluster log added
Actions #38

Updated by Konstantin Shalygin 4 months ago

  • Category set to Tests
  • Status changed from Pending Backport to Resolved
  • Target version set to v19.0.0
  • Source set to Development
Actions

Also available in: Atom PDF