Project

General

Profile

Bug #59192

cls/test_cls_sdk.sh: Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED)

Added by Laura Flores 11 months ago. Updated 2 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Tests
Target version:
% Done:

0%

Source:
Development
Tags:
backport_processed
Backport:
pacific,quincy,reef
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

/a/lflores-2023-03-27_02:17:31-rados-wip-aclamk-bs-elastic-shared-blob-save-25.03.2023-a-distro-default-smithi/7221015

2023-03-27T07:50:01.978 DEBUG:teuthology.orchestra.run.smithi103:workunit test cls/test_cls_sdk.sh> mkdir -p -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=5e717292106ca2d310770101bfebb345837be8e1 TESTDIR="/home/ubuntu/cephtest" CEPH_ARGS="--cluster ceph" CEPH_ID="0" PATH=$PATH:/usr/sbin CEPH_BASE=/home/ubuntu/cephtest/clone.client.0 CEPH_ROOT=/home/ubuntu/cephtest/clone.client.0 CEPH_MNT=/home/ubuntu/cephtest/mnt.0 adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/clone.client.0/qa/workunits/cls/test_cls_sdk.sh

...

2023-03-27T07:50:48.129 DEBUG:teuthology.orchestra.run.smithi103:> sudo egrep '\[ERR\]|\[WRN\]|\[SEC\]' /var/log/ceph/ceph.log | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | egrep -v '\(OSD_SLOW_PING_TIME' | egrep -v '\(PG_AVAILABILITY\)' | head -n 1
2023-03-27T07:50:48.162 INFO:teuthology.orchestra.run.smithi103.stdout:1679903171.7781157 mon.a (mon.0) 589 : cluster [WRN] Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED)
2023-03-27T07:50:48.163 WARNING:tasks.ceph:Found errors (ERR|WRN|SEC) in cluster log
2023-03-27T07:50:48.163 DEBUG:teuthology.orchestra.run.smithi103:> sudo egrep '\[SEC\]' /var/log/ceph/ceph.log | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | egrep -v '\(OSD_SLOW_PING_TIME' | egrep -v '\(PG_AVAILABILITY\)' | head -n 1
2023-03-27T07:50:48.218 DEBUG:teuthology.orchestra.run.smithi103:> sudo egrep '\[ERR\]' /var/log/ceph/ceph.log | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | egrep -v '\(OSD_SLOW_PING_TIME' | egrep -v '\(PG_AVAILABILITY\)' | head -n 1
2023-03-27T07:50:48.272 DEBUG:teuthology.orchestra.run.smithi103:> sudo egrep '\[WRN\]' /var/log/ceph/ceph.log | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | egrep -v '\(OSD_SLOW_PING_TIME' | egrep -v '\(PG_AVAILABILITY\)' | head -n 1


Related issues

Related to rgw - Bug #61168: cluster [WRN] Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED) Pending Backport
Related to crimson - Bug #62595: Health check failed: (POOL_APP_NOT_ENABLED)" in cluster log Resolved
Copied to RADOS - Backport #61601: quincy: cls/test_cls_sdk.sh: Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED) Resolved
Copied to RADOS - Backport #61602: pacific: cls/test_cls_sdk.sh: Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED) Rejected
Copied to RADOS - Backport #61603: reef: cls/test_cls_sdk.sh: Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED) Resolved

History

#1 Updated by Laura Flores 11 months ago

/a/lflores-2023-03-27_13:46:00-rados-wip-aclamk-bs-elastic-shared-blob-save-25.03.2023-a-distro-default-smithi/7221524
/a/yuriw-2023-03-27_23:05:54-rados-wip-yuri4-testing-2023-03-25-0714-distro-default-smithi/7221965

#2 Updated by Laura Flores 11 months ago

  • Priority changed from Normal to High

I've seen this several times now in two different branches of unmerged PRs. Possible regression?

#3 Updated by Neha Ojha 11 months ago

seen in the rgw test suite too /a/cbodley-2023-03-22_18:01:21-rgw-main-distro-default-smithi/7216444 - see the discussion in https://github.com/ceph/ceph/pull/47560#issuecomment-1487406107

#4 Updated by Neha Ojha 11 months ago

Looking at a previous run very similar to /a/lflores-2023-03-27_02:17:31-rados-wip-aclamk-bs-elastic-shared-blob-save-25.03.2023-a-distro-default-smithi/7221015 (rados/basic/{ceph clusters/{fixed-2 openstack} mon_election/classic msgr-failures/many msgr/async objectstore/bluestore-low-osd-mem-target rados supported-random-distro$/{ubuntu_latest} tasks/rados_cls_all}) that had passed, it appears that the warning existed there too but the badness check just didn't catch it.

/a/yuriw-2023-03-17_23:38:21-rados-reef-distro-default-smithi/7212192 (rados/basic/{ceph clusters/{fixed-2 openstack} mon_election/classic msgr-failures/many msgr/async objectstore/bluestore-low-osd-mem-target rados supported-random-distro$/{ubuntu_latest} tasks/rados_cls_all})

2023-03-19T04:46:59.319 INFO:tasks.ceph:Checking cluster log for badness...
2023-03-19T04:46:59.319 DEBUG:teuthology.orchestra.run.smithi121:> sudo egrep '\[ERR\]|\[WRN\]|\[SEC\]' /var/log/ceph/ceph.log | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | egrep -v '\(OSD_SLOW_PING_TIME' | egrep -v '\(PG_AVAILABILITY\)' | head -n 1
2023-03-19T04:46:59.319 INFO:tasks.daemonwatchdog.daemon_watchdog:daemon ceph.mon.b is failed for ~0s
2023-03-19T04:46:59.339 INFO:tasks.ceph:Unmounting /var/lib/ceph/osd/ceph-0 on ubuntu@smithi121.front.sepia
nojha@teuthology:/ceph/teuthology-archive/yuriw-2023-03-17_23:38:21-rados-reef-distro-default-smithi/7212192/remote/smithi121/log$ zgrep "POOL_APP_NOT_ENABLED" ceph.log.gz 
1679200138.2431207 mon.a (mon.0) 640 : cluster 3 Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED)
1679200138.2431207 mon.a (mon.0) 640 : cluster 3 Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED)
1679200143.161677 mon.a (mon.0) 649 : cluster 1 Health check cleared: POOL_APP_NOT_ENABLED (was: 1 pool(s) do not have an application enabled)
1679200143.161677 mon.a (mon.0) 649 : cluster 1 Health check cleared: POOL_APP_NOT_ENABLED (was: 1 pool(s) do not have an application enabled)

#5 Updated by Matan Breizman 11 months ago

Neha Ojha wrote:

Looking at a previous run very similar to ... that had passed, it appears that the warning existed there too but the badness check just didn't catch it.

Unless I'm missing something, the pools created in this test shouldn't be associated to any application. If that's the case, then we can simply add "POOL_APP_NOT_ENABLED" to the ignore list (as done in other rados suite tests).

#6 Updated by Laura Flores 11 months ago

@Matan that's probably right, although I wonder what changed to make this pop up so frequently in the rados/rgw suites.

#8 Updated by Laura Flores 11 months ago

/a/yuriw-2023-03-27_23:05:54-rados-wip-yuri4-testing-2023-03-25-0714-distro-default-smithi/7221965

#9 Updated by Laura Flores 11 months ago

  • Backport set to pacific

/a/yuriw-2023-03-16_21:59:27-rados-wip-yuri6-testing-2023-03-12-0918-pacific-distro-default-smithi/7211186
/a/yuriw-2023-03-16_21:59:27-rados-wip-yuri6-testing-2023-03-12-0918-pacific-distro-default-smithi/7211167

#10 Updated by Laura Flores 11 months ago

/a/yuriw-2023-03-30_21:53:20-rados-wip-yuri7-testing-2023-03-29-1100-distro-default-smithi/7227986

#11 Updated by Radoslaw Zarzynski 11 months ago

  • Status changed from New to In Progress
  • Assignee set to Radoslaw Zarzynski

#12 Updated by Laura Flores 10 months ago

/a/yuriw-2023-04-25_14:15:40-rados-pacific-release-distro-default-smithi/7251186

#13 Updated by Laura Flores 10 months ago

/a/yuriw-2023-04-24_23:35:26-smoke-pacific-release-distro-default-smithi/7250661

#15 Updated by Laura Flores 10 months ago

/a/yuriw-2023-04-25_21:30:50-rados-wip-yuri3-testing-2023-04-25-1147-distro-default-smithi/7253406

#16 Updated by Laura Flores 10 months ago

/a/yuriw-2023-04-25_18:56:08-rados-wip-yuri5-testing-2023-04-25-0837-pacific-distro-default-smithi/7252745

#17 Updated by Laura Flores 10 months ago

/a/yuriw-2023-05-06_14:41:44-rados-pacific-release-distro-default-smithi/7264188

#18 Updated by Radoslaw Zarzynski 10 months ago

  • Assignee changed from Radoslaw Zarzynski to Laura Flores
  • Priority changed from High to Normal

Laura, would you mind taking a look? Definitely not urgent thing.

#19 Updated by Laura Flores 10 months ago

/a/yuriw-2023-04-26_01:16:19-rados-wip-yuri11-testing-2023-04-25-1605-pacific-distro-default-smithi/7253751

Sure Radek, I will see if something needs to be whitelisted.

#20 Updated by Casey Bodley 10 months ago

  • Related to Bug #61168: cluster [WRN] Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED) added

#21 Updated by Laura Flores 9 months ago

  • Status changed from In Progress to Duplicate

#22 Updated by Laura Flores 9 months ago

  • Duplicates Bug #61168: cluster [WRN] Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED) added

#23 Updated by Laura Flores 9 months ago

  • Related to deleted (Bug #61168: cluster [WRN] Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED))

#24 Updated by Laura Flores 9 months ago

  • Status changed from Duplicate to New

Hmm, found another instance that looks like this tracker in main:
/a/yuriw-2023-06-01_19:33:38-rados-wip-yuri-testing-2023-06-01-0746-distro-default-smithi/7294007

The test branch has the commit mentioned above, so perhaps there needs to be an additional fix.

#25 Updated by Laura Flores 9 months ago

  • Duplicates deleted (Bug #61168: cluster [WRN] Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED))

#26 Updated by Laura Flores 9 months ago

  • Related to Bug #61168: cluster [WRN] Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED) added

#27 Updated by Laura Flores 9 months ago

/a/yuriw-2023-05-30_21:40:46-rados-wip-yuri10-testing-2023-05-30-1244-distro-default-smithi/7290995

#28 Updated by Radoslaw Zarzynski 9 months ago

  • Backport changed from pacific to pacific,quincy,reef

#29 Updated by Radoslaw Zarzynski 9 months ago

Hi Laura! Do you have the bandwidth to take a deeper look?

#30 Updated by Laura Flores 9 months ago

Hey Radek, yes. Looking into it, it should be a quick whitelist fix. Trying out a fix now.

#31 Updated by Laura Flores 9 months ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 51925

#32 Updated by Laura Flores 9 months ago

  • Status changed from Fix Under Review to Pending Backport

#33 Updated by Backport Bot 9 months ago

  • Copied to Backport #61601: quincy: cls/test_cls_sdk.sh: Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED) added

#34 Updated by Backport Bot 9 months ago

  • Copied to Backport #61602: pacific: cls/test_cls_sdk.sh: Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED) added

#35 Updated by Backport Bot 9 months ago

  • Copied to Backport #61603: reef: cls/test_cls_sdk.sh: Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED) added

#36 Updated by Backport Bot 9 months ago

  • Tags set to backport_processed

#37 Updated by Matan Breizman 6 months ago

  • Related to Bug #62595: Health check failed: (POOL_APP_NOT_ENABLED)" in cluster log added

#38 Updated by Konstantin Shalygin 2 months ago

  • Category set to Tests
  • Status changed from Pending Backport to Resolved
  • Target version set to v19.0.0
  • Source set to Development

Also available in: Atom PDF