Bug #59192
closedcls/test_cls_sdk.sh: Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED)
0%
Description
/a/lflores-2023-03-27_02:17:31-rados-wip-aclamk-bs-elastic-shared-blob-save-25.03.2023-a-distro-default-smithi/7221015
2023-03-27T07:50:01.978 DEBUG:teuthology.orchestra.run.smithi103:workunit test cls/test_cls_sdk.sh> mkdir -p -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=5e717292106ca2d310770101bfebb345837be8e1 TESTDIR="/home/ubuntu/cephtest" CEPH_ARGS="--cluster ceph" CEPH_ID="0" PATH=$PATH:/usr/sbin CEPH_BASE=/home/ubuntu/cephtest/clone.client.0 CEPH_ROOT=/home/ubuntu/cephtest/clone.client.0 CEPH_MNT=/home/ubuntu/cephtest/mnt.0 adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/clone.client.0/qa/workunits/cls/test_cls_sdk.sh
...
2023-03-27T07:50:48.129 DEBUG:teuthology.orchestra.run.smithi103:> sudo egrep '\[ERR\]|\[WRN\]|\[SEC\]' /var/log/ceph/ceph.log | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | egrep -v '\(OSD_SLOW_PING_TIME' | egrep -v '\(PG_AVAILABILITY\)' | head -n 1
2023-03-27T07:50:48.162 INFO:teuthology.orchestra.run.smithi103.stdout:1679903171.7781157 mon.a (mon.0) 589 : cluster [WRN] Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED)
2023-03-27T07:50:48.163 WARNING:tasks.ceph:Found errors (ERR|WRN|SEC) in cluster log
2023-03-27T07:50:48.163 DEBUG:teuthology.orchestra.run.smithi103:> sudo egrep '\[SEC\]' /var/log/ceph/ceph.log | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | egrep -v '\(OSD_SLOW_PING_TIME' | egrep -v '\(PG_AVAILABILITY\)' | head -n 1
2023-03-27T07:50:48.218 DEBUG:teuthology.orchestra.run.smithi103:> sudo egrep '\[ERR\]' /var/log/ceph/ceph.log | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | egrep -v '\(OSD_SLOW_PING_TIME' | egrep -v '\(PG_AVAILABILITY\)' | head -n 1
2023-03-27T07:50:48.272 DEBUG:teuthology.orchestra.run.smithi103:> sudo egrep '\[WRN\]' /var/log/ceph/ceph.log | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | egrep -v '\(OSD_SLOW_PING_TIME' | egrep -v '\(PG_AVAILABILITY\)' | head -n 1
Updated by Laura Flores about 1 year ago
/a/lflores-2023-03-27_13:46:00-rados-wip-aclamk-bs-elastic-shared-blob-save-25.03.2023-a-distro-default-smithi/7221524
/a/yuriw-2023-03-27_23:05:54-rados-wip-yuri4-testing-2023-03-25-0714-distro-default-smithi/7221965
Updated by Laura Flores about 1 year ago
- Priority changed from Normal to High
I've seen this several times now in two different branches of unmerged PRs. Possible regression?
Updated by Neha Ojha about 1 year ago
seen in the rgw test suite too /a/cbodley-2023-03-22_18:01:21-rgw-main-distro-default-smithi/7216444 - see the discussion in https://github.com/ceph/ceph/pull/47560#issuecomment-1487406107
Updated by Neha Ojha about 1 year ago
Looking at a previous run very similar to /a/lflores-2023-03-27_02:17:31-rados-wip-aclamk-bs-elastic-shared-blob-save-25.03.2023-a-distro-default-smithi/7221015 (rados/basic/{ceph clusters/{fixed-2 openstack} mon_election/classic msgr-failures/many msgr/async objectstore/bluestore-low-osd-mem-target rados supported-random-distro$/{ubuntu_latest} tasks/rados_cls_all}) that had passed, it appears that the warning existed there too but the badness check just didn't catch it.
/a/yuriw-2023-03-17_23:38:21-rados-reef-distro-default-smithi/7212192 (rados/basic/{ceph clusters/{fixed-2 openstack} mon_election/classic msgr-failures/many msgr/async objectstore/bluestore-low-osd-mem-target rados supported-random-distro$/{ubuntu_latest} tasks/rados_cls_all})
2023-03-19T04:46:59.319 INFO:tasks.ceph:Checking cluster log for badness... 2023-03-19T04:46:59.319 DEBUG:teuthology.orchestra.run.smithi121:> sudo egrep '\[ERR\]|\[WRN\]|\[SEC\]' /var/log/ceph/ceph.log | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | egrep -v '\(OSD_SLOW_PING_TIME' | egrep -v '\(PG_AVAILABILITY\)' | head -n 1 2023-03-19T04:46:59.319 INFO:tasks.daemonwatchdog.daemon_watchdog:daemon ceph.mon.b is failed for ~0s 2023-03-19T04:46:59.339 INFO:tasks.ceph:Unmounting /var/lib/ceph/osd/ceph-0 on ubuntu@smithi121.front.sepia
nojha@teuthology:/ceph/teuthology-archive/yuriw-2023-03-17_23:38:21-rados-reef-distro-default-smithi/7212192/remote/smithi121/log$ zgrep "POOL_APP_NOT_ENABLED" ceph.log.gz 1679200138.2431207 mon.a (mon.0) 640 : cluster 3 Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED) 1679200138.2431207 mon.a (mon.0) 640 : cluster 3 Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED) 1679200143.161677 mon.a (mon.0) 649 : cluster 1 Health check cleared: POOL_APP_NOT_ENABLED (was: 1 pool(s) do not have an application enabled) 1679200143.161677 mon.a (mon.0) 649 : cluster 1 Health check cleared: POOL_APP_NOT_ENABLED (was: 1 pool(s) do not have an application enabled)
Updated by Matan Breizman about 1 year ago
Neha Ojha wrote:
Looking at a previous run very similar to ... that had passed, it appears that the warning existed there too but the badness check just didn't catch it.
Unless I'm missing something, the pools created in this test shouldn't be associated to any application. If that's the case, then we can simply add "POOL_APP_NOT_ENABLED" to the ignore list (as done in other rados suite tests).
Updated by Laura Flores about 1 year ago
@Matan Breizman that's probably right, although I wonder what changed to make this pop up so frequently in the rados/rgw suites.
Updated by Casey Bodley about 1 year ago
this seems to happen exclusively against ubuntu 22.04:
https://pulpito.ceph.com/cbodley-2023-03-30_21:31:09-rgw:verify-wip-cbodley-testing-distro-default-smithi/
centos 8 runs were green:
https://pulpito.ceph.com/cbodley-2023-03-30_20:32:49-rgw:verify-wip-cbodley-testing-distro-default-smithi/
ubuntu 20.04 had other failures, but no cluster warnings:
https://pulpito.ceph.com/cbodley-2023-03-30_13:17:22-rgw:verify-wip-cbodley-testing-distro-default-smithi/
Updated by Laura Flores about 1 year ago
/a/yuriw-2023-03-27_23:05:54-rados-wip-yuri4-testing-2023-03-25-0714-distro-default-smithi/7221965
Updated by Laura Flores about 1 year ago
- Backport set to pacific
/a/yuriw-2023-03-16_21:59:27-rados-wip-yuri6-testing-2023-03-12-0918-pacific-distro-default-smithi/7211186
/a/yuriw-2023-03-16_21:59:27-rados-wip-yuri6-testing-2023-03-12-0918-pacific-distro-default-smithi/7211167
Updated by Laura Flores about 1 year ago
/a/yuriw-2023-03-30_21:53:20-rados-wip-yuri7-testing-2023-03-29-1100-distro-default-smithi/7227986
Updated by Radoslaw Zarzynski about 1 year ago
- Status changed from New to In Progress
- Assignee set to Radoslaw Zarzynski
Updated by Laura Flores 12 months ago
/a/yuriw-2023-04-25_14:15:40-rados-pacific-release-distro-default-smithi/7251186
Updated by Laura Flores 12 months ago
/a/yuriw-2023-04-24_23:35:26-smoke-pacific-release-distro-default-smithi/7250661
Updated by Casey Bodley 12 months ago
still failing consistently in the rgw suite
on main: https://pulpito.ceph.com/cbodley-2023-04-26_00:39:50-rgw-wip-cbodley2-testing-distro-default-smithi/
and on reef: https://pulpito.ceph.com/yuriw-2023-04-28_19:03:15-rgw-reef-distro-default-smithi/
Updated by Laura Flores 12 months ago
/a/yuriw-2023-04-25_21:30:50-rados-wip-yuri3-testing-2023-04-25-1147-distro-default-smithi/7253406
Updated by Laura Flores 12 months ago
/a/yuriw-2023-04-25_18:56:08-rados-wip-yuri5-testing-2023-04-25-0837-pacific-distro-default-smithi/7252745
Updated by Laura Flores 12 months ago
/a/yuriw-2023-05-06_14:41:44-rados-pacific-release-distro-default-smithi/7264188
Updated by Radoslaw Zarzynski 12 months ago
- Assignee changed from Radoslaw Zarzynski to Laura Flores
- Priority changed from High to Normal
Laura, would you mind taking a look? Definitely not urgent thing.
Updated by Laura Flores 12 months ago
/a/yuriw-2023-04-26_01:16:19-rados-wip-yuri11-testing-2023-04-25-1605-pacific-distro-default-smithi/7253751
Sure Radek, I will see if something needs to be whitelisted.
Updated by Casey Bodley 12 months ago
- Related to Bug #61168: cluster [WRN] Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED) added
Updated by Laura Flores 11 months ago
- Status changed from In Progress to Duplicate
Updated by Laura Flores 11 months ago
- Is duplicate of Bug #61168: cluster [WRN] Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED) added
Updated by Laura Flores 11 months ago
- Related to deleted (Bug #61168: cluster [WRN] Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED))
Updated by Laura Flores 11 months ago
- Status changed from Duplicate to New
Hmm, found another instance that looks like this tracker in main:
/a/yuriw-2023-06-01_19:33:38-rados-wip-yuri-testing-2023-06-01-0746-distro-default-smithi/7294007
The test branch has the commit mentioned above, so perhaps there needs to be an additional fix.
Updated by Laura Flores 11 months ago
- Is duplicate of deleted (Bug #61168: cluster [WRN] Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED))
Updated by Laura Flores 11 months ago
- Related to Bug #61168: cluster [WRN] Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED) added
Updated by Laura Flores 11 months ago
/a/yuriw-2023-05-30_21:40:46-rados-wip-yuri10-testing-2023-05-30-1244-distro-default-smithi/7290995
Updated by Radoslaw Zarzynski 11 months ago
- Backport changed from pacific to pacific,quincy,reef
Updated by Radoslaw Zarzynski 11 months ago
Hi Laura! Do you have the bandwidth to take a deeper look?
Updated by Laura Flores 11 months ago
Hey Radek, yes. Looking into it, it should be a quick whitelist fix. Trying out a fix now.
Updated by Laura Flores 11 months ago
- Status changed from New to Fix Under Review
- Pull request ID set to 51925
Updated by Laura Flores 11 months ago
- Status changed from Fix Under Review to Pending Backport
Updated by Backport Bot 11 months ago
- Copied to Backport #61601: quincy: cls/test_cls_sdk.sh: Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED) added
Updated by Backport Bot 11 months ago
- Copied to Backport #61602: pacific: cls/test_cls_sdk.sh: Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED) added
Updated by Backport Bot 11 months ago
- Copied to Backport #61603: reef: cls/test_cls_sdk.sh: Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED) added
Updated by Matan Breizman 8 months ago
- Related to Bug #62595: Health check failed: (POOL_APP_NOT_ENABLED)" in cluster log added
Updated by Konstantin Shalygin 4 months ago
- Category set to Tests
- Status changed from Pending Backport to Resolved
- Target version set to v19.0.0
- Source set to Development