Bug #42228: mgr/dashboard: backend API test failure "test_access_permissions" - CephFS - Ceph

Actions

Copy link

Bug #42228

closed

mgr/dashboard: backend API test failure "test_access_permissions"

Added by Laura Paduano over 4 years ago. Updated about 4 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Patrick Donnelly

Category:

Target version:

Ceph - v15.0.0

% Done:

Source:

Q/A

Tags:

Backport:

nautilus

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(FS):

qa-suite

Labels (FS):

Pull request ID:

30816

Crash signature (v1):

Crash signature (v2):

Description

I got this error on my local system (based on master) and it also failed on a PR test (https://jenkins.ceph.com/job/ceph-dashboard-pr-backend/30/console):


2019-10-08 11:05:57,510.510 INFO:__main__:Stopped test: test_access_permissions (tasks.mgr.dashboard.test_cephfs.CephfsTest) in 22.789247s
2019-10-08 11:05:57,510.510 INFO:__main__:
2019-10-08 11:05:57,510.510 INFO:__main__:======================================================================
2019-10-08 11:05:57,510.510 INFO:__main__:ERROR: test_access_permissions (tasks.mgr.dashboard.test_cephfs.CephfsTest)
2019-10-08 11:05:57,511.511 INFO:__main__:----------------------------------------------------------------------
2019-10-08 11:05:57,511.511 INFO:__main__:Traceback (most recent call last):
2019-10-08 11:05:57,511.511 INFO:__main__:  File "/home/jenkins-build/build/workspace/ceph-dashboard-pr-backend/qa/tasks/mgr/dashboard/helper.py", line 154, in setUp
2019-10-08 11:05:57,511.511 INFO:__main__:    self.wait_for_health_clear(20)
2019-10-08 11:05:57,511.511 INFO:__main__:  File "/home/jenkins-build/build/workspace/ceph-dashboard-pr-backend/qa/tasks/ceph_test_case.py", line 131, in wait_for_health_clear
2019-10-08 11:05:57,511.511 INFO:__main__:    self.wait_until_true(is_clear, timeout)
2019-10-08 11:05:57,511.511 INFO:__main__:  File "/home/jenkins-build/build/workspace/ceph-dashboard-pr-backend/qa/tasks/ceph_test_case.py", line 163, in wait_until_true
2019-10-08 11:05:57,511.511 INFO:__main__:    raise RuntimeError("Timed out after {0}s".format(elapsed))
2019-10-08 11:05:57,511.511 INFO:__main__:RuntimeError: Timed out after 20s
2019-10-08 11:05:57,511.511 INFO:__main__:
2019-10-08 11:05:57,512.512 INFO:__main__:----------------------------------------------------------------------
2019-10-08 11:05:57,512.512 INFO:__main__:Ran 15 tests in 254.204s
2019-10-08 11:05:57,512.512 INFO:__main__:
2019-10-08 11:05:57,512.512 INFO:__main__:FAILED (errors=1)
2019-10-08 11:05:57,512.512 INFO:__main__:
2019-10-08 11:05:57,512.512 INFO:__main__:======================================================================
2019-10-08 11:05:57,512.512 INFO:__main__:ERROR: test_access_permissions (tasks.mgr.dashboard.test_cephfs.CephfsTest)
2019-10-08 11:05:57,512.512 INFO:__main__:----------------------------------------------------------------------
2019-10-08 11:05:57,513.513 INFO:__main__:Traceback (most recent call last):
2019-10-08 11:05:57,513.513 INFO:__main__:  File "/home/jenkins-build/build/workspace/ceph-dashboard-pr-backend/qa/tasks/mgr/dashboard/helper.py", line 154, in setUp
2019-10-08 11:05:57,513.513 INFO:__main__:    self.wait_for_health_clear(20)
2019-10-08 11:05:57,513.513 INFO:__main__:  File "/home/jenkins-build/build/workspace/ceph-dashboard-pr-backend/qa/tasks/ceph_test_case.py", line 131, in wait_for_health_clear
2019-10-08 11:05:57,513.513 INFO:__main__:    self.wait_until_true(is_clear, timeout)
2019-10-08 11:05:57,513.513 INFO:__main__:  File "/home/jenkins-build/build/workspace/ceph-dashboard-pr-backend/qa/tasks/ceph_test_case.py", line 163, in wait_until_true
2019-10-08 11:05:57,513.513 INFO:__main__:    raise RuntimeError("Timed out after {0}s".format(elapsed))
2019-10-08 11:05:57,513.513 INFO:__main__:RuntimeError: Timed out after 20s

Related issues 3 (0 open — 3 closed)

Actions

Copy link

Updated by Laura Paduano over 4 years ago

Description updated (diff)

Actions

Copy link

Updated by Laura Paduano over 4 years ago

Description updated (diff)

Actions

Copy link

Updated by Stephan Müller over 4 years ago

Status changed from New to In Progress
Assignee set to Stephan Müller

Actions

Copy link

Updated by Stephan Müller over 4 years ago

I tested it on an older compiled cluster and it worked... As on newer builds it fails I assume it's a code change somewhere else that makes this test fail :(

Actions

Copy link

Updated by Stephan Müller over 4 years ago

On my new build I get this error, so the change that broke the test comes from outside the dashboard.

Actions

Copy link

Updated by Stephan Müller over 4 years ago

I found out that the cluster somehow gets into an unhealthy state, which causes the problem.

  cluster:
    id:     50cd2934-64df-4f46-b868-154e688e6e42
    health: HEALTH_WARN
            2 pool(s) have non-power-of-two pg_num

  services:
    mon: 3 daemons, quorum a,b,c (age 26m)
    mgr: z(active, since 3m), standbys: x, y
    mds: cephfs:1 {0=a=up:active(laggy or crashed)}
    osd: 4 osds: 4 up (since 24m), 4 in (since 24m)
    rgw: 1 daemon active (8000)

  task status:

  data:
    pools:   6 pools, 50 pgs
    objects: 223 objects, 6.0 KiB
    usage:   8.0 GiB used, 4.0 TiB / 4.0 TiB avail
    pgs:     50 active+clean

The new power of two warning was merged 23.09.2019 -> https://github.com/ceph/ceph/pull/30525

Actions

Copy link

Updated by Stephan Müller over 4 years ago

Directly after vstart_runner.py is executed the cluster seems to be fine

  cluster:
    id:     760fd545-32e4-43d3-b7e5-c7588811e4c8
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum a,b,c (age 2m)
    mgr: x(active, since 2m), standbys: y, z
    osd: 4 osds: 4 up (since 16s), 4 in (since 16s)
    rgw: 1 daemon active (8000)

  task status:

  data:
    pools:   4 pools, 32 pgs
    objects: 12 objects, 1.2 KiB
    usage:   8.0 GiB used, 4.0 TiB / 4.0 TiB avail
    pgs:     32 active+clean

Actions

Copy link