Project

General

Profile

Bug #53582

Updated by Alfonso Martínez about 1 year ago

Observed a (recently) repeating error in all "https://jenkins.ceph.com/job/ceph-dashboard-pull-requests" job runs:
"waiting for mgr dashboard module to start"

Example of E2E Job failed:
https://jenkins.ceph.com/job/ceph-dashboard-pull-requests/9539/consoleText

I logged in to braggi02 to check mgr.x.log file and I found founf these errors:
<pre>
2021-12-10T11:27:01.531+0000 7fe36de73980 1 mgr[py] Loading python module 'prometheus'
2021-12-10T11:27:01.647+0000 7fe36de73980 10 mgr[py] Computed sys.path '/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr:/usr/local/lib/python3.8/dist-packages:/usr/lib/python3/dist-packages:/usr/lib/python3.8/dist-packages:/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind:/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/build/lib/cython_modules/lib.3:/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/python-common::/usr/lib/python38.zip:/usr/lib/python3.8:/usr/lib/python3.8/lib-dynload'
2021-12-10T11:27:01.859+0000 7fe36de73980 -1 mgr[py] Module not found: 'prometheus'
2021-12-10T11:27:01.859+0000 7fe36de73980 -1 mgr[py] Traceback (most recent call last):
File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/prometheus/__init__.py", line 2, in <module>
from .module import Module, StandbyModule
File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/prometheus/module.py", line 451, in <module>
class Module(MgrModule):
File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/prometheus/module.py", line 1645, in Module
def _list_healthchecks(self, format: Format = Format.plain) -> HandleCommandResult:
File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/mgr_module.py", line 387, in __call__
self.store_func_metadata(func)
File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/mgr_module.py", line 384, in store_func_metadata
self.load_func_metadata(f)
File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/mgr_module.py", line 376, in load_func_metadata
args.append(CephArgtype.to_argdesc(arg_spec[arg],
TypeError: to_argdesc() takes from 2 to 3 positional arguments but 4 were given

[...]

2021-12-10T11:26:59.791+0000 7fe36de73980 1 mgr[py] Loading python module 'dashboard'
2021-12-10T11:26:59.803+0000 7fe36de73980 10 mgr[py] Computed sys.path '/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr:/usr/local/lib/python3.8/dist-packages:/usr/lib/python3/dist-packages:/usr/lib/python3.8/dist-packages:/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind:/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/build/lib/cython_modules/lib.3:/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/python-common::/usr/lib/python38.zip:/usr/lib/python3.8:/usr/lib/python3.8/lib-dynload'
2021-12-10T11:27:00.007+0000 7fe36de73980 -1 mgr[py] Module not found: 'dashboard'
2021-12-10T11:27:00.007+0000 7fe36de73980 -1 mgr[py] Traceback (most recent call last):
File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/dashboard/__init__.py", line 52, in <module>
from .module import Module, StandbyModule # noqa: F401
File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/dashboard/module.py", line 28, in <module>
from .controllers import Router, json_error_page
File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/dashboard/controllers/__init__.py", line 1, in <module>
from ._api_router import APIRouter
File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/dashboard/controllers/_api_router.py", line 1, in <module>
from ._router import Router
File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/dashboard/controllers/_router.py", line 7, in <module>
from ._base_controller import BaseController
File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/dashboard/controllers/_base_controller.py", line 11, in <module>
from ..services.auth import AuthManager, JwtManager
File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/dashboard/services/auth.py", line 15, in <module>
from .access_control import LocalAuthenticator, UserDoesNotExist
File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/dashboard/services/access_control.py", line 579, in <module>
def set_login_credentials_cmd(_, username: str, inbuf: str):
File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/mgr_module.py", line 387, in __call__
self.store_func_metadata(func)
File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/mgr_module.py", line 384, in store_func_metadata
self.load_func_metadata(f)
File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/mgr_module.py", line 376, in load_func_metadata
args.append(CephArgtype.to_argdesc(arg_spec[arg],
TypeError: to_argdesc() takes from 2 to 3 positional arguments but 4 were given

[...]

2021-12-10T11:27:33.388+0000 7fe354318700 20 mgr.server operator() health checks:
{
"MGR_MODULE_DEPENDENCY": {
"severity": "HEALTH_WARN",
"summary": {
"message": "12 mgr modules have failed dependencies",
"count": 12
},
"detail": [
{
"message": "Module 'balancer' has failed dependency: to_argdesc() takes from 2 to 3 positional arguments but 4 were given"
},
{
"message": "Module 'crash' has failed dependency: to_argdesc() takes from 2 to 3 positional arguments but 4 were given"
},
{
"message": "Module 'dashboard' has failed dependency: to_argdesc() takes from 2 to 3 positional arguments but 4 were given"
},
{
"message": "Module 'devicehealth' has failed dependency: to_argdesc() takes from 2 to 3 positional arguments but 4 were given"
},
{
"message": "Module 'iostat' has failed dependency: to_argdesc() takes from 2 to 3 positional arguments but 4 were given"
},
{
"message": "Module 'nfs' has failed dependency: cannot import name 'OSDMethod' from 'ceph.deployment.drive_group' (/usr/lib/python3/dist-packages/ceph/deployment/drive_group.py)"
},
{
"message": "Module 'orchestrator' has failed dependency: cannot import name 'OSDMethod' from 'ceph.deployment.drive_group' (/usr/lib/python3/dist-packages/ceph/deployment/drive_group.py)"
},
{
"message": "Module 'pg_autoscaler' has failed dependency: to_argdesc() takes from 2 to 3 positional arguments but 4 were given"
},
{
"message": "Module 'rbd_support' has failed dependency: to_argdesc() takes from 2 to 3 positional arguments but 4 were given"
},
{
"message": "Module 'status' has failed dependency: to_argdesc() takes from 2 to 3 positional arguments but 4 were given"
},
{
"message": "Module 'telemetry' has failed dependency: to_argdesc() takes from 2 to 3 positional arguments but 4 were given"
},
{
"message": "Module 'volumes' has failed dependency: cannot import name 'OSDMethod' from 'ceph.deployment.drive_group' (/usr/lib/python3/dist-packages/ceph/deployment/drive_group.py)"
}
]
},
</pre>

It seems related to ceph_argparse and I noticed that Ceph E2E nightly job for master was successful.
Example of E2E nightly successful:
https://jenkins.ceph.com/view/all/job/ceph-api-nightly-master-e2e/770/consoleText

The only difference that I found is that in nightly job we do not build the tests:
<pre>
export FOR_MAKE_CHECK=1; timeout 2h ./src/script/run-make.sh --cmake-args '-DWITH_TESTS=OFF -DENABLE_GIT_VERSION=OFF'
</pre>
But in E2E PR job we do:
<pre>
export NPROC=$(nproc) CHECK_MAKEOPTS='-j$(nproc) -N -Q'; timeout 7200 ./run-make-check.sh
</pre>

After editing manually the PR E2E job in jenkins to do the same as nightly job, then the dashboard module was starting correctly,
so it seems that somehow building the tests has some side effect impacting ceph_argparse.

Back