Bug #53582
Updated by Alfonso MartÃnez over 2 years ago
Observed a (recently) repeating error in all "https://jenkins.ceph.com/job/ceph-dashboard-pull-requests" job runs: "waiting for mgr dashboard module to start" Example of E2E Job failed: https://jenkins.ceph.com/job/ceph-dashboard-pull-requests/9539/consoleText I logged in to braggi02 to check mgr.x.log file and I found founf these errors: <pre> 2021-12-10T11:27:01.531+0000 7fe36de73980 1 mgr[py] Loading python module 'prometheus' 2021-12-10T11:27:01.647+0000 7fe36de73980 10 mgr[py] Computed sys.path '/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr:/usr/local/lib/python3.8/dist-packages:/usr/lib/python3/dist-packages:/usr/lib/python3.8/dist-packages:/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind:/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/build/lib/cython_modules/lib.3:/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/python-common::/usr/lib/python38.zip:/usr/lib/python3.8:/usr/lib/python3.8/lib-dynload' 2021-12-10T11:27:01.859+0000 7fe36de73980 -1 mgr[py] Module not found: 'prometheus' 2021-12-10T11:27:01.859+0000 7fe36de73980 -1 mgr[py] Traceback (most recent call last): File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/prometheus/__init__.py", line 2, in <module> from .module import Module, StandbyModule File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/prometheus/module.py", line 451, in <module> class Module(MgrModule): File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/prometheus/module.py", line 1645, in Module def _list_healthchecks(self, format: Format = Format.plain) -> HandleCommandResult: File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/mgr_module.py", line 387, in __call__ self.store_func_metadata(func) File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/mgr_module.py", line 384, in store_func_metadata self.load_func_metadata(f) File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/mgr_module.py", line 376, in load_func_metadata args.append(CephArgtype.to_argdesc(arg_spec[arg], TypeError: to_argdesc() takes from 2 to 3 positional arguments but 4 were given [...] 2021-12-10T11:26:59.791+0000 7fe36de73980 1 mgr[py] Loading python module 'dashboard' 2021-12-10T11:26:59.803+0000 7fe36de73980 10 mgr[py] Computed sys.path '/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr:/usr/local/lib/python3.8/dist-packages:/usr/lib/python3/dist-packages:/usr/lib/python3.8/dist-packages:/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind:/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/build/lib/cython_modules/lib.3:/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/python-common::/usr/lib/python38.zip:/usr/lib/python3.8:/usr/lib/python3.8/lib-dynload' 2021-12-10T11:27:00.007+0000 7fe36de73980 -1 mgr[py] Module not found: 'dashboard' 2021-12-10T11:27:00.007+0000 7fe36de73980 -1 mgr[py] Traceback (most recent call last): File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/dashboard/__init__.py", line 52, in <module> from .module import Module, StandbyModule # noqa: F401 File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/dashboard/module.py", line 28, in <module> from .controllers import Router, json_error_page File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/dashboard/controllers/__init__.py", line 1, in <module> from ._api_router import APIRouter File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/dashboard/controllers/_api_router.py", line 1, in <module> from ._router import Router File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/dashboard/controllers/_router.py", line 7, in <module> from ._base_controller import BaseController File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/dashboard/controllers/_base_controller.py", line 11, in <module> from ..services.auth import AuthManager, JwtManager File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/dashboard/services/auth.py", line 15, in <module> from .access_control import LocalAuthenticator, UserDoesNotExist File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/dashboard/services/access_control.py", line 579, in <module> def set_login_credentials_cmd(_, username: str, inbuf: str): File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/mgr_module.py", line 387, in __call__ self.store_func_metadata(func) File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/mgr_module.py", line 384, in store_func_metadata self.load_func_metadata(f) File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/mgr_module.py", line 376, in load_func_metadata args.append(CephArgtype.to_argdesc(arg_spec[arg], TypeError: to_argdesc() takes from 2 to 3 positional arguments but 4 were given [...] 2021-12-10T11:27:33.388+0000 7fe354318700 20 mgr.server operator() health checks: { "MGR_MODULE_DEPENDENCY": { "severity": "HEALTH_WARN", "summary": { "message": "12 mgr modules have failed dependencies", "count": 12 }, "detail": [ { "message": "Module 'balancer' has failed dependency: to_argdesc() takes from 2 to 3 positional arguments but 4 were given" }, { "message": "Module 'crash' has failed dependency: to_argdesc() takes from 2 to 3 positional arguments but 4 were given" }, { "message": "Module 'dashboard' has failed dependency: to_argdesc() takes from 2 to 3 positional arguments but 4 were given" }, { "message": "Module 'devicehealth' has failed dependency: to_argdesc() takes from 2 to 3 positional arguments but 4 were given" }, { "message": "Module 'iostat' has failed dependency: to_argdesc() takes from 2 to 3 positional arguments but 4 were given" }, { "message": "Module 'nfs' has failed dependency: cannot import name 'OSDMethod' from 'ceph.deployment.drive_group' (/usr/lib/python3/dist-packages/ceph/deployment/drive_group.py)" }, { "message": "Module 'orchestrator' has failed dependency: cannot import name 'OSDMethod' from 'ceph.deployment.drive_group' (/usr/lib/python3/dist-packages/ceph/deployment/drive_group.py)" }, { "message": "Module 'pg_autoscaler' has failed dependency: to_argdesc() takes from 2 to 3 positional arguments but 4 were given" }, { "message": "Module 'rbd_support' has failed dependency: to_argdesc() takes from 2 to 3 positional arguments but 4 were given" }, { "message": "Module 'status' has failed dependency: to_argdesc() takes from 2 to 3 positional arguments but 4 were given" }, { "message": "Module 'telemetry' has failed dependency: to_argdesc() takes from 2 to 3 positional arguments but 4 were given" }, { "message": "Module 'volumes' has failed dependency: cannot import name 'OSDMethod' from 'ceph.deployment.drive_group' (/usr/lib/python3/dist-packages/ceph/deployment/drive_group.py)" } ] }, </pre> It seems related to ceph_argparse and I noticed that Ceph E2E nightly job for master was successful. Example of E2E nightly successful: https://jenkins.ceph.com/view/all/job/ceph-api-nightly-master-e2e/770/consoleText The only difference that I found is that in nightly job we do not build the tests: <pre> export FOR_MAKE_CHECK=1; timeout 2h ./src/script/run-make.sh --cmake-args '-DWITH_TESTS=OFF -DENABLE_GIT_VERSION=OFF' </pre> But in E2E PR job we do: <pre> export NPROC=$(nproc) CHECK_MAKEOPTS='-j$(nproc) -N -Q'; timeout 7200 ./run-make-check.sh </pre> After editing manually the PR E2E job in jenkins to do the same as nightly job, then the dashboard module was starting correctly, so it seems that somehow building the tests has some side effect impacting ceph_argparse.