Project

General

Profile

Actions

Bug #64374

open

Error ENOENT: module 'cephadm' reports that it cannot run on the active manager daemon: No module named 'mgr_module' (pass --force to force enablement)

Added by Prashant D 3 months ago. Updated 11 days ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
squid
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

https://pulpito.ceph.com/yuriw-2024-02-07_22:15:19-rados-wip-yuri6-testing-2024-02-07-1000-distro-default-smithi

Tests are failing due to failed to import internal mgr_module :

2024-02-09T00:51:05.636 INFO:teuthology.orchestra.run.smithi002.stdout:Non-zero exit code 2 from /bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/bin/ceph --init -e CONTAINER_IMAGE=quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph:eb583889924ee88759f8e8d8bd48edeb78bfd3c0 -e NODE_NAME=smithi002 -v /var/log/ceph/1c7ef528-c6e5-11ee-95b6-87774f69a715:/var/log/ceph:z -v /tmp/ceph-tmp20jrlflo:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpv7483kqa:/etc/ceph/ceph.conf:z quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph:eb583889924ee88759f8e8d8bd48edeb78bfd3c0 mgr module enable cephadm
2024-02-09T00:51:05.636 INFO:teuthology.orchestra.run.smithi002.stdout:/usr/bin/ceph: stderr Error ENOENT: module 'cephadm' reports that it cannot run on the active manager daemon: No module named 'mgr_module' (pass --force to force enablement)
2024-02-09T00:51:05.636 INFO:teuthology.orchestra.run.smithi002.stderr:RuntimeError: Failed command: /bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/bin/ceph --init -e CONTAINER_IMAGE=quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph:eb583889924ee88759f8e8d8bd48edeb78bfd3c0 -e NODE_NAME=smithi002 -v /var/log/ceph/1c7ef528-c6e5-11ee-95b6-87774f69a715:/var/log/ceph:z -v /tmp/ceph-tmp20jrlflo:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpv7483kqa:/etc/ceph/ceph.conf:z quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph:eb583889924ee88759f8e8d8bd48edeb78bfd3c0 mgr module enable cephadm: Error ENOENT: module 'cephadm' reports that it cannot run on the active manager daemon: No module named 'mgr_module' (pass --force to force enablement)
2024-02-09T00:51:05.636 INFO:teuthology.orchestra.run.smithi002.stderr:
2024-02-09T00:51:05.637 INFO:teuthology.orchestra.run.smithi002.stdout:
2024-02-09T00:51:05.637 INFO:teuthology.orchestra.run.smithi002.stdout:
2024-02-09T00:51:05.637 INFO:teuthology.orchestra.run.smithi002.stdout:    ***************
2024-02-09T00:51:05.637 INFO:teuthology.orchestra.run.smithi002.stdout:    Cephadm hit an issue during cluster installation. Current cluster files will be deleted automatically.
2024-02-09T00:51:05.637 INFO:teuthology.orchestra.run.smithi002.stdout:    To disable this behaviour you can pass the --no-cleanup-on-failure flag. In case of any previous
2024-02-09T00:51:05.637 INFO:teuthology.orchestra.run.smithi002.stdout:    broken installation, users must use the following command to completely delete the broken cluster:
2024-02-09T00:51:05.637 INFO:teuthology.orchestra.run.smithi002.stdout:
2024-02-09T00:51:05.637 INFO:teuthology.orchestra.run.smithi002.stdout:    > cephadm rm-cluster --force --zap-osds --fsid <fsid>
2024-02-09T00:51:05.637 INFO:teuthology.orchestra.run.smithi002.stdout:
2024-02-09T00:51:05.638 INFO:teuthology.orchestra.run.smithi002.stdout:    for more information please refer to https://docs.ceph.com/en/latest/cephadm/operations/#purging-a-cluster
2024-02-09T00:51:05.638 INFO:teuthology.orchestra.run.smithi002.stdout:    ***************

Actions #1

Updated by Prashant D 3 months ago

  • Description updated (diff)
Actions #2

Updated by Prashant D 3 months ago

The mgr modules path seems to be not appended to PYTHONPATH :

[user@testnode ~]$ ssh user@smithi002.front.sepia.ceph.com
The authenticity of host 'smithi002.front.sepia.ceph.com (172.21.15.2)' can't be established.
ECDSA key fingerprint is SHA256:Pt5zs9W9FXd4V2S3CgJQO/bCPS+i2m/6nzun1513cXE.
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added 'smithi002.front.sepia.ceph.com' (ECDSA) to the list of known hosts.
sudo su -
[user@smithi002 ~]$ sudo su -
[root@smithi002 ~]# curl --silent -L https://1.chacra.ceph.com/binaries/ceph/wip-yuri6-testing-2024-02-07-1000/eb583889924ee88759f8e8d8bd48edeb78bfd3c0/centos/9/x86_64/flavors/default/cephadm > /home/ubuntu/cephtest/cephadm && ls -l /home/ubuntu/cephtest/cephadm
-rw-r--r--. 1 root root 766949 Feb  9 01:54 /home/ubuntu/cephtest/cephadm
[root@smithi002 ~]# chmod +x /home/ubuntu/cephtest/cephadm
[root@smithi002 ~]# sudo /home/ubuntu/cephtest/cephadm --image quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph:eb583889924ee88759f8e8d8bd48edeb78bfd3c0 pull
Pulling container image quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph:eb583889924ee88759f8e8d8bd48edeb78bfd3c0...
{
    "ceph_version": "ceph version 19.0.0-1298-geb583889 (eb583889924ee88759f8e8d8bd48edeb78bfd3c0) squid (dev)",
    "image_id": "38c4039b0e98b53ff89657a16fec4c63d01c188f936580ce26e5e09a61ad1a2d",
    "repo_digests": [
        "quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph@sha256:14bf4dbdc3d1ee48c9713ffb480ed1b59d64ddbe0851bf44ef27b7db227241b5" 
    ]
}
[root@smithi002 ~]# vim /home/ubuntu/cephtest/seed.ceph.conf
[root@smithi002 ~]#
[root@smithi002 ~]#
[root@smithi002 ~]# sudo /home/ubuntu/cephtest/cephadm --image quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph:eb583889924ee88759f8e8d8bd48edeb78bfd3c0 -v bootstrap --fsid 1c7ef528-c6e5-11ee-95b6-87774f69a715 --config /home/ubuntu/cephtest/seed.ceph.conf --output-config /etc/ceph/ceph.conf --output-keyring /etc/ceph/ceph.client.admin.keyring --output-pub-ssh-key /home/ubuntu/cephtest/ceph.pub --mon-ip 172.21.15.2 --skip-admin-label && sudo chmod +r /etc/ceph/ceph.client.admin.keyring
--------------------------------------------------------------------------------
cephadm ['--image', 'quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph:eb583889924ee88759f8e8d8bd48edeb78bfd3c0', '-v', 'bootstrap', '--fsid', '1c7ef528-c6e5-11ee-95b6-87774f69a715', '--config', '/home/ubuntu/cephtest/seed.ceph.conf', '--output-config', '/etc/ceph/ceph.conf', '--output-keyring', '/etc/ceph/ceph.client.admin.keyring', '--output-pub-ssh-key', '/home/ubuntu/cephtest/ceph.pub', '--mon-ip', '172.21.15.2', '--skip-admin-label']
/bin/podman: stdout 4.8.1
Specifying an fsid for your cluster offers no advantages and may increase the likelihood of fsid conflicts.
...
Waiting for mon...
/usr/bin/ceph: stdout   cluster:
/usr/bin/ceph: stdout     id:     1c7ef528-c6e5-11ee-95b6-87774f69a715
/usr/bin/ceph: stdout     health: HEALTH_OK
/usr/bin/ceph: stdout
/usr/bin/ceph: stdout   services:
/usr/bin/ceph: stdout     mon: 1 daemons, quorum smithi002 (age 0.566787s)
/usr/bin/ceph: stdout     mgr: no daemons active
/usr/bin/ceph: stdout     osd: 0 osds: 0 up, 0 in
/usr/bin/ceph: stdout
/usr/bin/ceph: stdout   data:
/usr/bin/ceph: stdout     pools:   0 pools, 0 pgs
/usr/bin/ceph: stdout     objects: 0 objects, 0 B
/usr/bin/ceph: stdout     usage:   0 B used, 0 B / 0 B avail
/usr/bin/ceph: stdout     pgs:
/usr/bin/ceph: stdout
mon is available
Assimilating anything we can from ceph.conf...
/usr/bin/ceph: stdout
/usr/bin/ceph: stdout [global]
/usr/bin/ceph: stdout     fsid = 1c7ef528-c6e5-11ee-95b6-87774f69a715
/usr/bin/ceph: stdout     mon_host = [v2:172.21.15.2:3300,v1:172.21.15.2:6789]
/usr/bin/ceph: stdout     mon_osd_allow_pg_remap = true
/usr/bin/ceph: stdout     mon_osd_allow_primary_affinity = true
/usr/bin/ceph: stdout     mon_warn_on_no_sortbitwise = false
/usr/bin/ceph: stdout     osd_crush_chooseleaf_type = 0
/usr/bin/ceph: stdout
/usr/bin/ceph: stdout [mgr]
/usr/bin/ceph: stdout     mgr/telemetry/nag = false
/usr/bin/ceph: stdout
/usr/bin/ceph: stdout [osd]
/usr/bin/ceph: stdout     osd_map_max_advance = 10
/usr/bin/ceph: stdout     osd_sloppy_crc = true
Generating new minimal ceph.conf...
Restarting the monitor...
Setting public_network to 172.21.0.0/20 in mon config section
Wrote config to /etc/ceph/ceph.conf
Wrote keyring to /etc/ceph/ceph.client.admin.keyring
Creating mgr...
Verifying port 0.0.0.0:9283 ...
Verifying port 0.0.0.0:8765 ...
Verifying port 0.0.0.0:8443 ...
...
Enabling cephadm module...
Non-zero exit code 2 from /bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/bin/ceph --init -e CONTAINER_IMAGE=quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph:eb583889924ee88759f8e8d8bd48edeb78bfd3c0 -e NODE_NAME=smithi002 -v /var/log/ceph/1c7ef528-c6e5-11ee-95b6-87774f69a715:/var/log/ceph:z -v /tmp/ceph-tmp0x139bqo:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpy3f55sb8:/etc/ceph/ceph.conf:z quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph:eb583889924ee88759f8e8d8bd48edeb78bfd3c0 mgr module enable cephadm
/usr/bin/ceph: stderr Error ENOENT: module 'cephadm' reports that it cannot run on the active manager daemon: No module named 'mgr_module' (pass --force to force enablement)
RuntimeError: Failed command: /bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/bin/ceph --init -e CONTAINER_IMAGE=quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph:eb583889924ee88759f8e8d8bd48edeb78bfd3c0 -e NODE_NAME=smithi002 -v /var/log/ceph/1c7ef528-c6e5-11ee-95b6-87774f69a715:/var/log/ceph:z -v /tmp/ceph-tmp0x139bqo:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpy3f55sb8:/etc/ceph/ceph.conf:z quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph:eb583889924ee88759f8e8d8bd48edeb78bfd3c0 mgr module enable cephadm: Error ENOENT: module 'cephadm' reports that it cannot run on the active manager daemon: No module named 'mgr_module' (pass --force to force enablement)

    ***************
    Cephadm hit an issue during cluster installation. Current cluster files will be deleted automatically.
    To disable this behaviour you can pass the --no-cleanup-on-failure flag. In case of any previous
    broken installation, users must use the following command to completely delete the broken cluster:

    > cephadm rm-cluster --force --zap-osds --fsid <fsid>

    for more information please refer to https://docs.ceph.com/en/latest/cephadm/operations/#purging-a-cluster
    ***************

Deleting cluster with fsid: 1c7ef528-c6e5-11ee-95b6-87774f69a715
/bin/podman: stdout e681ca382ed0,17.92MB / 33.27GB
/bin/podman: stdout 29e7368314f2,100.3MB / 33.27GB
/bin/podman: stdout e681ca382ed0,1.25%
/bin/podman: stdout 29e7368314f2,12.31%
systemctl: stderr Removed "/etc/systemd/system/ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715.target.wants/ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715@mon.smithi002.service".
Non-zero exit code 5 from systemctl stop ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-init@mon.smithi002.service
systemctl: stderr Failed to stop ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-init@mon.smithi002.service: Unit ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-init@mon.smithi002.service not loaded.
Non-zero exit code 1 from systemctl reset-failed ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-init@mon.smithi002.service
systemctl: stderr Failed to reset failed state of unit ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-init@mon.smithi002.service: Unit ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-init@mon.smithi002.service not loaded.
Non-zero exit code 1 from systemctl disable ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-init@mon.smithi002.service
systemctl: stderr Failed to disable unit: Unit file ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-init@mon.smithi002.service does not exist.
systemctl: stderr Removed "/etc/systemd/system/ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715.target.wants/ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715@mgr.smithi002.ahpaml.service".
Non-zero exit code 5 from systemctl stop ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-init@mgr.smithi002.ahpaml.service
systemctl: stderr Failed to stop ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-init@mgr.smithi002.ahpaml.service: Unit ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-init@mgr.smithi002.ahpaml.service not loaded.
Non-zero exit code 1 from systemctl reset-failed ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-init@mgr.smithi002.ahpaml.service
systemctl: stderr Failed to reset failed state of unit ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-init@mgr.smithi002.ahpaml.service: Unit ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-init@mgr.smithi002.ahpaml.service not loaded.
Non-zero exit code 1 from systemctl disable ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-init@mgr.smithi002.ahpaml.service
systemctl: stderr Failed to disable unit: Unit file ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-init@mgr.smithi002.ahpaml.service does not exist.
systemctl: stderr Removed "/etc/systemd/system/multi-user.target.wants/ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715.target".
systemctl: stderr Removed "/etc/systemd/system/ceph.target.wants/ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715.target".
systemctl: stderr Removed "/etc/systemd/system/multi-user.target.wants/ceph.target".
Traceback (most recent call last):
  File "/usr/lib64/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib64/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/tmp/tmpk3h8peah.cephadm.build/app/__main__.py", line 5520, in <module>
  File "/tmp/tmpk3h8peah.cephadm.build/app/__main__.py", line 5508, in main
  File "/tmp/tmpk3h8peah.cephadm.build/app/__main__.py", line 2635, in _rollback
  File "/tmp/tmpk3h8peah.cephadm.build/app/__main__.py", line 443, in _default_image
  File "/tmp/tmpk3h8peah.cephadm.build/app/__main__.py", line 2867, in command_bootstrap
  File "/tmp/tmpk3h8peah.cephadm.build/app/__main__.py", line 2329, in enable_cephadm_mgr_module
  File "/tmp/tmpk3h8peah.cephadm.build/app/__main__.py", line 2796, in cli
  File "/tmp/tmpk3h8peah.cephadm.build/app/cephadmlib/container_types.py", line 433, in run
  File "/tmp/tmpk3h8peah.cephadm.build/app/cephadmlib/call_wrappers.py", line 307, in call_throws
RuntimeError: Failed command: /bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/bin/ceph --init -e CONTAINER_IMAGE=quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph:eb583889924ee88759f8e8d8bd48edeb78bfd3c0 -e NODE_NAME=smithi002 -v /var/log/ceph/1c7ef528-c6e5-11ee-95b6-87774f69a715:/var/log/ceph:z -v /tmp/ceph-tmp0x139bqo:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpy3f55sb8:/etc/ceph/ceph.conf:z quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph:eb583889924ee88759f8e8d8bd48edeb78bfd3c0 mgr module enable cephadm: Error ENOENT: module 'cephadm' reports that it cannot run on the active manager daemon: No module named 'mgr_module' (pass --force to force enablement)

[root@smithi002 ~]# vim /var/lib/ceph/1c7ef528-c6e5-11ee-95b6-87774f69a715/mgr.smithi002.hlzqyo/unit.run
[root@smithi002 ~]# cat /var/lib/ceph/1c7ef528-c6e5-11ee-95b6-87774f69a715/mgr.smithi002.hlzqyo/unit.run
set -e
/bin/install -d -m0770 -o 167 -g 167 /var/run/ceph/1c7ef528-c6e5-11ee-95b6-87774f69a715
# mgr.smithi002.hlzqyo
! /bin/podman rm -f ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-mgr.smithi002.hlzqyo 2> /dev/null
! /bin/podman rm -f ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-mgr-smithi002-hlzqyo 2> /dev/null
! /bin/podman rm -f --storage ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-mgr-smithi002-hlzqyo 2> /dev/null
! /bin/podman rm -f --storage ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-mgr.smithi002.hlzqyo 2> /dev/null
/bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/bin/ceph-mgr --init --name ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-mgr-smithi002-hlzqyo --pids-limit=-1 -d --log-driver journald --conmon-pidfile /run/ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715@mgr.smithi002.hlzqyo.service-pid --cidfile /run/ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715@mgr.smithi002.hlzqyo.service-cid --cgroups=split -e CONTAINER_IMAGE=quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph:eb583889924ee88759f8e8d8bd48edeb78bfd3c0 -e NODE_NAME=smithi002 -e TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728 -v /var/run/ceph/1c7ef528-c6e5-11ee-95b6-87774f69a715:/var/run/ceph:z -v /var/log/ceph/1c7ef528-c6e5-11ee-95b6-87774f69a715:/var/log/ceph:z -v /var/lib/ceph/1c7ef528-c6e5-11ee-95b6-87774f69a715/crash:/var/lib/ceph/crash:z -v /run/systemd/journal:/run/systemd/journal -v /var/lib/ceph/1c7ef528-c6e5-11ee-95b6-87774f69a715/mgr.smithi002.hlzqyo:/var/lib/ceph/mgr/ceph-smithi002.hlzqyo:z -v /var/lib/ceph/1c7ef528-c6e5-11ee-95b6-87774f69a715/mgr.smithi002.hlzqyo/config:/etc/ceph/ceph.conf:z -v /etc/hosts:/etc/hosts:ro quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph:eb583889924ee88759f8e8d8bd48edeb78bfd3c0 -n mgr.smithi002.hlzqyo -f --setuser ceph --setgroup ceph --default-log-to-file=false --default-log-to-journald=true --default-log-to-stderr=false
[root@smithi002 ~]# systemctl restart ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715@mgr.smithi002.hlzqyo.service
[root@smithi002 ~]# sudo /home/ubuntu/cephtest/cephadm shell
Inferring fsid 1c7ef528-c6e5-11ee-95b6-87774f69a715
Inferring config /var/lib/ceph/1c7ef528-c6e5-11ee-95b6-87774f69a715/mon.smithi002/config
Using ceph image with id '339a52f5ffac' and tag '742e4ab9212fcc3e520b729693dcac293c946fc0' created on 2024-02-02 08:14:02 +0000 UTC
quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph@sha256:36a27b2fc44154ed789d5bc8725dbf47f3eeeccf13f781b04610840bd69ea6eb
[ceph: root@smithi002 /]# ceph -s
  cluster:
    id:     1c7ef528-c6e5-11ee-95b6-87774f69a715
    health: HEALTH_WARN
            16 mgr modules have failed dependencies

  services:
    mon: 1 daemons, quorum smithi002 (age 2m)
    mgr: smithi002.hlzqyo(active, since 27s)
    osd: 0 osds: 0 up, 0 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:

[ceph: root@smithi002 /]# ceph health detail
HEALTH_WARN 16 mgr modules have failed dependencies
[WRN] MGR_MODULE_DEPENDENCY: 16 mgr modules have failed dependencies
    Module 'balancer' has failed dependency: No module named 'mgr_module'
    Module 'cephadm' has failed dependency: No module named 'mgr_module'
    Module 'crash' has failed dependency: No module named 'mgr_module'
    Module 'dashboard' has failed dependency: No module named 'mgr_module'
    Module 'devicehealth' has failed dependency: No module named 'mgr_module'
    Module 'iostat' has failed dependency: No module named 'mgr_module'
    Module 'nfs' has failed dependency: No module named 'mgr_module'
    Module 'orchestrator' has failed dependency: No module named 'mgr_module'
    Module 'pg_autoscaler' has failed dependency: No module named 'mgr_module'
    Module 'progress' has failed dependency: No module named 'mgr_module'
    Module 'prometheus' has failed dependency: No module named 'mgr_module'
    Module 'rbd_support' has failed dependency: No module named 'mgr_module'
    Module 'restful' has failed dependency: No module named 'mgr_module'
    Module 'status' has failed dependency: No module named 'mgr_module'
    Module 'telemetry' has failed dependency: No module named 'mgr_module'
    Module 'volumes' has failed dependency: No module named 'mgr_module'
[ceph: root@smithi002 /]# exit
vi[root@smithi002 ~]# vim /var/log/ceph/
1c7ef528-c6e5-11ee-95b6-87774f69a715/ cephadm.log
[root@smithi002 ~]# vim /var/log/ceph/
1c7ef528-c6e5-11ee-95b6-87774f69a715/ cephadm.log
[root@smithi002 ~]# vim /var/log/ceph/1c7ef528-c6e5-11ee-95b6-87774f69a715/ceph-mgr.smithi002.hlzqyo.log
[root@smithi002 ~]# cp /var/log/ceph/1c7ef528-c6e5-11ee-95b6-87774f69a715/ceph-mgr.smithi002.hlzqyo.log /tmp/
[root@smithi002 ~]# vim /tmp/ceph-mgr.smithi002.hlzqyo.log
[root@smithi002 ~]# sudo /home/ubuntu/cephtest/cephadm shell
Inferring fsid 1c7ef528-c6e5-11ee-95b6-87774f69a715
Inferring config /var/lib/ceph/1c7ef528-c6e5-11ee-95b6-87774f69a715/mon.smithi002/config
Using ceph image with id '339a52f5ffac' and tag '742e4ab9212fcc3e520b729693dcac293c946fc0' created on 2024-02-02 08:14:02 +0000 UTC
quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph@sha256:36a27b2fc44154ed789d5bc8725dbf47f3eeeccf13f781b04610840bd69ea6eb
[ceph: root@smithi002 /]# ceph config get mgr mgr_module_path
/usr/share/ceph/mgr
[ceph: root@smithi002 /]# exit
[root@smithi002 ~]# podman ps -a
CONTAINER ID  IMAGE                                                                                                                       COMMAND               CREATED         STATUS         PORTS       NAMES
ee74b6f9d25d  quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph:742e4ab9212fcc3e520b729693dcac293c946fc0                                 -n mon.smithi002 ...  41 minutes ago  Up 41 minutes              ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-mon-smithi002
084550bab5df  quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph@sha256:36a27b2fc44154ed789d5bc8725dbf47f3eeeccf13f781b04610840bd69ea6eb  -n client.ceph-ex...  40 minutes ago  Up 40 minutes              ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-ceph-exporter-smithi002
09155cce7e3e  quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph@sha256:36a27b2fc44154ed789d5bc8725dbf47f3eeeccf13f781b04610840bd69ea6eb  -n client.crash.s...  40 minutes ago  Up 40 minutes              ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-crash-smithi002
015d64d6c3b1  quay.io/prometheus/node-exporter:v1.5.0                                                                                     --no-collector.ti...  40 minutes ago  Up 40 minutes              ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-node-exporter-smithi002
fb5a54387a47  quay.io/ceph/ceph-grafana:9.4.12                                                                                            /bin/bash             39 minutes ago  Up 39 minutes              ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-grafana-smithi002
025ce38873f7  quay.io/prometheus/prometheus:v2.43.0                                                                                       --config.file=/et...  39 minutes ago  Up 39 minutes              ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-prometheus-smithi002
5566e6c58712  quay.io/prometheus/alertmanager:v0.25.0                                                                                     --cluster.listen-...  39 minutes ago  Up 39 minutes              ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-alertmanager-smithi002
4aa984fb1f79  quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph:eb583889924ee88759f8e8d8bd48edeb78bfd3c0                                 -n mgr.smithi002....  35 minutes ago  Up 35 minutes              ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-mgr-smithi002-hlzqyo
[root@smithi002 ~]# podman exec -it ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-mgr-smithi002-hlzqyo /bin/bash
[root@smithi002 /]# env
CEPH_VERSION=main
LANG=en_US.UTF-8
CEPH_POINT_RELEASE=
CEPH_DEVEL=true
which_declare=declare -f
container=podman
CEPH_REF=wip-yuri6-testing-2024-02-07-1000
TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728
PWD=/
HOME=/root
NODE_NAME=smithi002
CONTAINER_IMAGE=quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph:eb583889924ee88759f8e8d8bd48edeb78bfd3c0
TERM=xterm
I_AM_IN_A_CONTAINER=1
SHLVL=1
OSD_FLAVOR=default
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
LESSOPEN=||/usr/bin/lesspipe.sh %s
BASH_FUNC_which%%=() {  ( alias;
 eval ${which_declare} ) | /usr/bin/which --tty-only --read-alias --read-functions --show-tilde --show-dot $@
}
_=/usr/bin/env
[root@smithi002 /]# exit
exit
[root@smithi002 ~]# podman ps -a
CONTAINER ID  IMAGE                                                                                                                       COMMAND               CREATED         STATUS         PORTS       NAMES
ee74b6f9d25d  quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph:742e4ab9212fcc3e520b729693dcac293c946fc0                                 -n mon.smithi002 ...  42 minutes ago  Up 42 minutes              ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-mon-smithi002
084550bab5df  quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph@sha256:36a27b2fc44154ed789d5bc8725dbf47f3eeeccf13f781b04610840bd69ea6eb  -n client.ceph-ex...  41 minutes ago  Up 41 minutes              ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-ceph-exporter-smithi002
09155cce7e3e  quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph@sha256:36a27b2fc44154ed789d5bc8725dbf47f3eeeccf13f781b04610840bd69ea6eb  -n client.crash.s...  41 minutes ago  Up 41 minutes              ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-crash-smithi002
015d64d6c3b1  quay.io/prometheus/node-exporter:v1.5.0                                                                                     --no-collector.ti...  41 minutes ago  Up 41 minutes              ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-node-exporter-smithi002
fb5a54387a47  quay.io/ceph/ceph-grafana:9.4.12                                                                                            /bin/bash             41 minutes ago  Up 41 minutes              ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-grafana-smithi002
025ce38873f7  quay.io/prometheus/prometheus:v2.43.0                                                                                       --config.file=/et...  40 minutes ago  Up 40 minutes              ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-prometheus-smithi002
5566e6c58712  quay.io/prometheus/alertmanager:v0.25.0                                                                                     --cluster.listen-...  40 minutes ago  Up 40 minutes              ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-alertmanager-smithi002
4aa984fb1f79  quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph:eb583889924ee88759f8e8d8bd48edeb78bfd3c0                                 -n mgr.smithi002....  36 minutes ago  Up 36 minutes              ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715-mgr-smithi002-hlzqyo
[root@smithi002 ~]# ps -aef|grep ceph-mgr
root       46185   46183  0 02:03 ?        00:00:00 /run/podman-init -- /usr/bin/ceph-mgr -n mgr.smithi002.hlzqyo -f --setuser ceph --setgroup ceph --default-log-to-file=false --default-log-to-journald=true --default-log-to-stderr=false
167        46187   46185  0 02:03 ?        00:00:03 /usr/bin/ceph-mgr -n mgr.smithi002.hlzqyo -f --setuser ceph --setgroup ceph --default-log-to-file=false --default-log-to-journald=true --default-log-to-stderr=false
root       46613   26390  0 02:40 pts/0    00:00:00 grep --color=auto ceph-mgr
[root@smithi002 ~]# podman images
REPOSITORY                                          TAG                                       IMAGE ID      CREATED        SIZE
quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph  eb583889924ee88759f8e8d8bd48edeb78bfd3c0  38c4039b0e98  30 hours ago   1.32 GB
quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph  742e4ab9212fcc3e520b729693dcac293c946fc0  339a52f5ffac  6 days ago     1.32 GB
quay.io/ceph/ceph-grafana                           9.4.12                                    a0ea5e3c46d9  2 months ago   648 MB
quay.io/prometheus/prometheus                       v2.43.0                                   a07b618ecd1d  10 months ago  235 MB
quay.io/prometheus/alertmanager                     v0.25.0                                   c8568f914cd2  13 months ago  66.5 MB
quay.io/prometheus/node-exporter                    v1.5.0                                    0da6a335fe13  14 months ago  23.9 MB
[root@smithi002 ~]#
[root@smithi002 ~]#
[root@smithi002 ~]# systemctl stop ceph-1c7ef528-c6e5-11ee-95b6-87774f69a715@mgr.smithi002.hlzqyo.service
[root@smithi002 ~]# podman run -it --privileged --rm quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph:eb583889924ee88759f8e8d8bd48edeb78bfd3c0

[root@a63daccde388 /]# env
CEPH_VERSION=main
LANG=en_US.UTF-8
HOSTNAME=a63daccde388
CEPH_POINT_RELEASE=
CEPH_DEVEL=true
which_declare=declare -f
container=podman
CEPH_REF=wip-yuri6-testing-2024-02-07-1000
PWD=/
HOME=/root
TERM=xterm
I_AM_IN_A_CONTAINER=1
SHLVL=1
OSD_FLAVOR=default
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
LESSOPEN=||/usr/bin/lesspipe.sh %s
BASH_FUNC_which%%=() {  ( alias;
 eval ${which_declare} ) | /usr/bin/which --tty-only --read-alias --read-functions --show-tilde --show-dot $@
}
_=/usr/bin/env

[root@a63daccde388 /]# find /usr -iname mgr_module.py
/usr/share/ceph/mgr/mgr_module.py

After specifying the mgr modules path in PYTHONPATH, modules were imported successfully :

[root@a63daccde388 /]# export PYTHONPATH=/usr/share/ceph/mgr:$PYTHONPATH
[root@a63daccde388 /]# vi /etc/ceph/ceph.conf
[root@a63daccde388 /]# vi /etc/ceph/mgr.keyring
[root@a63daccde388 /]# ceph-mgr -n mgr.smithi002.hlzqyo -f --setuser ceph --setgroup ceph --keyring /etc/ceph/mgr.keyring
2024-02-09T03:00:15.350+0000 7fe632aeb200 -1 mgr[py] Module osd_perf_query has missing NOTIFY_TYPES member
2024-02-09T03:00:15.445+0000 7fe632aeb200 -1 mgr[py] Module snap_schedule has missing NOTIFY_TYPES member
2024-02-09T03:00:16.103+0000 7fe632aeb200 -1 mgr[py] Module zabbix has missing NOTIFY_TYPES member
2024-02-09T03:00:16.197+0000 7fe632aeb200 -1 mgr[py] Module balancer has missing NOTIFY_TYPES member
2024-02-09T03:00:16.340+0000 7fe632aeb200 -1 mgr[py] Module influx has missing NOTIFY_TYPES member
2024-02-09T03:00:16.437+0000 7fe632aeb200 -1 mgr[py] Module alerts has missing NOTIFY_TYPES member
2024-02-09T03:00:16.521+0000 7fe632aeb200 -1 mgr[py] Module iostat has missing NOTIFY_TYPES member
2024-02-09T03:00:16.787+0000 7fe632aeb200 -1 mgr[py] Module rgw has missing NOTIFY_TYPES member
2024-02-09T03:00:16.899+0000 7fe632aeb200 -1 mgr[py] Module rbd_support has missing NOTIFY_TYPES member
2024-02-09T03:00:16.987+0000 7fe632aeb200 -1 mgr[py] Module progress has missing NOTIFY_TYPES member
2024-02-09T03:00:17.083+0000 7fe632aeb200 -1 mgr[py] Module pg_autoscaler has missing NOTIFY_TYPES member
2024-02-09T03:00:17.259+0000 7fe632aeb200 -1 mgr[py] Module devicehealth has missing NOTIFY_TYPES member
2024-02-09T03:00:18.258+0000 7fe632aeb200 -1 mgr[py] Module rook has missing NOTIFY_TYPES member
2024-02-09T03:00:18.390+0000 7fe632aeb200 -1 mgr[py] Module diskprediction_local has missing NOTIFY_TYPES member
2024-02-09T03:00:18.726+0000 7fe632aeb200 -1 mgr[py] Module selftest has missing NOTIFY_TYPES member
2024-02-09T03:00:18.817+0000 7fe632aeb200 -1 mgr[py] Module telegraf has missing NOTIFY_TYPES member
2024-02-09T03:00:19.016+0000 7fe632aeb200 -1 mgr[py] Module test_orchestrator has missing NOTIFY_TYPES member
2024-02-09T03:00:19.204+0000 7fe632aeb200 -1 mgr[py] Module crash has missing NOTIFY_TYPES member
2024-02-09T03:00:19.491+0000 7fe632aeb200 -1 mgr[py] Module orchestrator has missing NOTIFY_TYPES member
2024-02-09T03:00:19.751+0000 7fe632aeb200 -1 mgr[py] Module osd_support has missing NOTIFY_TYPES member
2024-02-09T03:00:19.991+0000 7fe632aeb200 -1 mgr[py] Module volumes has missing NOTIFY_TYPES member
2024-02-09T03:00:20.812+0000 7fe632aeb200 -1 mgr[py] Module telemetry has missing NOTIFY_TYPES member
2024-02-09T03:00:21.387+0000 7fe632aeb200 -1 mgr[py] Module prometheus has missing NOTIFY_TYPES member
2024-02-09T03:00:21.486+0000 7fe632aeb200 -1 mgr[py] Module status has missing NOTIFY_TYPES member
2024-02-09T03:00:21.706+0000 7fe632aeb200 -1 mgr[py] Module nfs has missing NOTIFY_TYPES member
[09/Feb/2024:03:00:21] ENGINE Bus STARTING
CherryPy Checker:
The Application mounted at '' has an empty config.

[09/Feb/2024:03:00:21] ENGINE Serving on http://:::9283
[09/Feb/2024:03:00:21] ENGINE Bus STARTED
[09/Feb/2024:03:00:43] ENGINE Bus STOPPING
[09/Feb/2024:03:00:43] ENGINE HTTP Server cherrypy._cpwsgi_server.CPWSGIServer(('::', 9283)) shut down
[09/Feb/2024:03:00:43] ENGINE Bus STOPPED
[09/Feb/2024:03:00:43] ENGINE Bus STARTING
[09/Feb/2024:03:00:43] ENGINE Serving on http://:::9283
[09/Feb/2024:03:00:43] ENGINE Bus STARTED
[09/Feb/2024:03:00:43] ENGINE Bus STOPPING
[09/Feb/2024:03:00:43] ENGINE HTTP Server cherrypy._cpwsgi_server.CPWSGIServer(('::', 9283)) shut down
[09/Feb/2024:03:00:43] ENGINE Bus STOPPED
[09/Feb/2024:03:00:43] ENGINE Bus STARTING
[09/Feb/2024:03:00:43] ENGINE Serving on http://:::9283
[09/Feb/2024:03:00:43] ENGINE Bus STARTED

Actions #3

Updated by Radoslaw Zarzynski 3 months ago

note from a scrub: it seems the problem comes from an unmerged PR; if so, we should close this ticket.

Actions #4

Updated by Laura Flores about 1 month ago

/a/yuriw-2024-04-01_20:57:46-rados-wip-yuri3-testing-2024-04-01-0837-squid-distro-default-smithi/7634416

Actions #5

Updated by Laura Flores about 1 month ago

  • Backport set to squid
Actions #6

Updated by Laura Flores about 1 month ago

Possible fix: https://github.com/ceph/ceph/pull/56617

Although this reverted a commit that was only merged a few days ago. So the root cause for the original report is likely a different root cause.

Actions #7

Updated by Laura Flores 25 days ago

  • Project changed from RADOS to Orchestrator
Actions #8

Updated by Laura Flores 11 days ago

/a/lflores-2024-04-01_18:07:25-rados-wip-yuri8-testing-2024-03-25-1419-distro-default-smithi/7634080

Actions

Also available in: Atom PDF