Happened in the gibba cluster:
[lflores@gibba001 ~]$ sudo ceph -s
cluster:
id: 5363501e-fdf2-11ed-bac8-3cecef3d8fb8
health: HEALTH_WARN
1 pool(s) do not have an application enabled
1 mgr modules have recently crashed
services:
mon: 5 daemons, quorum gibba001,gibba002,gibba005,gibba003,gibba004 (age 38h)
mgr: gibba006.afdywy(active, since 38h), standbys: gibba008.nemumh
osd: 62 osds: 62 up (since 38h), 62 in (since 38h); 18 remapped pgs
rgw: 6 daemons active (6 hosts, 1 zones)
data:
pools: 6 pools, 257 pgs
objects: 83.37M objects, 318 GiB
usage: 1.1 TiB used, 9.4 TiB / 11 TiB avail
pgs: 20809893/250109739 objects misplaced (8.320%)
239 active+clean
18 active+remapped+backfilling
io:
client: 63 KiB/s rd, 0 B/s wr, 63 op/s rd, 42 op/s wr
recovery: 1.0 MiB/s, 266 objects/s
progress:
Global Recovery Event (0s)
[............................]
[lflores@gibba001 ~]$ sudo ceph health detail
HEALTH_WARN 1 pool(s) do not have an application enabled; 1 mgr modules have recently crashed
[WRN] POOL_APP_NOT_ENABLED: 1 pool(s) do not have an application enabled
application not enabled on pool 'foo'
use 'ceph osd pool application enable <pool-name> <app-name>', where <app-name> is 'cephfs', 'rbd', 'rgw', or freeform for custom applications.
[WRN] RECENT_MGR_MODULE_CRASH: 1 mgr modules have recently crashed
mgr module devicehealth crashed in daemon mgr.gibba001.nkuepu on host gibba001 at 2023-05-29T07:32:20.873598Z
[lflores@gibba001 ~]$ sudo ceph crash info 2023-05-29T07:32:20.873598Z_0465ae2d-0220-4d9b-9ef8-debf2e6a5d70
{
"backtrace": [
" File \"/usr/share/ceph/mgr/devicehealth/module.py\", line 764, in get_recent_device_metrics\n return self._get_device_metrics(devid, min_sample=min_sample)",
" File \"/usr/share/ceph/mgr/devicehealth/module.py\", line 553, in _get_device_metrics\n with self._db_lock, self.db:",
" File \"/usr/share/ceph/mgr/mgr_module.py\", line 1233, in db\n raise MgrDBNotReady();",
"mgr_module.MgrDBNotReady"
],
"ceph_version": "17.2.6",
"crash_id": "2023-05-29T07:32:20.873598Z_0465ae2d-0220-4d9b-9ef8-debf2e6a5d70",
"entity_name": "mgr.gibba001.nkuepu",
"mgr_module": "devicehealth",
"mgr_module_caller": "ActivePyModule::dispatch_remote get_recent_device_metrics",
"mgr_python_exception": "MgrDBNotReady",
"os_id": "centos",
"os_name": "CentOS Stream",
"os_version": "8",
"os_version_id": "8",
"process_name": "ceph-mgr",
"stack_sig": "fbbc6a4724a20738af8118fb5d84831008735002870daa3a76853a0dcaaa3f92",
"timestamp": "2023-05-29T07:32:20.873598Z",
"utsname_hostname": "gibba001",
"utsname_machine": "x86_64",
"utsname_release": "4.18.0-301.1.el8.x86_64",
"utsname_sysname": "Linux",
"utsname_version": "#1 SMP Tue Apr 13 16:24:22 UTC 2021"
}
From the mgr log:
2023-05-29T07:32:20.746+0000 7fe13d427700 0 [telemetry INFO root] Compiling and sending report to https://telemetry.ceph.com/report
2023-05-29T07:32:20.764+0000 7fe13d427700 0 [telemetry INFO root] Sending ceph report to: https://telemetry.ceph.com/report
2023-05-29T07:32:20.796+0000 7fe15c602700 0 [progress WARNING root] complete: ev c158f0be-5ee5-43ec-9dc4-5754658550ba does not exist
2023-05-29T07:32:20.796+0000 7fe15c602700 0 [progress WARNING root] complete: ev b16c5b1b-f70c-4902-a80a-58955b08c131 does not exist
2023-05-29T07:32:20.796+0000 7fe15c602700 0 [progress WARNING root] complete: ev d8460a9b-583b-4f9d-849c-3ed28768bbff does not exist
2023-05-29T07:32:20.796+0000 7fe15c602700 0 [progress WARNING root] complete: ev fbae7d8f-22a6-4ca3-8304-18a178d62c55 does not exist
2023-05-29T07:32:20.796+0000 7fe15c602700 0 [progress WARNING root] complete: ev 91c8d6fc-a976-4651-84f9-72dbc59c52b5 does not exist
2023-05-29T07:32:20.797+0000 7fe15c602700 0 [progress WARNING root] complete: ev 12f6ceb0-d855-4345-95cf-616f4429160b does not exist
2023-05-29T07:32:20.797+0000 7fe15c602700 0 [progress WARNING root] complete: ev b9af52da-d16d-4106-89b2-eb2220aff415 does not exist
2023-05-29T07:32:20.797+0000 7fe15c602700 0 [progress WARNING root] complete: ev 40bdf7b1-80d7-4fd3-beb6-069b394d7f31 does not exist
2023-05-29T07:32:20.821+0000 7fe1843a6700 0 [prometheus INFO cherrypy.error] [29/May/2023:07:32:20] ENGINE Serving on http://:::9283
2023-05-29T07:32:20.821+0000 7fe1843a6700 0 [prometheus INFO cherrypy.error] [29/May/2023:07:32:20] ENGINE Bus STARTED
2023-05-29T07:32:20.821+0000 7fe1843a6700 0 [prometheus INFO root] Engine started.
2023-05-29T07:32:20.871+0000 7fe13d427700 0 [telemetry INFO root] Sent report to https://telemetry.ceph.com/report
2023-05-29T07:32:20.872+0000 7fe13d427700 -1 Remote method threw exception: Traceback (most recent call last):
File "/usr/share/ceph/mgr/devicehealth/module.py", line 764, in get_recent_device_metrics
return self._get_device_metrics(devid, min_sample=min_sample)
File "/usr/share/ceph/mgr/devicehealth/module.py", line 553, in _get_device_metrics
with self._db_lock, self.db:
File "/usr/share/ceph/mgr/mgr_module.py", line 1233, in db
raise MgrDBNotReady();
mgr_module.MgrDBNotReady
2023-05-29T07:32:20.872+0000 7fe13d427700 0 [telemetry ERROR root] Unable to get recent metrics from device with id "TOSHIBA_MG04ACA1_Y9I3K2IYF6XF": Remote method threw exception: Traceback (most recent call last):
File "/usr/share/ceph/mgr/devicehealth/module.py", line 764, in get_recent_device_metrics
return self._get_device_metrics(devid, min_sample=min_sample)
File "/usr/share/ceph/mgr/devicehealth/module.py", line 553, in _get_device_metrics
with self._db_lock, self.db:
File "/usr/share/ceph/mgr/mgr_module.py", line 1233, in db
raise MgrDBNotReady();
mgr_module.MgrDBNotReady
2023-05-29T07:32:20.872+0000 7fe13d427700 0 [telemetry ERROR root] Unable to send device report: Device channel is on, but the generated report was empty.