Bug #54250
closed
mgr/telemetry: telemetry module experiences an AssertionError when generating device metrics
Added by Laura Flores about 2 years ago.
Updated about 2 years ago.
Category:
telemetry module
Description
Nizamudeen and Ernesto reported experiencing this failure on the ceph-dev environment: https://github.com/rhcs-dashboard/ceph-dev
View from the CLI:
[root@ceph ceph]# ceph telemetry on --license sharing-1-0
2022-02-04T07:02:44.216+0000 7f587dea4700 -1 WARNING: all dangerous and experimental features are enabled.
2022-02-04T07:02:44.222+0000 7f587dea4700 -1 WARNING: all dangerous and experimental features are enabled.
Telemetry is on.
Some channels are disabled, please enable with:
`ceph telemetry enable channel perf`
[root@ceph ceph]# ceph telemetry status
2022-02-04T07:02:51.761+0000 7f5768191700 -1 WARNING: all dangerous and experimental features are enabled.
2022-02-04T07:02:51.764+0000 7f5768191700 -1 WARNING: all dangerous and experimental features are enabled.
Error EIO: Module 'telemetry' has experienced an error and cannot handle commands:
Snapshot from the mgr log:
2022-02-04T07:02:45.697+0000 7f8ead583700 0 [telemetry INFO root] Sent report to https://telemetry.ceph.com/report
2022-02-04T07:02:45.697+0000 7f8ead583700 -1 log_channel(cluster) log [ERR] : Unhandled exception from module 'telemetry' while running on mgr.x:
2022-02-04T07:02:45.698+0000 7f8ead583700 -1 telemetry.serve:
2022-02-04T07:02:45.698+0000 7f8ead583700 -1 Traceback (most recent call last):
File "/ceph/src/pybind/mgr/telemetry/module.py", line 1803, in serve
self.send(self.last_report)
File "/ceph/src/pybind/mgr/telemetry/module.py", line 1271, in send
assert devices
AssertionError
Files
- Related to Bug #54120: mgr/dashboard: dashboard turns telemetry off when configuring report added
Was able to fake reproducing this error by setting "devices = {}" right before the assert line in the telemetry module and deploying a vstart cluster. I have attached the mgr log from that instance in a file titled "assertion-error.txt.gz".
Even better-- I reproduced the failure in the ceph-dev environment. Seems to be specific to that setup. A new, more accurate mgr log for this scenario is attached under the name "ceph-dev-assertion-error.txt.tz".
View from the ceph-dev CLI when reproduced:
╭─root@ceph-1 /ceph ‹master›
╰─# ceph telemetry on
*** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH ***
2022-02-10T23:41:03.139+0000 7fb2fdffc640 -1 WARNING: all dangerous and experimental features are enabled.
2022-02-10T23:41:03.143+0000 7fb2fdffc640 -1 WARNING: all dangerous and experimental features are enabled.
Error EPERM: Telemetry data is licensed under the Community Data License Agreement - Sharing - Version 1.0 (https://cdla.io/sharing-1-0/).
To enable, add '--license sharing-1-0' to the 'ceph telemetry on' command.
╭─root@ceph-1 /ceph ‹master›
╰─# ceph telemetry on --license sharing-1-0
*** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH ***
2022-02-10T23:41:18.237+0000 7f914196c640 -1 WARNING: all dangerous and experimental features are enabled.
2022-02-10T23:41:18.243+0000 7f914196c640 -1 WARNING: all dangerous and experimental features are enabled.
Telemetry is on.
Some channels are disabled, please enable with:
`ceph telemetry enable channel perf`
╭─root@ceph-1 /ceph ‹master›
╰─# ceph telemetry enable channel perf
*** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH ***
2022-02-10T23:41:25.272+0000 7f2ed6868640 -1 WARNING: all dangerous and experimental features are enabled.
2022-02-10T23:41:25.278+0000 7f2ed6868640 -1 WARNING: all dangerous and experimental features are enabled.
Error EIO: Module 'telemetry' has experienced an error and cannot handle commands:
Also, I linked the wrong ceph-dev environment in the description. That one is strictly for testing the dashboard; this is the environment used to reproduce the error: https://github.com/ricardoasmarques/ceph-dev-docker
- Status changed from New to Fix Under Review
- Pull request ID set to 44994
- Status changed from Fix Under Review to Pending Backport
- Copied to Backport #54326: quincy: mgr/telemetry: telemetry module experiences an AssertionError when generating device metrics added
- Status changed from Pending Backport to Resolved
Also available in: Atom
PDF