Project

General

Profile

Actions

Bug #54250

closed

mgr/telemetry: telemetry module experiences an AssertionError when generating device metrics

Added by Laura Flores about 2 years ago. Updated about 2 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
telemetry module
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
quincy
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Nizamudeen and Ernesto reported experiencing this failure on the ceph-dev environment: https://github.com/rhcs-dashboard/ceph-dev

View from the CLI:

[root@ceph ceph]# ceph telemetry on --license sharing-1-0
2022-02-04T07:02:44.216+0000 7f587dea4700 -1 WARNING: all dangerous and experimental features are enabled.
2022-02-04T07:02:44.222+0000 7f587dea4700 -1 WARNING: all dangerous and experimental features are enabled.
Telemetry is on.
Some channels are disabled, please enable with:
`ceph telemetry enable channel perf`
[root@ceph ceph]# ceph telemetry status
2022-02-04T07:02:51.761+0000 7f5768191700 -1 WARNING: all dangerous and experimental features are enabled.
2022-02-04T07:02:51.764+0000 7f5768191700 -1 WARNING: all dangerous and experimental features are enabled.
Error EIO: Module 'telemetry' has experienced an error and cannot handle commands:

Snapshot from the mgr log:

2022-02-04T07:02:45.697+0000 7f8ead583700  0 [telemetry INFO root] Sent report to https://telemetry.ceph.com/report
2022-02-04T07:02:45.697+0000 7f8ead583700 -1 log_channel(cluster) log [ERR] : Unhandled exception from module 'telemetry' while running on mgr.x: 
2022-02-04T07:02:45.698+0000 7f8ead583700 -1 telemetry.serve:
2022-02-04T07:02:45.698+0000 7f8ead583700 -1 Traceback (most recent call last):
  File "/ceph/src/pybind/mgr/telemetry/module.py", line 1803, in serve
    self.send(self.last_report)
  File "/ceph/src/pybind/mgr/telemetry/module.py", line 1271, in send
    assert devices
AssertionError


Files

assertion-error.txt.gz (251 KB) assertion-error.txt.gz mgr log from fake-reproducing the error on a vstart cluster Laura Flores, 02/10/2022 10:57 PM
ceph-dev-assertion-error.txt.gz (448 KB) ceph-dev-assertion-error.txt.gz mgr log from actually reproducing the error in the ceph-dev environment Laura Flores, 02/10/2022 11:55 PM

Related issues 2 (0 open2 closed)

Related to Dashboard - Bug #54120: mgr/dashboard: dashboard turns telemetry off when configuring reportResolvedSarthak Gupta

Actions
Copied to mgr - Backport #54326: quincy: mgr/telemetry: telemetry module experiences an AssertionError when generating device metricsResolvedLaura FloresActions
Actions #1

Updated by Yaarit Hatuka about 2 years ago

  • Related to Bug #54120: mgr/dashboard: dashboard turns telemetry off when configuring report added
Actions #2

Updated by Laura Flores about 2 years ago

Was able to fake reproducing this error by setting "devices = {}" right before the assert line in the telemetry module and deploying a vstart cluster. I have attached the mgr log from that instance in a file titled "assertion-error.txt.gz".

Actions #3

Updated by Laura Flores about 2 years ago

Even better-- I reproduced the failure in the ceph-dev environment. Seems to be specific to that setup. A new, more accurate mgr log for this scenario is attached under the name "ceph-dev-assertion-error.txt.tz".

View from the ceph-dev CLI when reproduced:

╭─root@ceph-1 /ceph ‹master› 
╰─# ceph telemetry on
*** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH ***
2022-02-10T23:41:03.139+0000 7fb2fdffc640 -1 WARNING: all dangerous and experimental features are enabled.
2022-02-10T23:41:03.143+0000 7fb2fdffc640 -1 WARNING: all dangerous and experimental features are enabled.
Error EPERM: Telemetry data is licensed under the Community Data License Agreement - Sharing - Version 1.0 (https://cdla.io/sharing-1-0/).
To enable, add '--license sharing-1-0' to the 'ceph telemetry on' command.
╭─root@ceph-1 /ceph ‹master› 
╰─# ceph telemetry on --license sharing-1-0
*** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH ***
2022-02-10T23:41:18.237+0000 7f914196c640 -1 WARNING: all dangerous and experimental features are enabled.
2022-02-10T23:41:18.243+0000 7f914196c640 -1 WARNING: all dangerous and experimental features are enabled.
Telemetry is on.
Some channels are disabled, please enable with:
`ceph telemetry enable channel perf`
╭─root@ceph-1 /ceph ‹master› 
╰─# ceph telemetry enable channel perf
*** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH ***
2022-02-10T23:41:25.272+0000 7f2ed6868640 -1 WARNING: all dangerous and experimental features are enabled.
2022-02-10T23:41:25.278+0000 7f2ed6868640 -1 WARNING: all dangerous and experimental features are enabled.
Error EIO: Module 'telemetry' has experienced an error and cannot handle commands:

Also, I linked the wrong ceph-dev environment in the description. That one is strictly for testing the dashboard; this is the environment used to reproduce the error: https://github.com/ricardoasmarques/ceph-dev-docker

Actions #4

Updated by Laura Flores about 2 years ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 44994
Actions #5

Updated by Laura Flores about 2 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #6

Updated by Laura Flores about 2 years ago

  • Copied to Backport #54326: quincy: mgr/telemetry: telemetry module experiences an AssertionError when generating device metrics added
Actions #8

Updated by Laura Flores about 2 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF