Project

General

Profile

Actions

Bug #53603

closed

mgr/telemetry: list index out of range in gather_device_report

Added by Yaarit Hatuka over 2 years ago. Updated 12 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
telemetry module
Target version:
% Done:

0%


Description

Telemetry crashed on the gibba cluster:

# ceph health detail
HEALTH_ERR Module 'telemetry' has failed: list index out of range; 3 daemons have recently crashed
[ERR] MGR_MODULE_ERROR: Module 'telemetry' has failed: list index out of range
    Module 'telemetry' has failed: list index out of range

2021-12-14T16:44:38.413+0000 7f3372dd4700 -1 Traceback (most recent call last):
  File "/usr/share/ceph/mgr/telemetry/module.py", line 1152, in serve
    self.send(self.last_report)
  File "/usr/share/ceph/mgr/telemetry/module.py", line 963, in send
    devices = self.gather_device_report()
  File "/usr/share/ceph/mgr/telemetry/module.py", line 611, in gather_device_report
    host = d['location'][0]['host']
IndexError: list index out of range

Related issues 3 (0 open3 closed)

Related to mgr - Bug #53604: mgr/telemetry: list assignment index out of range in gather_crashinfoResolvedYaarit Hatuka

Actions
Copied to mgr - Backport #53691: pacific: mgr/telemetry: list index out of range in gather_device_reportRejectedActions
Copied to mgr - Backport #53692: octopus: mgr/telemetry: list index out of range in gather_device_reportRejectedActions
Actions #1

Updated by Vikhyat Umrao over 2 years ago

2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr dispatch_remote Success calling 'ls'
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr ~Gil Destroying new thread state 0x55eb2c4a9600

2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr Gil Switched to new thread state 0x55eb2c4a9600
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr ~Gil Destroying new thread state 0x55eb2c4a9600
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr Gil Switched to new thread state 0x55eb2c4a9600
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr dispatch_remote Calling crash.do_info...
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr dispatch_remote Success calling 'do_info'
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr ~Gil Destroying new thread state 0x55eb2c4a9600
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr Gil Switched to new thread state 0x55eb2c4a9600
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr ~Gil Destroying new thread state 0x55eb2c4a9600
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr Gil Switched to new thread state 0x55eb2c4a9600
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr dispatch_remote Calling crash.do_info...
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr dispatch_remote Success calling 'do_info'
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr ~Gil Destroying new thread state 0x55eb2c4a9600
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr Gil Switched to new thread state 0x55eb2c4a9600
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr ~Gil Destroying new thread state 0x55eb2c4a9600

2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr Gil Switched to new thread state 0x55eb2c4a9600
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr dispatch_remote Calling crash.do_info...
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr dispatch_remote Success calling 'do_info'
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr ~Gil Destroying new thread state 0x55eb2c4a9600
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr Gil Switched to new thread state 0x55eb2c4a9600
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr ~Gil Destroying new thread state 0x55eb2c4a9600
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr Gil Switched to new thread state 0x55eb2c4a9600
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr dispatch_remote Calling crash.do_info...

2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr dispatch_remote Success calling 'do_info'
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr ~Gil Destroying new thread state 0x55eb2c4a9600

2021-12-14T18:48:04.034+0000 7ffa856af700 -1 mgr handle_command module 'telemetry' command handler threw exception: list assignment index out of range

2021-12-14T18:48:04.116+0000 7ffa856af700 20 mgr ~Gil Destroying new thread state 0x55eb061b3b00
2021-12-14T18:48:04.116+0000 7ffa856af700 -1 mgr.server reply reply (22) Invalid argument Traceback (most recent call last):
  File "/usr/share/ceph/mgr/mgr_module.py", line 1648, in _handle_command
    return CLICommand.COMMANDS[cmd['prefix']].call(self, cmd, inbuf)
  File "/usr/share/ceph/mgr/mgr_module.py", line 434, in call
    return self.func(mgr, **kwargs)
  File "/usr/share/ceph/mgr/telemetry/module.py", line 1038, in show
    report = self.get_report(channels=channels)
  File "/usr/share/ceph/mgr/telemetry/module.py", line 1087, in get_report
    return self.compile_report(channels=channels)
  File "/usr/share/ceph/mgr/telemetry/module.py", line 900, in compile_report
    report['crashes'] = self.gather_crashinfo()
  File "/usr/share/ceph/mgr/telemetry/module.py", line 483, in gather_crashinfo
    c['backtrace'][-1] = '<redacted>'
IndexError: list assignment index out of range

2021-12-14T18:48:04.116+0000 7ffa856af700  1 -- [v2:172.21.2.101:7002/1319847290,v1:172.21.2.101:7003/1319847290] --> 172.21.2.102:0/1267436291 -- mgr_command_reply(tid 0: -22 Traceback (most recent call last):
  File "/usr/share/ceph/mgr/mgr_module.py", line 1648, in _handle_command
    return CLICommand.COMMANDS[cmd['prefix']].call(self, cmd, inbuf)
  File "/usr/share/ceph/mgr/mgr_module.py", line 434, in call
    return self.func(mgr, **kwargs)
  File "/usr/share/ceph/mgr/telemetry/module.py", line 1038, in show
    report = self.get_report(channels=channels)
  File "/usr/share/ceph/mgr/telemetry/module.py", line 1087, in get_report
    return self.compile_report(channels=channels)
  File "/usr/share/ceph/mgr/telemetry/module.py", line 900, in compile_report
    report['crashes'] = self.gather_crashinfo()
  File "/usr/share/ceph/mgr/telemetry/module.py", line 483, in gather_crashinfo
    c['backtrace'][-1] = '<redacted>'
IndexError: list assignment index out of range
) v1 -- 0x55eb10343a20 con 0x55eb03927800
2021-12-14T18:48:04.116+0000 7ffa856af700 10 mgr.server operator()  command returned -22
2021-12-14T18:48:04.116+0000 7ffa856af700 20 mgr Gil Switched to new thread state 0x55eb061b3b00
2021-12-14T18:48:04.116+0000 7ffa856af700 20 mgr ~Gil Destroying new thread state 0x55eb061b3b00
Actions #2

Updated by Yaarit Hatuka over 2 years ago

There is a separate tracker for the gather_crashinfo bug:
https://tracker.ceph.com/issues/53604

Actions #3

Updated by Yaarit Hatuka over 2 years ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 44327
Actions #4

Updated by Sebastian Wagner over 2 years ago

  • Related to Bug #53604: mgr/telemetry: list assignment index out of range in gather_crashinfo added
Actions #5

Updated by Neha Ojha over 2 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #6

Updated by Backport Bot over 2 years ago

  • Copied to Backport #53691: pacific: mgr/telemetry: list index out of range in gather_device_report added
Actions #7

Updated by Backport Bot over 2 years ago

  • Copied to Backport #53692: octopus: mgr/telemetry: list index out of range in gather_device_report added
Actions #8

Updated by Backport Bot over 1 year ago

  • Tags set to backport_processed
Actions #9

Updated by Konstantin Shalygin 12 months ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF