Project

General

Profile

Bug #53603

mgr/telemetry: list index out of range in gather_device_report

Added by Yaarit Hatuka 10 months ago. Updated about 2 months ago.

Status:
Pending Backport
Priority:
Normal
Assignee:
Category:
telemetry module
Target version:
% Done:

0%


Description

Telemetry crashed on the gibba cluster:

# ceph health detail
HEALTH_ERR Module 'telemetry' has failed: list index out of range; 3 daemons have recently crashed
[ERR] MGR_MODULE_ERROR: Module 'telemetry' has failed: list index out of range
    Module 'telemetry' has failed: list index out of range

2021-12-14T16:44:38.413+0000 7f3372dd4700 -1 Traceback (most recent call last):
  File "/usr/share/ceph/mgr/telemetry/module.py", line 1152, in serve
    self.send(self.last_report)
  File "/usr/share/ceph/mgr/telemetry/module.py", line 963, in send
    devices = self.gather_device_report()
  File "/usr/share/ceph/mgr/telemetry/module.py", line 611, in gather_device_report
    host = d['location'][0]['host']
IndexError: list index out of range

Related issues

Related to mgr - Bug #53604: mgr/telemetry: list assignment index out of range in gather_crashinfo Resolved
Copied to mgr - Backport #53691: pacific: mgr/telemetry: list index out of range in gather_device_report New
Copied to mgr - Backport #53692: octopus: mgr/telemetry: list index out of range in gather_device_report New

History

#1 Updated by Vikhyat Umrao 10 months ago

2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr dispatch_remote Success calling 'ls'
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr ~Gil Destroying new thread state 0x55eb2c4a9600

2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr Gil Switched to new thread state 0x55eb2c4a9600
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr ~Gil Destroying new thread state 0x55eb2c4a9600
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr Gil Switched to new thread state 0x55eb2c4a9600
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr dispatch_remote Calling crash.do_info...
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr dispatch_remote Success calling 'do_info'
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr ~Gil Destroying new thread state 0x55eb2c4a9600
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr Gil Switched to new thread state 0x55eb2c4a9600
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr ~Gil Destroying new thread state 0x55eb2c4a9600
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr Gil Switched to new thread state 0x55eb2c4a9600
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr dispatch_remote Calling crash.do_info...
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr dispatch_remote Success calling 'do_info'
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr ~Gil Destroying new thread state 0x55eb2c4a9600
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr Gil Switched to new thread state 0x55eb2c4a9600
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr ~Gil Destroying new thread state 0x55eb2c4a9600

2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr Gil Switched to new thread state 0x55eb2c4a9600
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr dispatch_remote Calling crash.do_info...
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr dispatch_remote Success calling 'do_info'
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr ~Gil Destroying new thread state 0x55eb2c4a9600
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr Gil Switched to new thread state 0x55eb2c4a9600
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr ~Gil Destroying new thread state 0x55eb2c4a9600
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr Gil Switched to new thread state 0x55eb2c4a9600
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr dispatch_remote Calling crash.do_info...

2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr dispatch_remote Success calling 'do_info'
2021-12-14T18:48:04.034+0000 7ffa856af700 20 mgr ~Gil Destroying new thread state 0x55eb2c4a9600

2021-12-14T18:48:04.034+0000 7ffa856af700 -1 mgr handle_command module 'telemetry' command handler threw exception: list assignment index out of range

2021-12-14T18:48:04.116+0000 7ffa856af700 20 mgr ~Gil Destroying new thread state 0x55eb061b3b00
2021-12-14T18:48:04.116+0000 7ffa856af700 -1 mgr.server reply reply (22) Invalid argument Traceback (most recent call last):
  File "/usr/share/ceph/mgr/mgr_module.py", line 1648, in _handle_command
    return CLICommand.COMMANDS[cmd['prefix']].call(self, cmd, inbuf)
  File "/usr/share/ceph/mgr/mgr_module.py", line 434, in call
    return self.func(mgr, **kwargs)
  File "/usr/share/ceph/mgr/telemetry/module.py", line 1038, in show
    report = self.get_report(channels=channels)
  File "/usr/share/ceph/mgr/telemetry/module.py", line 1087, in get_report
    return self.compile_report(channels=channels)
  File "/usr/share/ceph/mgr/telemetry/module.py", line 900, in compile_report
    report['crashes'] = self.gather_crashinfo()
  File "/usr/share/ceph/mgr/telemetry/module.py", line 483, in gather_crashinfo
    c['backtrace'][-1] = '<redacted>'
IndexError: list assignment index out of range

2021-12-14T18:48:04.116+0000 7ffa856af700  1 -- [v2:172.21.2.101:7002/1319847290,v1:172.21.2.101:7003/1319847290] --> 172.21.2.102:0/1267436291 -- mgr_command_reply(tid 0: -22 Traceback (most recent call last):
  File "/usr/share/ceph/mgr/mgr_module.py", line 1648, in _handle_command
    return CLICommand.COMMANDS[cmd['prefix']].call(self, cmd, inbuf)
  File "/usr/share/ceph/mgr/mgr_module.py", line 434, in call
    return self.func(mgr, **kwargs)
  File "/usr/share/ceph/mgr/telemetry/module.py", line 1038, in show
    report = self.get_report(channels=channels)
  File "/usr/share/ceph/mgr/telemetry/module.py", line 1087, in get_report
    return self.compile_report(channels=channels)
  File "/usr/share/ceph/mgr/telemetry/module.py", line 900, in compile_report
    report['crashes'] = self.gather_crashinfo()
  File "/usr/share/ceph/mgr/telemetry/module.py", line 483, in gather_crashinfo
    c['backtrace'][-1] = '<redacted>'
IndexError: list assignment index out of range
) v1 -- 0x55eb10343a20 con 0x55eb03927800
2021-12-14T18:48:04.116+0000 7ffa856af700 10 mgr.server operator()  command returned -22
2021-12-14T18:48:04.116+0000 7ffa856af700 20 mgr Gil Switched to new thread state 0x55eb061b3b00
2021-12-14T18:48:04.116+0000 7ffa856af700 20 mgr ~Gil Destroying new thread state 0x55eb061b3b00

#2 Updated by Yaarit Hatuka 10 months ago

There is a separate tracker for the gather_crashinfo bug:
https://tracker.ceph.com/issues/53604

#3 Updated by Yaarit Hatuka 10 months ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 44327

#4 Updated by Sebastian Wagner 10 months ago

  • Related to Bug #53604: mgr/telemetry: list assignment index out of range in gather_crashinfo added

#5 Updated by Neha Ojha 9 months ago

  • Status changed from Fix Under Review to Pending Backport

#6 Updated by Backport Bot 9 months ago

  • Copied to Backport #53691: pacific: mgr/telemetry: list index out of range in gather_device_report added

#7 Updated by Backport Bot 9 months ago

  • Copied to Backport #53692: octopus: mgr/telemetry: list index out of range in gather_device_report added

#8 Updated by Backport Bot about 2 months ago

  • Tags set to backport_processed

Also available in: Atom PDF