Project

General

Profile

Actions

Bug #65073

open

pybind/mgr/stats/fs: log exceptions to cluster log

Added by Patrick Donnelly about 1 month ago. Updated 29 days ago.

Status:
Fix Under Review
Priority:
High
Assignee:
Category:
Correctness/Safety
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
squid,reef,quincy
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
cephfs-top, tools
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

There are exceptions raised in the module which are not failing tests:

2024-03-20T21:38:38.702 INFO:tasks.ceph.mgr.x.smithi007.stderr:Exception in thread Thread-3:
2024-03-20T21:38:38.702 INFO:tasks.ceph.mgr.x.smithi007.stderr:Traceback (most recent call last):
2024-03-20T21:38:38.702 INFO:tasks.ceph.mgr.x.smithi007.stderr:  File "/usr/lib64/python3.9/threading.py", line 980, in _bootstrap_inner
2024-03-20T21:38:38.704 DEBUG:teuthology.orchestra.run.smithi007:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph osd pool get cephfs_data pg_num
2024-03-20T21:38:38.712 INFO:tasks.ceph.mgr.x.smithi007.stderr:    self.run()
2024-03-20T21:38:38.712 INFO:tasks.ceph.mgr.x.smithi007.stderr:  File "/usr/lib64/python3.9/threading.py", line 1306, in run
2024-03-20T21:38:38.712 INFO:tasks.ceph.mgr.x.smithi007.stderr:    self.function(*self.args, **self.kwargs)
2024-03-20T21:38:38.712 INFO:tasks.ceph.mgr.x.smithi007.stderr:  File "/usr/share/ceph/mgr/stats/fs/perf_stats.py", line 222, in re_register_queries
2024-03-20T21:38:38.713 INFO:tasks.ceph.mgr.x.smithi007.stderr:    if self.mx_last_updated >= ua_last_updated:
2024-03-20T21:38:38.713 INFO:tasks.ceph.mgr.x.smithi007.stderr:AttributeError: 'FSPerfStats' object has no attribute 'mx_last_updated'

From: /teuthology/pdonnell-2024-03-20_18:16:52-fs-wip-batrick-testing-20240320.145742-distro-default-smithi/7613026/teuthology.log

Exceptions should not be in the mgr log at all. It pollutes the log making grep for actual errors difficult.

If this is a genuine error, log it to the clog to fail the test. Otherwise, handle it quietly.

Actions #1

Updated by Venky Shankar about 1 month ago

  • Category set to Correctness/Safety
  • Status changed from New to Triaged
  • Assignee set to Jos Collin
  • Backport changed from squid,reef to squid,reef,quincy
Actions #2

Updated by Jos Collin about 1 month ago

  • Status changed from Triaged to Fix Under Review
  • Pull request ID set to 56525
  • Component(FS) cephfs-top, tools added
Actions #3

Updated by Venky Shankar 29 days ago

This can happen when FSPerfStats.re_register_queries is called before mgr/stats can process a single mds report.

Actions

Also available in: Atom PDF