Project

General

Profile

Actions

Feature #59561

open

mgr/prometheus,exporter: add integration test against TargetDown alert, "reading text format failed" error, etc

Added by Pere Díaz Bou about 1 year ago. Updated 5 months ago.

Status:
In Progress
Priority:
Urgent
Assignee:
Category:
prometheus module
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Reviewed:
Affected Versions:
Pull request ID:

Description

The exporter and the mgr prometheus module lacks integration testing and since both should behave in similarly when it comes to perf counters, we must add integration tests
that are smart to test the same thing by switching the module endpoint and the exporter endpoint.

Furthermore, unit testing the exporter module is a need.


Related issues 1 (1 open0 closed)

Related to Dashboard - Bug #63849: exporter: Enhance unit-tests coverage for exporterFix Under ReviewAvan Thakkar

Actions
Actions #1

Updated by Ilya Dryomov about 1 year ago

  • Subject changed from mgr/prometheus/exporter: add testing infrastructure (integration and unittest) to mgr/prometheus/exporter: add integration infrastructure (integration and unittest)
  • Priority changed from Normal to Urgent
Actions #2

Updated by Ilya Dryomov about 1 year ago

  • Subject changed from mgr/prometheus/exporter: add integration infrastructure (integration and unittest) to mgr/prometheus,exporter: add integration test against TargetDown alert, "reading text format failed" error, etc

I have bumped the priority and adjusted the subject to not mention unit tests -- that part is specific to exporter and is not as pressing.

Actions #3

Updated by Ilya Dryomov 6 months ago

This ticket already has the priority of Urgent, but it clearly hasn't taken effect. I'm writing this comment to raise awareness of this gap. End-to-end test coverage for perf counters -> exporter daemon -> prometheus metrics is desperately needed because the thing keeps getting broken.

In April, 17.2.6 went out with a regression reported by a user in https://github.com/ceph/ceph/pull/50718. The immediate workaround was to disable exporter deployment in Rook (https://github.com/rook/rook/pull/12077) and later revert the corresponding changes in cephadm (https://github.com/ceph/ceph/pull/51053).

Pere addressed the root cause in exporter daemon in https://github.com/ceph/ceph/pull/51069. That PR was approved based on manual tests and the commitment to fulfill this ticket, see https://github.com/ceph/ceph/pull/51069#pullrequestreview-1399729291.

About a month later, in May, https://github.com/ceph/ceph/pull/50749 got merged into quincy and completely wrecked exporter daemon. Because there are no automated tests, this went unnoticed until yesterday -- just a day before the planned 17.2.7 release date. The revert happened in https://github.com/ceph/ceph/pull/54169 and builds would need to be respinned now.

Despite exporter being partially disabled in quincy, it's still something that is shipped. Further, and much more importantly, I don't think the situation with regards to automated testing is noticeably better in reef or main, which means that exporter is pretty much bound to break again in a release where it's enabled by default.

Avan, what can we do to actually prioritize this ticket?

Actions #4

Updated by Avan Thakkar 6 months ago

  • Assignee deleted (Avan Thakkar)

Thank you for raising this concern. While I understand its importance, my current workload prevents me from taking on additional tasks at this time atleast for a week or 2. I recommend discussing this with Juan Mi/ Divyansh involved in Monitoring with whom I've discussed atleast plan to start with the testing of exporter a couple weeks back so they might be able to assist. Please feel free to reach out if you have any other questions or concerns.

Actions #5

Updated by Avan Thakkar 5 months ago

  • Status changed from New to In Progress
  • Assignee set to Avan Thakkar
Actions #6

Updated by Avan Thakkar 4 months ago

  • Related to Bug #63849: exporter: Enhance unit-tests coverage for exporter added
Actions

Also available in: Atom PDF