Project

General

Profile

Actions

Feature #12773

open

automate identifying infrastructure issues in nightly runs

Added by Tamilarasi muthamizhan over 8 years ago. Updated over 8 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Reviewed:
Affected Versions:

Description

The feature to be added is to: identify most common error messages that are reported in the nightlies as a result of infrastructure issues.

This is by looking/searching for corresponding error messages in the database and reporting it as part of the nightly runs.

by automating this, we can save manual efforts and time in tracking those failures and filing individual tickets for them by different teams.

please add to this, your thoughts about best resolution to this.

Actions #1

Updated by Zack Cerza over 8 years ago

  • Assignee deleted (Zack Cerza)

Typed this up a while back and forgot to hit submit:

I spent some time in the last couple days investigating some of the ways we could approach this:

  • An approach similar to John Spray's scrape.py
  • Normalization and grouping of failure_reason values in paddles
  • Sentry

Ultimately for now Sentry makes the most sense. We had only been using it for Sepia, but I also configured it for Octo.

Actions

Also available in: Atom PDF