Project

General

Profile

Feature #12773

automate identifying infrastructure issues in nightly runs

Added by Tamilarasi muthamizhan almost 4 years ago. Updated almost 4 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
Start date:
08/24/2015
Due date:
% Done:

0%

Source:
Q/A
Tags:
Backport:
Reviewed:
Affected Versions:

Description

The feature to be added is to: identify most common error messages that are reported in the nightlies as a result of infrastructure issues.

This is by looking/searching for corresponding error messages in the database and reporting it as part of the nightly runs.

by automating this, we can save manual efforts and time in tracking those failures and filing individual tickets for them by different teams.

please add to this, your thoughts about best resolution to this.

History

#1 Updated by Zack Cerza almost 4 years ago

  • Assignee deleted (Zack Cerza)

Typed this up a while back and forgot to hit submit:

I spent some time in the last couple days investigating some of the ways we could approach this:

  • An approach similar to John Spray's scrape.py
  • Normalization and grouping of failure_reason values in paddles
  • Sentry

Ultimately for now Sentry makes the most sense. We had only been using it for Sepia, but I also configured it for Octo.

Also available in: Atom PDF