Project

General

Profile

Actions

Bug #62534

open

"Misunderstood line" in teuthology scrape.log

Added by Laura Flores 8 months ago. Updated 8 months ago.

Status:
Fix Under Review
Priority:
Normal
Category:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

/a/yuriw-2023-08-17_21:18:20-rados-wip-yuri11-testing-2023-08-17-0823-distro-default-smithi/scrape.log

Found 309 jobs
Misunderstood line:
Missing teuthology log /home/teuthworker/archive/yuriw-2023-08-17_21:18:20-rados-wip-yuri11-testing-2023-08-17-0823-distro-default-smithi/7372213/teuthology.log
Misunderstood line:
Misunderstood line:
Misunderstood line:

Actions #1

Updated by Kamoltat (Junior) Sirivadhna 8 months ago

  • Assignee set to Kamoltat (Junior) Sirivadhna
Actions #2

Updated by Kamoltat (Junior) Sirivadhna 8 months ago

This issue occurred due to greping through teuthology.log for `</kind> in `
Once it finds the the line that has this grep, it will try to searche for a match for a regular expression at the beginning of a string.
In this case it searches for "<kind>(.+)</kind> in .+/(.+)"

These are all the lines in the run that contains `</kind> in `
There are 4 distinct jobs that grep was able to catch

ksirivad@teuthology:/a/yuriw-2023-08-17_21:18:20-rados-wip-yuri11-testing-2023-08-17-0823-distro-default-smithi$ grep -nr "</kind> in " 

7372158/teuthology.log:3164:2023-08-17T23:09:30.783 ERROR:tasks.ceph:saw valgrind issue   <kind>Leak_StillReachable</kind> in /var/log/ceph/valgrind/mon.a.log
7372158/teuthology.log:3166:2023-08-17T23:09:30.784 ERROR:tasks.ceph:saw valgrind issue   <kind>Leak_StillReachable</kind> in /var/log/ceph/valgrind/mon.b.log
7372158/teuthology.log:3168:2023-08-17T23:09:30.784 ERROR:tasks.ceph:saw valgrind issue   <kind>Leak_StillReachable</kind> in /var/log/ceph/valgrind/mon.c.log
7372092/teuthology.log:33841:2023-08-17T23:09:36.407 ERROR:tasks.ceph:saw valgrind issue   <kind>Leak_StillReachable</kind> in /var/log/ceph/valgrind/mon.a.log
7372092/teuthology.log:33843:2023-08-17T23:09:36.407 ERROR:tasks.ceph:saw valgrind issue   <kind>Leak_StillReachable</kind> in /var/log/ceph/valgrind/mon.c.log
7372092/teuthology.log:33846:2023-08-17T23:09:36.432 ERROR:tasks.ceph:saw valgrind issue   <kind>Leak_StillReachable</kind> in /var/log/ceph/valgrind/mon.b.log
7372159/teuthology.log:7464:2023-08-17T23:30:35.303 ERROR:tasks.ceph:saw valgrind issue   <kind>Leak_StillReachable</kind> in /var/log/ceph/valgrind/mon.a.log
7372159/teuthology.log:7466:2023-08-17T23:30:35.303 ERROR:tasks.ceph:saw valgrind issue   <kind>Leak_StillReachable</kind> in /var/log/ceph/valgrind/mon.c.log
7372159/teuthology.log:7469:2023-08-17T23:30:35.334 ERROR:tasks.ceph:saw valgrind issue   <kind>Leak_StillReachable</kind> in /var/log/ceph/valgrind/mon.b.log
7372228/teuthology.log:19527:2023-08-18T01:07:25.312 ERROR:tasks.ceph:saw valgrind issue   <kind>Leak_StillReachable</kind> in /var/log/ceph/valgrind/mon.a.log
7372228/teuthology.log:19529:2023-08-18T01:07:25.312 ERROR:tasks.ceph:saw valgrind issue   <kind>Leak_StillReachable</kind> in /var/log/ceph/valgrind/mon.c.log
7372228/teuthology.log:19532:2023-08-18T01:07:25.338 ERROR:tasks.ceph:saw valgrind issue   <kind>Leak_StillReachable</kind> in /var/log/ceph/valgrind/mon.b.log

Scrape correctly identifies 4 Jobs with valgrind error:

Found 16 distinct failure reasons
Valgrind: mon (Leak_StillReachable)
4 jobs: ['7372158', '7372092', '7372159', '7372228']
suites intersection: []
suites union: ['2-inject-leak/none', 'centos_latest}', 'ceph', 'clusters/{fixed-2', 'd-thrash/none', 'mon_election/classic', 'msgr-failures/few', 'msgr/async', 'msgr/async-v1only', 'msgr/async-v2only', 'objectstore/bluestore-bitmap', 'objectstore/bluestore-comp-snappy', 'objectstore/bluestore-comp-zstd', 'openstack}', 'rados', 'rados/valgrind-leaks/{1-start', 'rados/verify/{centos_latest', 'tasks/mon_recovery', 'tasks/rados_api_tests', 'tasks/rados_cls_all', 'validater/valgrind}']

Real question is how did empty string gets picked up by the initial grep of `</kind> in ` and obviously gets rejected during regex matching against "<kind>(.+)</kind> in .+/(.+)"
Since:

            if not match:
                log.warning("Misunderstood line: {0}".format(line))
                continue

We are printing blank, meaning we caught empty string during the grep process

Misunderstood line:
Actions #3

Updated by Kamoltat (Junior) Sirivadhna 8 months ago

This is the possible culprit `teuthology/scrape.py`

def grep(path, expr):
    """ 
    Call out to native grep rather than feeding massive log files through python line by line
    """ 
    p = subprocess.Popen(["grep", expr, path], stdout=subprocess.PIPE,
                         universal_newlines=True)
    p.wait()
    out, err = p.communicate()
    if p.returncode == 0:
        return out.split("\n")
    else:
        return []

This chunk of code returns a list which gets traversed by the function `_get_service_types`, the list probably contains empty string

Actions #4

Updated by Kamoltat (Junior) Sirivadhna 8 months ago

  • Status changed from New to Fix Under Review
Actions

Also available in: Atom PDF