Project

General

Profile

HOWTO describe a test result » History » Revision 7

Revision 6 (Loïc Dachary, 09/29/2015 03:22 PM) → Revision 7/8 (Loïc Dachary, 02/26/2016 06:02 AM)

h3. Format of teuthology analysis entries 

 For instance http://tracker.ceph.com/issues/11153#rgw is structured like so: 

 * In chronological order 
 * The command line (that can be copy/pasted) used to run the suite 
 * A bullet point with the URL to the suite run in pulpito prefixed by 
 ** *running* if the run is not complete 
 ** *fail* *red* if the run has at least one error 
 ** *green* if the run has no error 
 * If the run has at least one error, the output of the fail formatter snippet is added and edited when the errors are analyzed (call with *python fail.py loic-2015-04-21_10:20:06-rados-firefly-backports---basic-multi fail* or *python fail.py loic-2015-04-21_10:20:06-rados-firefly-backports---basic-multi fail 167.114.249.14* if using teuthology-openstack and the IP is where the pulpito results are available)  
 <pre> 
 import re 
 import sys 
 import requests 

 if len(sys.argv) > 3: 
   host = sys.argv[3] 
   paddle_host = host + ":8080" 
   pulpito_host = host + ":8081" 
 else: 
   paddle_host = 'paddles.front.sepia.ceph.com' 
   pulpito_host = 'pulpito.ceph.com' 

 paddle = ("http://" + paddle_host + "/runs/" + 
           sys.argv[1] + 
           "/jobs/?status=" + sys.argv[2]) 
 pulpito = ("http://" + pulpito_host + "/" + 
            sys.argv[1]) 
 failure2jobs = {} 

 def normalize(failure): 
     if 'wget -O- ' in failure: 
         return re.findall('"(.*)"', failure)[0] 
     if 'Command failed' in failure: 
         return re.findall('Command failed.*?:\s*(.*)', failure, )[0] 
     else: 
         return failure 


 for job in requests.get(paddle).json(): 
     failure2jobs.setdefault(normalize(job['failure_reason']), []).append(job) 

 for (failure, jobs) in failure2jobs.iteritems(): 
     print "** *" + failure + "*" 
     for job in jobs: 
         print '*** "' + job['description'] + '":' + pulpito + '/' + job['job_id'] 

 </pre> 

 The idea is to run the script after the entire suite completes (though this is not strictly necessary - jobs can be re-started even while the original suite is still running, but then one can easily get confused). The output of the script is copy/pasted into the release tracker issue. Then, each failure is analyzed and marked as belonging to one of the following categories: 

 * When an error is analyzed, the link to the error is prefixed with 
 ** *environmental noise* if it must be run again because it failed for reasons unrelated to the test itself (DNS error etc.) 
 ** *known bug* and a URL to the bug (not just the number of the bug) 
 ** *new bug* and a URL to the newly created bug if it was discovered during the analysis of this error: it is likely to be a regression 
 ** *can be ignored* and a URL to the bug (not just the number of the bug) and the reason why it can be ignored (possibly a link to a mail thread)