HOWTO forensic analysis of integration and upgrade tests » History » Version 5
Loïc Dachary, 05/15/2015 12:07 PM
1 | 2 | Loïc Dachary | h3. Steps |
---|---|---|---|
2 | 2 | Loïc Dachary | |
3 | 3 | Loïc Dachary | * For a given job, there is a pulpito page (for instance http://pulpito.ceph.com/loic-2015-05-13_00:58:29-rados-firefly-backports---basic-multi/888125/) |
4 | 4 | Loïc Dachary | * Research tracker.ceph.com for the error string to find existing issues. For instance http://pulpito.ceph.com/loic-2015-05-13_00:58:29-rados-firefly-backports---basic-multi/888125/ has *ceph-objectstore-tool: import failure with status 139* which has "a few issues associated with it":http://tracker.ceph.com/projects/ceph/search?utf8=%E2%9C%93&issues=1&q=ceph-objectstore-tool%3A+import+failure+with+status+139 |
5 | 5 | Loïc Dachary | * If an issue is found and it looks like knowing it happened one more time is useful, add a comment with a link to the failed job and the relevant quote from the logs. |
6 | 3 | Loïc Dachary | * Click *All details...* to show the YAML file and *Control-f description* to see the job description which is the list of YAML files that were used to create the job. They can be found at https://github.com/ceph/ceph-qa-suite/blob/firefly/suites (where *firefly* can be replaced by the stable release name). |
7 | 3 | Loïc Dachary | * Download the teuthology logs from the link provided by the pulpito page (for instance http://qa-proxy.ceph.com/teuthology/loic-2015-05-13_00:58:29-rados-firefly-backports---basic-multi/888125/teuthology.log) |
8 | 3 | Loïc Dachary | * Explore the logs and core dumps collected by teuthology. If the log is at http://qa-proxy.ceph.com/teuthology/loic-2015-05-13_00:58:29-rados-firefly-backports---basic-multi/888125/teuthology.log the rest can be found by removing the teuthology.log part of the path, i.e. http://qa-proxy.ceph.com/teuthology/loic-2015-05-13_00:58:29-rados-firefly-backports---basic-multi/888125/ |
9 | 3 | Loïc Dachary | * In the teuthology log, look for the first *Traceback* and look around it: this is when something went wrong first. |
10 | 3 | Loïc Dachary | <pre> |
11 | 3 | Loïc Dachary | 2015-05-15T03:56:10.905 ERROR:teuthology.contextutil:Saw exception from nested tasks |
12 | 3 | Loïc Dachary | Traceback (most recent call last): |
13 | 3 | Loïc Dachary | File "/home/teuthworker/src/teuthology_master/teuthology/contextutil.py", line 30, in nested |
14 | 3 | Loïc Dachary | yield vars |
15 | 3 | Loïc Dachary | File "/home/teuthworker/src/teuthology_master/teuthology/task/install.py", line 1298, in task |
16 | 3 | Loïc Dachary | yield |
17 | 3 | Loïc Dachary | File "/home/teuthworker/src/teuthology_master/teuthology/run_tasks.py", line 125, in run_tasks |
18 | 3 | Loïc Dachary | suppress = manager.__exit__(*exc_info) |
19 | 3 | Loïc Dachary | File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__ |
20 | 3 | Loïc Dachary | self.gen.next() |
21 | 3 | Loïc Dachary | File "/var/lib/teuthworker/src/ceph-qa-suite_firefly/tasks/thrashosds.py", line 183, in task |
22 | 3 | Loïc Dachary | thrash_proc.do_join() |
23 | 3 | Loïc Dachary | File "/var/lib/teuthworker/src/ceph-qa-suite_firefly/tasks/ceph_manager.py", line 356, in do_join |
24 | 3 | Loïc Dachary | self.thread.get() |
25 | 3 | Loïc Dachary | File "/usr/lib/python2.7/dist-packages/gevent/greenlet.py", line 308, in get |
26 | 3 | Loïc Dachary | raise self._exception |
27 | 3 | Loïc Dachary | Exception: ceph-objectstore-tool: import failure with status 139 |
28 | 3 | Loïc Dachary | </pre> |
29 | 1 | Loïc Dachary | * Examine the relevant OSD, MDS or MON logs |
30 | 3 | Loïc Dachary | * Obtain a backtrace from the coredumps (see http://dachary.org/?p=3568 for a way to do that), if they are not in the OSD, MDS or MON logs (they usually are) |
31 | 2 | Loïc Dachary | |
32 | 2 | Loïc Dachary | h3. Tools |
33 | 2 | Loïc Dachary | |
34 | 2 | Loïc Dachary | * https://github.com/jcsp/scrape/blob/master/scrape.py |
35 | 2 | Loïc Dachary | ** command line example: |
36 | 2 | Loïc Dachary | <pre> |
37 | 2 | Loïc Dachary | user@machine:~$ python ~/<scrape_dir>/scrape.py /a/<run_name> |
38 | 2 | Loïc Dachary | </pre> |
39 | 1 | Loïc Dachary | *** this will generally run in all labs (sepia, octo, typica) as */a* exits in all of them |