2F - Testing buildrelease & » History » Version 1
Jessica Mack, 06/22/2015 04:57 AM
1 | 1 | Jessica Mack | h1. 2F - Testing buildrelease & Teuthology |
---|---|---|---|
2 | |||
3 | h3. Live Pad |
||
4 | |||
5 | The live pad can be found here: "[pad]":http://pad.ceph.com/p/test-teuthology |
||
6 | |||
7 | h3. Summit Snapshot |
||
8 | |||
9 | Building and testing Ceph automatically |
||
10 | |||
11 | |||
12 | Build infrastructure (gitbuilders): |
||
13 | |||
14 | p(. https://github.com/ceph/gitbuilder + https://github.com/ceph/autobuild-ceph on-demand |
||
15 | Ceph gitbuilder status: http://ceph.com/gitbuilder.cgi |
||
16 | |||
17 | Release builds |
||
18 | 1. target platforms? |
||
19 | ubuntu (precise, quantal, soon raring), squeeze, Centos, SuSE |
||
20 | 2. process |
||
21 | |||
22 | Teuthology http://github.com/ceph/teuthology |
||
23 | |||
24 | p(. automated test cluster setup/run/collect output |
||
25 | uses physical or virtual machines |
||
26 | cooperative locking of machine resources to avoid trampling other users (optional) |
||
27 | write YAML configuration files to select machines/roles, Ceph versions to install, tests to run |
||
28 | a few hundred physical machines internal to Inktank to run nightly tests |
||
29 | |||
30 | |||
31 | gtest for unit tests: (make check) |
||
32 | |||
33 | p(. require tests for new code |
||
34 | refactor existing code to be testable |
||
35 | |||
36 | Test suite http://github.com/ceph/ceph-qa-suite |
||
37 | * combination/permutation of teuthology tests, cluster configurations |
||
38 | many different functional/regression tests, with and without failure injection |
||
39 | |||
40 | upgrade testing |
||
41 | |||
42 | p(. run tests in mixed verison environment with slow rolling upgrades |
||
43 | |||
44 | integration tests: |
||
45 | |||
46 | p(. openstack |
||
47 | cloudstack |
||
48 | chef.py |
||
49 | libvirt (pools and volumes) |
||
50 | qemu |
||
51 | |||
52 | |||
53 | Allocate/create VMs using EC2 or Openstack APIs? |
||
54 | |||
55 | |||
56 | Improve documentation on how to get it set up and working |
||
57 | - how to setup test machines (http://github.com/ceph/ceph-qa-chef ) |
||
58 | |||
59 | |||
60 | Work items: |
||
61 | |||
62 | p(. build out a large cluster test suite |
||
63 | parallel.py and sequential.py task |
||
64 | rados.py radosmodel test should infer the list of clients and run them in parallel |
||
65 | task to slurp up/archive perf counters |
||
66 | identifying key metrics to monitor |
||
67 | |||
68 | p((. osd: small/large write performance |
||
69 | mds: metadata ops/sec |
||
70 | ... |
||
71 | |||
72 | p(. qemu gitbuilder |
||
73 | build large long-term clusters on burnupi? |
||
74 | samba (and others?) don't register as running daemons and thus can't be restarted by the upgrade task |
||
75 | |||
76 | Performance: |
||
77 | |||
78 | p(. need to be able to identify performance regressions |
||
79 | memory, cpu usage, network usage(please) data |
||
80 | |||
81 | p((. perf task? |
||
82 | collectl? |
||
83 | store aggregated data in summary.yaml for each runl |
||
84 | |||
85 | p(. time (have raw timer task; need to log results) |
||
86 | identify data warehouse and make something to import into it |
||
87 | build chart.io graphs :) |
||
88 | aggregate/slurp the osd perf counters at end of run? |
||
89 | |||
90 | |||
91 | |||
92 | scribe (facebook) |
||
93 | flume (cloudera) |