2F - Testing buildrelease & Teuthology¶
Live Pad¶
The live pad can be found here: [pad]
Summit Snapshot¶
Building and testing Ceph automatically
Build infrastructure (gitbuilders):
https://github.com/ceph/gitbuilder + https://github.com/ceph/autobuild-ceph on-demand
Ceph gitbuilder status: http://ceph.com/gitbuilder.cgi
Release builds
1. target platforms?
ubuntu (precise, quantal, soon raring), squeeze, Centos, SuSE
2. process
Teuthology http://github.com/ceph/teuthology
automated test cluster setup/run/collect output
uses physical or virtual machines
cooperative locking of machine resources to avoid trampling other users (optional)
write YAML configuration files to select machines/roles, Ceph versions to install, tests to run
a few hundred physical machines internal to Inktank to run nightly tests
gtest for unit tests: (make check)
require tests for new code
refactor existing code to be testable
- combination/permutation of teuthology tests, cluster configurations
many different functional/regression tests, with and without failure injection
upgrade testing
run tests in mixed verison environment with slow rolling upgrades
integration tests:
openstack
cloudstack
chef.py
libvirt (pools and volumes)
qemu
Allocate/create VMs using EC2 or Openstack APIs?
Improve documentation on how to get it set up and working
- how to setup test machines (http://github.com/ceph/ceph-qa-chef )
Work items:
build out a large cluster test suite
parallel.py and sequential.py task
rados.py radosmodel test should infer the list of clients and run them in parallel
task to slurp up/archive perf counters
identifying key metrics to monitor
osd: small/large write performance
mds: metadata ops/sec
...
qemu gitbuilder
build large long-term clusters on burnupi?
samba (and others?) don't register as running daemons and thus can't be restarted by the upgrade task
Performance:
need to be able to identify performance regressions
memory, cpu usage, network usage(please) data
perf task?
collectl?
store aggregated data in summary.yaml for each runl
time (have raw timer task; need to log results)
identify data warehouse and make something to import into it
build chart.io graphs :)
aggregate/slurp the osd perf counters at end of run?
scribe (facebook)
flume (cloudera)
Updated by Jessica Mack almost 9 years ago · 1 revisions