Actions
-2F - Testing buildrelease & Teuthology¶
rturk Anyone want to be in the hangout for this next track? 11:14 sagewk me! 11:15 joshd me too 11:15 sagewk muted.. 11:16 saras what is teuthology 11:18 sjust saras: our testing framework 11:18 saras suit 11:18 saras this is going to sound funny can you use openvas scan as one of the test 11:23 saras not asking you to do it looking running it in my use case for teuthology 11:24 *** paravoid_ has joined #ceph-summit2 11:26 Karcaw what is the largest number of osd's tested regularly? 11:26 *** paravoid is now known as Guest4609 11:27 *** paravoid_ is now known as paravoid 11:27 *** Guest4609 has quit IRC 11:27 saras can please how teuthology works 11:29 dmick saras: not familiar with the tool, but check out the teuthology repo 11:29 dmick it does it a lot of execution of external tests 11:30 dmick as well as its own 11:30 *** michael_dreamhost has joined #ceph-summit2 11:34 saras what is min sizes of teuthology cluster 11:36 dmick can run a cluster on one node if you wish, and either very soon or now, one VM 11:36 saras dmick: good add try get it up running on my todo list 11:37 dmick the first hurdle is usually locking. you don't have to implement locking, but it's on by default in our setup because we have lots of machines and users 11:39 saras saras: I will more then happey to tell you where i find pain with setup teuthology 11:40 *** mikedawson has joined #ceph-summit2 11:41 saras that soundes great 11:41 mikedawson what is considered a large size Teuthology ceph cluster? What is a long-running test? 11:41 dmick most of our experience is 2-5 machines or so in a cluster, and maybe an hour or two of test run (the longer ones tend to be "steady load with failure injection") 11:42 dmick but Sage has done some recent much-larger setups 11:42 dmick (burnupi: our cluster has groups of identical machines named after cuttlefish species, so overall it's "sepia", and the individual machine groups are plana, burnupi, senta, mira, vercoi, etc.) 11:43 joshd a while back we had some many-node tests, but at the time we had too few machines to run many of them, so we scaled them down for more general coverage 11:43 sjust perhaps to create a "performance" suite with the messenger delays/failure injection, osd thrashing etc. turned off 11:46 saras network load 11:46 sjust and then scrape the nightly runs for the summary yaml? 11:46 dmick sjust: +1 11:46 sjust the summary yaml probably needs a way to associate the test with prior runs of the same test 11:46 sjust sha1 of the config.yaml? 11:47 saras network load for snyc agents 11:47 mikedawson Sensu is pretty good for dynamic infrastructure 11:47 sjust we should expose such info via admin socket 11:49 elder Or a perf socket? 11:49 dmick specifically: ceph --admin-daemon <socket> perf dump 11:50 sjust I had a patch set at one point which allows streaming output from admin socket for osd op events 11:50 sjust yep 11:50 dmick sjust: I was hearing "snap at end of test run for a pile of reducable data", but graphing over the time fo the run could be interesting too 11:52 kbader for network fault injection I've used the 'tc' utilities before 11:52 sjust dmick: the advantage to grabbing the stream of events is you can get op latency histograms 11:52 kbader you can inject arbitrary packetloss and latency for an interface 11:52 dmick sjust: yep, also interesting data 11:53 saras soundes like alot of what salt does 11:56
Updated by Jessica Mack almost 9 years ago · 1 revisions