Version 1 - History - -2F - Testing buildrelease & - Ceph - Ceph

1

Jessica Mack

h1. -2F - Testing buildrelease & Teuthology

2

3

<pre>

4

5

rturk	Anyone want to be in the hangout for this next track?	11:14

6

sagewk	me!	11:15

7

joshd	me too	11:15

8

sagewk	muted..	11:16

9

saras	what is teuthology	11:18

10

sjust	saras: our testing framework	11:18

11

saras	suit	11:18

12

saras	this is going to sound funny can you use openvas scan as one of the test	11:23

13

saras	not asking you to do it looking running it in my use case for teuthology	11:24

14

*** paravoid_ has joined #ceph-summit2	11:26

15

Karcaw	what is the largest number of osd's tested regularly?	11:26

16

*** paravoid is now known as Guest4609	11:27

17

*** paravoid_ is now known as paravoid	11:27

18

*** Guest4609 has quit IRC	11:27

19

saras	can please how teuthology works	11:29

20

dmick	saras: not familiar with the tool, but check out the teuthology repo	11:29

21

dmick	it does it a lot of execution of external tests	11:30

22

dmick	as well as its own	11:30

23

*** michael_dreamhost has joined #ceph-summit2	11:34

24

saras	what is min sizes of teuthology cluster	11:36

25

dmick	can run a cluster on one node if you wish, and either very soon or now, one VM	11:36

26

saras	dmick: good add try get it up running on my todo list	11:37

27

dmick	the first hurdle is usually locking.  you don't have to implement locking, but it's on by default in our setup because we have lots of machines and users	11:39

28

saras	saras: I will more then happey to tell you where i find pain with setup teuthology	11:40

29

*** mikedawson has joined #ceph-summit2	11:41

30

saras	that soundes great	11:41

31

mikedawson	what is considered a large size Teuthology ceph cluster? What is a long-running test?	11:41

32

dmick	most of our experience is 2-5 machines or so in a cluster, and maybe an hour or two of test run (the longer ones tend to be "steady load with failure injection")	11:42

33

dmick	but Sage has done some recent much-larger setups	11:42

34

dmick	(burnupi: our cluster has groups of identical machines named after cuttlefish species, so overall it's "sepia", and the individual machine groups are plana, burnupi, senta, mira, vercoi, etc.)	11:43

35

joshd	a while back we had some many-node tests, but at the time we had too few machines to run many of them, so we scaled them down for more general coverage	11:43

36

sjust	perhaps to create a "performance" suite with the messenger delays/failure injection, osd thrashing etc. turned off	11:46

37

saras	network load	11:46

38

sjust	and then scrape the nightly runs for the summary yaml?	11:46

39

dmick	sjust: +1	11:46

40

sjust	the summary yaml probably needs a way to associate the test with prior runs of the same test	11:46

41

sjust	sha1 of the config.yaml?	11:47

42

saras	network load for snyc agents	11:47

43

mikedawson	Sensu is pretty good for dynamic infrastructure	11:47

44

sjust	we should expose such info via admin socket	11:49

45

elder	Or a perf socket?	11:49

46

dmick	specifically: ceph --admin-daemon <socket> perf dump	11:50

47

sjust	I had a patch set at one point which allows streaming output from admin socket for osd op events	11:50

48

sjust	yep	11:50

49

dmick	sjust: I was hearing "snap at end of test run for a pile of reducable data", but graphing over the time fo the run could be interesting too	11:52

50

kbader	for network fault injection I've used the 'tc' utilities before	11:52

51

sjust	dmick: the advantage to grabbing the stream of events is you can get op latency histograms	11:52

52

kbader	you can inject arbitrary packetloss and latency for an interface	11:52

53

dmick	sjust: yep, also interesting data	11:53

54

saras	soundes like alot of what salt does	11:56

55

</pre>

Project

General

Profile

Ceph

-2F - Testing buildrelease & » History » Version 1