Ceph stats and monitoring tools » History » Version 1
Jessica Mack, 06/08/2015 10:09 PM
1 | 1 | Jessica Mack | h1. Ceph stats and monitoring tools |
---|---|---|---|
2 | |||
3 | h3. Summary |
||
4 | |||
5 | Ceph tracks some state internally for its own purposes, but also exposes a wealth of other information for consumption by external tools. There is little consensus or shared knowledge of what a full solution is or should look like, both for health and performance monitoring. |
||
6 | |||
7 | h3. Owners |
||
8 | |||
9 | * Kyle Bader (DreamHost) |
||
10 | |||
11 | h3. Interested Parties |
||
12 | |||
13 | * Sage Weil (Inktank) |
||
14 | * Josh Durgin |
||
15 | * Dan Mick |
||
16 | * Xiaobing Zhou(xzhou40 (AT) hawk.iit.edu) |
||
17 | |||
18 | h3. Current Status |
||
19 | |||
20 | There is a collectd plugin and others have experimented with statsd, both in combination with graphite. |
||
21 | There is a nagios plugin available. |
||
22 | What about ganglia? |
||
23 | |||
24 | h3. Detailed Description |
||
25 | |||
26 | The collectd plugin has been used successfully by DreamHost, but my efforts to get it upstream have stalled due to a poor choice of json library. |
||
27 | The nagios plugin is available on github, but not as part of the Ceph tree. Should it be upstream? Documented? |
||
28 | Graphite looks good (to me) for warehousing the stats. What is the best way to get them from all the daemons into graphite? Collectd works okay if you have a single graphite server, but the 'proxy' functionality of collectd does not work if the meters are dynamically defined (as they are with ceph--we add them all the time and the plugin autoconfigures itself accordingly). |
||
29 | Discuss possible approaches, available tools, and come to some consensus on what tools and integrations should be fully developed and documented. |
||
30 | |||
31 | h3. Work items |