Project

General

Profile

Ceph stats and monitoring tools » History » Version 1

Jessica Mack, 06/08/2015 10:09 PM

1 1 Jessica Mack
h1. Ceph stats and monitoring tools
2
3
h3. Summary
4
5
Ceph tracks some state internally for its own purposes, but also exposes a wealth of other information for consumption by external tools.  There is little consensus or shared knowledge of what a full solution is or should look like, both for health and performance monitoring.
6
7
h3. Owners
8
9
* Kyle Bader (DreamHost)
10
11
h3. Interested Parties
12
13
* Sage Weil (Inktank)
14
* Josh Durgin
15
* Dan Mick
16
* Xiaobing Zhou(xzhou40 (AT) hawk.iit.edu)
17
18
h3. Current Status
19
20
There is a collectd plugin and others have experimented with statsd, both in combination with graphite.
21
There is a nagios plugin available.
22
What about ganglia?
23
24
h3. Detailed Description
25
26
The collectd plugin has been used successfully by DreamHost, but my efforts to get it upstream have stalled due to a poor choice of json library.
27
The nagios plugin is available on github, but not as part of the Ceph tree.  Should it be upstream?  Documented?
28
Graphite looks good (to me) for warehousing the stats.  What is the best way to get them from all the daemons into graphite?  Collectd works okay if you have a single graphite server, but the 'proxy' functionality of collectd does not work if the meters are dynamically defined (as they are with ceph--we add them all the time and the plugin autoconfigures itself accordingly).
29
Discuss possible approaches, available tools, and come to some consensus on what tools and integrations should be fully developed and documented.
30
31
h3. Work items