BE: Cluster IOPS for 1.x
Neil reckons Sage told him that we could get global (ie not per pool) IOPS stats out of existing dumpling releases.
If that's so, then haul them in to the 1.x big blobs 'o' data so that we can expose them to an IOPS dashboard widget. If it's not, think again.
#1 Updated by John Spray almost 8 years ago
As Dan pointed out in chat, Ceph has had these statistics for some time, although they are in the form of counters (num_read/num_write) rather than gauges.
There are gauge versions of the statistics cluster-wide already in dumpling (in 'ceph pg dump summary'->pg_stats_delta, note that you have to divide by mon_tick_interval), the "new" per pool stats are the deltas ("pg dump delta", "osd pool stats") which are new in emperor.We have a few options:
- Wait for backports of the new stats
- Use the existing counter stats (have to do some very rough division in calamari or graphite)
- Use existing gauge stats (only gets you cluster-wide number)
I think we're expecting 1.x UI work to only consume a global IOPs number, so option 3 seems the most expedient.
#2 Updated by Dan Mick almost 8 years ago
So I interviewed Sage, and I think the actual situation is that:
1) OSD-centric io counters are available from the perfcounters mechanism (op, r_op, w_op, rw_op for client
I/O, sop_* for replication I/O, rop for recovery I/O, three sets supposedly disjoint)
2) pool-centric io counters are available from the pgmap, summarized in pg dump pools so the load isn't
and both of those have been in Ceph for a long time.
The later additions have been rate estimates, which are called "deltas" (erroneously in my opinion),
and are, I think, not relevant to our collection in Calamari.
So I think we can do this when we want and not require Ceph backporting.
#9 Updated by John Spray almost 8 years ago
Diamond bits are currently @here: https://github.com/jcsp/Diamond/tree/cluster-stats