Project

General

Profile

Feature #6886

BE: Cluster IOPS for 1.x

Added by John Spray almost 8 years ago. Updated over 7 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Backend (services)
Target version:
% Done:

0%

Source:
other
Tags:
Backport:
Reviewed:
Affected Versions:

Description

Neil reckons Sage told him that we could get global (ie not per pool) IOPS stats out of existing dumpling releases.

If that's so, then haul them in to the 1.x big blobs 'o' data so that we can expose them to an IOPS dashboard widget. If it's not, think again.

History

#1 Updated by John Spray almost 8 years ago

As Dan pointed out in chat, Ceph has had these statistics for some time, although they are in the form of counters (num_read/num_write) rather than gauges.

There are gauge versions of the statistics cluster-wide already in dumpling (in 'ceph pg dump summary'->pg_stats_delta, note that you have to divide by mon_tick_interval), the "new" per pool stats are the deltas ("pg dump delta", "osd pool stats") which are new in emperor.

We have a few options:
  1. Wait for backports of the new stats
  2. Use the existing counter stats (have to do some very rough division in calamari or graphite)
  3. Use existing gauge stats (only gets you cluster-wide number)

I think we're expecting 1.x UI work to only consume a global IOPs number, so option 3 seems the most expedient.

#2 Updated by Dan Mick almost 8 years ago

So I interviewed Sage, and I think the actual situation is that:

1) OSD-centric io counters are available from the perfcounters mechanism (op, r_op, w_op, rw_op for client
I/O, sop_* for replication I/O, rop for recovery I/O, three sets supposedly disjoint)

2) pool-centric io counters are available from the pgmap, summarized in pg dump pools so the load isn't
crazy

and both of those have been in Ceph for a long time.

The later additions have been rate estimates, which are called "deltas" (erroneously in my opinion),
and are, I think, not relevant to our collection in Calamari.

So I think we can do this when we want and not require Ceph backporting.

#3 Updated by Neil Levine almost 8 years ago

  • Subject changed from Cluster IOPS for 1.x? to Cluster IOPS for 1.x

#4 Updated by Neil Levine almost 8 years ago

  • Status changed from New to 12

#5 Updated by Ian Colle almost 8 years ago

  • Target version changed from v1.1rc2 to v1.1rc3

#6 Updated by Neil Levine almost 8 years ago

  • Subject changed from Cluster IOPS for 1.x to BE: Cluster IOPS for 1.x

#7 Updated by Neil Levine almost 8 years ago

  • Status changed from 12 to In Progress

#8 Updated by Neil Levine almost 8 years ago

  • translation missing: en.field_story_points set to 3.0

#9 Updated by John Spray almost 8 years ago

Diamond bits are currently @here: https://github.com/jcsp/Diamond/tree/cluster-stats

#10 Updated by John Spray over 7 years ago

  • Status changed from In Progress to Resolved

Believe this is all done.

Also available in: Atom PDF