Project

General

Profile

Feature #6883

Graphite statistics names should be based on FSIDs, not hostnames

Added by John Spray almost 8 years ago. Updated over 7 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Backend (graphite/diamond)
Target version:
% Done:

0%

Source:
other
Tags:
Backport:
Reviewed:
Affected Versions:

Description

Namespacing statistics under hostnames is bad because:

- Ceph services can potentially be relocated between hosts (e.g. unplug and move an OSD drive)
- Some statistics have no affinity to a particular host (cluster-wide statistics)

Service statistics (i.e. per mon, per OSD) should be like:
ceph.[fsid|cluster_name].[osd|mon|mds].[id|uuid]

Cluster statistics should be like:
ceph.[fsid|cluster_name].<stat>

The fsid|clustername and id|uuid should be configurable, depending on the tastes of the consumer. This is to be helpful to non-calamari users of the ceph collector, who won't be using a frontend that knows how to map UUIDs around.

Associated revisions

Revision 1fd033af (diff)
Added by John Spray over 7 years ago

rest_api: Use FSID instead of name in graphite

Fixes: #6883

History

#1 Updated by John Spray over 7 years ago

NB in my current dev branch I'm adding comments like this:

# TODO: Change names to FSIDs for #6883

grep for those when doing this ticket

#2 Updated by John Spray over 7 years ago

  • translation missing: en.field_story_points set to 3.0

#3 Updated by John Spray over 7 years ago

  • Assignee deleted (John Spray)

#4 Updated by John Spray over 7 years ago

  • Target version changed from v1.2 Backlog to v1.2-dev7

#5 Updated by Ian Colle over 7 years ago

  • Assignee set to John Spray

#6 Updated by John Spray over 7 years ago

Backend change is here: https://github.com/inktankstorage/calamari/tree/wip-6883

Yan - could you update the frontend stats paths please? These looks like the places where we should use FSID instead of name:

grep -r ceph.cluster dashboard/ 2>/dev/null
dashboard//app/scripts/models/graphite-pool-iops-model.js:            return this.graphiteHost + '/metrics/find?query=ceph.cluster.' + name + '.pool.*';
dashboard//app/scripts/templates/graphite/PoolDiskFreeTarget.ejs:ceph.cluster.<%- clusterName %>.df.<%- metric %>
dashboard//app/scripts/templates/graphite/PoolIOPSTarget.ejs:ceph.cluster.<%- clusterName %>.pool.<%- id %>.<%- metric %>

(the win from making this twiddle before releasing 1.2 is that we don't have to change these paths by moving whisperdb files around down the line)

#7 Updated by John Spray over 7 years ago

  • Assignee changed from John Spray to Yan-Fa Li

#8 Updated by Yan-Fa Li over 7 years ago

This is a pretty easy change. I have a branch ready manage-fsid. I noticed another problem though while looking at this.

Server stats are stored under the shortnames instead of the fqdn. To avoid issues in the future, I guess we should also switch graphite to always use the fqdn instead of the shortname. What do you think?

#9 Updated by Yan-Fa Li over 7 years ago

  • Status changed from New to 4
  • Assignee changed from Yan-Fa Li to John Spray

#10 Updated by John Spray over 7 years ago

  • Assignee changed from John Spray to Yan-Fa Li

Sounds good to me: backend change is simple https://github.com/inktankstorage/calamari/pull/91

#12 Updated by John Spray over 7 years ago

  • Assignee changed from John Spray to Yan-Fa Li

oops, I meant to say: graphite replaces periods in fqdns with underscores (because it also uses periods as its own path separator)

#13 Updated by Yan-Fa Li over 7 years ago

  • Status changed from 4 to In Progress

OK, updated pull request. Let me know when you want to pull the trigger.

#14 Updated by John Spray over 7 years ago

  • Status changed from In Progress to Resolved

The code for this is all landed & reasonably expected to work, hadn't seen end to end test yet though.

Also available in: Atom PDF