Project

General

Profile

Feature #6168

UI: Graphs: Add view mode selector for host graphs.

Added by Neil Levine over 10 years ago. Updated over 10 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Neil Levine
Category:
-
Target version:
% Done:

0%

Source:
other
Tags:
Backport:
Reviewed:
Affected Versions:

Description

Currently we are showing CPU graphs for all nodes in a cluster. We should add a view mode selector which allows the admin to choose from CPU, memory, IO, processes, memory.

Selector should indicate inactive view options.

cpu-day.png View (19.9 KB) Neil Levine, 09/05/2013 04:03 PM

History

#1 Updated by Yan-Fa Li over 10 years ago

Take a look at the targets on mira022 and pick the graphs you want from the graphite tool and paste them back into this bug.

You can discover the targets names by picking the graphs and then right clicking on the PNG and copying the URL and pasting it to an editor.

#2 Updated by Neil Levine over 10 years ago

Yan-Fa Li wrote:

Take a look at the targets on mira022 and pick the graphs you want from the graphite tool and paste them back into this bug.

Not sure what you mean by targets? Also don't know how to get to the graphite tool now that it is not linked from the Graph button.

#3 Updated by Yan-Fa Li over 10 years ago

http://mira022.front.sepia.ceph.com:8080/ -> Graphite

Targets are like this:

http://mira022.front.sepia.ceph.com:8080/render/?width=586&height=308&_salt=1377843192.539&target=servers.mira045.memory.MemFree

See the target parameter in the path? I need those to generate graphs. Pick the ones you like using the UI and copy the URL to the bug. Thanks

#4 Updated by Yan-Fa Li over 10 years ago

Looking at the graphite data, IO is going to be problematic. Firstly IO is collected by device, not system wide. Diskspace is collected using a weird file system path name so I'm not sure how to go about discovering that info automatically without some assistance on the server side.

There does appear to be a search API on the graphite browser but I'm not sure how useful it is for consumption by clients rather than people.

#5 Updated by Yan-Fa Li over 10 years ago

  • translation missing: en.field_story_points set to 3.00

#6 Updated by Neil Levine over 10 years ago

  • Subject changed from UI: Graphs: Add view mode selector for cluster graphs. to UI: Graphs: Add view mode selector for host graphs.

#7 Updated by Neil Levine over 10 years ago

CPU

For each CPU, a composite of:

<host>.cpu.<cpu_unit>.system
<host>.cpu.<cpu_unit>.user
<host>.cpu.<cpu_unit>.nice
<host>.cpu.<cpu_unit>.idle
<host>.cpu.<cpu_unit>.iowait
<host>.cpu.<cpu_unit>.irq
<host>.cpu.<cpu_unit>.softirq
<host>.cpu.<cpu_unit>.steal

Diskspace

For each disk, a composite of:

<host>.diskspace.<device>.byte_avail
<host>.diskspace.<device>.byte_free
<host>.diskspace.<device>.byte_used

and a composite of:

<host>.diskspace.<device>.inodes_avail
<host>.diskspace.<device>.inodes_free
<host>.diskspace.<device>.inodes_used

IO

One graph for each device for:

<host>.iostat.<device>.iops

and a composite graph per disk for:
<host>.iostat.<device>.read_byte_per_second
<host>.iostat.<device>.write_byte_per_second

and a composite graph per disk for:
<host>.iostat.<device>.read_await
<host>.iostat.<device>.write_await

Load

Composite of:

<host>.loadavg.01
<host>.loadavg.5
<host>.loadavg.15

Memory

Composite of:
<host>.memory.memfree
<host>.memory.memtotal

and composite of:
<host>.memory.swapfree
<host>.memory.swaptotal

Stepping back a bit: this is a long list but the minimum set of graphs that ops people seem to look at. If is it not easy to put something in place to aggregate these graphs, let's discuss. It may push up the list the task to use a drop-in third party app like Taseo etc

#8 Updated by Yan-Fa Li over 10 years ago

There's some practical issues with this.

<host>.cpu.<cpu_unit>.system
<host>.cpu.<cpu_unit>.user
<host>.cpu.<cpu_unit>.nice
<host>.cpu.<cpu_unit>.idle
<host>.cpu.<cpu_unit>.iowait
<host>.cpu.<cpu_unit>.irq
<host>.cpu.<cpu_unit>.softirq
<host>.cpu.<cpu_unit>.steal

1. That's a lot of lines for 1 graph. The legends alone will take up a lot of space.
2. I don't know how many CPUs there are on a host and I'm not sure if there's an easy way for me to find this out without some help from the server. I'm not sure how I would discover this without scrapping the graphite search APIs.

Same problem with IO. I don't have a list of devices.
3. Per CPU graphs are generally not that useful in an SMP system, because you have no way to tie the load to a specific process/driver over time. The kernel is free to load balance processes and interrupts and for higher performance sys admins sometimes tie device interrupts to specific CPUs.

Maybe Noah has a good idea on how to approach doing automatic discovery using the built in graphite APIs.

#9 Updated by Neil Levine over 10 years ago

Re: CPU graph - look at attached from Munin... can Graphite render in a similar way? But take your point about SMP. We can drop this for the timebeing (especially if we show load).

Wrt discovering number of disks.. I'll leave you to work this out with Noah & Dan. It's probably the most important thing so , again, may be pointing us to using an embedded Graphite console which just displays everything and allows the user to composite their own graphs.

#10 Updated by Neil Levine over 10 years ago

  • Target version set to v.14

#11 Updated by Yan-Fa Li over 10 years ago

OK, Noah and I have come up with a way to get the info out of Graphite. Problem is we need to change the graphite install to allow Cross Origin Requests, the JS side can't make requests to a different port number.

http://enable-cors.org/server.html

If we add that header to graphite, I should be able to solve the problem in JS.

#12 Updated by Yan-Fa Li over 10 years ago

  • Assignee changed from Yan-Fa Li to Noah Watkins

#14 Updated by Yan-Fa Li over 10 years ago

OK, Basic graphing frontend support for iostats, diskspace and cpu metrics is done. Next up integrating them into the UI.

#15 Updated by Yan-Fa Li over 10 years ago

  • Status changed from 12 to In Progress
  • Assignee changed from Noah Watkins to Yan-Fa Li

#16 Updated by Neil Levine over 10 years ago

We should add traffic in and out of each network card to the list of targets we need to display.

I don't see any ethernet device targets in Graphite however, which is why I didn't have them in the list above.

#17 Updated by Yan-Fa Li over 10 years ago

Neil Levine wrote:

We should add traffic in and out of each network card to the list of targets we need to display.

I don't see any ethernet device targets in Graphite however, which is why I didn't have them in the list above.

We're going to need someone to configure graphite. This is made all the more peculiar on linux systems because of the change to use hardware addresses that systemd made. legacy boxes use eth(0-n) and newer systems use funky names like enp{0-9]s{0-9} so they are not automatically discoverable without using a tool like 'ip addr'. Ifconfig has been deprecated on some linux distros in favor of ip, though it'll be some years before it's totally gone.

#18 Updated by Yan-Fa Li over 10 years ago

  • Status changed from In Progress to 12
  • Assignee changed from Yan-Fa Li to Noah Watkins

Neil Levine wrote:

Re: CPU graph - look at attached from Munin... can Graphite render in a similar way? But take your point about SMP. We can drop this for the timebeing (especially if we show load).

Wrt discovering number of disks.. I'll leave you to work this out with Noah & Dan. It's probably the most important thing so , again, may be pointing us to using an embedded Graphite console which just displays everything and allows the user to composite their own graphs.

I don't think the graphite png renderer can do anything this sophisticated. We'd have to use a 3rd party JS plugin like Flotr2 and customize the view for each kind of graph.

#19 Updated by Yan-Fa Li over 10 years ago

  • Assignee changed from Noah Watkins to Neil Levine

Dan figured out how to get Graphite to log network stats. Neil, can you take a look at the stats Graphite collects and update this bug with the metrics you want?

#20 Updated by Yan-Fa Li over 10 years ago

  • Status changed from 12 to Need More Info

#21 Updated by Neil Levine over 10 years ago

  • Status changed from Need More Info to 12
  • Assignee changed from Neil Levine to Yan-Fa Li

Going insane. I am sure I updated this ticket yesterday with the targets. Grr. Obviously didn't hit submit.

Composite:
<host>.network.<device>.rx_byte
<host>.network.<device>.tx_byte

Composite:
<host>.network.<device>.rx_packets
<host>.network.<device>.tx_packets

Composite:
<host>.network.<device>.rx_drop
<host>.network.<device>.tx_drop

Composite:
<host>.network.<device>.rx_errors
<host>.network.<device>.tx_errors

#22 Updated by Yan-Fa Li over 10 years ago

  • Status changed from 12 to Resolved
  • Assignee changed from Yan-Fa Li to Neil Levine

Should be on mira022

Also available in: Atom PDF