Project

General

Profile

Bug #8548

UI: IOPS do not appear to be detected.

Added by Warren Usui over 7 years ago. Updated about 7 years ago.

Status:
Resolved
Priority:
Normal
Category:
Backend (graphite/diamond)
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

Occurs on both RHEL 6.4 and Centos 6.4. On the vm where I installed ceph, I ran the following command:

sudo rados bench -p rbd --show-time 100 write

The IOPs part of the dashboard still says 0. I ran that rados operation on every osd that I could find.

Associated revisions

Revision 4cdcba72
Added by Dan Mick about 7 years ago

Merge pull request #152 from ceph/wip-fix-diamond-requisites

fix files to properly require the package
Believed to fix 8548

References: #8548
Reviewed-by: Dan Mick <>

History

#1 Updated by Yan-Fa Li over 7 years ago

  • Category set to Backend (graphite/diamond)
  • Status changed from New to Need More Info
  • Assignee set to John Spray
  • Target version set to v1.2-dev11
  • Source changed from other to Q/A

So this I'm so sure about. The UI is polling the backend at regular intervals, but it's taking info from graphite. If graphite isn't registering the data, there's very little the UI can do. I'll assign this to John so he can give you some more detailed analysis.

#2 Updated by John Spray over 7 years ago

  • Assignee changed from John Spray to Warren Usui

Things to check:

  • As with any issue: are there any errors or backtraces in any of the logs in /var/log/calamari?
  • Is diamond running on all the mon servers, and are there any errors in /var/log/diamond?
  • Run "salt-call state.highstate" from a SSH terminal on one of the mon servers. Are any errors reported?
  • Is the version of diamond installed on the mon servers correct? It should be the inktank-built one.
  • Open the graphite dashboard in your browser at /graphite/dashboard/ on the Calamari server. Navigate to ceph.cluster.<your fsid>.pool.all.num_read -- is that statistic present? When you click it to get a chart in the graphite dashboard, is there any data?

#3 Updated by Warren Usui over 7 years ago

I've recreated this but I think that the problem is with Ceph itself. I am not getting rados puts to be able to save objects. Then again, my ceph cluster ain't healthy either. I'm still messing with this.

#4 Updated by Warren Usui about 7 years ago

Latest version -- Diamond does not appear to be running. I brought up 1 Calamari Server, 1 mon server, and 3 osd servers on rhel7. Ran sudo rados -p data bench 300 write

sudo service --status-all on all OSD's show:

diamond.service - LSB: System statistics collector for Graphite.
   Loaded: loaded (/etc/rc.d/init.d/diamond)
   Active: inactive (dead)

#5 Updated by Warren Usui about 7 years ago

Further details:

I reinstalled RHEL 7 from scratch on a 5 node cluster.

I finished running ceph-deploy calamari connect on all the sites.

The web page was at the ADD prompt and listed the mon site and all the osd's.

sudo service --status-all before the ADD prompt was hit showed:

=== osd.0 === 
osd.0: running {"version":"0.81"}
netconsole module not loaded
Configured devices:
lo eth0
Currently active devices:
lo eth0
Scheduled yum updates are disabled.

sudo service --status-all after the ADD prompt was hit (and the hosts came up in
http://vpm050.front.sepia.ceph.com/manage/#/first) was:

=== osd.0 === 
osd.0: running {"version":"0.81"}
diamond.service - LSB: System statistics collector for Graphite.
   Loaded: loaded (/etc/rc.d/init.d/diamond)
   Active: inactive (dead)

netconsole module not loaded
Configured devices:
lo eth0
Currently active devices:
lo eth0
Scheduled yum updates are disabled.

So it looks like diamond is inactive after the hosts are added to Calamari.

#6 Updated by Warren Usui about 7 years ago

Further details:

I reinstalled RHEL 7 from scratch on a 5 node cluster.

I finished running ceph-deploy calamari connect on all the sites.

The web page was that the ADD prompt and listed the mon site and all the osd's

sudo service --status-all before the ADD prompt was hit showed:

=== osd.0 === 
osd.0: running {"version":"0.81"}
netconsole module not loaded
Configured devices:
lo eth0
Currently active devices:
lo eth0
Scheduled yum updates are disabled.

sudo service --status-all after the ADD prompt was hit (and the hosts came up in
http://vpm050.front.sepia.ceph.com/manage/#/first)

=== osd.0 === 
osd.0: running {"version":"0.81"}
diamond.service - LSB: System statistics collector for Graphite.
   Loaded: loaded (/etc/rc.d/init.d/diamond)
   Active: inactive (dead)

netconsole module not loaded
Configured devices:
lo eth0
Currently active devices:
lo eth0
Scheduled yum updates are disabled.

So it looks like diamond come is inactive after the hosts are added to Calamari.

#7 Updated by Warren Usui about 7 years ago

  • Status changed from Need More Info to In Progress

See the above comments. It looks like Diamond is not starting correctly after the page add prompt is hit.

#8 Updated by Warren Usui about 7 years ago

  • Assignee changed from Warren Usui to Christina Meno

#9 Updated by Warren Usui about 7 years ago

Note: Greg. I assigned this to you because I heard that you were working on some diamond startup issues.

#10 Updated by Warren Usui about 7 years ago

vpm050 locked wusui@aardvark "Rhel 7 Server"
vpm063 locked wusui@aardvark "Rhel 7 Ceph Monitor"
vpm072 locked wusui@aardvark "Rhel 7 OSD"
vpm074 locked wusui@aardvark "Rhel 7 OSD"
vpm114 locked wusui@aardvark "Rhel 7 OSD"

These machines are the aftermath of this testing, I am leaving them up if someone wants to look at them.

#11 Updated by Dan Mick about 7 years ago

Indeed diamond isn't running; there are these warnings in /var/log/salt/minion:

2014-06-30 20:14:22,489 [salt.state ][ERROR ] Parent directory not present

I don't know for sure what they mean, but it could be resulting from the missing require that Gregory's fix addresses. Server packages are rebuilding now; I suggest retesting with the fix in place.

#12 Updated by Christina Meno about 7 years ago

  • Assignee changed from Christina Meno to Warren Usui

#13 Updated by Dan Mick about 7 years ago

  • Status changed from In Progress to 7

#14 Updated by Warren Usui about 7 years ago

  • Status changed from 7 to Resolved

Appears to be working now.

#15 Updated by Warren Usui about 7 years ago

  • Status changed from Resolved to 4
  • Assignee changed from Warren Usui to Christina Meno

Diamond does not seem to come up in 6.4. Is this expected right now?

#16 Updated by Dan Mick about 7 years ago

No, build was made with branch that didn't contain fix fro this bug

#17 Updated by Dan Mick about 7 years ago

  • Status changed from 4 to Resolved

Also available in: Atom PDF