Project

General

Profile

Actions

Bug #22226

closed

ceph zabbix plugin sends incorrect motinoring info to zabbix server

Added by Peter Hardon over 6 years ago. Updated about 6 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Category:
zabbix module
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

zabbix plugin sends incorrect cluster status to zabbix server.

ceph -s shows cluster in WARNING state:
root@ocata-ceph-node4:~# ceph -s
cluster:
id: c11fa860-98ee-4dd2-9243-1788f7ee2364
health: HEALTH_WARN
1 osds down
1 host (1 osds) down
Degraded data redundancy: 830/2490 objects degraded (33.333%), 44 pgs unclean, 44 pgs degraded, 44 pgs undersized
too few PGs per OSD (22 < min 30)

services:
mon: 3 daemons, quorum ocata-ceph-node2,ocata-ceph-node1,ocata-ceph-node3
mgr: ocata-ceph-node1(active), standbys: ocata-ceph-node3, ocata-ceph-node2
mds: cephfs-1/1/1 up {0=ocata-ceph-node1=up:active}, 2 up:standby
osd: 4 osds: 3 up, 4 in
rgw: 1 daemon active
data:
pools: 7 pools, 44 pgs
objects: 830 objects, 978 MB
usage: 13036 MB used, 69503 MB / 82539 MB avail
pgs: 830/2490 objects degraded (33.333%)
44 active+undersized+degraded

from tcpdump between zabbix client and server: {"host":"ocata-ceph-node1","key":"ceph.overall_status","value":"HEALTH_OK"}

issue disappears when active mgr is restarted.
running ceph 12.2.1-1~bpo90+1 on debian stretch

traces attached.

zabbix plugin configuration:
root@ocata-ceph-node1:~# ceph zabbix config-show {"zabbix_port": 10051, "zabbix_host": "192.168.10.9", "identifier": "ocata-ceph-node1", "zabbix_sender": "/usr/bin/zabbix_sender", "interval": 60}


Files

zabbix_ceph.pcap (3.05 KB) zabbix_ceph.pcap Peter Hardon, 11/22/2017 03:17 PM
Actions

Also available in: Atom PDF