Project

General

Profile

Fix #9701

salt-call ceph.get_heartbeats error if radosgw is installed on a node in the cluster ( ex : mon )

Added by Orlando Mendes almost 7 years ago. Updated over 6 years ago.

Status:
Duplicate
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

Hello all,

i have discovered a small issue in Calamari ( salt-minion, module ceph.py ) when a rados gateway is deployed in the same node of the cluster. Calamari ( salt-minion ) was unable to find the "mon" and the FSID of the cluster.

To reproduce this event, I setup the cluster with :
1 node : mon + radosgw
5 nodes : osd

when I launch the command "salt-call ceph.get_heartbeats" on the "mon" node ( with radosgw ) I have this output :

[ERROR ] An un-handled exception was caught by salt's global exception handler:
AttributeError: 'NoneType' object has no attribute 'groups'
Traceback (most recent call last):
File "/usr/bin/salt-call", line 11, in <module>
salt_call()
File "/usr/lib/python2.6/site-packages/salt/scripts.py", line 82, in salt_call
client.run()
File "/usr/lib/python2.6/site-packages/salt/cli/__init__.py", line 319, in run
caller.run()
File "/usr/lib/python2.6/site-packages/salt/cli/caller.py", line 144, in run
ret = self.call()
File "/usr/lib/python2.6/site-packages/salt/cli/caller.py", line 81, in call
ret['return'] = func(*args, **kwargs)
File "/var/cache/salt/minion/extmods/modules/ceph.py", line 468, in get_heartbeats
service_data = service_status(filename)
File "/var/cache/salt/minion/extmods/modules/ceph.py", line 514, in service_status
cluster_name, service_type, service_id = re.match("^(.+)-(mon|osd|mds)\.(.+)\.asok$", os.path.basename(socket_path)).groups()
AttributeError: 'NoneType' object has no attribute 'groups'
Traceback (most recent call last):
File "/usr/bin/salt-call", line 11, in <module>
salt_call()
File "/usr/lib/python2.6/site-packages/salt/scripts.py", line 82, in salt_call
client.run()
File "/usr/lib/python2.6/site-packages/salt/cli/__init__.py", line 319, in run
caller.run()
File "/usr/lib/python2.6/site-packages/salt/cli/caller.py", line 144, in run
ret = self.call()
File "/usr/lib/python2.6/site-packages/salt/cli/caller.py", line 81, in call
ret['return'] = func(*args, **kwargs)
File "/var/cache/salt/minion/extmods/modules/ceph.py", line 468, in get_heartbeats
service_data = service_status(filename)
File "/var/cache/salt/minion/extmods/modules/ceph.py", line 514, in service_status
cluster_name, service_type, service_id = re.match("^(.+)-(mon|osd|mds)\.(.+)\.asok$", os.path.basename(socket_path)).groups()
AttributeError: 'NoneType' object has no attribute 'groups'

I found that the "ceph.py" module check for files "*.asok" in "/var/run/ceph", so in our directory we have this files :

ceph-client.radosgw.gateway.asok
ceph-mon.fc-cont-ih-01.asok

Even though I followed the documentation for radosgw, I didn't know if the ceph objects gateway user should start with "client.radosgw", so I found a workaround on the module "ceph.py" ( /opt/calamari/salt/salt/_modules ) line 514 :

Original :

cluster_name, service_type, service_id = re.match("^(.+)-(mon|osd|mds)\.(.+)\.asok$", os.path.basename(socket_path)).groups()

Modified :

cluster_name, service_type, service_id = re.match("^(.+)-(mon|osd|mds|.*)\.(.+)\.asok$", os.path.basename(socket_path)).groups()

or

cluster_name, service_type, service_id = re.match("^(.+)-(mon|osd|mds|client)\.(.+)\.asok$", os.path.basename(socket_path)).groups()

Restart the service to recreate the cache, then I relaunched the command line "salt-call ceph.get_heartbeats" which produced this output :

[INFO ] Executing command 'repoquery --queryformat="%{NAME}_|-%{VERSION}_|-%{RELEASE}_|-%{ARCH}_|-%{REPOID}" --all --pkgnarrow=installed' in directory '/root'
local:
----------
- boot_time:
1400063592
- ceph_version:
0.80.1-0.el6
- services:
----------
ceph-client.radosgw.gateway:
----------
cluster:
ceph
fsid:
144338aa-4d2e-497e-8d83-f36f669088dd
id:
gateway
status:
None
type:
client.radosgw
version:
0.80.1
ceph-mon.fc-cont-ih-01:
----------
cluster:
ceph
fsid:
144338aa-4d2e-497e-8d83-f36f669088dd
id:
fc-cont-ih-01
status:
----------
election_epoch:
1
extra_probe_peers:
monmap:
----------
created:
0.000000
epoch:
1
fsid:
144338aa-4d2e-497e-8d83-f36f669088dd
modified:
0.000000
mons:
----------
- addr:
20.2.2.5:6789/0
- name:
fc-cont-ih-01
- rank:
0
name:
fc-cont-ih-01
outside_quorum:
quorum:
- 0
rank:
0
state:
leader
sync_provider:
type:
mon
version:
0.80.1
----------
- 144338aa-4d2e-497e-8d83-f36f669088dd:
----------
fsid:
144338aa-4d2e-497e-8d83-f36f669088dd
name:
ceph
versions:
----------
config:
6a40a712ac406e1d3f44998e81cac37a
health:
ce97edb1c60b19366535de2a1b3e5724
mds_map:
1
mon_map:
1
mon_status:
1
osd_map:
124
pg_summary:
01cf8d7de4601ac0df80804438041401

I hope this will help somebody.

History

#1 Updated by Christina Meno over 6 years ago

  • Status changed from New to Duplicate

Also available in: Atom PDF