Bug #43551
closedTrying to enable the CEPH Telegraf module errors 'No such file or directory'
0%
Description
I followed the steps at https://docs.ceph.com/docs/master/mgr/telegraf/, and enabled the telegraf module.
I then enabled the telegraf daemon to open a local listener port, and then ran the second command to set the address, and received the error:
$ sudo ceph telegraf config-set address udp://:8094
Error EIO: Module 'telegraf' has experienced an error and cannot handle commands: (2, 'No such file or directory')
I also noticed that the cluster health degraded:
2020-01-09 22:02:25.317507 mon.ceph-mon0 [ERR] Health check failed: Module 'telegraf' has failed: (2, 'No such file or directory') (MGR_MODULE_ERROR)
No matter what command I did related to telegraf, received the same error:
$ sudo ceph telegraf config-show
Error EIO: Module 'telegraf' has experienced an error and cannot handle commands: (2, 'No such file or directory')
The ceph-mgr log, shows that it tries to load the modules, but eventually raises an exception:
2020-01-09 22:02:13.021 7fb3c44e7700 1 mgr load Constructed class from module: rbd_support
2020-01-09 22:02:13.021 7fb3c44e7700 1 mgr load Constructed class from module: restful
2020-01-09 22:02:13.021 7fb3c44e7700 1 mgr load Constructed class from module: status
2020-01-09 22:02:13.021 7fb3c44e7700 1 mgr load Constructed class from module: telegraf
2020-01-09 22:02:13.025 7fb3b94d1700 1 mgr[restful] server not running: no certificate configured
2020-01-09 22:02:13.025 7fb3c44e7700 1 mgr load Constructed class from module: volumes
2020-01-09 22:02:23.077 7fb3b84cf700 -1 log_channel(cluster) log [ERR] : Unhandled exception from module 'telegraf' while running on mgr.ceph-mon0: (2, 'No such file or directory')
2020-01-09 22:02:23.077 7fb3b84cf700 -1 telegraf.serve:
2020-01-09 22:02:23.077 7fb3b84cf700 -1 Traceback (most recent call last):
File "/usr/share/ceph/mgr/telegraf/module.py", line 295, in serve
self.send_to_telegraf()
File "/usr/share/ceph/mgr/telegraf/module.py", line 243, in send_to_telegraf
with sock as s:
File "/usr/share/ceph/mgr/telegraf/basesocket.py", line 41, in __enter__
self.connect()
File "/usr/share/ceph/mgr/telegraf/basesocket.py", line 29, in connect
return self.sock.connect(self.address)
File "/usr/lib/python2.7/socket.py", line 228, in meth
return getattr(self._sock,name)(*args)
error: (2, 'No such file or directory')
I tried to create the telegraf unix socket at the default location of /tmp/telegraf.sock and it still had the same error.
What I ended up having to do, was very quickly enable the module, set the config address and then it worked.
sudo ceph mgr module enable telegraf
sudo ceph telegraf config-set address udp://:8094
There appear to be two problems:
1: Even if the sock file exists, the telegraf module is not able to see it or read it.
2: If the module was not able to find the unix socket file, it should not break other command that try to set the options in the module.
My CEPH version:
ceph version 14.2.5 (ad5bd132e1492173c85fda2cc863152730b16a92) nautilus (stable)
3 mgr/mon nodes, and 5 osd nodes.