Bug #48019
closedcephadm: `ceph daemon <daemon-name> ...` is broken
0%
Description
$ ceph daemon mgr.ceph03 config show admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
$ ceph --admin-daemon /var/run/ceph/mgr.ceph03 config show No such file or directory
It seems like a bug that we are requiring users to enter a running container just to exercise a standard ceph feature.
Updated by Nathan Cutler over 3 years ago
- Related to Documentation #45564: cephadm: document workaround for accessing the admin socket by entering running container added
Updated by Sebastian Wagner over 3 years ago
I think this actually works already. the mgr daemons have some random identifier. like mgr.<hostname>.xyzuizxy
Updated by Nathan Cutler over 3 years ago
Sebastian Wagner wrote:
I think this actually works already. the mgr daemons have some random identifier. like mgr.<hostname>.xyzuizxy
Does not seem to be the case:
master:~ # ceph -s cluster: id: 3d58c86e-2990-11eb-8e8e-525400cd9699 health: HEALTH_OK services: mon: 3 daemons, quorum node1,node3,node2 (age 9h) mgr: node2.ankmgz(active, since 9h), standbys: node3.ddjhkx, node1.jptrux
OK, the active mgr's name is "mgr.node2.ankmgz". I ssh over to node2 to check it out.
node2:~ # ceph daemon mgr.node2.ankmgz config show admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
Nope. If I know how to find the asok file, and go through the trouble to do it, then this does work:
node2:~ # ceph --admin-daemon /var/run/ceph/3d58c86e-2990-11eb-8e8e-525400cd9699/ceph-mgr.node2.ankmgz.asok config show
Updated by Nathan Cutler over 3 years ago
Could it be that the admin_socket code has not been updated to reflect the fact that the asok files have moved?
This error seems to indicate it's still looking for the asok under /var/log/ceph:
admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
Updated by Sebastian Wagner over 3 years ago
Hm. Seems that when calling
ceph daemon mgr.x.y
the ceph binary call something like
ceph-conf --show-config-value admin_socket
which is determined by
which means, it won't find the admin socket, due to the admin socket being in a different location within the container. Thus is is actually important that the admin socket is in the exact same location within the container and outside of the container.
So, the path that is set within the container and in options.cc is:
/var/run/ceph/ceph-$daemon-name.asok
but we're mounting the path on the host to
/var/run/ceph/$fsid/ceph-$daemon-name.asok
That's a bummer.
Updated by Sebastian Wagner over 3 years ago
- Subject changed from cephadm: not possible to access the admin socket from outside container to cephadm: `ceph daemon <daemon-name> ...` is broken
Updated by Redouane Kachach Elhichou about 2 years ago
Closing beacause I can't reproduce the issue on the latest version of Ceph. If you think the issue still exists please re-open and provide detailed instructions on how to reproduce it on the latest Ceph version.
[root@ceph-node-0 ~]# cephadm shell Inferring fsid 35e99e04-aeab-11ec-a051-525400283b7b Using recent ceph image registry.hub.docker.com/rkachach/ceph@sha256:565277a8bcb13d2e123d444c4e5433cd28b221ec48fb97d99e4fcaae2233cf38 [ceph: root@ceph-node-0 /]# [ceph: root@ceph-node-0 /]# ceph --version ceph version 17.0.0-10469-g29e1fc17 (29e1fc1722aa5915b44828a5ad02ec45ce760aa3) quincy (dev) [ceph: root@ceph-node-0 /]# [ceph: root@ceph-node-0 /]# ceph orch ps | grep mgr mgr.ceph-node-0.scljdt ceph-node-0 *:9283 running (19h) 4m ago 19h 501M - 17.0.0-10469-g29e1fc17 8bc723507631 38b25a1d5b12 mgr.ceph-node-1.ndtirr ceph-node-1 *:9283 running (19h) 4m ago 19h 426M - 17.0.0-10469-g29e1fc17 8bc723507631 f153a85e7708 [ceph: root@ceph-node-0 /]# [ceph: root@ceph-node-0 /]# ceph daemon mgr.ceph-node-0.scljdt config show | head -10 { "name": "mgr.ceph-node-0.scljdt", "cluster": "ceph", "admin_socket": "/var/run/ceph/ceph-mgr.ceph-node-0.scljdt.asok", "admin_socket_mode": "", "auth_allow_insecure_global_id_reclaim": "true", "auth_client_required": "cephx, none", "auth_cluster_required": "cephx", "auth_debug": "false", "auth_expose_insecure_global_id_reclaim": "true", admin_socket: [Errno 32] Broken pipe [ceph: root@ceph-node-0 /]# [ceph: root@ceph-node-0 /]#
Updated by Redouane Kachach Elhichou about 2 years ago
- Status changed from New to Can't reproduce