Bug #48019: cephadm: `ceph daemon <daemon-name> ...` is broken - Orchestrator - Ceph

Actions

Copy link

Bug #48019

closed

cephadm: `ceph daemon <daemon-name> ...` is broken

Added by Nathan Cutler over 3 years ago. Updated about 2 years ago.

Status:

Can't reproduce

Priority:

Normal

Assignee:

Category:

cephadm (binary)

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

$ ceph daemon mgr.ceph03 config show
admin_socket: exception getting command descriptions: [Errno 2] No such file or directory

$ ceph --admin-daemon /var/run/ceph/mgr.ceph03 config show
No such file or directory

It seems like a bug that we are requiring users to enter a running container just to exercise a standard ceph feature.

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Updated by Nathan Cutler over 3 years ago

Related to Documentation #45564: cephadm: document workaround for accessing the admin socket by entering running container added

Actions

Copy link

Updated by Sebastian Wagner over 3 years ago

I think this actually works already. the mgr daemons have some random identifier. like mgr.<hostname>.xyzuizxy

Actions

Copy link

Updated by Nathan Cutler over 3 years ago

Sebastian Wagner wrote:

I think this actually works already. the mgr daemons have some random identifier. like mgr.<hostname>.xyzuizxy

Does not seem to be the case:

master:~ # ceph -s
  cluster:
    id:     3d58c86e-2990-11eb-8e8e-525400cd9699
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum node1,node3,node2 (age 9h)
    mgr: node2.ankmgz(active, since 9h), standbys: node3.ddjhkx, node1.jptrux

OK, the active mgr's name is "mgr.node2.ankmgz". I ssh over to node2 to check it out.

node2:~ # ceph daemon mgr.node2.ankmgz config show
admin_socket: exception getting command descriptions: [Errno 2] No such file or directory

Nope. If I know how to find the asok file, and go through the trouble to do it, then this does work:

node2:~ # ceph --admin-daemon /var/run/ceph/3d58c86e-2990-11eb-8e8e-525400cd9699/ceph-mgr.node2.ankmgz.asok config show

Actions

Copy link

Updated by Nathan Cutler over 3 years ago

Could it be that the admin_socket code has not been updated to reflect the fact that the asok files have moved?

This error seems to indicate it's still looking for the asok under /var/log/ceph:

admin_socket: exception getting command descriptions: [Errno 2] No such file or directory

Actions

Copy link

Updated by Sebastian Wagner over 3 years ago

Hm. Seems that when calling

ceph daemon mgr.x.y

the ceph binary call something like

ceph-conf --show-config-value admin_socket

which is determined by

https://github.com/ceph/ceph/blob/07cba31a03a3a311940a5338944f11ad2c87b641/src/common/options.cc#L476-L488

which means, it won't find the admin socket, due to the admin socket being in a different location within the container. Thus is is actually important that the admin socket is in the exact same location within the container and outside of the container.

So, the path that is set within the container and in options.cc is:

/var/run/ceph/ceph-$daemon-name.asok

but we're mounting the path on the host to

/var/run/ceph/$fsid/ceph-$daemon-name.asok

That's a bummer.

Actions

Copy link

Updated by Sebastian Wagner over 3 years ago

Subject changed from cephadm: not possible to access the admin socket from outside container to cephadm: `ceph daemon <daemon-name> ...` is broken

Actions

Copy link

Updated by Sebastian Wagner almost 3 years ago

Category set to cephadm (binary)

Actions

Copy link

Updated by Redouane Kachach Elhichou about 2 years ago

Closing beacause I can't reproduce the issue on the latest version of Ceph. If you think the issue still exists please re-open and provide detailed instructions on how to reproduce it on the latest Ceph version.

[root@ceph-node-0 ~]# cephadm shell
Inferring fsid 35e99e04-aeab-11ec-a051-525400283b7b
Using recent ceph image registry.hub.docker.com/rkachach/ceph@sha256:565277a8bcb13d2e123d444c4e5433cd28b221ec48fb97d99e4fcaae2233cf38
[ceph: root@ceph-node-0 /]# 
[ceph: root@ceph-node-0 /]# ceph --version
ceph version 17.0.0-10469-g29e1fc17 (29e1fc1722aa5915b44828a5ad02ec45ce760aa3) quincy (dev)
[ceph: root@ceph-node-0 /]# 
[ceph: root@ceph-node-0 /]# ceph orch ps | grep mgr
mgr.ceph-node-0.scljdt          ceph-node-0  *:9283       running (19h)     4m ago  19h     501M        -  17.0.0-10469-g29e1fc17  8bc723507631  38b25a1d5b12  
mgr.ceph-node-1.ndtirr          ceph-node-1  *:9283       running (19h)     4m ago  19h     426M        -  17.0.0-10469-g29e1fc17  8bc723507631  f153a85e7708  
[ceph: root@ceph-node-0 /]# 
[ceph: root@ceph-node-0 /]# ceph daemon mgr.ceph-node-0.scljdt config show | head -10
{
    "name": "mgr.ceph-node-0.scljdt",
    "cluster": "ceph",
    "admin_socket": "/var/run/ceph/ceph-mgr.ceph-node-0.scljdt.asok",
    "admin_socket_mode": "",
    "auth_allow_insecure_global_id_reclaim": "true",
    "auth_client_required": "cephx, none",
    "auth_cluster_required": "cephx",
    "auth_debug": "false",
    "auth_expose_insecure_global_id_reclaim": "true",
admin_socket: [Errno 32] Broken pipe
[ceph: root@ceph-node-0 /]# 
[ceph: root@ceph-node-0 /]#

Actions

Copy link

Updated by Redouane Kachach Elhichou about 2 years ago

Status changed from New to Can't reproduce

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » Orchestrator

Custom queries

Bug #48019

cephadm: `ceph daemon <daemon-name> ...` is broken

Updated by Nathan Cutler over 3 years ago

Updated by Sebastian Wagner over 3 years ago

Updated by Nathan Cutler over 3 years ago

Updated by Nathan Cutler over 3 years ago

Updated by Sebastian Wagner over 3 years ago

Updated by Sebastian Wagner over 3 years ago

Updated by Sebastian Wagner almost 3 years ago

Updated by Redouane Kachach Elhichou about 2 years ago

Updated by Redouane Kachach Elhichou about 2 years ago