Project

General

Profile

Bug #55334

mgr/cephadm: socket path too long for some daemons

Added by Avan Thakkar 8 months ago. Updated 7 months ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

The admin socket commands fails for daemons like rgw/rbd-mirror/cephfs-mirror
"admin_socket: exception getting command descriptions: AF_UNIX path too long".

This is probably because of too long filenames. An example path name for rbd-mirror:
/var/run/ceph/e3f41acc-ba6a-11ec-9629-525400c43ed6/ceph-client.rbd-mirror.ceph-node-00.dpqslq.2.93914410882624.asok

116 chars

This exceeds the sizeof(sockaddr_un.sun_path) , which is 108 currently:
struct sockaddr_un {
_SOCKADDR_COMMON (sun);
char sun_path108; /* Path name. */
};

History

#1 Updated by Avan Thakkar 8 months ago

Avan Thakkar wrote:

The admin socket commands fails for daemons like rgw/rbd-mirror/cephfs-mirror
"admin_socket: exception getting command descriptions: AF_UNIX path too long".

This is probably because of too long filenames. An example path name for rbd-mirror:
/var/run/ceph/e3f41acc-ba6a-11ec-9629-525400c43ed6/ceph-client.rbd-mirror.ceph-node-00.dpqslq.2.93914410882624.asok

116 chars

This exceeds the sizeof(sockaddr_un.sun_path) , which is 108 currently:

struct sockaddr_un
  {
    __SOCKADDR_COMMON (sun_);
    char sun_path[108];   /* Path name.  */
  };

#2 Updated by Avan Thakkar 8 months ago

Ok I see name for these container are stored in form in src/cephadm
client.rbd-mirror.{daemon_id}

but the thing is the asok files stored under /var/run/ceph/<fsid> ..for e.g.
ceph-client.rbd-mirror.ceph-node-00.fqbmns.2.94595577574976.asok

So where's the extra 2.94595577574976 coming from? I couldn't see it in daemon spec of rbd-mirror as well

#3 Updated by Redouane Kachach Elhichou 7 months ago

I'm afraid cephadm has nothing to do with this issue. cephadm responsibility ends on daemon name generation etc. Admin socket is created/managed by the RGW code. The part 94595577574976 corresponds to the cctid I guess (from sample.ceph.conf):

    # The Ceph admin socket allows you to query a daemon via a socket interface
    # From a client perspective this can be a virtual machine using librbd
    # Type: String
    # Required: No
    ;admin socket                       = /var/run/ceph/$cluster-$type.$id.$pid.$cctid.asok

I'm not sure if this could be changed or not (RGW team may probably help here) but this is something that cephadm cannot control AFAIK.

#4 Updated by Redouane Kachach Elhichou 7 months ago

  • Status changed from New to Rejected

Please, feel free to re-open if you still think it's a cephadm issue. Otherwise you can move this issue to the corresponding component.

#5 Updated by Sebastian Wagner 7 months ago

this issue did popped up a whole ago already. I think there is an internal rhbz about it. $cctid was added years ago to avoid having multiple RBD volumes (each having it's own librados instance thus opening a socket) conflicting with each other. Back then there was no easy way to solve it. Neha might help?

Don't think that simply closing this as "Rejected" is going to make this issue go away any time soon.

Also available in: Atom PDF