Project

General

Profile

Actions

Bug #43701

open

systemd units for mon, mgr and mds fail to start with ms_type=async+rdma

Added by Mikko Tanner over 4 years ago. Updated about 4 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
rdma
Backport:
Regression:
Yes
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

At some point in time, ceph systemd units ceph-mon@.service, ceph-mgr@.service and ceph-mds@.service seem to have adopted 'PrivateDevices=yes' in [Service] stanza. This will cause the services to fail with the following error if cluster communication is changed with ms_type=async+rdma to RDMA:

DeviceList failed to get rdma device list.  (19) No such device
/build/ceph-14.2.6/src/msg/async/rdma/Infiniband.h: In function 'DeviceList::DeviceList(CephContext*)'
/build/ceph-14.2.6/src/msg/async/rdma/Infiniband.h: 106: ceph_abort_msg("abort() called")

Stracing the mon process with a modified systemd unit reveals the following:

2885913 stat("/sys/class/infiniband_verbs/abi_version", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0
2885913 stat("/sys/class/infiniband_verbs/uverbs0", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
2885913 openat(AT_FDCWD, "/sys/class/infiniband_verbs/uverbs0/ibdev", O_RDONLY|O_CLOEXEC) = 24
2885913 read(24, "mlx4_0\n", 64)        = 7
2885913 close(24)                       = 0
2885913 stat("/sys/class/infiniband/mlx4_0", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
2885913 stat("/dev/infiniband/uverbs0", 0x7faefcd479a0) = -1 ENOENT (No such file or directory)

The service crashes right after this ENOENT. However, /dev/infiniband/uverbs0 exists. From PrivateDevices description (https://www.freedesktop.org/software/systemd/man/systemd.exec.html#PrivateDevices=):

"If true, sets up a new /dev mount for the executed processes and only adds API pseudo devices such as /dev/null, /dev/zero or /dev/random (as well as the pseudo TTY subsystem) to it, but no physical devices"

It appears that for async+rdma messaging to work, 'PrivateDevices=yes' has to be removed from these systemd units or some other workaround has to be devised. In the meantime, a simple workaround is to add a systemd override for each of the affected services (with 'systemctl edit ceph-[mon|mgr|mds]@.service'):

[Service]
PrivateDevices=false

Please note that simply stracing the affected binary will not reveal the bug as without systemd's "protection" the service will start without issue.

Actions #1

Updated by Greg Farnum over 4 years ago

  • Project changed from Ceph to Messengers
Actions #2

Updated by Yiu Chung Lee about 4 years ago

Actually, ProtectSystem=no is need too, otherwise the daemon cannot open /dev/infiniband/uverbs0 as read-write.

Actions #3

Updated by Yiu Chung Lee about 4 years ago

Yiu Chung Lee wrote:

Actually, ProtectSystem=no is need too, otherwise the daemon cannot open /dev/infiniband/uverbs0 as read-write.

Ignore this. Seems unrelated.

Actions

Also available in: Atom PDF