Project

General

Profile

Bug #21967

'ceph tell mds' commands result in 'File exists' errors on client admin socket

Added by Brad Hubbard about 1 year ago. Updated 12 months ago.

Status:
Resolved
Priority:
Normal
Category:
-
Target version:
-
Start date:
10/30/2017
Due date:
% Done:

0%

Source:
Tags:
Backport:
luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
tools
Labels (FS):
Pull request ID:

Description

$ bin/ceph tell mds.a help
*** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH ***
2017-10-30 13:58:59.114 7f4d452a3700 -1 WARNING: all dangerous and experimental features are enabled.
2017-10-30 13:58:59.124 7f4d452a3700 -1 WARNING: all dangerous and experimental features are enabled.
2017-10-30 13:58:59.149 7f4d5d392700 -1 WARNING: all dangerous and experimental features are enabled.
2017-10-30 13:58:59.149 7f4d5d392700 -1 asok(0x56527b661b40) AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to bind the UNIX domain socket to '/tmp/ceph-asok.FY9ZEz/client.admin.23393.asok': (17) File exists
2017-10-30 13:58:59.152 7f4cef7fe700  0 client.4124 ms_handle_reset on 127.0.0.1:6813/1197852257
...

The above command succeeds but generates the EEXIST error. Tracking calls to AdminSocket::bind_and_listen in gdb show why.

$ gdb -q --args python bin/ceph tell mds.a help
Reading symbols from python...Reading symbols from /home/brad/working/src/ceph/build/python...(no debugging symbols found)...done.
(no debugging symbols found)...done.
Missing separate debuginfos, use: dnf debuginfo-install python2-2.7.13-12.fc26.x86_64
(gdb) b AdminSocket::bind_and_listen
Function "AdminSocket::bind_and_listen" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (AdminSocket::bind_and_listen) pending.
(gdb) r
...
(gdb) c
Continuing.

Thread 5 "ceph" hit Breakpoint 1, AdminSocket::bind_and_listen (this=this@entry=0x7fffd808bd80, sock_path="/tmp/ceph-asok.FY9ZEz/client.admin.32062.asok", fd=fd@entry=0x7fffdfed60c4)
    at /home/brad/working/src/ceph/src/common/admin_socket.cc:164
164     {
(gdb) bt
#0  AdminSocket::bind_and_listen (this=this@entry=0x7fffd808bd80, sock_path="/tmp/ceph-asok.FY9ZEz/client.admin.32062.asok", fd=fd@entry=0x7fffdfed60c4) at /home/brad/working/src/ceph/src/common/admin_socket.cc:164
#1  0x00007fffe5d2a9be in AdminSocket::init (this=0x7fffd808bd80, path="/tmp/ceph-asok.FY9ZEz/client.admin.32062.asok") at /home/brad/working/src/ceph/src/common/admin_socket.cc:575
#2  0x00007fffe5f90745 in CephContext::start_service_thread (this=this@entry=0x7fffd8000b30) at /home/brad/working/src/ceph/src/common/ceph_context.cc:751
#3  0x00007fffe5f8d04c in common_init_finish (cct=0x7fffd8000b30) at /home/brad/working/src/ceph/src/common/common_init.cc:99
#4  0x00007fffee845630 in librados::RadosClient::connect (this=this@entry=0x7fffd808f6f0) at /home/brad/working/src/ceph/src/librados/RadosClient.cc:240
#5  0x00007fffee7f326f in rados_connect (cluster=0x7fffd808f6f0) at /home/brad/working/src/ceph/src/librados/librados.cc:2851
#6  0x00007fffeeb37309 in __pyx_pf_5rados_5Rados_26connect (__pyx_v_timeout=<optimized out>, __pyx_v_self=0x7fffe2438a60) at /home/brad/working/src/ceph/build/src/pybind/rados/rados.c:11959
#7  __pyx_pw_5rados_5Rados_27connect (__pyx_v_self=0x7fffe2438a60, __pyx_args=<optimized out>, __pyx_kwds=<optimized out>) at /home/brad/working/src/ceph/build/src/pybind/rados/rados.c:11907
#8  0x00007ffff7b104c7 in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0
#9  0x00007ffff7b133f8 in PyEval_EvalCodeEx () from /lib64/libpython2.7.so.1.0
#10 0x00007ffff7b10433 in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0
#11 0x00007ffff7b10d99 in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0
#12 0x00007ffff7b133f8 in PyEval_EvalCodeEx () from /lib64/libpython2.7.so.1.0
#13 0x00007ffff7a627ee in function_call.lto_priv () from /lib64/libpython2.7.so.1.0
#14 0x00007ffff7a2ba53 in PyObject_Call () from /lib64/libpython2.7.so.1.0
#15 0x00007ffff7a5788e in instancemethod_call.lto_priv () from /lib64/libpython2.7.so.1.0
#16 0x00007ffff7a2ba53 in PyObject_Call () from /lib64/libpython2.7.so.1.0
#17 0x00007ffff7b09a67 in PyEval_CallObjectWithKeywords () from /lib64/libpython2.7.so.1.0
#18 0x00007ffff7ad2e82 in t_bootstrap () from /lib64/libpython2.7.so.1.0
#19 0x00007ffff777a36d in start_thread () from /lib64/libpthread.so.0
#20 0x00007ffff6d91e1f in clone () from /lib64/libc.so.6
(gdb) c
...
(gdb) c
Continuing.

Thread 1 "ceph" hit Breakpoint 1, AdminSocket::bind_and_listen (this=this@entry=0x555555a8ab00, sock_path="/tmp/ceph-asok.FY9ZEz/client.admin.32062.asok", fd=fd@entry=0x7fffffffc564)
    at /home/brad/working/src/ceph/src/common/admin_socket.cc:164
164     {
(gdb) bt
#0  AdminSocket::bind_and_listen (this=this@entry=0x555555a8ab00, sock_path="/tmp/ceph-asok.FY9ZEz/client.admin.32062.asok", fd=fd@entry=0x7fffffffc564) at /home/brad/working/src/ceph/src/common/admin_socket.cc:164
#1  0x00007fffe5d2a9be in AdminSocket::init (this=0x555555a8ab00, path="/tmp/ceph-asok.FY9ZEz/client.admin.32062.asok") at /home/brad/working/src/ceph/src/common/admin_socket.cc:575
#2  0x00007fffe5f90745 in CephContext::start_service_thread (this=this@entry=0x555555866b40) at /home/brad/working/src/ceph/src/common/ceph_context.cc:751
#3  0x00007fffe5f8d04c in common_init_finish (cct=0x555555866b40) at /home/brad/working/src/ceph/src/common/common_init.cc:99
#4  0x00007fffdc1ce896 in ceph_mount_info::init (this=<optimized out>) at /home/brad/working/src/ceph/src/libcephfs.cc:75
#5  ceph_init (cmount=0x555555a8e5a0) at /home/brad/working/src/ceph/src/libcephfs.cc:440
#6  0x00007fffdc4ee9e6 in __pyx_pf_6cephfs_9LibCephFS_26init (__pyx_v_self=0x7fffdc17d910) at /home/brad/working/src/ceph/build/src/pybind/cephfs/cephfs.c:7153
#7  __pyx_pw_6cephfs_9LibCephFS_27init (__pyx_v_self=0x7fffdc17d910, unused=<optimized out>) at /home/brad/working/src/ceph/build/src/pybind/cephfs/cephfs.c:7101
#8  0x00007ffff7b12243 in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0
#9  0x00007ffff7b133f8 in PyEval_EvalCodeEx () from /lib64/libpython2.7.so.1.0
#10 0x00007ffff7a62987 in function_call.lto_priv () from /lib64/libpython2.7.so.1.0
#11 0x00007ffff7a2ba53 in PyObject_Call () from /lib64/libpython2.7.so.1.0
#12 0x00007ffff7b0ec43 in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0
#13 0x00007ffff7b133f8 in PyEval_EvalCodeEx () from /lib64/libpython2.7.so.1.0
#14 0x00007ffff7b10433 in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0
#15 0x00007ffff7b133f8 in PyEval_EvalCodeEx () from /lib64/libpython2.7.so.1.0
#16 0x00007ffff7b10433 in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0
#17 0x00007ffff7b133f8 in PyEval_EvalCodeEx () from /lib64/libpython2.7.so.1.0
#18 0x00007ffff7b10433 in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0
#19 0x00007ffff7b133f8 in PyEval_EvalCodeEx () from /lib64/libpython2.7.so.1.0
#20 0x00007ffff7b10433 in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0
#21 0x00007ffff7b133f8 in PyEval_EvalCodeEx () from /lib64/libpython2.7.so.1.0
#22 0x00007ffff7b10433 in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0
#23 0x00007ffff7b133f8 in PyEval_EvalCodeEx () from /lib64/libpython2.7.so.1.0
#24 0x00007ffff7b13609 in PyEval_EvalCode () from /lib64/libpython2.7.so.1.0
#25 0x00007ffff7aeb81f in run_mod () from /lib64/libpython2.7.so.1.0
#26 0x00007ffff7aeb8ca in PyRun_FileExFlags () from /lib64/libpython2.7.so.1.0
#27 0x00007ffff7aec8ce in PyRun_SimpleFileExFlags () from /lib64/libpython2.7.so.1.0
#28 0x00007ffff7adf61e in Py_Main () from /lib64/libpython2.7.so.1.0
#29 0x00007ffff6ca188a in __libc_start_main () from /lib64/libc.so.6
#30 0x000055555555477a in _start ()
gdb) c
Continuing.
2017-10-30 14:09:01.955 7ffff7fc1700 -1 asok(0x555555866b40) AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to bind the UNIX domain socket to '/tmp/ceph-asok.FY9ZEz/client.admin.32062.asok': (17) File exists
...
[Inferior 1 (process 32062) exited normally]

So librados calls common_init_finish->CephContext::start_service_thread->AdminSocket::init->AdminSocket::bind_and_listen and later libcephfs' ceph_mount_info::init calls it again with the same path and this call fails with EEXIST which may result in some 'tell' commands failing?


Related issues

Duplicates fs - Bug #21406: ceph.in: tell mds does not understand --cluster Resolved 09/15/2017
Copied to fs - Backport #22076: luminous: 'ceph tell mds' commands result in 'File exists' errors on client admin socket Resolved

History

#1 Updated by Patrick Donnelly about 1 year ago

  • Status changed from New to Verified
  • Assignee set to Patrick Donnelly
  • Backport set to luminous
  • Component(FS) tools added

#2 Updated by Patrick Donnelly about 1 year ago

  • Status changed from Verified to Need Review

#3 Updated by Jos Collin about 1 year ago

  • Status changed from Need Review to In Progress

#4 Updated by Patrick Donnelly about 1 year ago

  • Status changed from In Progress to Need Review

Jos, "In Progress" indicates the Assignee is working on a fix. "Need Review" indicates the fix is undergoing review/testing. (We don't use "Need Test".)

#5 Updated by Patrick Donnelly about 1 year ago

  • Status changed from Need Review to Pending Backport

#6 Updated by Nathan Cutler about 1 year ago

  • Copied to Backport #22076: luminous: 'ceph tell mds' commands result in 'File exists' errors on client admin socket added

#7 Updated by Nathan Cutler 12 months ago

  • Duplicates Bug #21406: ceph.in: tell mds does not understand --cluster added

#8 Updated by Nathan Cutler 12 months ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF