Project

General

Profile

Actions

Bug #61968

closed

rados::connect() gets segement fault

Added by Han Han 10 months ago. Updated 3 months ago.

Status:
Closed
Priority:
Urgent
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
rados
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Version:
librados2-18.1.2-0.2.fc39.x86_64
python3-rados-18.1.2-0.2.fc39.x86_64

Reproducing steps:
Setup a ceph cluster on Fedora 38(ceph-17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5))

Connect the ceph by the script:

#!/usr/bin/python3
import rados

CONFS={"key":"AQB8eaZkEnvHCxAASrS+Q9/OMAZo8BotCg7Oaw==", "mon_host":"10.73.114.79,"}

rad=rados.Rados('admin')
for i in CONFS.items():
    rad.conf_set(*i)

rad.connect()
Run the script:
  1. python3 rados_connect.py
    Segmentation fault (core dumped)
Backtrace:
(gdb) bt
#0  std::_Rb_tree_rebalance_for_erase (__z=0x1f14020, __header=...) at ../../../../../libstdc++-v3/src/c++98/tree.cc:289
#1  0x00007f5a54281709 in CommonSafeTimer<std::mutex>::cancel_all_events() () from /usr/lib64/ceph/libceph-common.so.2
#2  0x00007f5a542819e1 in CommonSafeTimer<std::mutex>::shutdown() () from /usr/lib64/ceph/libceph-common.so.2
#3  0x00007f5a544c6281 in MonClient::shutdown() () from /usr/lib64/ceph/libceph-common.so.2
#4  0x00007f5a544c7387 in MonClient::get_monmap_and_config() () from /usr/lib64/ceph/libceph-common.so.2
#5  0x00007f5a54a2521e in librados::v14_2_0::RadosClient::connect() () from /lib64/librados.so.2
#6  0x00007f5a549bfcb1 in rados_connect () from /lib64/librados.so.2
#7  0x00007f5a54bcc0f1 in __pyx_pw_5rados_5Rados_29connect () from /usr/lib64/python3.12/site-packages/rados.cpython-312-x86_64-linux-gnu.so
#8  0x00007f5a54b3f715 in __Pyx_CyFunction_CallAsMethod () from /usr/lib64/python3.12/site-packages/rados.cpython-312-x86_64-linux-gnu.so
#9  0x00007f5a553e9a66 in _PyObject_MakeTpCall (tstate=0x7f5a558488b0 <_PyRuntime+459824>, 
    callable=<cython_function_or_method at remote 0x7f5a52b4ec20>, args=0x7f5a55888070, nargs=1, keywords=0x0)
    at /usr/src/debug/python3.12-3.12.0~b3-2.fc39.x86_64/Objects/call.c:240
#10 0x00007f5a553f2304 in _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=<optimized out>, throwflag=<optimized out>)
    at Python/bytecodes.c:2645
#11 0x00007f5a5548b336 in PyEval_EvalCode (co=, globals=<optimized out>, 
    locals={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <SourceFileLoader(name='__main__', path='/root/rados_connect.py') at remote 0x7f5a54dfe0f0>, '__spec__': None, '__annotations__': {}, '__builtins__': <module at remote 0x7f5a54d95760>, '__file__': '/root/rados_connect.py', '__cached__': None, 'rados': <module at remote 0x7f5a54c098f0>, 'CONFS': {'key': 'AQB8eaZkEnvHCxAASrS+Q9/OMAZo8BotCg7Oaw==', 'mon_host': '10.73.114.79,'}, 'rad': <rados.Rados at remote 0x7f5a54cae2c0>, 'i': ('mon_host', '10.73.114.79,')})
    at /usr/src/debug/python3.12-3.12.0~b3-2.fc39.x86_64/Python/ceval.c:567
#12 0x00007f5a554ade8a in run_eval_code_obj (tstate=tstate@entry=0x7f5a558488b0 <_PyRuntime+459824>, co=co@entry=0x7f5a54c2c030, 
    globals=globals@entry={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <SourceFileLoader(name='__main__', path='/root/rados_connect.py') at remote 0x7f5a54dfe0f0>, '__spec__': None, '__annotations__': {}, '__builtins__': <module at remote 0x7f5a54d95760>, '__file__': '/root/rados_connect.py', '__cached__': None, 'rados': <module at remote 0x7f5a54c098f0>, 'CONFS': {'key': 'AQB8eaZkEnvHCxAASrS+Q9/OMAZo8BotCg7Oaw==', 'mon_host': '10.73.114.79,'}, 'rad': <rados.Rados at remote 0x7f5a54cae2c0>, 'i': ('mon_host', '10.73.114.79,')}, 
    locals=locals@entry={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <SourceFileLoader(name='__main__', path='/root/rados_connect.py') at remote 0x7f5a54dfe0f0>, '__spec__': None, '__annotations__': {}, '__builtins__': <module at remote 0x7f5a54d95760>, '__file__': '/root/rados_connect.py', '__cached__': None, 'rados': <module at remote 0x7f5a54c098f0>, 'CONFS': {'key': 'AQB8eaZkEnvHCxAASrS+Q9/OMAZo8BotCg7Oaw==', 'mon_host': '10.73.114.79,'}, 'rad': <rados.Rados at remote 0x7f5a54cae2c0>, 'i': ('mon_host', '10.73.114.79,')})
--Type <RET> for more, q to quit, c to continue without paging--c
    at /usr/src/debug/python3.12-3.12.0~b3-2.fc39.x86_64/Python/pythonrun.c:1695
#13 0x00007f5a554a8c5e in run_mod (mod=mod@entry=0x55b201d7caf8, filename=filename@entry='/root/rados_connect.py', 
    globals=globals@entry={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <SourceFileLoader(name='__main__', path='/root/rados_connect.py') at remote 0x7f5a54dfe0f0>, '__spec__': None, '__annotations__': {}, '__builtins__': <module at remote 0x7f5a54d95760>, '__file__': '/root/rados_connect.py', '__cached__': None, 'rados': <module at remote 0x7f5a54c098f0>, 'CONFS': {'key': 'AQB8eaZkEnvHCxAASrS+Q9/OMAZo8BotCg7Oaw==', 'mon_host': '10.73.114.79,'}, 'rad': <rados.Rados at remote 0x7f5a54cae2c0>, 'i': ('mon_host', '10.73.114.79,')}, 
    locals=locals@entry={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <SourceFileLoader(name='__main__', path='/root/rados_connect.py') at remote 0x7f5a54dfe0f0>, '__spec__': None, '__annotations__': {}, '__builtins__': <module at remote 0x7f5a54d95760>, '__file__': '/root/rados_connect.py', '__cached__': None, 'rados': <module at remote 0x7f5a54c098f0>, 'CONFS': {'key': 'AQB8eaZkEnvHCxAASrS+Q9/OMAZo8BotCg7Oaw==', 'mon_host': '10.73.114.79,'}, 'rad': <rados.Rados at remote 0x7f5a54cae2c0>, 'i': ('mon_host', '10.73.114.79,')}, 
    flags=flags@entry=0x7ffcd589c7b8, arena=arena@entry=0x7f5a54d1bdd0)
    at /usr/src/debug/python3.12-3.12.0~b3-2.fc39.x86_64/Python/pythonrun.c:1716
#14 0x00007f5a554c93c3 in pyrun_file (fp=fp@entry=0x55b201cf7490, filename=filename@entry='/root/rados_connect.py', start=start@entry=257, 
    globals=globals@entry={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <SourceFileLoader(name='__main__', path='/root/rados_connect.py') at remote 0x7f5a54dfe0f0>, '__spec__': None, '__annotations__': {}, '__builtins__': <module at remote 0x7f5a54d95760>, '__file__': '/root/rados_connect.py', '__cached__': None, 'rados': <module at remote 0x7f5a54c098f0>, 'CONFS': {'key': 'AQB8eaZkEnvHCxAASrS+Q9/OMAZo8BotCg7Oaw==', 'mon_host': '10.73.114.79,'}, 'rad': <rados.Rados at remote 0x7f5a54cae2c0>, 'i': ('mon_host', '10.73.114.79,')}, 
    locals=locals@entry={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <SourceFileLoader(name='__main__', path='/root/rados_connect.py') at remote 0x7f5a54dfe0f0>, '__spec__': None, '__annotations__': {}, '__builtins__': <module at remote 0x7f5a54d95760>, '__file__': '/root/rados_connect.py', '__cached__': None, 'rados': <module at remote 0x7f5a54c098f0>, 'CONFS': {'key': 'AQB8eaZkEnvHCxAASrS+Q9/OMAZo8BotCg7Oaw==', 'mon_host': '10.73.114.79,'}, 'rad': <rados.Rados at remote 0x7f5a54cae2c0>, 'i': ('mon_host', '10.73.114.79,')}, 
    closeit=closeit@entry=1, flags=0x7ffcd589c7b8) at /usr/src/debug/python3.12-3.12.0~b3-2.fc39.x86_64/Python/pythonrun.c:1616
#15 0x00007f5a554c870a in _PyRun_SimpleFileObject (fp=0x55b201cf7490, filename='/root/rados_connect.py', closeit=1, flags=0x7ffcd589c7b8)
    at /usr/src/debug/python3.12-3.12.0~b3-2.fc39.x86_64/Python/pythonrun.c:433
#16 0x00007f5a554c7d5f in _PyRun_AnyFileObject (fp=0x55b201cf7490, filename='/root/rados_connect.py', closeit=1, flags=0x7ffcd589c7b8)
    at /usr/src/debug/python3.12-3.12.0~b3-2.fc39.x86_64/Python/pythonrun.c:78
#17 0x00007f5a554b9592 in pymain_run_file_obj (skip_source_first_line=0, filename='/root/rados_connect.py', program_name='python3')
    at /usr/src/debug/python3.12-3.12.0~b3-2.fc39.x86_64/Modules/main.c:360
#18 pymain_run_file (config=0x7f5a5582b870 <_PyRuntime+340976>) at /usr/src/debug/python3.12-3.12.0~b3-2.fc39.x86_64/Modules/main.c:379
#19 pymain_run_python (exitcode=0x7ffcd589c7b4) at /usr/src/debug/python3.12-3.12.0~b3-2.fc39.x86_64/Modules/main.c:610
#20 Py_RunMain () at /usr/src/debug/python3.12-3.12.0~b3-2.fc39.x86_64/Modules/main.c:689
#21 0x00007f5a5547520c in Py_BytesMain (argc=<optimized out>, argv=<optimized out>)
    at /usr/src/debug/python3.12-3.12.0~b3-2.fc39.x86_64/Modules/main.c:743
#22 0x00007f5a54e2814a in __libc_start_call_main (main=main@entry=0x55b200400160 <main>, argc=argc@entry=2, argv=argv@entry=0x7ffcd589ca18)
    at ../sysdeps/nptl/libc_start_call_main.h:58
#23 0x00007f5a54e2820b in __libc_start_main_impl (main=0x55b200400160 <main>, argc=2, argv=0x7ffcd589ca18, init=<optimized out>, 
    fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffcd589ca08) at ../csu/libc-start.c:360
#24 0x000055b200400095 in _start ()

Files

script-backtrace.zip (2.52 KB) script-backtrace.zip Han Han, 07/12/2023 06:35 AM

Related issues 1 (1 open0 closed)

Is duplicate of Ceph - Bug #63867: Segfault in CommonSafeTimer::cancel_all_events due to uninitialized dataNew

Actions
Actions #1

Updated by Neha Ojha 10 months ago

  • Description updated (diff)
Actions #2

Updated by Radoslaw Zarzynski 10 months ago

  • Description updated (diff)
Actions #3

Updated by Radoslaw Zarzynski 10 months ago

  • Subject changed from rbd_connect gets segement fault to rados::connect() gets segement fault

bump up

Actions #4

Updated by Ilya Dryomov 10 months ago

  • Target version deleted (v18.1.2)
Actions #5

Updated by Neha Ojha 9 months ago

  • Assignee set to Pere Díaz Bou
Actions #6

Updated by Radoslaw Zarzynski 9 months ago

  • Status changed from New to In Progress
Actions #7

Updated by Brad Hubbard 5 months ago

I believe this is the same issue being reported on fedora 39 at https://bugzilla.redhat.com/show_bug.cgi?id=2252160

Actions #8

Updated by Brad Hubbard 5 months ago

  • Priority changed from Normal to Urgent
Actions #9

Updated by Radoslaw Zarzynski 5 months ago

  • Assignee changed from Pere Díaz Bou to Nitzan Mordechai

Hi Nitzan! Would you mind taking a look?

Actions #10

Updated by Nitzan Mordechai 5 months ago

I was able to reproduce it on Fedora 39 but only with the python api. (since it is coming as comiled package) i was not able to recreate it with my own compiled code, i guess I'm missing a few flags that can cause that Timer issue to appear.

Actions #11

Updated by Nitzan Mordechai 4 months ago

  • Is duplicate of Bug #63867: Segfault in CommonSafeTimer::cancel_all_events due to uninitialized data added
Actions #12

Updated by Nitzan Mordechai 4 months ago

there is duplicate bug for it and Hector Martin already found out this is actually a gcc bug and explain it in the BZ: https://bugzilla.redhat.com/show_bug.cgi?id=2241339

Actions #13

Updated by Radoslaw Zarzynski 3 months ago

  • Status changed from In Progress to Closed

Closing per the last comment from Nitzan, For fix, please refer to https://bugzilla.redhat.com/show_bug.cgi?id=2241339.

Actions

Also available in: Atom PDF