Actions
Bug #41354
closedRBD image manipulation using python API crashing since Nautilus
% Done:
0%
Source:
Community (user)
Tags:
Backport:
luminous,mimic,nautilus
Regression:
Yes
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Since Nautilus, our python based management tools keep crashing. By examining GDB backtraces we think there might be some locking issue. I'm attaching simple reproducer scripts, multithreaded one crashes within seconds, single threaded within minutes. Backtrace attached as well.
Files
Updated by Jason Dillaman over 4 years ago
- Project changed from Linux kernel client to rbd
- Category deleted (
rbd)
Updated by Jason Dillaman over 4 years ago
- Status changed from New to In Progress
- Assignee set to Jason Dillaman
- Backport set to nautilus
It looks like a bug in librados associated w/ retrieving config overrides from MON config store, but I'll take a look.
Updated by Jason Dillaman over 4 years ago
- Backport changed from nautilus to luminous,mimic,nautilus
Regression introduced via [1]. It's already been backported to luminous and it's pending for mimic.
Updated by Jason Dillaman over 4 years ago
- Status changed from In Progress to Fix Under Review
- Pull request ID set to 29809
Updated by Jason Dillaman over 4 years ago
/home/jdillaman/ceph_nautilus/src/common/config_proxy.h: In function 'void ConfigProxy::call_gate_leave(ConfigProxy::md_config_obs_t*)' thread 7fffd6ffd700 time 2019-08-21 23:08:51.961269 /home/jdillaman/ceph_nautilus/src/common/config_proxy.h: 70: FAILED ceph_assert(p != obs_call_gate.end()) ceph version 14.2.2-421-g5286e37857 (5286e37857aad3901636f3c9a3a301c0eaa35f68) nautilus (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14f) [0x7fffe1574fcb] 2: (()+0x2771b5) [0x7fffe15751b5] 3: (()+0x65ba92) [0x7fffe1959a92] 4: (FunctionContext::finish(int)+0x2c) [0x7fffe16588dc] 5: (Context::complete(int)+0x9) [0x7fffe160a639] 6: (Finisher::finisher_thread_entry()+0x15e) [0x7fffe1612c0e] 7: (()+0x85a2) [0x7ffff7a635a2] 8: (clone()+0x43) [0x7ffff7ec7303] Thread 2198290 "fn_anonymous" received signal SIGABRT, Aborted. [Switching to Thread 0x7fffd6ffd700 (LWP 10321)] 0x00007ffff7e03e75 in raise () from /lib64/libc.so.6 (gdb) bt #0 0x00007ffff7e03e75 in raise () from /lib64/libc.so.6 #1 0x00007ffff7dee895 in abort () from /lib64/libc.so.6 #2 0x00007fffe1575026 in ceph::__ceph_assert_fail (assertion=<optimized out>, file=<optimized out>, line=<optimized out>, func=<optimized out>) at /home/jdillaman/ceph_nautilus/src/common/assert.cc:73 #3 0x00007fffe15751b5 in ceph::__ceph_assert_fail (ctx=...) at /home/jdillaman/ceph_nautilus/src/common/assert.cc:78 #4 0x00007fffe1959a92 in ConfigProxy::call_gate_leave (obs=<optimized out>, this=0x55555576b248) at /usr/include/c++/9/bits/stl_tree.h:1010 #5 ConfigProxy::call_observers (rev_obs=std::map with 1 element = {...}, this=0x55555576b248) at /home/jdillaman/ceph_nautilus/src/common/config_proxy.h:89 #6 ConfigProxy::set_mon_vals(CephContext*, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > const&, std::function<bool (std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)>) ( config_cb=..., kv=..., cct=<optimized out>, this=0x55555576b248) at /home/jdillaman/ceph_nautilus/src/common/config_proxy.h:291 #7 MonClient::<lambda(int)>::operator() (__closure=0x7fffbc013d10, __closure=0x7fffbc013d10, r=<optimized out>) at /home/jdillaman/ceph_nautilus/src/mon/MonClient.cc:418 #8 boost::detail::function::void_function_obj_invoker1<MonClient::handle_config(MConfig*)::<lambda(int)>, void, int>::invoke(boost::detail::function::function_buffer &, int) (function_obj_ptr=..., a0=<optimized out>) at /home/jdillaman/ceph_nautilus/build/boost/include/boost/function/function_template.hpp:159 #9 0x00007fffe16588dc in boost::function1<void, int>::operator() (a0=<optimized out>, this=<optimized out>) at /home/jdillaman/ceph_nautilus/build/boost/include/boost/function/function_template.hpp:682 #10 FunctionContext::finish (this=<optimized out>, r=<optimized out>) at /home/jdillaman/ceph_nautilus/src/include/Context.h:487 #11 0x00007fffe160a639 in Context::complete (this=0x7fffbc013d00, r=<optimized out>) at /home/jdillaman/ceph_nautilus/src/include/Context.h:77 #12 0x00007fffe1612c0e in Finisher::finisher_thread_entry (this=0x5555557749c0) at /home/jdillaman/ceph_nautilus/src/common/Finisher.cc:67 #13 0x00007ffff7a635a2 in start_thread () from /lib64/libpthread.so.0 #14 0x00007ffff7ec7303 in clone () from /lib64/libc.so.6
Updated by Patrick Donnelly over 4 years ago
- Status changed from Fix Under Review to Pending Backport
- Target version changed from v14.2.2 to v15.0.0
Updated by Nathan Cutler over 4 years ago
- Copied to Backport #41770: mimic: RBD image manipulation using python API crashing since Nautilus added
Updated by Nathan Cutler over 4 years ago
- Copied to Backport #41771: nautilus: RBD image manipulation using python API crashing since Nautilus added
Updated by Nathan Cutler over 4 years ago
- Copied to Backport #41772: luminous: RBD image manipulation using python API crashing since Nautilus added
Updated by Nathan Cutler over 3 years ago
- Status changed from Pending Backport to Resolved
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".
Updated by Patrick Donnelly 8 months ago
- Related to Bug #62832: common: config_proxy deadlock during shutdown (and possibly other times) added
Actions