Bug #44570
rgw segfault when uploading files with BEAST
0%
Description
This happens 100% of the time with a command like this:
aws --endpoint-url http://<host>:7480 s3 cp <big file> s3://<user>
This is with ceph-12.2.8 using the bundled version of boost, compiled with gcc 9.2.0.
I have attached a full backtrace ("thread apply all bt"). Please let me know if there is any other information I can provide.
History
#1 Updated by Patrick McLean over 3 years ago
More readable version of command:
aws --endpoint-url http://[host]:7480 s3 cp [big file] s3://[user]
#2 Updated by Mark Kogan over 3 years ago
- Status changed from New to Need More Info
For reproduction purposes, few questions please:
- what is the size of the large file?
- were the default multipart settings of the aws cli command changed?
- please attach ceph.conf and radosgw log file with the seg fault.
Thanks
#3 Updated by Mark Kogan over 3 years ago
- Affected Versions v12.2.8 added
#4 Updated by Casey Bodley over 3 years ago
this is luminous? does this reproduce in later (supported) releases?
#5 Updated by Patrick McLean over 3 years ago
Argh, I had an inadvertent typo, this is actually 14.2.8. The backtraces attached and everything are from 14.2.8 (nautilus). We had the same issue on 12, and were using civetweb, we were hoping it would be fixed with nautilus.
#6 Updated by JuanJose Galvez over 3 years ago
- Affected Versions v14.2.8 added
- Affected Versions deleted (
v12.2.8)
Updated affected version. This is happening on v14.2.8 with the bundled boost packages.
#7 Updated by JuanJose Galvez over 3 years ago
- File ceph-conf-redacted.txt View added
- File redacted-small-and-large-file-upload.zip added
Mark Kogan wrote:
For reproduction purposes, few questions please:
- what is the size of the large file?
The file is a few GB's in size, however the crash happens at seemingly random points early on. I can also reproduce this error with a small 1M file.
- were the default multipart settings of the aws cli command changed?
No changes to the aws cli multiplart settings.
- please attach ceph.conf and radosgw log file with the seg fault.
I've attached a redacted ceph.conf and a log where the crash was reproduced with both a large file and a small file upload. debug_rgw = 20 and debug_ms = 1
Thanks
Please let us know if there is anything more we can provide to help identify and resolve this.
#8 Updated by Mark Kogan over 3 years ago
- Assignee set to Mark Kogan
#10 Updated by Mark Kogan over 3 years ago
Posting the callstacks from the attached logs:
ceph version 14.2.8 (2d095e947a02261ce61424021bb43bd3022d35cb) nautilus (stable) 1: (()+0x149f0) [0x715520b039f0] 2: (()+0x56f6d) [0x715529d60f6d] 3: (ceph::buffer::v14_2_0::ptr_node::cloner::operator()(ceph::buffer::v14_2_0::ptr_node const&)+0x4c) [0x715529d7d36c] 4: (()+0xded8c) [0x715529de8d8c] 5: (()+0xae6d4) [0x715529db86d4] 6: (()+0xb8371) [0x715529dc2371] 7: (()+0xb9c1b) [0x715529dc3c1b] 8: (()+0xc8b5b) [0x715529dd2b5b] 9: (()+0xc8d7f) [0x715529dd2d7f] 10: (()+0x8978a) [0x715529d9378a] 11: (librados::v14_2_0::IoCtx::operate(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, librados::v14_2_0::ObjectReadOperation*, ceph::buffer::v14_2_0::list*)+0x5f) [0x715529d62fef] 12: (rgw_rados_operate(librados::v14_2_0::IoCtx&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, librados::v14_2_0::ObjectReadOperation*, ceph::buffer::v14_2_0::list*, optional_yield)+0x2cc) [0x13e39434290c] 13: (RGWSI_SysObj_Core::read(RGWSysObjectCtxBase&, RGWSI_SysObj_Core::GetObjState&, RGWObjVersionTracker*, rgw_raw_obj const&, ceph::buffer::v14_2_0::list*, long, long, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::v14_2_0::list, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::v14_2_0::list> > >*, bool, rgw_cache_entry_info*, boost::optional<obj_version>)+0x2e2) [0x13e393fba362] 14: (RGWSI_SysObj_Cache::read(RGWSysObjectCtxBase&, RGWSI_SysObj_Core::GetObjState&, RGWObjVersionTracker*, rgw_raw_obj const&, ceph::buffer::v14_2_0::list*, long, long, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::v14_2_0::list, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::v14_2_0::list> > >*, bool, rgw_cache_entry_info*, boost::optional<obj_version>)+0x486) [0x13e3943bbdd6] 15: (RGWSI_SysObj::Obj::ROp::read(long, long, ceph::buffer::v14_2_0::list*)+0x9b) [0x13e393fb6d5b] 16: (rgw_get_system_obj(RGWRados*, RGWSysObjectCtx&, rgw_pool const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, ceph::buffer::v14_2_0::list&, RGWObjVersionTracker*, std::chrono::time_point<ceph::time_detail::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >*, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::v14_2_0::list, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::v14_2_0::list> > >*, rgw_cache_entry_info*, boost::optional<obj_version>)+0x312) [0x13e39433fd92] 17: (rgw_get_user_info_by_uid(RGWRados*, rgw_user const&, RGWUserInfo&, RGWObjVersionTracker*, std::chrono::time_point<ceph::time_detail::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >*, rgw_cache_entry_info*, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::v14_2_0::list, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::v14_2_0::list> > >*)+0x18a) [0x13e39434f1ba] 18: (rgw_get_user_info_from_index(RGWRados*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rgw_pool const&, RGWUserInfo&, RGWObjVersionTracker*, std::chrono::time_point<ceph::time_detail::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >*)+0x565) [0x13e39434fec5] 19: (rgw::auth::s3::LocalEngine::authenticate(DoutPrefixProvider const*, boost::basic_string_view<char, std::char_traits<char> > const&, boost::basic_string_view<char, std::char_traits<char> > const&, boost::basic_string_view<char, std::char_traits<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::function<basic_sstring<char, unsigned short, (unsigned short)65> (CephContext*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)> const&, std::function<std::shared_ptr<rgw::auth::Completer> (boost::optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const&)> const&, req_state const*) const+0x361) [0x13e394321631] 20: (rgw::auth::s3::AWSEngine::authenticate(DoutPrefixProvider const*, req_state const*) const+0x141) [0x13e3942ff181] 21: (rgw::auth::Strategy::authenticate(DoutPrefixProvider const*, req_state const*) const+0x565) [0x13e393fe98f5] 22: (rgw::auth::Strategy::authenticate(DoutPrefixProvider const*, req_state const*) const+0x565) [0x13e393fe98f5] 23: (rgw::auth::Strategy::apply(DoutPrefixProvider const*, rgw::auth::Strategy const&, req_state*)+0x3f) [0x13e393fea40f] 24: (RGW_Auth_S3::authorize(DoutPrefixProvider const*, RGWRados*, rgw::auth::StrategyRegistry const&, req_state*)+0x50) [0x13e39431d5f0] 25: (RGWHandler_REST_S3::authorize(DoutPrefixProvider const*)+0x9a) [0x13e39431d82a] 26: (process_request(RGWRados*, RGWREST*, RGWRequest*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rgw::auth::StrategyRegistry const&, RGWRestfulIO*, OpsLogSocket*, optional_yield, rgw::dmclock::Scheduler*, int*)+0x1042) [0x13e393f390d2] 27: (()+0x31de2f) [0x13e393e7de2f] 28: (()+0x31f16c) [0x13e393e7f16c] 29: (make_fcontext()+0x2f) [0x13e39446334f] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Thread 1 (Thread 0x64a6e4bba700 (LWP 14952)): #0 raise (sig=sig@entry=11) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x000013f502628651 in reraise_fatal (signum=11) at /var/tmp/portage/sys-cluster/ceph-14.2.8/work/ceph-14.2.8/src/global/signal_handler.cc:326 #2 handle_fatal_signal (signum=11) at /var/tmp/portage/sys-cluster/ceph-14.2.8/work/ceph-14.2.8/src/global/signal_handler.cc:326 #3 <signal handler called> #4 0x000064a6ef114963 in ceph::buffer::v14_2_0::list::buffers_t::clear_and_dispose (this=0x13f5082457b8) at /var/tmp/portage/sys-cluster/ceph-14.2.8/work/ceph-14.2.8/src/include/buffer.h:648 #5 ceph::buffer::v14_2_0::list::buffers_t::clone_from (other=..., this=0x13f5082457b8) at /var/tmp/portage/sys-cluster/ceph-14.2.8/work/ceph-14.2.8/src/include/buffer.h:639 #6 ceph::buffer::v14_2_0::list::operator= (other=..., this=0x13f5082457b8) at /var/tmp/portage/sys-cluster/ceph-14.2.8/work/ceph-14.2.8/src/include/buffer.h:1003 #7 ceph::buffer::v14_2_0::list::operator= (other=..., this=0x13f5082457b8) at /var/tmp/portage/sys-cluster/ceph-14.2.8/work/ceph-14.2.8/src/include/buffer.h:1000 #8 Objecter::handle_osd_op_reply (this=0x13f507c75700, m=0x13f50820eb00) at /var/tmp/portage/sys-cluster/ceph-14.2.8/work/ceph-14.2.8/src/osdc/Objecter.cc:3502 #9 0x000064a6ef115cab in Objecter::ms_dispatch (this=0x13f507c75700, m=0x13f50820eb00) at /var/tmp/portage/sys-cluster/ceph-14.2.8/work/ceph-14.2.8/src/osdc/Objecter.cc:966 #10 0x000064a6ef1347d7 in non-virtual thunk to Objecter::ms_fast_dispatch(Message*) () at /var/tmp/portage/sys-cluster/ceph-14.2.8/work/ceph-14.2.8/src/osdc/Objecter.h:2110 #11 0x000064a6ef0e5440 in Dispatcher::ms_fast_dispatch2 (this=0x13f507c75708, m=...) at /var/tmp/portage/sys-cluster/ceph-14.2.8/work/ceph-14.2.8/src/common/RefCountedObj.h:41 #12 0x000064a6e67d530f in Messenger::ms_fast_dispatch (m=..., this=<optimized out>) at /var/tmp/portage/sys-cluster/ceph-14.2.8/work/ceph-14.2.8_build/boost/include/boost/container/new_allocator.hpp:165 #13 DispatchQueue::fast_dispatch (this=0x13f506fcf558, m=...) at /var/tmp/portage/sys-cluster/ceph-14.2.8/work/ceph-14.2.8/src/msg/DispatchQueue.cc:72 #14 0x000064a6e68fa75e in DispatchQueue::fast_dispatch (m=0x13f50820eb00, this=<optimized out>) at /var/tmp/portage/sys-cluster/ceph-14.2.8/work/ceph-14.2.8_build/boost/include/boost/smart_ptr/intrusive_ptr.hpp:67 #15 ProtocolV2::handle_message (this=<optimized out>) at /var/tmp/portage/sys-cluster/ceph-14.2.8/work/ceph-14.2.8/src/msg/async/ProtocolV2.cc:1501 #16 0x000064a6e690cec8 in ProtocolV2::handle_read_frame_dispatch (this=0x13f5080fb700) at /var/tmp/portage/sys-cluster/ceph-14.2.8/work/ceph-14.2.8/src/msg/async/ProtocolV2.cc:1128 #17 0x000064a6e690d245 in ProtocolV2::handle_read_frame_epilogue_main (this=0x13f5080fb700, buffer=..., r=<optimized out>) at /var/tmp/portage/sys-cluster/ceph-14.2.8/work/ceph-14.2.8/src/msg/async/ProtocolV2.cc:1352 #18 0x000064a6e68f203f in ProtocolV2::run_continuation (this=0x13f5080fb700, continuation=...) at /var/tmp/portage/sys-cluster/ceph-14.2.8/work/ceph-14.2.8/src/msg/async/ProtocolV2.cc:45 #19 0x000064a6e68b898e in std::function<void (char*, long)>::operator()(char*, long) const (__args#1=<optimized out>, __args#0=<optimized out>, this=0x13f508165610) at /usr/lib/gcc/x86_64-pc-linux-gnu/9.2.0/include/g++-v9/bits/std_function.h:685 #20 AsyncConnection::process (this=0x13f508165200) at /var/tmp/portage/sys-cluster/ceph-14.2.8/work/ceph-14.2.8/src/msg/async/AsyncConnection.cc:450 #21 0x000064a6e69181b8 in EventCenter::process_events (this=this@entry=0x13f506f53700, timeout_microseconds=<optimized out>, timeout_microseconds@entry=30000000, working_dur=working_dur@entry=0x64a6e4bb7d28) at /usr/lib/gcc/x86_64-pc-linux-gnu/9.2.0/include/g++-v9/bits/basic_ios.h:282 #22 0x000064a6e691e25c in NetworkStack::<lambda()>::operator() (__closure=0x13f506f93808, __closure=0x13f506f93808) at /var/tmp/portage/sys-cluster/ceph-14.2.8/work/ceph-14.2.8/src/msg/async/Stack.cc:53 #23 std::_Function_handler<void(), NetworkStack::add_thread(unsigned int)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...) at /usr/lib/gcc/x86_64-pc-linux-gnu/9.2.0/include/g++-v9/bits/std_function.h:300 #24 0x000064a6e5ca2130 in std::execute_native_thread_routine (__p=0x13f506f93800) at /var/tmp/portage/sys-devel/gcc-9.2.0-r2/work/gcc-9.2.0/libstdc++-v3/src/c++11/thread.cc:80 #25 0x000064a6e5e2c482 in start_thread (arg=<optimized out>) at pthread_create.c:486 #26 0x000064a6e5abc5cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Running reproducer load on version v14.2.8 consisting 10GB objects PUTs with similar ceph.conf to the provided above
the segmentation fault does not reproduce.
The difference between civetweb and beast frameworks during load, is that memory footprint of the beast framework is larger.
A clue that this is related to memory preassure is the following like from the segmentation fault:
#12 0x000064a6e67d530f in Messenger::ms_fast_dispatch (m=..., this=<optimized out>) at /var/tmp/portage/sys-cluster/ceph-14.2.8/work/ceph-14.2.8_build/boost/include/boost/container/new_allocator.hpp:165
If it is possible to proceed investigating whether the segmentation fault was a result of memory preasure,
there are debug steps that can assist in advancing the investigation:
provide a sosreport to provide information aboutt the system spec like the total memory and avaliable memory during usage,
and/or provide atop or sar system load logs (from /var/log/sa/ or /var/log/atop/) from the time surrounding the segfault reproduction.
#11 Updated by JuanJose Galvez over 3 years ago
Mark Kogan wrote:
Posting the callstacks from the attached logs:
[...][...]
Running reproducer load on version v14.2.8 consisting 10GB objects PUTs with similar ceph.conf to the provided above
the segmentation fault does not reproduce.The difference between civetweb and beast frameworks during load, is that memory footprint of the beast framework is larger.
A clue that this is related to memory preassure is the following like from the segmentation fault:
[...]If it is possible to proceed investigating whether the segmentation fault was a result of memory preasure,
there are debug steps that can assist in advancing the investigation:
provide a sosreport to provide information aboutt the system spec like the total memory and avaliable memory during usage,
and/or provide atop or sar system load logs (from /var/log/sa/ or /var/log/atop/) from the time surrounding the segfault reproduction.
Unfortunately the sa and atop logs don't go that far back for us, however we do have Prometheus data. I can confirm that these nodes all have 128GB and only have two instances of radosgw running on each (s3api and s3website). The lowest dip in memory over the last month was 117GB available, I suppose a large spike could have occurred in the time between prometheus scrapes but it does seem unlikely.
Is there a combination of debug options that can provide greater detail of the activity going on in the RGW leading to the crash?
Also, I've noted a few grsec messages in our dmesg logs, they are unrelated (ntupdate) but do you know if there is anything RGW related we should watch for when grsecurity is enabled?
#12 Updated by Mark Kogan over 3 years ago
Thank you,
a couple of questions please to try to clue on the source of the problem,
- Under which OS and OS version it's running?
- is selinux or equivalent is active
- Is radosgw installed from packages (rpm/deb, which one?) or compiled on the machine?
- tcmalloc package name & version
- Is it running in a docker containerized environment?
- is the radosgw process constrained by groups (systemd unit configuration)
- Please paste the content of 'cat /proc/$(pidof radosgw)/limits'
#13 Updated by Patrick McLean over 3 years ago
- Under which OS and OS version it's running?
It's running under Gentoo, which is a rolling distro with no specific version.
- is selinux or equivalent is active
No selinux or equivalent, it does run under grsecurity normally, but we can reproduce this exactly the same without it enabled
- Is radosgw installed from packages (rpm/deb, which one?) or compiled on the machine?
It's installed via the Gentoo package, it's compiled on a local build server
- tcmalloc package name & version
The Gentoo google-perftools package version google-perftools-2.7-r1
The source of the code this package uses is https://github.com/gperftools/gperftools
- Is it running in a docker containerized environment?
No.
- is the radosgw process constrained by groups (systemd unit configuration)
No, it's running under OpenRC with zero control group constraints.
- Please paste the content of 'cat /proc/$(pidof radosgw)/limits'
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 2097152 2097152 processes
Max open files 1048576 1048576 files
Max locked memory 65536 65536 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 513634 513634 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
#14 Updated by Mark Kogan over 3 years ago
Patrick McLean wrote:
- Is radosgw installed from packages (rpm/deb, which one?) or compiled on the machine?
It's installed via the Gentoo package, it's compiled on a local build server
This may be related to the Gentoo compilation flags,
Is it possible to please try to provide the optimization flags that are used in the Gentoo compilation?
(will try to compile with the same flags (was it compiled with gcc or clang))
PS: found this, but it does not contain compilation flags:
https://wiki.gentoo.org/wiki/Ceph/Installation#Installation --> https://packages.gentoo.org/packages/sys-cluster/ceph --> https://gitweb.gentoo.org/repo/gentoo.git/tree/sys-cluster/ceph/ceph-15.2.3.ebuild
#15 Updated by Patrick McLean over 3 years ago
The compilation flags on Gentoo are user selected.
These hosts are being compiled with gcc-9.3.0 with the flags "-O2 -march=znver1 -mindirect-branch=thunk -frecord-gcc-switches -fno-omit-frame-pointer -g -pipe"
We have tried removing "-mindirect-branch=thunk" which had no effect on this issue.
#16 Updated by Patrick McLean over 3 years ago
I will also note that I am the maintainer of the package in Gentoo.
If there are compile flags that should be avoided, I can have the ebuild filter them out automatically
#17 Updated by Mark Kogan over 3 years ago
Patrick McLean wrote:
The compilation flags on Gentoo are user selected.
These hosts are being compiled with gcc-9.3.0 with the flags "-O2 -march=znver1 -mindirect-branch=thunk -frecord-gcc-switches -fno-omit-frame-pointer -g -pipe"We have tried removing "-mindirect-branch=thunk" which had no effect on this issue.
Updating - in process of looking for a machine with an AMD Epyc processor to work on reproducing with "-march=znver1"
#18 Updated by Mark Kogan over 3 years ago
Updating, the seg fault did not reproduce on
lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 1 Vendor ID: AuthenticAMD CPU family: 23 Model: 1 Model name: AMD Ryzen 7 1700 Eight-Core Processor Stepping: 1 CPU MHz: 3000.047 BogoMIPS: 6000.09 Hypervisor vendor: Microsoft Virtualization type: full L1d cache: 32K L1i cache: 64K L2 cache: 512K L3 cache: 8192K
and vstart environment built with
-DCMAKE_CXX_FLAGS="-O2 -march=znver1 -mindirect-branch=thunk -frecord-gcc-switches -fno-omit-frame-pointer -g -pipe"
and
lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 18.04.4 LTS Release: 18.04 Codename: bionic
#19 Updated by Mark Kogan over 3 years ago
Is it possible to please check if reproduces with version 15.x or master on the reproducing environment
#20 Updated by Sage Weil over 2 years ago
- Project changed from Ceph to rgw