Bug #57163
closedfree(): invalid pointer
Added by Nitzan Mordechai over 1 year ago. Updated over 1 year ago.
0%
Description
2022-08-16T19:03:26.666 INFO:tasks.workunit.client.0.smithi195.stderr:+ for b in $BINARIES_TO_RUN 2022-08-16T19:03:26.667 INFO:tasks.workunit.client.0.smithi195.stderr:+ ./hello_world_cpp -c /etc/ceph/ceph.conf 2022-08-16T19:03:26.686 INFO:tasks.workunit.client.0.smithi195.stdout:we just set up a rados cluster object 2022-08-16T19:03:26.688 INFO:tasks.workunit.client.0.smithi195.stdout:we just parsed our config options 2022-08-16T19:03:26.696 INFO:tasks.workunit.client.0.smithi195.stderr:free(): invalid pointer 2022-08-16T19:03:26.717 INFO:tasks.workunit.client.0.smithi195.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/rados/test_librados_build.sh: line 65: 16668 Aborted (core dumped) ./$b -c /etc/ceph/ceph.conf 2022-08-16T19:03:26.717 INFO:tasks.workunit.client.0.smithi195.stderr:+ cleanup
/home/teuthworker/archive/lflores-2022-08-16_18:51:57-rados:singleton-nomsgr-wip-yuri4-testing-2022-08-15-0951-distro-default-smithi/6975747
Updated by Kefu Chai over 1 year ago
/a/kchai-2022-08-23_13:19:39-rados-wip-kefu-testing-2022-08-22-2243-distro-default-smithi/6987883/teuthology.log
Updated by Laura Flores over 1 year ago
This failure (as of right now) only occurs on Ubuntu 20.04. See https://github.com/ceph/ceph/pull/47642 for some examples.
Updated by Laura Flores over 1 year ago
Kefu Chai wrote:
/a/kchai-2022-08-23_13:19:39-rados-wip-kefu-testing-2022-08-22-2243-distro-default-smithi/6987883/teuthology.log
(gdb) bt
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007f5656e26859 in __GI_abort () at abort.c:79
#2 0x00007f5656e9126e in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7f5656fbb298 "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#3 0x00007f5656e992fc in malloc_printerr (str=str@entry=0x7f5656fb94c1 "free(): invalid pointer") at malloc.c:5347
#4 0x00007f5656e9ab2c in _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at malloc.c:4173
#5 0x00007f56570cd8ca in std::locale::_Impl::~_Impl() () from /lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x00007f56570cdb17 in std::locale::~locale() () from /lib/x86_64-linux-gnu/libstdc++.so.6
... many question marks
#29 0x00007f56565ffb4a in ?? () from /usr/lib/ceph/libceph-common.so.2
#30 0x00007f5656b8706d in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
#34 0x00007f5656b8707a in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
#38 0x00007f5656b8708b in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
#42 0x00007f5656b87098 in ?? () from /usr/lib/ceph/libceph-common.so.2
#43 0x00000000000001e8 in ?? ()
--Type <RET> for more, q to quit, c to continue without paging--
#44 0x0000000000000102 in ?? ()
#45 0x000000000000000f in ?? ()
#46 0x00007f5656b870a7 in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
#50 0x00007f5656b75c63 in typeinfo name for boost::spirit::qi::detail::parser_binder<boost::spirit::qi::expect_operator<boost::fusion::cons<boost::spirit::qi::alternative<boost::fusion::cons<boost::spirit::qi::action<boost::spirit::qi::reference<boost::spirit::qi::rule<boost::spirit::line_pos_iterator<char const*>, conf_line_t (), boost::spirit::qi::rule<boost::spirit::line_pos_iterator<char const*>, boost::spirit::unused_type, boost::spirit::unused_type, boost::spirit::unused_type, boost::spirit::unused_type>, boost::spirit::unused_type, boost::spirit::unused_type> const>, boost::phoenix::actor<boost::proto::exprns_::basic_expr<boost::proto::tagns_::tag::assign, boost::proto::argsns_::list2<boost::phoenix::actor<boost::spirit::attribute<0> >, boost::phoenix::actor<boost::proto::exprns_::basic_expr<boost::phoenix::tag::construct, boost::proto::argsns_::list2<boost::proto::exprns_::basic_expr<boost::proto::tagns_::tag::terminal, boost::proto::argsns_::term<boost::phoenix::detail::target<ConfFile> >, 0l>, boost::phoenix::actor<boost::spirit::argument<0> > >, 2l> > >, 2l> > >, boost::fusion::cons<boost::spirit::qi::sequence<boost::fusion::cons<boost::spirit::qi::kleene<boost::spirit::qi::eol_parser>, boost::fusion::cons<boost::spirit::qi::action<boost::spirit::qi::kleene<boost::spirit::qi::reference<boost::spirit::qi::rule<boost::spirit::line_pos_iterator<char const*>, conf_section_t (), boost::spirit::qi::rule<boost::spirit::line_pos_iterator<char const*>, boost::spirit::unused_type, boost::spirit::unused_type, boost::spirit::unused_type, boost::spirit::unused_type>, boost::spirit::unused_type, boost::spirit::unused_type> const> >, boost::phoenix::actor<boost::proto::exprns_::basic_expr<boost::proto::tagns_::tag::assign, boost::proto::argsns_::list2<boost::phoenix::actor<boost::spirit::attribute<0> >, boost::phoenix::actor<boost::proto::exprns_::basic_expr<boost::phoenix::tag::construct, boost::proto::argsns_::list2<boost::proto::exprns_::basic_expr<boost::proto::tagns_::tag::terminal, boost::proto::argsns_::term<boost::phoenix::detail::target<ConfFile> >, 0l>, boost::phoenix::actor<boost::spirit::argument<0> > >, 2l> > >, 2l> > >, boost::fusion::nil_> > >, boost::fusion::nil_> > >, boost::fusion::cons<boost::spirit::qi::eoi_parser, boost::fusion::nil_> > >, mpl_::bool_<false> > () from /usr/lib/ceph/libceph-common.so.2
... question marks
#54 0x00007f5656b870b7 in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
#58 0x00007f5656b870cf in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
#62 0x00007f5656b7ceaa in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
#66 0x00007f5656b870d7 in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
#70 0x00007f5656b870e9 in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
#74 0x00007f5656b870f3 in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
--Type <RET> for more, q to quit, c to continue without paging--
... question marks
#78 0x00007f5656b870fb in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
#82 0x00007f5656b87104 in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
#86 0x00007f5656b87117 in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
#90 0x00007f5656b87120 in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
#94 0x00007f5656b87126 in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
#98 0x00007f5656b8713c in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
#102 0x00007f5656b87146 in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
#106 0x00007f5656b87158 in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
#110 0x00007f5656b87169 in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
#114 0x00007f5656b8a043 in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
#118 0x00007f5656b7edfb in ?? () from /usr/lib/ceph/libceph-common.so.2
#119 0x00000000000003e0 in ?? ()
--Type <RET> for more, q to quit, c to continue without paging--
... question marks
#122 0x00007f5656b7ee07 in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
#126 0x00007f5656b8a02b in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
#130 0x00007f5656b89ff7 in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
#134 0x00007f5656b7eddb in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
#138 0x00007f5656b8a011 in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
#142 0x00007f5656b7edae in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
#146 0x00007f5656b827bb in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
#150 0x00007f5656b8717a in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
#154 0x00007f5656b8a073 in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
#158 0x00007f5656b7ee16 in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
#162 0x00007f5656b7ee25 in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
--Type <RET> for more, q to quit, c to continue without paging--
... question marks
#166 0x00007f5656b7ee36 in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
#170 0x00007f5656b78144 in typeinfo name for boost::iostreams::access_control<boost::iostreams::detail::chain_client<boost::iostreams::chain<boost::iostreams::output, char, std::char_traits<char>, std::allocator<char> > >, boost::iostreams::public_, boost::iostreams::detail::pub_<boost::iostreams::detail::chain_client<boost::iostreams::chain<boost::iostreams::output, char, std::char_traits<char>, std::allocator<char> > > > > () from /usr/lib/ceph/libceph-common.so.2
... question marks
#174 0x00007f5656b78155 in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
#178 0x00007f5656b7817c in typeinfo name for boost::iostreams::detail::filtering_stream_base<boost::iostreams::chain<boost::iostreams::output, char, std::char_traits<char>, std::allocator<char> >, boost::iostreams::public_> () from /usr/lib/ceph/libceph-common.so.2
... question marks
#182 0x00007f5656b78164 in typeinfo name for boost::iostreams::detail::filtering_stream_base<boost::iostreams::chain<boost::iostreams::output, char, std::char_traits<char>, std::allocator<char> >, boost::iostreams::public_> () from /usr/lib/ceph/libceph-common.so.2
... question marks
#186 0x00007f5656b781a1 in typeinfo name for boost::iostreams::detail::filtering_stream_base<boost::iostreams::chain<boost::iostreams::output, char, std::char_traits<char>, std::allocator<char> >, boost::iostreams::public_> () from /usr/lib/ceph/libceph-common.so.2
... question marks
#190 0x00007f5656b781b6 in typeinfo name for boost::iostreams::detail::filtering_stream_base<boost::iostreams::chain<boost::iostreams::output, char, std::char_traits<char>, std::allocator<char> >, boost::iostreams::public_> () from /usr/lib/ceph/libceph-common.so.2
... question marks
#194 0x00007f5656b7eef0 in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
#198 0x00007f5656b87192 in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
--Type <RET> for more, q to quit, c to continue without paging--
#202 0x00007f5656b871a7 in ?? () from /usr/lib/ceph/libceph-common.so.2
... question marks
#204 0x00007f56565fef00 in get_rbd_options() () from /usr/lib/ceph/libceph-common.so.2
Backtrace stopped: Cannot access memory at address 0x7ffe87531c58
Updated by Matan Breizman over 1 year ago
Local Run with -fsanitize=address warns about a data race at the same stage, may be relevant.
./hello_world_cpp -c ~/ceph/build/ceph.conf
we just set up a rados cluster object
we just parsed our config options
==================
WARNING: ThreadSanitizer: data race (pid=1144724)
Read of size 8 at 0x7b100003a0c0 by thread T3:
#0 memcmp <null> (libtsan.so.0+0x5077a)
#1 std::char_traits<char>::compare(char const*, char const*, unsigned long) /opt/rh/gcc-toolset-11/root/usr/include/c++/11/bits/char_traits.h:389 (libcep$
-common.so.2+0x48c999)
#2 std::basic_string_view<char, std::char_traits<char> >::compare(std::basic_string_view<char, std::char_traits<char> >) const /opt/rh/gcc-toolset-11/roo$
/usr/include/c++/11/string_view:315 (libceph-common.so.2+0x48c999)
#3 _ZStssIcSt11char_traitsIcEEDTcl21__char_traits_cmp_catIT0_ELi0EEESt17basic_string_viewIT_S2_ENSt15__type_identityIS6_E4typeE /opt/rh/gcc-toolset-11/ro$
t/usr/include/c++/11/string_view:560 (libceph-common.so.2+0x48c999)
#4 bool ceph::common::CephContext::associated_objs_cmp::operator()<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::$
asic_string_view<char, std::char_traits<char> > >(std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::type_index>
const&, std::pair<std::basic_string_view<char, std::char_traits<char> >, std::type_index> const&) const ../src/common/ceph_context.h:342 (libceph-common.so.2$
0x48c999)
#5 std::_Rb_tree_const_iterator<std::pair<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::type_index> con$
t, ceph::immobile_any<576ul> > > std::_Rb_tree<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::type_index>, s$
d::pair<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::type_index> const, ceph::immobile_any<576ul> >, std::$
Select1st<std::pair<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::type_index> const, ceph::immobile_any<576$
l> > >, ceph::common::CephContext::associated_objs_cmp, std::allocator<std::pair<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allo$
ator<char> >, std::type_index> const, ceph::immobile_any<576ul> > > >::_M_lower_bound_tr<std::pair<std::basic_string_view<char, std::char_traits<char> >, std$
:type_index>, void>(std::pair<std::basic_string_view<char, std::char_traits<char> >, std::type_index> const&) const /opt/rh/gcc-toolset-11/root/usr/include/c+
+/11/bits/stl_tree.h:1337 (libceph-common.so.2+0x48c999)
#6 std::_Rb_tree_const_iterator<std::pair<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::type_index> cons
t, ceph::immobile_any<576ul> > > std::_Rb_tree<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::type_index>, st
d::pair<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::type_index> const, ceph::immobile_any<576ul> >, std::_
Select1st<std::pair<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::type_index> const, ceph::immobile_any<576u
l> > >, ceph::common::CephContext::associated_objs_cmp, std::allocator<std::pair<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::alloc
ator<char> >, std::type_index> const, ceph::immobile_any<576ul> > > >::_M_find_tr<std::pair<std::basic_string_view<char, std::char_traits<char> >, std::type_i
ndex>, void>(std::pair<std::basic_string_view<char, std::char_traits<char> >, std::type_index> const&) const /opt/rh/gcc-toolset-11/root/usr/include/c++/11/bi
ts/stl_tree.h:1305 (libceph-common.so.2+0x48c999)
Previous write of size 8 at 0x7b100003a0c0 by thread T2:
#0 operator new(unsigned long) <null> (libtsan.so.0+0x7583a)
#1 __gnu_cxx::new_allocator<char>::allocate(unsigned long, void const*) /opt/rh/gcc-toolset-11/root/usr/include/c++/11/ext/new_allocator.h:127 (libceph-co
mmon.so.2+0x3b4e24)
#2 std::allocator<char>::allocate(unsigned long) /opt/rh/gcc-toolset-11/root/usr/include/c++/11/bits/allocator.h:185 (libceph-common.so.2+0x3b4e24)
#3 std::allocator_traits<std::allocator<char> >::allocate(std::allocator<char>&, unsigned long) /opt/rh/gcc-toolset-11/root/usr/include/c++/11/bits/alloc_
traits.h:464 (libceph-common.so.2+0x3b4e24)
#4 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_create(unsigned long&, unsigned long) /opt/rh/gcc-toolset-11/root/u
sr/include/c++/11/bits/basic_string.tcc:153 (libceph-common.so.2+0x3b4e24)
Location is heap block of size 50 at 0x7b100003a0c0 allocated by thread T2:
#0 operator new(unsigned long) <null> (libtsan.so.0+0x7583a)
#1 __gnu_cxx::new_allocator<char>::allocate(unsigned long, void const*) /opt/rh/gcc-toolset-11/root/usr/include/c++/11/ext/new_allocator.h:127 (libceph-co
mmon.so.2+0x3b4e24)
#2 std::allocator<char>::allocate(unsigned long) /opt/rh/gcc-toolset-11/root/usr/include/c++/11/bits/allocator.h:185 (libceph-common.so.2+0x3b4e24)
#3 std::allocator_traits<std::allocator<char> >::allocate(std::allocator<char>&, unsigned long) /opt/rh/gcc-toolset-11/root/usr/include/c++/11/bits/alloc_
traits.h:464 (libceph-common.so.2+0x3b4e24)
#4 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_create(unsigned long&, unsigned long) /opt/rh/gcc-toolset-11/root/u
sr/include/c++/11/bits/basic_string.tcc:153 (libceph-common.so.2+0x3b4e24)
Thread T3 'msgr-worker-1' (tid=1144739, running) created by main thread at:
#0 pthread_create <null> (libtsan.so.0+0x2cd42)
#1 std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) <null> (libstdc++.so.6+0xc2e8
8)
#2 __libc_start_main <null> (libc.so.6+0x3aca2)
Thread T2 'msgr-worker-0' (tid=1144738, running) created by main thread at:
#0 pthread_create <null> (libtsan.so.0+0x2cd42)
#1 std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) <null> (libstdc++.so.6+0xc2e8
8)
#2 __libc_start_main <null> (libc.so.6+0x3aca2)
SUMMARY: ThreadSanitizer: data race (/lib64/libtsan.so.0+0x5077a) in memcmp
Updated by Laura Flores over 1 year ago
/a/yuriw-2022-08-22_20:21:58-rados-wip-yuri11-testing-2022-08-22-1005-distro-default-smithi/6986255
2022-08-22T21:46:09.687 INFO:tasks.workunit.client.0.smithi165.stdout:[100%] Building CXX object CMakeFiles/rocksdb-shared.dir/utilities/env_librados.cc.o
2022-08-22T21:46:10.346 INFO:tasks.workunit.client.0.smithi165.stderr:/home/ubuntu/cephtest/mnt.0/client.0/tmp/rocksdb/utilities/env_librados.cc: In member function 'virtual rocksdb::Status rocksdb::LibradosWritableFile::Truncate(uint64_t)':
2022-08-22T21:46:10.347 INFO:tasks.workunit.client.0.smithi165.stderr:/home/ubuntu/cephtest/mnt.0/client.0/tmp/rocksdb/utilities/env_librados.cc:380:24: warning: 'void ceph::buffer::v15_2_0::list::claim(ceph::buffer::v15_2_0::list&)' is deprecated: in favor of operator=(list&&) [-Wdeprecated-declarations]
2022-08-22T21:46:10.347 INFO:tasks.workunit.client.0.smithi165.stderr: 380 | tmp.claim(_buffer);
2022-08-22T21:46:10.347 INFO:tasks.workunit.client.0.smithi165.stderr: | ^
2022-08-22T21:46:10.348 INFO:tasks.workunit.client.0.smithi165.stderr:In file included from /usr/include/rados/librados.hpp:11,
2022-08-22T21:46:10.348 INFO:tasks.workunit.client.0.smithi165.stderr: from /home/ubuntu/cephtest/mnt.0/client.0/tmp/rocksdb/include/rocksdb/utilities/env_librados.h:14,
2022-08-22T21:46:10.348 INFO:tasks.workunit.client.0.smithi165.stderr: from /home/ubuntu/cephtest/mnt.0/client.0/tmp/rocksdb/utilities/env_librados.cc:4:
2022-08-22T21:46:10.348 INFO:tasks.workunit.client.0.smithi165.stderr:/usr/include/rados/buffer.h:1082:58: note: declared here
2022-08-22T21:46:10.349 INFO:tasks.workunit.client.0.smithi165.stderr: 1082 | [[deprecated("in favor of operator=(list&&)")]] void claim(list& bl) {
2022-08-22T21:46:10.349 INFO:tasks.workunit.client.0.smithi165.stderr: | ^~~~~
2022-08-22T21:46:12.388 INFO:tasks.workunit.client.0.smithi165.stdout:[100%] Linking CXX shared library librocksdb.so
2022-08-22T21:46:18.058 INFO:tasks.workunit.client.0.smithi165.stdout:[100%] Built target rocksdb-shared
2022-08-22T21:46:18.094 INFO:tasks.workunit.client.0.smithi165.stdout:Scanning dependencies of target testutillib
2022-08-22T21:46:18.119 INFO:tasks.workunit.client.0.smithi165.stdout:[100%] Building CXX object CMakeFiles/testutillib.dir/monitoring/thread_status_updater_debug.cc.o
2022-08-22T21:46:18.119 INFO:tasks.workunit.client.0.smithi165.stdout:[100%] Building CXX object CMakeFiles/testutillib.dir/db/db_test_util.cc.o
2022-08-22T21:46:18.119 INFO:tasks.workunit.client.0.smithi165.stdout:[100%] Building CXX object CMakeFiles/testutillib.dir/utilities/cassandra/test_utils.cc.o
2022-08-22T21:46:18.120 INFO:tasks.workunit.client.0.smithi165.stdout:[100%] Building CXX object CMakeFiles/testutillib.dir/table/mock_table.cc.o
2022-08-22T21:46:24.285 INFO:tasks.workunit.client.0.smithi165.stdout:[100%] Linking CXX static library libtestutillib.a
2022-08-22T21:46:24.505 INFO:tasks.workunit.client.0.smithi165.stdout:[100%] Built target testutillib
2022-08-22T21:46:24.528 INFO:tasks.workunit.client.0.smithi165.stdout:Scanning dependencies of target rocksdb_env_librados_test
2022-08-22T21:46:24.540 INFO:tasks.workunit.client.0.smithi165.stdout:[100%] Building CXX object CMakeFiles/rocksdb_env_librados_test.dir/utilities/env_librados_test.cc.o
2022-08-22T21:46:26.983 INFO:tasks.workunit.client.0.smithi165.stdout:[100%] Linking CXX executable env_librados_test
2022-08-22T21:46:27.445 INFO:tasks.workunit.client.0.smithi165.stdout:[100%] Built target rocksdb_env_librados_test
2022-08-22T21:46:27.455 INFO:tasks.workunit.client.0.smithi165.stdout:Copy ceph.conf
2022-08-22T21:46:27.456 INFO:tasks.workunit.client.0.smithi165.stderr:+ echo 'Copy ceph.conf'
2022-08-22T21:46:27.457 INFO:tasks.workunit.client.0.smithi165.stderr:+ mkdir -p ../ceph/src/
2022-08-22T21:46:27.457 INFO:tasks.workunit.client.0.smithi165.stderr:+ '[' -f /etc/ceph/ceph.conf ']'
2022-08-22T21:46:27.458 INFO:tasks.workunit.client.0.smithi165.stderr:+ cp /etc/ceph/ceph.conf ../ceph/src/
2022-08-22T21:46:27.458 INFO:tasks.workunit.client.0.smithi165.stderr:+ echo 'Run EnvLibrados test'
2022-08-22T21:46:27.459 INFO:tasks.workunit.client.0.smithi165.stdout:Run EnvLibrados test
2022-08-22T21:46:27.460 INFO:tasks.workunit.client.0.smithi165.stderr:+ '[' -f ../ceph/src/ceph.conf ']'
2022-08-22T21:46:27.460 INFO:tasks.workunit.client.0.smithi165.stderr:+ cp env_librados_test /home/ubuntu/cephtest/archive
2022-08-22T21:46:27.465 INFO:tasks.workunit.client.0.smithi165.stderr:+ ./env_librados_test
2022-08-22T21:46:27.489 INFO:tasks.workunit.client.0.smithi165.stdout:[==========] Running 16 tests from 2 test cases.
2022-08-22T21:46:27.490 INFO:tasks.workunit.client.0.smithi165.stdout:[----------] Global test environment set-up.
2022-08-22T21:46:27.490 INFO:tasks.workunit.client.0.smithi165.stdout:[----------] 12 tests from EnvLibradosTest
2022-08-22T21:46:27.490 INFO:tasks.workunit.client.0.smithi165.stdout:[ RUN ] EnvLibradosTest.Basics
2022-08-22T21:46:27.491 INFO:tasks.workunit.client.0.smithi165.stderr:free(): invalid pointer
2022-08-22T21:46:27.506 DEBUG:teuthology.orchestra.run:got remote process result: 134
2022-08-22T21:46:27.508 INFO:tasks.workunit.client.0.smithi165.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/rados/test_envlibrados_for_rocksdb.sh: line 96: 16240 Aborted (core dumped) ./env_librados_test
gdb shows:
(gdb) bt
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007fe4fb50f859 in __GI_abort () at abort.c:79
#2 0x00007fe4fb57a26e in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7fe4fb6a4298 "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#3 0x00007fe4fb5822fc in malloc_printerr (str=str@entry=0x7fe4fb6a24c1 "free(): invalid pointer") at malloc.c:5347
#4 0x00007fe4fb583b2c in _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at malloc.c:4173
#5 0x00007fe4fb7b68ca in char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag) () from /lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x00007fe4fb8ecc10 in ?? () from /lib/x86_64-linux-gnu/libpthread.so.0
#7 0x00007ffcb2f962f0 in ?? ()
#8 0x00007fe4fb7b6b17 in std::string::replace(unsigned long, unsigned long, char const*, unsigned long) () from /lib/x86_64-linux-gnu/libstdc++.so.6
#9 0x000055c3dd70e1d0 in ?? ()
#10 0x00007ffcb2f962b0 in ?? ()
#11 0x00007fe4fcbe5b01 in ?? ()
#12 0x000055c3dd70e1d0 in ?? ()
#13 0x000055c3dd70e1d0 in ?? ()
#14 0x00007ffcb2f962e0 in ?? ()
#15 0x00007fe4fcbe589d in ?? ()
#16 0x0000000000000001 in ?? ()
#17 0x000055c3dd70e1c0 in ?? ()
#18 0x00007ffcb2f964a0 in ?? ()
#19 0x000055c3dd684850 in ?? ()
#20 0x000055c3dd70e1c0 in ?? ()
#21 0x00007fe4face8cda in ?? () from /usr/lib/ceph/libceph-common.so.2
#22 0x00007fe4fac44016 in Option::dump_value(char const*, std::variant<std::monostate, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, unsigned long, long, double, bool, entity_addr_t, entity_addrvec_t, std::chrono::duration<long, std::ratio<1l, 1l> >, std::chrono::duration<long, std::ratio<1l, 1000l> >, Option::size_t, uuid_d> const&, ceph::Formatter*) const () from /usr/lib/ceph/libceph-common.so.2
#23 0x00007ffcb2f96510 in ?? ()
#24 0x00007ffcb2f96438 in ?? ()
#25 0x00007ffcb2f96480 in ?? ()
#26 0x0000000100000000 in ?? ()
#27 0x0000000000000000 in ?? ()
Updated by Laura Flores over 1 year ago
- Priority changed from Normal to Urgent
Maybe "urgent" is too dramatic, but this seems to be affecting a lot of tests in main.
Updated by Radoslaw Zarzynski over 1 year ago
- Priority changed from Urgent to High
How about having it as High?
Updated by Laura Flores over 1 year ago
- Translation missing: en.field_tag_list set to test-failure
Updated by Laura Flores over 1 year ago
/a/lflores-2022-08-17_21:04:23-rados:singleton-nomsgr-wip-yuri4-testing-2022-08-15-0951-distro-default-smithi/6977853
(gdb) bt
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007fb242eba859 in __GI_abort () at abort.c:79
#2 0x00007fb242f2526e in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7fb24304f298 "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#3 0x00007fb242f2d2fc in malloc_printerr (str=str@entry=0x7fb24304d4c1 "free(): invalid pointer") at malloc.c:5347
#4 0x00007fb242f2eb2c in _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at malloc.c:4173
#5 0x00007fb2431618ca in char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag) () from /lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x00007fb23e01a908 in ?? ()
#7 0x00007fb242e3d610 in ?? () from /usr/lib/ceph/libceph-common.so.2
#8 0x00007fb243161b17 in std::string::replace(unsigned long, unsigned long, char const*, unsigned long) () from /lib/x86_64-linux-gnu/libstdc++.so.6
#9 0x0000000000000000 in ?? ()
Both this bt and the one above shows "std::string::replace" is involved. And judging from Matan's output with -fsaniztize, there is some kind of memory comparison that isn't reacting well. These are definite clues.
Updated by Brad Hubbard over 1 year ago
I did an interactive rerun of /a/lflores-2022-08-17_21:04:23-rados:singleton-nomsgr-wip-yuri4-testing-2022-08-15-0951-distro-default-smithi/6977853 and manually ran the failing binary with valgrind.
# valgrind --trace-children=yes --show-reachable=yes --track-origins=yes --read-var-info=yes --tool=memcheck --leak-check=full --num-callers=50 -v --log-file=leaky.log ./hello_world_cpp -c /etc/ceph/ceph.conf
This produces the following clues.
==25304== ERROR SUMMARY: 15 errors from 8 contexts (suppressed: 0 from 0) ==25304== ==25304== 1 errors in context 1 of 8: ==25304== Conditional jump or move depends on uninitialised value(s) ==25304== at 0x483EF58: strlen (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==25304== by 0x4D16BCD: std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28) ==25304== by 0x10C75C: main (hello_world.cc:272) ==25304== Uninitialised value was created by a heap allocation ==25304== at 0x483E0F0: memalign (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==25304== by 0x483E212: posix_memalign (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==25304== by 0x48F20A7: ceph::buffer::v15_2_0::list::refill_append_space(unsigned int) (in /usr/lib/librados.so.2.0.0) ==25304== by 0x48F2281: ceph::buffer::v15_2_0::list::append(char const*, unsigned int) (in /usr/lib/librados.so.2.0.0) ==25304== by 0x10CFF2: ceph::buffer::v15_2_0::list::append(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (buffer.h:1128) ==25304== by 0x10C57D: main (hello_world.cc:254) ==25304== ==25304== ==25304== 1 errors in context 2 of 8: ==25304== Conditional jump or move depends on uninitialised value(s) ==25304== at 0x483EF58: strlen (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==25304== by 0x4D16BCD: std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28) ==25304== by 0x10C2CB: main (hello_world.cc:216) ==25304== Uninitialised value was created by a heap allocation ==25304== at 0x483E0F0: memalign (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==25304== by 0x483E212: posix_memalign (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==25304== by 0x48F20A7: ceph::buffer::v15_2_0::list::refill_append_space(unsigned int) (in /usr/lib/librados.so.2.0.0) ==25304== by 0x48F2281: ceph::buffer::v15_2_0::list::append(char const*, unsigned int) (in /usr/lib/librados.so.2.0.0) ==25304== by 0x10CFF2: ceph::buffer::v15_2_0::list::append(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (buffer.h:1128) ==25304== by 0x10C134: main (hello_world.cc:202) ==25304== ==25304== ==25304== 1 errors in context 3 of 8: ==25304== Thread 6 service: ==25304== Invalid free() / delete / delete[] / realloc() ==25304== at 0x483CFBF: operator delete(void*) (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==25304== by 0x4AB3847: StackStringStream<4096ul>::~StackStringStream() (in /usr/lib/libradosstriper.so.1.0.0) ==25304== by 0x53552B8: CachedStackStringStream::Cache::~Cache() (in /usr/lib/ceph/libceph-common.so.2) ==25304== by 0x4E272BE: __call_tls_dtors (cxa_thread_atexit_impl.c:155) ==25304== by 0x5D36616: start_thread (pthread_create.c:485) ==25304== by 0x4EFF132: clone (clone.S:95) ==25304== Address 0x5d26780 is in the BSS segment of /usr/lib/ceph/libceph-common.so.2 ==25304== ==25304== ==25304== 1 errors in context 4 of 8: ==25304== Invalid free() / delete / delete[] / realloc() ==25304== at 0x483D74F: operator delete[](void*) (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==25304== by 0x4C9FB16: std::locale::~locale() (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28) ==25304== by 0x4AB3847: StackStringStream<4096ul>::~StackStringStream() (in /usr/lib/libradosstriper.so.1.0.0) ==25304== by 0x53552B8: CachedStackStringStream::Cache::~Cache() (in /usr/lib/ceph/libceph-common.so.2) ==25304== by 0x4E272BE: __call_tls_dtors (cxa_thread_atexit_impl.c:155) ==25304== by 0x5D36616: start_thread (pthread_create.c:485) ==25304== by 0x4EFF132: clone (clone.S:95) ==25304== Address 0x5d26700 is in the BSS segment of /usr/lib/ceph/libceph-common.so.2 ==25304== ==25304== ==25304== 1 errors in context 5 of 8: ==25304== Invalid free() / delete / delete[] / realloc() ==25304== at 0x483D74F: operator delete[](void*) (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==25304== by 0x4C9F96D: std::locale::_Impl::~_Impl() (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28) ==25304== by 0x4C9FB16: std::locale::~locale() (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28) ==25304== by 0x4AB3847: StackStringStream<4096ul>::~StackStringStream() (in /usr/lib/libradosstriper.so.1.0.0) ==25304== by 0x53552B8: CachedStackStringStream::Cache::~Cache() (in /usr/lib/ceph/libceph-common.so.2) ==25304== by 0x4E272BE: __call_tls_dtors (cxa_thread_atexit_impl.c:155) ==25304== by 0x5D36616: start_thread (pthread_create.c:485) ==25304== by 0x4EFF132: clone (clone.S:95) ==25304== Address 0x5d266e0 is in the BSS segment of /usr/lib/ceph/libceph-common.so.2 ==25304== ==25304== ==25304== 1 errors in context 6 of 8: ==25304== Invalid free() / delete / delete[] / realloc() ==25304== at 0x483D74F: operator delete[](void*) (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==25304== by 0x4C9F951: std::locale::_Impl::~_Impl() (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28) ==25304== by 0x4C9FB16: std::locale::~locale() (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28) ==25304== by 0x4AB3847: StackStringStream<4096ul>::~StackStringStream() (in /usr/lib/libradosstriper.so.1.0.0) ==25304== by 0x53552B8: CachedStackStringStream::Cache::~Cache() (in /usr/lib/ceph/libceph-common.so.2) ==25304== by 0x4E272BE: __call_tls_dtors (cxa_thread_atexit_impl.c:155) ==25304== by 0x5D36616: start_thread (pthread_create.c:485) ==25304== by 0x4EFF132: clone (clone.S:95) ==25304== Address 0x5d263e0 is in the BSS segment of /usr/lib/ceph/libceph-common.so.2 ==25304== ==25304== ==25304== 1 errors in context 7 of 8: ==25304== Invalid free() / delete / delete[] / realloc() ==25304== at 0x483D74F: operator delete[](void*) (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==25304== by 0x4C9F8C9: std::locale::_Impl::~_Impl() (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28) ==25304== by 0x4C9FB16: std::locale::~locale() (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28) ==25304== by 0x4AB3847: StackStringStream<4096ul>::~StackStringStream() (in /usr/lib/libradosstriper.so.1.0.0) ==25304== by 0x53552B8: CachedStackStringStream::Cache::~Cache() (in /usr/lib/ceph/libceph-common.so.2) ==25304== by 0x4E272BE: __call_tls_dtors (cxa_thread_atexit_impl.c:155) ==25304== by 0x5D36616: start_thread (pthread_create.c:485) ==25304== by 0x4EFF132: clone (clone.S:95) ==25304== Address 0x5d26560 is in the BSS segment of /usr/lib/ceph/libceph-common.so.2 ==25304== ==25304== ==25304== 8 errors in context 8 of 8: ==25304== Invalid free() / delete / delete[] / realloc() ==25304== at 0x483CFBF: operator delete(void*) (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==25304== by 0x4C9F9D9: std::locale::_Impl::~_Impl() (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28) ==25304== by 0x4C9FB16: std::locale::~locale() (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28) ==25304== by 0x4AB3847: StackStringStream<4096ul>::~StackStringStream() (in /usr/lib/libradosstriper.so.1.0.0) ==25304== by 0x53552B8: CachedStackStringStream::Cache::~Cache() (in /usr/lib/ceph/libceph-common.so.2) ==25304== by 0x4E272BE: __call_tls_dtors (cxa_thread_atexit_impl.c:155) ==25304== by 0x5D36616: start_thread (pthread_create.c:485) ==25304== by 0x4EFF132: clone (clone.S:95) ==25304== Address 0x5d258c0 is in the BSS segment of /usr/lib/ceph/libceph-common.so.2 ==25304== ==25304== ERROR SUMMARY: 15 errors from 8 contexts (suppressed: 0 from 0)
I suspect it's the errors in the first two contexts that are causing the problem.
254 bl.append(hello); 255 bl.append("v3"); 256 old_version_bl.clear(); 257 old_version_bl.append('2'); 258 version_bl.clear(); 259 version_bl.append('3'); 260 librados::ObjectWriteOperation update_op; 261 update_op.cmpxattr("version", LIBRADOS_CMPXATTR_OP_EQ, old_version_bl); 262 update_op.write_full(bl); 263 update_op.setxattr("version", version_bl); 264 ret = io_ctx.operate(object_name, &update_op); 265 if (ret < 0) { 266 std::cerr << "failed to do a compound write update! error " << ret 267 << std::endl; 268 ret = EXIT_FAILURE; 269 goto out; 270 } 271 std::cout << "we overwrote our object " << object_name 272 << " following an xattr test with contents\n" << bl.c_str() 273 << std::endl; 274 }
195 /* 196 * And if we want to be really cool, we can do multiple things in a single 197 * atomic operation. For instance, we can update the contents of our object 198 * and set the version at the same time. 199 */ 200 { 201 librados::bufferlist bl; 202 bl.append(hello); 203 bl.append("v2"); 204 librados::ObjectWriteOperation write_op; 205 write_op.write_full(bl); 206 librados::bufferlist version_bl; 207 version_bl.append('2'); 208 write_op.setxattr("version", version_bl); 209 ret = io_ctx.operate(object_name, &write_op); 210 if (ret < 0) { 211 std::cerr << "failed to do compound write! error " << ret << std::endl; 212 ret = EXIT_FAILURE; 213 goto out; 214 } 215 std::cout << "we overwrote our object " << object_name 216 << " with contents\n" << bl.c_str() << std::endl; 217 }
# g++ -std=c++11 -Wno-unused-parameter -Wall -Wextra -Werror -g -fsanitize=address -o hello_world_cpp hello_world.cc -lrados -lradosstriper
# ASAN_OPTIONS=alloc_dealloc_mismatch=0 ./hello_world_cpp -c /etc/ceph/ceph.conf we just set up a rados cluster object we just parsed our config options ================================================================= ==26544==ERROR: AddressSanitizer: attempting free on address which was not malloc()-ed: 0x7fe462f3e560 in thread T5 (ms_dispatch) #0 0x7fe46394f6ef in operator delete[](void*) ../../../../src/libsanitizer/asan/asan_new_delete.cc:168 #1 0x7fe46338d8c9 in std::locale::_Impl::~_Impl() (/lib/x86_64-linux-gnu/libstdc++.so.6+0xbc8c9) #2 0x7fe46338db16 in std::locale::~locale() (/lib/x86_64-linux-gnu/libstdc++.so.6+0xbcb16) #3 0x7fe463532847 in std::basic_ios<char, std::char_traits<char> >::~basic_ios() /usr/include/c++/11/bits/basic_ios.h:282 #4 0x7fe463532847 in StackStringStream<4096ul>::~StackStringStream() src/common/StackStringStream.h:100 #5 0x7fe463532847 in StackStringStream<4096ul>::~StackStringStream() src/common/StackStringStream.h:100 #6 0x7fe46256d2b8 in CachedStackStringStream::Cache::~Cache() (/usr/lib/ceph/libceph-common.so.2+0x3812b8) #7 0x7fe462fba2be in __GI___call_tls_dtors /build/glibc-SzIz7B/glibc-2.31/stdlib/cxa_thread_atexit_impl.c:155 #8 0x7fe462f48616 in start_thread /build/glibc-SzIz7B/glibc-2.31/nptl/pthread_create.c:485 #9 0x7fe463092132 in __clone (/lib/x86_64-linux-gnu/libc.so.6+0x11f132) Address 0x7fe462f3e560 is a wild pointer. SUMMARY: AddressSanitizer: bad-free ../../../../src/libsanitizer/asan/asan_new_delete.cc:168 in operator delete[](void*) Thread T5 (ms_dispatch) created by T0 here: #0 0x7fe463879815 in __interceptor_pthread_create ../../../../src/libsanitizer/asan/asan_interceptors.cc:208 #1 0x7fe4625ec693 in Thread::try_create(unsigned long) (/usr/lib/ceph/libceph-common.so.2+0x400693) ==26544==ABORTING
Not clear yet but we have lots of hints ;)
I'll try to come back to this Monday if not before.
Updated by Brad Hubbard over 1 year ago
Many thanks to Josh for suggesting we may be dealing with a compiler mismatch here and sorry if you were working on this too Matan (I heard you were speaking to Josh about it).
# g++ --version g++ (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0 # g++ -std=c++11 -Wno-unused-parameter -Wall -Wextra -Werror -g -o hello_world_cpp hello_world.cc -lrados -lradosstriper # ./hello_world_cpp -c /etc/ceph/ceph.conf we just set up a rados cluster object we just parsed our config options free(): invalid pointer Aborted # add-apt-repository -y ppa:ubuntu-toolchain-r/test # sudo apt -y install gcc-11 g++-11 # g++-11 -std=c++20 -Wno-unused-parameter -Wall -Wextra -Werror -g -o hello_world_cpp hello_world.cc -lrados -lradosstriper # ./hello_world_cpp -c /etc/ceph/ceph.conf we just set up a rados cluster object we just parsed our config options we just connected to the rados cluster we just created a new pool named hello_world_pool we just created an ioctx for our pool we just wrote new object hello_object, with contents hello world! we read our object hello_object, and got back 12 bytes with contents hello world! we set the xattr 'version' on our object! we overwrote our object hello_object with contents hello world!v2 we just failed a write because the xattr wasn't as specified we overwrote our object hello_object following an xattr test with contents hello world!v3
So the hello_world code needs to be compiled with std=c++20.
Updated by Nitzan Mordechai over 1 year ago
/a/yuriw-2022-09-01_00:21:36-rados-wip-yuri7-testing-2022-08-31-0841-distro-default-smithi/7003413
Updated by Matan Breizman over 1 year ago
- Status changed from New to Fix Under Review
- Pull request ID set to 47900
Many thanks to Josh for suggesting we may be dealing with a compiler mismatch here and sorry if you were working on this too Matan (I heard you were speaking to Josh about it).
No worries! It was worth mentioning that to Josh then :)
Updated by Matan Breizman over 1 year ago
- Status changed from Fix Under Review to Resolved
test_envlibrados_for_rocksdb failure will be tracked here: https://tracker.ceph.com/issues/57632