Actions
Bug #16908
closedInvalidRead in OSD
% Done:
0%
Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Seen on master today:
<error> <unique>0x0</unique> <tid>49</tid> <threadname>tp_osd_tp</threadname> <kind>InvalidRead</kind> <what>Invalid read of size 8</what> <stack> <frame> <ip>0x8EF764</ip> <obj>/usr/bin/ceph-osd</obj> <fn>ReplicatedPG::do_op(std::shared_ptr<OpRequest>&)</fn> </frame> <frame> <ip>0x8AB73B</ip> <obj>/usr/bin/ceph-osd</obj> <fn>ReplicatedPG::do_request(std::shared_ptr<OpRequest>&, ThreadPool::TPHandle&)</fn> </frame> <frame> <ip>0x763FF4</ip> <obj>/usr/bin/ceph-osd</obj> <fn>OSD::dequeue_op(boost::intrusive_ptr<PG>, std::shared_ptr<OpRequest>, ThreadPool::TPHandle&)</fn> </frame> <frame> <ip>0x76421C</ip> <obj>/usr/bin/ceph-osd</obj> <fn>PGQueueable::RunVis::operator()(std::shared_ptr<OpRequest> const&)</fn> </frame> <frame> <ip>0x78523B</ip> <obj>/usr/bin/ceph-osd</obj> <fn>OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)</fn> </frame> <frame> <ip>0xD5DD64</ip> <obj>/usr/bin/ceph-osd</obj> <fn>ShardedThreadPool::shardedthreadpool_worker(unsigned int)</fn> </frame> <frame> <ip>0xD5FEBF</ip> <obj>/usr/bin/ceph-osd</obj> <fn>ShardedThreadPool::WorkThreadSharded::entry()</fn> </frame> <frame> <ip>0xBD6F181</ip> <obj>/lib/x86_64-linux-gnu/libpthread-2.19.so</obj> <fn>start_thread</fn> <dir>/build/buildd/eglibc-2.19/nptl</dir> <file>pthread_create.c</file> <line>312</line> </frame> <frame> <ip>0xD22247C</ip> <obj>/lib/x86_64-linux-gnu/libc-2.19.so</obj> <fn>clone</fn> <dir>/build/buildd/eglibc-2.19/misc/../sysdeps/unix/sysv/linux/x86_64</dir> <file>clone.S</file> <line>111</line> </frame> </stack> <auxwhat>Address 0x353022b8 is 1,464 bytes inside a block of size 1,632 free'd</auxwhat> <stack> <frame> <ip>0xA22C2BC</ip> <obj>/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so</obj> <fn>operator delete(void*)</fn> </frame> <frame> <ip>0x8EB9E6</ip> <obj>/usr/bin/ceph-osd</obj> <fn>ReplicatedPG::execute_ctx(ReplicatedPG::OpContext*)</fn> </frame> <frame> <ip>0x8EF6FE</ip> <obj>/usr/bin/ceph-osd</obj> <fn>ReplicatedPG::do_op(std::shared_ptr<OpRequest>&)</fn> </frame> <frame> <ip>0x8AB73B</ip> <obj>/usr/bin/ceph-osd</obj> <fn>ReplicatedPG::do_request(std::shared_ptr<OpRequest>&, ThreadPool::TPHandle&)</fn> </frame> <frame> <ip>0x763FF4</ip> <obj>/usr/bin/ceph-osd</obj> <fn>OSD::dequeue_op(boost::intrusive_ptr<PG>, std::shared_ptr<OpRequest>, ThreadPool::TPHandle&)</fn> </frame> <frame> <ip>0x76421C</ip> <obj>/usr/bin/ceph-osd</obj> <fn>PGQueueable::RunVis::operator()(std::shared_ptr<OpRequest> const&)</fn> </frame> <frame> <ip>0x78523B</ip> <obj>/usr/bin/ceph-osd</obj> <fn>OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)</fn> </frame> <frame> <ip>0xD5DD64</ip> <obj>/usr/bin/ceph-osd</obj> <fn>ShardedThreadPool::shardedthreadpool_worker(unsigned int)</fn> </frame> <frame> <ip>0xD5FEBF</ip> <obj>/usr/bin/ceph-osd</obj> <fn>ShardedThreadPool::WorkThreadSharded::entry()</fn> </frame> <frame> <ip>0xBD6F181</ip> <obj>/lib/x86_64-linux-gnu/libpthread-2.19.so</obj> <fn>start_thread</fn> <dir>/build/buildd/eglibc-2.19/nptl</dir> <file>pthread_create.c</file> <line>312</line> </frame> <frame> <ip>0xD22247C</ip> <obj>/lib/x86_64-linux-gnu/libc-2.19.so</obj> <fn>clone</fn> <dir>/build/buildd/eglibc-2.19/misc/../sysdeps/unix/sysv/linux/x86_64</dir> <file>clone.S</file> <line>111</line> </frame> </stack> </error>
Updated by Samuel Just over 7 years ago
sjust@teuthology:/a/dzafman-2016-08-03_16:16:17-rados-wip-zafman-testing-distro-basic-smithi/348187/remote
Updated by Josh Durgin over 7 years ago
Still occurring - with a longer backtrace in /a/yuriw-2016-08-20_15:36:43-rados-master_2016_08_19-distro-basic-smithi/375963/remote/smithi026/log/valgrind/osd.0.log.gz:
<error> <unique>0x1</unique> <tid>55</tid> <threadname>tp_osd_tp</threadname> <kind>InvalidRead</kind> <what>Invalid read of size 8</what> <stack> <frame> <ip>0x6607A2</ip> <obj>/usr/bin/ceph-osd</obj> <fn>ReplicatedPG::do_op(std::shared_ptr<OpRequest>&)</fn> <dir>/usr/src/debug/ceph-11.0.0/src/osd</dir> <file>ReplicatedPG.cc</file> <line>2145</line> </frame> <frame> <ip>0x61BD26</ip> <obj>/usr/bin/ceph-osd</obj> <fn>ReplicatedPG::do_request(std::shared_ptr<OpRequest>&, ThreadPool::TPHandle&)</fn> <dir>/usr/src/debug/ceph-11.0.0/src/osd</dir> <file>ReplicatedPG.cc</file> <line>1480</line> </frame> <frame> <ip>0x4CADFC</ip> <obj>/usr/bin/ceph-osd</obj> <fn>OSD::dequeue_op(boost::intrusive_ptr<PG>, std::shared_ptr<OpRequest>, ThreadPool::TPHandle&)</fn> <dir>/usr/src/debug/ceph-11.0.0/src/osd</dir> <file>OSD.cc</file> <line>8867</line> </frame> <frame> <ip>0x4CB04C</ip> <obj>/usr/bin/ceph-osd</obj> <fn>PGQueueable::RunVis::operator()(std::shared_ptr<OpRequest> const&)</fn> <dir>/usr/src/debug/ceph-11.0.0/src/osd</dir> <file>OSD.cc</file> <line>163</line> </frame> <frame> <ip>0x4EC66B</ip> <obj>/usr/bin/ceph-osd</obj> <fn>internal_visit<std::shared_ptr<OpRequest> ></fn> <dir>/usr/include/boost/variant</dir> <file>variant.hpp</file> <line>1017</line> </frame> <frame> <ip>0x4EC66B</ip> <obj>/usr/bin/ceph-osd</obj> <fn>visitation_impl_invoke_impl<boost::detail::variant::invoke_visitor<PGQueueable::RunVis>, void*, std::shared_ptr<OpRequest> ></fn> <dir>/usr/include/boost/variant/detail</dir> <file>visitation_impl.hpp</file> <line>130</line> </frame> <frame> <ip>0x4EC66B</ip> <obj>/usr/bin/ceph-osd</obj> <fn>visitation_impl_invoke<boost::detail::variant::invoke_visitor<PGQueueable::RunVis>, void*, std::shared_ptr<OpRequest>, boost::variant<std::shared_ptr<OpRequest>, PGSnapTrim, PGScrub, PGRecovery>::has_fallback_type_></fn> <dir>/usr/include/boost/variant/detail</dir> <file>visitation_impl.hpp</file> <line>173</line> </frame> <frame> <ip>0x4EC66B</ip> <obj>/usr/bin/ceph-osd</obj> <fn>visitation_impl<mpl_::int_<0>, boost::detail::variant::visitation_impl_step<boost::mpl::l_iter<boost::mpl::l_item<mpl_::long_<4l>, std::shared_ptr<OpRequest>, boost::mpl::l_item<mpl_::long_<3l>, PGSnapTrim, boost::mpl::l_item<mpl_::long_<2l>, PGScrub, boost::mpl::l_item<mpl_::long_<1l>, PGRecovery, boost::mpl::l_end> > > > >, boost::mpl::l_iter<boost::mpl::l_end> >, boost::detail::variant::invoke_visitor<PGQueueable::RunVis>, void*, boost::variant<std::shared_ptr<OpRequest>, PGSnapTrim, PGScrub, PGRecovery>::has_fallback_type_></fn> <dir>/usr/include/boost/variant/detail</dir> <file>visitation_impl.hpp</file> <line>256</line> </frame> <frame> <ip>0x4EC66B</ip> <obj>/usr/bin/ceph-osd</obj> <fn>internal_apply_visitor_impl<boost::detail::variant::invoke_visitor<PGQueueable::RunVis>, void*></fn> <dir>/usr/include/boost/variant</dir> <file>variant.hpp</file> <line>2326</line> </frame> <frame> <ip>0x4EC66B</ip> <obj>/usr/bin/ceph-osd</obj> <fn>internal_apply_visitor<boost::detail::variant::invoke_visitor<PGQueueable::RunVis> ></fn> <dir>/usr/include/boost/variant</dir> <file>variant.hpp</file> <line>2337</line> </frame> <frame> <ip>0x4EC66B</ip> <obj>/usr/bin/ceph-osd</obj> <fn>apply_visitor<PGQueueable::RunVis></fn> <dir>/usr/include/boost/variant</dir> <file>variant.hpp</file> <line>2360</line> </frame> <frame> <ip>0x4EC66B</ip> <obj>/usr/bin/ceph-osd</obj> <fn>apply_visitor<PGQueueable::RunVis, boost::variant<std::shared_ptr<OpRequest>, PGSnapTrim, PGScrub, PGRecovery> ></fn> <dir>/usr/include/boost/variant/detail</dir> <file>apply_visitor_unary.hpp</file> <line>60</line> </frame> <frame> <ip>0x4EC66B</ip> <obj>/usr/bin/ceph-osd</obj> <fn>run</fn> <dir>/usr/src/debug/ceph-11.0.0/src/osd</dir> <file>OSD.h</file> <line>410</line> </frame> <frame> <ip>0x4EC66B</ip> <obj>/usr/bin/ceph-osd</obj> <fn>OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)</fn> <dir>/usr/src/debug/ceph-11.0.0/src/osd</dir> <file>OSD.cc</file> <line>8748</line> </frame> <frame> <ip>0xAF94E6</ip> <obj>/usr/bin/ceph-osd</obj> <fn>ShardedThreadPool::shardedthreadpool_worker(unsigned int)</fn> <dir>/usr/src/debug/ceph-11.0.0/src/common</dir> <file>WorkQueue.cc</file> <line>356</line> </frame> <frame> <ip>0xAFB63F</ip> <obj>/usr/bin/ceph-osd</obj> <fn>ShardedThreadPool::WorkThreadSharded::entry()</fn> <dir>/usr/src/debug/ceph-11.0.0/src/common</dir> <file>WorkQueue.h</file> <line>685</line> </frame> <frame> <ip>0xD0A7DC4</ip> <obj>/usr/lib64/libpthread-2.17.so</obj> <fn>start_thread</fn> </frame> <frame> <ip>0xE1F821C</ip> <obj>/usr/lib64/libc-2.17.so</obj> <fn>clone</fn> </frame> </stack> <auxwhat>Address 0x3ab72588 is 1,464 bytes inside a block of size 1,632 free'd</auxwhat> <stack> <frame> <ip>0x9FD9131</ip> <obj>/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so</obj> <fn>operator delete(void*)</fn> </frame> <frame> <ip>0x65D9C3</ip> <obj>/usr/bin/ceph-osd</obj> <fn>ReplicatedPG::execute_ctx(ReplicatedPG::OpContext*)</fn> <dir>/usr/src/debug/ceph-11.0.0/src/osd</dir> <file>ReplicatedPG.cc</file> <line>3015</line> </frame> <frame> <ip>0x66073F</ip> <obj>/usr/bin/ceph-osd</obj> <fn>ReplicatedPG::do_op(std::shared_ptr<OpRequest>&)</fn> <dir>/usr/src/debug/ceph-11.0.0/src/osd</dir> <file>ReplicatedPG.cc</file> <line>2139</line> </frame> <frame> <ip>0x61BD26</ip> <obj>/usr/bin/ceph-osd</obj> <fn>ReplicatedPG::do_request(std::shared_ptr<OpRequest>&, ThreadPool::TPHandle&)</fn> <dir>/usr/src/debug/ceph-11.0.0/src/osd</dir> <file>ReplicatedPG.cc</file> <line>1480</line> </frame> <frame> <ip>0x4CADFC</ip> <obj>/usr/bin/ceph-osd</obj> <fn>OSD::dequeue_op(boost::intrusive_ptr<PG>, std::shared_ptr<OpRequest>, ThreadPool::TPHandle&)</fn> <dir>/usr/src/debug/ceph-11.0.0/src/osd</dir> <file>OSD.cc</file> <line>8867</line> </frame> <frame> <ip>0x4CB04C</ip> <obj>/usr/bin/ceph-osd</obj> <fn>PGQueueable::RunVis::operator()(std::shared_ptr<OpRequest> const&)</fn> <dir>/usr/src/debug/ceph-11.0.0/src/osd</dir> <file>OSD.cc</file> <line>163</line> </frame> <frame> <ip>0x4EC66B</ip> <obj>/usr/bin/ceph-osd</obj> <fn>internal_visit<std::shared_ptr<OpRequest> ></fn> <dir>/usr/include/boost/variant</dir> <file>variant.hpp</file> <line>1017</line> </frame> <frame> <ip>0x4EC66B</ip> <obj>/usr/bin/ceph-osd</obj> <fn>visitation_impl_invoke_impl<boost::detail::variant::invoke_visitor<PGQueueable::RunVis>, void*, std::shared_ptr<OpRequest> ></fn> <dir>/usr/include/boost/variant/detail</dir> <file>visitation_impl.hpp</file> <line>130</line> </frame> <frame> <ip>0x4EC66B</ip> <obj>/usr/bin/ceph-osd</obj> <fn>visitation_impl_invoke<boost::detail::variant::invoke_visitor<PGQueueable::RunVis>, void*, std::shared_ptr<OpRequest>, boost::variant<std::shared_ptr<OpRequest>, PGSnapTrim, PGScrub, PGRecovery>::has_fallback_type_></fn> <dir>/usr/include/boost/variant/detail</dir> <file>visitation_impl.hpp</file> <line>173</line> </frame> <frame> <ip>0x4EC66B</ip> <obj>/usr/bin/ceph-osd</obj> <fn>visitation_impl<mpl_::int_<0>, boost::detail::variant::visitation_impl_step<boost::mpl::l_iter<boost::mpl::l_item<mpl_::long_<4l>, std::shared_ptr<OpRequest>, boost::mpl::l_item<mpl_::long_<3l>, PGSnapTrim, boost::mpl::l_item<mpl_::long_<2l>, PGScrub, boost::mpl::l_item<mpl_::long_<1l>, PGRecovery, boost::mpl::l_end> > > > >, boost::mpl::l_iter<boost::mpl::l_end> >, boost::detail::variant::invoke_visitor<PGQueueable::RunVis>, void*, boost::variant<std::shared_ptr<OpRequest>, PGSnapTrim, PGScrub, PGRecovery>::has_fallback_type_></fn> <dir>/usr/include/boost/variant/detail</dir> <file>visitation_impl.hpp</file> <line>256</line> </frame> <frame> <ip>0x4EC66B</ip> <obj>/usr/bin/ceph-osd</obj> <fn>internal_apply_visitor_impl<boost::detail::variant::invoke_visitor<PGQueueable::RunVis>, void*></fn> <dir>/usr/include/boost/variant</dir> <file>variant.hpp</file> <line>2326</line> </frame> <frame> <ip>0x4EC66B</ip> <obj>/usr/bin/ceph-osd</obj> <fn>internal_apply_visitor<boost::detail::variant::invoke_visitor<PGQueueable::RunVis> ></fn> <dir>/usr/include/boost/variant</dir> <file>variant.hpp</file> <line>2337</line> </frame> <frame> <ip>0x4EC66B</ip> <obj>/usr/bin/ceph-osd</obj> <fn>apply_visitor<PGQueueable::RunVis></fn> <dir>/usr/include/boost/variant</dir> <file>variant.hpp</file> <line>2360</line> </frame> <frame> <ip>0x4EC66B</ip> <obj>/usr/bin/ceph-osd</obj> <fn>apply_visitor<PGQueueable::RunVis, boost::variant<std::shared_ptr<OpRequest>, PGSnapTrim, PGScrub, PGRecovery> ></fn> <dir>/usr/include/boost/variant/detail</dir> <file>apply_visitor_unary.hpp</file> <line>60</line> </frame> <frame> <ip>0x4EC66B</ip> <obj>/usr/bin/ceph-osd</obj> <fn>run</fn> <dir>/usr/src/debug/ceph-11.0.0/src/osd</dir> <file>OSD.h</file> <line>410</line> </frame> <frame> <ip>0x4EC66B</ip> <obj>/usr/bin/ceph-osd</obj> <fn>OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)</fn> <dir>/usr/src/debug/ceph-11.0.0/src/osd</dir> <file>OSD.cc</file> <line>8748</line> </frame> <frame> <ip>0xAF94E6</ip> <obj>/usr/bin/ceph-osd</obj> <fn>ShardedThreadPool::shardedthreadpool_worker(unsigned int)</fn> <dir>/usr/src/debug/ceph-11.0.0/src/common</dir> <file>WorkQueue.cc</file> <line>356</line> </frame> <frame> <ip>0xAFB63F</ip> <obj>/usr/bin/ceph-osd</obj> <fn>ShardedThreadPool::WorkThreadSharded::entry()</fn> <dir>/usr/src/debug/ceph-11.0.0/src/common</dir> <file>WorkQueue.h</file> <line>685</line> </frame> <frame> <ip>0xD0A7DC4</ip> <obj>/usr/lib64/libpthread-2.17.so</obj> <fn>start_thread</fn> </frame> <frame> <ip>0xE1F821C</ip> <obj>/usr/lib64/libc-2.17.so</obj> <fn>clone</fn> </frame> </stack> </error>
Updated by Samuel Just over 7 years ago
The report from Josh seems to indicate that the issue is with the pending_reads access introduced in 52be772788d9d96accaa7af9eaf9f29a3792df49 .
Updated by Samuel Just over 7 years ago
Yeah, in the error path, execute_ctx closes and deletes the OpContext, so that's invalid.
Updated by Samuel Just over 7 years ago
- Status changed from New to 7
- Assignee set to Samuel Just
Actions