Bug #52124
Invalid read of size 8 in handle_recovery_delete()
0%
Description
<error> <unique>0x55a2b5</unique> <tid>54</tid> <threadname>tp_osd_tp</threadname> <kind>InvalidRead</kind> <what>Invalid read of size 8</what> <stack> <frame> <ip>0x156A06D</ip> <obj>/usr/bin/ceph-osd</obj> <fn>ceph::common::RefCountedObject::put() const</fn> <dir>/usr/src/debug/ceph-17.0.0-6641.g626e0d0d.el8.x86_64/src/common</dir> <file>RefCountedObj.cc</file> <line>18</line> </frame> <frame> <ip>0x1000B60</ip> <obj>/usr/bin/ceph-osd</obj> <fn>intrusive_ptr_release</fn> <dir>/usr/src/debug/ceph-17.0.0-6641.g626e0d0d.el8.x86_64/src/common</dir> <file>RefCountedObj.h</file> <line>194</line> </frame> <frame> <ip>0x1000B60</ip> <obj>/usr/bin/ceph-osd</obj> <fn>~intrusive_ptr</fn> <dir>/usr/src/debug/ceph-17.0.0-6641.g626e0d0d.el8.x86_64/x86_64-redhat-linux-gnu/boost/include/boost/smart_ptr</dir> <file>intrusive_ptr.hpp</file> <line>98</line> </frame> <frame> <ip>0x1000B60</ip> <obj>/usr/bin/ceph-osd</obj> <fn>~<lambda></fn> <dir>/usr/src/debug/ceph-17.0.0-6641.g626e0d0d.el8.x86_64/src/osd</dir> <file>PGBackend.cc</file> <line>156</line> ... <auxwhat>Address 0x17592060 is 16 bytes inside a block of size 376 free'd</auxwhat> <stack> <frame> <ip>0x4C3210C</ip> <obj>/usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so</obj> <fn>free</fn> <dir>/builddir/build/BUILD/valgrind-3.16.0/coregrind/m_replacemalloc</dir> <file>vg_replace_malloc.c</file> <line>538</line> </frame> <frame> <ip>0x100F93E</ip> <obj>/usr/bin/ceph-osd</obj> <fn>MOSDPGRecoveryDeleteReply::~MOSDPGRecoveryDeleteReply()</fn> <dir>/usr/src/debug/ceph-17.0.0-6641.g626e0d0d.el8.x86_64/src/messages</dir> <file>MOSDPGRecoveryDeleteReply.h</file> <line>9</line>
/a/nojha-2021-08-09_20:10:33-rados-wip_gbenhano_ncbz-distro-basic-smithi/6328976/remote/smithi049/log/valgrind
Related issues
History
#1 Updated by Neha Ojha about 2 years ago
/a/yuriw-2021-08-26_18:40:53-rados-wip-yuri7-testing-2021-08-26-0841-distro-basic-smithi/6360450/remote/smithi052/log/valgrind
#2 Updated by Neha Ojha about 2 years ago
- Backport set to pacific
/a/yuriw-2021-08-31_22:30:47-rados-wip-yuri8-testing-2021-08-30-0930-pacific-distro-basic-smithi/6369129/remote/smithi133/log/valgrind
#3 Updated by Neha Ojha almost 2 years ago
/a/yuriw-2021-10-21_13:40:38-rados-wip-yuri2-testing-2021-10-20-1700-pacific-distro-basic-smithi/6454961/remote/smithi191/log/valgrind
#4 Updated by Sridhar Seshasayee almost 2 years ago
/a/yuriw-2021-12-07_16:02:55-rados-wip-yuri11-testing-2021-12-06-1619-distro-default-smithi/6550873
#5 Updated by Laura Flores almost 2 years ago
/a/yuriw-2021-12-22_22:11:35-rados-wip-yuri3-testing-2021-12-22-1047-distro-default-smithi/6580436
#6 Updated by Laura Flores almost 2 years ago
/a/yuriw-2021-12-22_22:11:35-rados-wip-yuri3-testing-2021-12-22-1047-distro-default-smithi/6580187
#7 Updated by Sridhar Seshasayee over 1 year ago
/a/yuriw-2022-01-08_17:57:43-rados-wip-yuri8-testing-2022-01-07-1541-distro-default-smithi/6603232
#8 Updated by Laura Flores over 1 year ago
/a/yuriw-2022-01-11_19:17:55-rados-wip-yuri5-testing-2022-01-11-0843-distro-default-smithi/6608445/
#9 Updated by Laura Flores over 1 year ago
/a/yuriw-2022-02-08_17:00:23-rados-wip-yuri5-testing-2022-02-08-0733-pacific-distro-default-smithi/6670360
#10 Updated by Neha Ojha over 1 year ago
- Backport changed from pacific to pacific,quincy
#11 Updated by Neha Ojha over 1 year ago
/a/yuriw-2022-02-15_22:40:39-rados-wip-yuri7-testing-2022-02-15-1102-quincy-distro-default-smithi/6686655/remote/smithi062/log/valgrind
#12 Updated by Laura Flores over 1 year ago
/a/yuriw-2022-02-16_15:53:49-rados-wip-yuri11-testing-2022-02-15-1643-distro-default-smithi/6688846
#13 Updated by Laura Flores over 1 year ago
/a/yuriw-2022-02-21_18:20:15-rados-wip-yuri11-testing-2022-02-21-0831-quincy-distro-default-smithi/6699270
Happened this time on osd.1, which exited after osd.2:
lflores@teuthology:/a/yuriw-2022-02-21_18:20:15-rados-wip-yuri11-testing-2022-02-21-0831-quincy-distro-default-smithi/6699270$ cat teuthology.log | grep "Exit program on first error"
2022-02-22T00:25:28.903 INFO:tasks.ceph.osd.2.smithi039.stderr:==00:00:20:56.306 37217== Exit program on first error (--exit-on-first-error=yes)
2022-02-22T00:29:11.442 INFO:tasks.ceph.osd.1.smithi039.stderr:==00:00:19:15.980 122658== Exit program on first error (--exit-on-first-error=yes)
2022-02-22T00:33:38.817 INFO:tasks.ceph.osd.0.smithi039.stderr:==00:00:29:06.358 37214== Exit program on first error (--exit-on-first-error=yes)
2022-02-22T00:33:39.373 INFO:tasks.ceph.osd.6.smithi065.stderr:==00:00:29:06.876 37038== Exit program on first error (--exit-on-first-error=yes)
2022-02-22T00:33:39.436 INFO:tasks.ceph.osd.5.smithi065.stderr:==00:00:29:06.918 37040== Exit program on first error (--exit-on-first-error=yes)
#14 Updated by Laura Flores over 1 year ago
Happened in a dead job.
/a/yuriw-2022-02-21_15:40:41-rados-wip-yuri4-testing-2022-02-18-0800-distro-default-smithi/6698528
/a/yuriw-2022-02-22_16:14:07-rados-wip-yuri4-testing-2022-02-18-0800-distro-default-smithi/6700753
#15 Updated by Radoslaw Zarzynski over 1 year ago
- Tags set to low-hanging-fruit
#16 Updated by Laura Flores over 1 year ago
/a/yuriw-2022-02-15_16:22:25-rados-wip-yuri6-testing-2022-02-14-1456-distro-default-smithi/6685226
#17 Updated by Laura Flores over 1 year ago
/a/yuriw-2022-03-01_22:42:19-rados-wip-yuri4-testing-2022-03-01-1206-distro-default-smithi/6715365
#18 Updated by Laura Flores over 1 year ago
/a/yuriw-2022-03-19_14:37:23-rados-wip-yuri6-testing-2022-03-18-1104-distro-default-smithi/6746705
#19 Updated by Laura Flores over 1 year ago
/a/yuriw-2022-03-25_18:42:52-rados-wip-yuri7-testing-2022-03-24-1341-pacific-distro-default-smithi/6761328
#20 Updated by Aishwarya Mathuria over 1 year ago
/a/yuriw-2022-03-31_21:45:19-rados-wip-yuri5-testing-2022-03-31-1158-quincy-distro-default-smithi/6770388
#21 Updated by Nitzan Mordechai over 1 year ago
- Status changed from New to In Progress
- Assignee set to Nitzan Mordechai
#22 Updated by Aishwarya Mathuria over 1 year ago
/a/yuriw-2022-04-06_16:35:43-rados-wip-yuri5-testing-2022-04-05-1720-distro-default-smithi/6779876
#23 Updated by Laura Flores over 1 year ago
/a/yuriw-2022-06-10_03:10:47-rados-wip-yuri4-testing-2022-06-09-1510-quincy-distro-default-smithi/6872050
#24 Updated by Sridhar Seshasayee over 1 year ago
/a/yuriw-2022-06-15_18:29:33-rados-wip-yuri4-testing-2022-06-15-1000-pacific-distro-default-smithi/6881215
#25 Updated by Aishwarya Mathuria about 1 year ago
/a/yuriw-2022-07-13_19:41:18-rados-wip-yuri7-testing-2022-07-11-1631-distro-default-smithi/6929396/remote/smithi204/log/valgrind/osd.5.log.gz
#26 Updated by Radoslaw Zarzynski about 1 year ago
- Tags changed from low-hanging-fruit to medium-hanging-fruit
Looks like a race condition. Does our a Context
makes a dependency on RefCountedObj
(e.g. TrackedOp
) but forgets to extend its life-time?
<error> <unique>0xe9deb</unique> <tid>60</tid> <threadname>tp_osd_tp</threadname> <kind>InvalidRead</kind> <what>Invalid read of size 8</what> <stack> <frame> <ip>0xF53F93</ip> <obj>/usr/bin/ceph-osd</obj> <fn>ceph::common::RefCountedObject::put() const</fn> <dir>/usr/src/debug/ceph-17.0.0-13509.g5b6cadda.el8.x86_64/src/common</dir> <file>RefCountedObj.cc</file> <line>18</line> </frame> <frame> <ip>0xA2112D</ip> <obj>/usr/bin/ceph-osd</obj> <fn>UnknownInlinedFun</fn> <dir>/usr/src/debug/ceph-17.0.0-13509.g5b6cadda.el8.x86_64/src/common</dir> <file>RefCountedObj.h</file> <line>194</line> </frame> <frame> <ip>0xA2112D</ip> <obj>/usr/bin/ceph-osd</obj> <fn>~intrusive_ptr</fn> <dir>/usr/src/debug/ceph-17.0.0-13509.g5b6cadda.el8.x86_64/x86_64-redhat-linux-gnu/boost/include/boost/smart_ptr</dir> <file>intrusive_ptr.hpp</file> <line>98</line> </frame> <frame> <ip>0xA2112D</ip> <obj>/usr/bin/ceph-osd</obj> <fn>~<lambda></fn> <dir>/usr/src/debug/ceph-17.0.0-13509.g5b6cadda.el8.x86_64/src/osd</dir> <file>PGBackend.cc</file> <line>157</line> </frame> <frame> <ip>0xA2112D</ip> <obj>/usr/bin/ceph-osd</obj> <fn>~LambdaContext</fn> <dir>/usr/src/debug/ceph-17.0.0-13509.g5b6cadda.el8.x86_64/src/include</dir> <file>Context.h</file> <line>161</line> </frame> <frame> <ip>0xA20ACA</ip> <obj>/usr/bin/ceph-osd</obj> <fn>delete_me</fn> <dir>/usr/src/debug/ceph-17.0.0-13509.g5b6cadda.el8.x86_64/src/include</dir> <file>Context.h</file> <line>343</line> </frame> <frame> <ip>0xA20ACA</ip> <obj>/usr/bin/ceph-osd</obj> <fn>sub_finish</fn> <dir>/usr/src/debug/ceph-17.0.0-13509.g5b6cadda.el8.x86_64/src/include</dir> <file>Context.h</file> <line>338</line> </frame> <frame> <ip>0xA20ACA</ip> <obj>/usr/bin/ceph-osd</obj> <fn>C_GatherBase<Context, Context>::sub_finish(Context*, int)</fn> <dir>/usr/src/debug/ceph-17.0.0-13509.g5b6cadda.el8.x86_64/src/include</dir> <file>Context.h</file> <line>319</line> </frame> <frame> <ip>0xA20F04</ip> <obj>/usr/bin/ceph-osd</obj> <fn>finish</fn> <dir>/usr/src/debug/ceph-17.0.0-13509.g5b6cadda.el8.x86_64/src/include</dir> <file>Context.h</file> <line>361</line> </frame> <frame> <ip>0xA20F04</ip> <obj>/usr/bin/ceph-osd</obj> <fn>UnknownInlinedFun</fn> <dir>/usr/src/debug/ceph-17.0.0-13509.g5b6cadda.el8.x86_64/src/include</dir> <file>Context.h</file> <line>99</line> </frame> <frame> <ip>0xA20F04</ip> <obj>/usr/bin/ceph-osd</obj> <fn>C_GatherBase<Context, Context>::C_GatherSub::complete(int)</fn> <dir>/usr/src/debug/ceph-17.0.0-13509.g5b6cadda.el8.x86_64/src/include</dir> <file>Context.h</file> <line>358</line> </frame> <frame> <ip>0x93AD90</ip> <obj>/usr/bin/ceph-osd</obj> <fn>PrimaryLogPG::remove_missing_object(hobject_t const&, eversion_t, Context*)::{lambda(int)#2}::operator()(int) const [clone .isra.6767]</fn> <dir>/usr/src/debug/ceph-17.0.0-13509.g5b6cadda.el8.x86_64/src/osd</dir> <file>PrimaryLogPG.cc</file> <line>12416</line> </frame> <frame> <ip>0x8701BC</ip> <obj>/usr/bin/ceph-osd</obj> <fn>Context::complete(int)</fn> <dir>/usr/src/debug/ceph-17.0.0-13509.g5b6cadda.el8.x86_64/src/include</dir> <file>Context.h</file> <line>99</line> </frame> <frame> <ip>0x9CD558</ip> <obj>/usr/bin/ceph-osd</obj> <fn>UnknownInlinedFun</fn> <dir>/usr/src/debug/ceph-17.0.0-13509.g5b6cadda.el8.x86_64/src/include</dir> <file>Context.h</file> <line>155</line> </frame> <frame> <ip>0x9CD558</ip> <obj>/usr/bin/ceph-osd</obj> <fn>destroy<RunOnDelete></fn> <dir>/usr/include/c++/8/ext</dir> <file>new_allocator.h</file> <line>140</line> </frame> <frame> <ip>0x9CD558</ip> <obj>/usr/bin/ceph-osd</obj> <fn>destroy<RunOnDelete></fn> <dir>/usr/include/c++/8/bits</dir> <file>alloc_traits.h</file> <line>487</line> </frame> <frame> <ip>0x9CD558</ip> <obj>/usr/bin/ceph-osd</obj> <fn>std::_Sp_counted_ptr_inplace<RunOnDelete, std::allocator<RunOnDelete>, (__gnu_cxx::_Lock_policy)2>::_M_dispose()</fn> <dir>/usr/include/c++/8/bits</dir> <file>shared_ptr_base.h</file> <line>554</line> </frame> <frame> <ip>0x9D3F86</ip> <obj>/usr/bin/ceph-osd</obj> <fn>UnknownInlinedFun</fn> <dir>/usr/include/c++/8/bits</dir> <file>shared_ptr_base.h</file> <line>155</line> </frame> <frame> <ip>0x9D3F86</ip> <obj>/usr/bin/ceph-osd</obj> <fn>UnknownInlinedFun</fn> <dir>/usr/include/c++/8/bits</dir> <file>shared_ptr_base.h</file> <line>148</line> </frame> <frame> <ip>0x9D3F86</ip> <obj>/usr/bin/ceph-osd</obj> <fn>~__shared_count</fn> <dir>/usr/include/c++/8/bits</dir> <file>shared_ptr_base.h</file> <line>728</line> </frame> <frame> <ip>0x9D3F86</ip> <obj>/usr/bin/ceph-osd</obj> <fn>~__shared_ptr</fn> <dir>/usr/include/c++/8/bits</dir> <file>shared_ptr_base.h</file> <line>1167</line> </frame> <frame> <ip>0x9D3F86</ip> <obj>/usr/bin/ceph-osd</obj> <fn>~shared_ptr</fn> <dir>/usr/include/c++/8/bits</dir> <file>shared_ptr.h</file> <line>103</line> </frame> <frame> <ip>0x9D3F86</ip> <obj>/usr/bin/ceph-osd</obj> <fn>~ContainerContext</fn> <dir>/usr/src/debug/ceph-17.0.0-13509.g5b6cadda.el8.x86_64/src/include</dir> <file>Context.h</file> <line>129</line> </frame> <frame> <ip>0x9D3F86</ip> <obj>/usr/bin/ceph-osd</obj> <fn>ContainerContext<std::shared_ptr<RunOnDelete> >::~ContainerContext()</fn> <dir>/usr/src/debug/ceph-17.0.0-13509.g5b6cadda.el8.x86_64/src/include</dir> <file>Context.h</file> <line>129</line> </frame> <frame> <ip>0x851C23</ip> <obj>/usr/bin/ceph-osd</obj> <fn>handle_oncommits</fn> <dir>/usr/src/debug/ceph-17.0.0-13509.g5b6cadda.el8.x86_64/src/osd</dir> <file>OSD.h</file> <line>1671</line> </frame> <frame> <ip>0x851C23</ip> <obj>/usr/bin/ceph-osd</obj> <fn>OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)</fn> <dir>/usr/src/debug/ceph-17.0.0-13509.g5b6cadda.el8.x86_64/src/osd</dir> <file>OSD.cc</file> <line>10897</line> </frame> <frame> <ip>0xF701A3</ip> <obj>/usr/bin/ceph-osd</obj> <fn>ShardedThreadPool::shardedthreadpool_worker(unsigned int)</fn> <dir>/usr/src/debug/ceph-17.0.0-13509.g5b6cadda.el8.x86_64/src/common</dir> <file>WorkQueue.cc</file> <line>313</line> </frame> <frame> <ip>0xF71543</ip> <obj>/usr/bin/ceph-osd</obj> <fn>ShardedThreadPool::WorkThreadSharded::entry()</fn> <dir>/usr/src/debug/ceph-17.0.0-13509.g5b6cadda.el8.x86_64/src/common</dir> <file>WorkQueue.h</file> <line>643</line> </frame> <frame> <ip>0x6D6D1C9</ip> <obj>/usr/lib64/libpthread-2.28.so</obj> <fn>start_thread</fn> </frame> <frame> <ip>0x7FBFDD2</ip> <obj>/usr/lib64/libc-2.28.so</obj> <fn>clone</fn> </frame> </stack>
#27 Updated by Radoslaw Zarzynski about 1 year ago
Moving to next week's bug scrub.
#28 Updated by Kamoltat (Junior) Sirivadhna about 1 year ago
/a/yuriw-2022-07-22_03:30:40-rados-wip-yuri3-testing-2022-07-21-1604-distro-default-smithi/6943721/remote/smithi042/log/valgrind/osd.3.log.gz
#29 Updated by Nitzan Mordechai about 1 year ago
- Status changed from In Progress to Fix Under Review
- Pull request ID set to 47379
#30 Updated by Kamoltat (Junior) Sirivadhna about 1 year ago
/a/yuriw-2022-08-04_11:58:29-rados-wip-yuri3-testing-2022-08-03-0828-pacific-distro-default-smithi/6958376
#31 Updated by Kefu Chai about 1 year ago
- Status changed from Fix Under Review to Pending Backport
#32 Updated by Backport Bot about 1 year ago
- Copied to Backport #57076: pacific: Invalid read of size 8 in handle_recovery_delete() added
#33 Updated by Backport Bot about 1 year ago
- Tags changed from medium-hanging-fruit to medium-hanging-fruit backport_processed
#34 Updated by Laura Flores about 1 year ago
/a/yuriw-2022-09-05_13:59:13-rados-wip-yuri10-testing-2022-09-04-0811-quincy-distro-default-smithi/7012481
Needs a Quincy backport.
#35 Updated by Nitzan Mordechai about 1 year ago
- Copied to Backport #57496: quincy: Invalid read of size 8 in handle_recovery_delete() added