Project

General

Profile

Actions

Bug #63445

closed

valgrind leak from D3nDataCache::d3n_libaio_create_write_request

Added by Casey Bodley 6 months ago. Updated 2 months ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
d3n backport_processed
Backport:
quincy reef
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

teuthology.log: http://qa-proxy.ceph.com/teuthology/cbodley-2023-11-03_18:34:54-rgw-wip-cbodley-testing-distro-default-smithi/7445940/teuthology.log
valgrind log: http://qa-proxy.ceph.com/teuthology/cbodley-2023-11-03_18:34:54-rgw-wip-cbodley-testing-distro-default-smithi/7445940/remote/smithi089/log/valgrind/ceph.client.0.log.gz

<error>
  <unique>0x70055</unique>
  <tid>1</tid>
  <kind>Leak_PossiblyLost</kind>
  <xwhat>
    <text>512 bytes in 1 blocks are possibly lost in loss record 29 of 38</text>
    <leakedbytes>512</leakedbytes>
    <leakedblocks>1</leakedblocks>
  </xwhat>
  <stack>
    <frame>
      <ip>0x484DA83</ip>
      <obj>/usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so</obj>
      <fn>calloc</fn>
    </frame>
    <frame>
      <ip>0x40147D9</ip>
      <obj>/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2</obj>
      <fn>calloc</fn>
      <dir>./elf/../include</dir>
      <file>rtld-malloc.h</file>
      <line>44</line>
    </frame>
    <frame>
      <ip>0x40147D9</ip>
      <obj>/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2</obj>
      <fn>allocate_dtv</fn>
      <dir>./elf/../elf</dir>
      <file>dl-tls.c</file>
      <line>375</line>
    </frame>
    <frame>
      <ip>0x40147D9</ip>
      <obj>/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2</obj>
      <fn>_dl_allocate_tls</fn>
      <dir>./elf/../elf</dir>
      <file>dl-tls.c</file>
      <line>634</line>
    </frame>
    <frame>
      <ip>0x6557834</ip>
      <obj>/usr/lib/x86_64-linux-gnu/libc.so.6</obj>
      <fn>allocate_stack</fn>
      <dir>./nptl/./nptl</dir>
      <file>allocatestack.c</file>
      <line>430</line>
    </frame>
    <frame>
      <ip>0x6557834</ip>
      <obj>/usr/lib/x86_64-linux-gnu/libc.so.6</obj>
      <fn>pthread_create@@GLIBC_2.34</fn>
      <dir>./nptl/./nptl</dir>
      <file>pthread_create.c</file>
      <line>647</line>
    </frame>
    <frame>
      <ip>0x6560BB4</ip>
      <obj>/usr/lib/x86_64-linux-gnu/libc.so.6</obj>
      <fn>__aio_create_helper_thread</fn>
      <dir>./rt/../sysdeps/unix/sysv/linux</dir>
      <file>aio_misc.h</file>
      <line>58</line>
    </frame>
    <frame>
      <ip>0x6560BB4</ip>
      <obj>/usr/lib/x86_64-linux-gnu/libc.so.6</obj>
      <fn>__aio_enqueue_request</fn>
      <dir>./rt/./rt</dir>
      <file>aio_misc.c</file>
      <line>437</line>
    </frame>
    <frame>
      <ip>0x65618B1</ip>
      <obj>/usr/lib/x86_64-linux-gnu/libc.so.6</obj>
      <fn>aio_write@@GLIBC_2.34</fn>
      <dir>./rt/./rt</dir>
      <file>aio_write.c</file>
      <line>35</line>
    </frame>
    <frame>
      <ip>0x92B5E4</ip>
      <obj>/usr/bin/radosgw</obj>
      <fn>D3nDataCache::d3n_libaio_create_write_request(ceph::buffer::v15_2_0::list&amp;, unsigned int, std::__cxx11::basic_string&lt;char, std::char_traits&lt;char&gt;, std::allocator&lt;char&gt; &gt;)</fn>
    </frame>
    <frame>
      <ip>0x938F43</ip>
      <obj>/usr/bin/radosgw</obj>
      <fn>D3nDataCache::put(ceph::buffer::v15_2_0::list&amp;, unsigned int, std::__cxx11::basic_string&lt;char, std::char_traits&lt;char&gt;, std::allocator&lt;char&gt; &gt;&amp;)</fn>
    </frame>
    <frame>
      <ip>0x97DA3E</ip>
      <obj>/usr/bin/radosgw</obj>
      <fn>get_obj_data::flush(rgw::OwningList&lt;rgw::AioResultEntry&gt;&amp;&amp;)</fn>
    </frame>
    <frame>
      <ip>0x99899A</ip>
      <obj>/usr/bin/radosgw</obj>
      <fn>RGWRados::Object::Read::iterate(DoutPrefixProvider const*, long, long, RGWGetDataCB*, optional_yield)</fn>
    </frame>
    <frame>
      <ip>0x7B4BF8</ip>
      <obj>/usr/bin/radosgw</obj>
      <fn>RGWGetObj::execute(optional_yield)</fn>
    </frame>
    <frame>
      <ip>0x66CFCB</ip>
      <obj>/usr/bin/radosgw</obj>
      <fn>rgw_process_authenticated(RGWHandler_REST*, RGWOp*&amp;, RGWRequest*, req_state*, optional_yield, rgw::sal::Driver*, bool)</fn>
    </frame>
    <frame>
      <ip>0x6714CD</ip>
      <obj>/usr/bin/radosgw</obj>
      <fn>process_request(RGWProcessEnv const&amp;, RGWRequest*, std::__cxx11::basic_string&lt;char, std::char_traits&lt;char&gt;, std::allocator&lt;char&gt; &gt; const&amp;, RGWRestfulIO*, optional_yield, rgw::dmclock::Scheduler*, std::__cxx11::basic_string&lt;char, std::char_traits&lt;char&gt;, std::allocator&lt;char&gt; &gt;*, std::chrono::duration&lt;unsigned long, std::ratio&lt;1l, 1000000000l&gt; &gt;*, int*)</fn>
    </frame>
    <frame>
      <ip>0x10C758C</ip>
      <obj>/usr/bin/radosgw</obj>
    </frame>
    <frame>
      <ip>0x5BB1A5</ip>
      <obj>/usr/bin/radosgw</obj>
    </frame>
    <frame>
      <ip>0x115DA36</ip>
      <obj>/usr/bin/radosgw</obj>
      <fn>make_fcontext</fn>
    </frame>
  </stack>
</error>


Related issues 3 (1 open2 closed)

Related to rgw - Bug #64835: valgrind invalid read related to D3nDataCache::d3n_libaio_create_write_request()NewMark Kogan

Actions
Copied to rgw - Backport #63752: quincy: valgrind leak from D3nDataCache::d3n_libaio_create_write_requestResolvedMark KoganActions
Copied to rgw - Backport #63753: reef: valgrind leak from D3nDataCache::d3n_libaio_create_write_requestResolvedMark KoganActions
Actions #1

Updated by Casey Bodley 6 months ago

  • Priority changed from Normal to Urgent
Actions #2

Updated by Mark Kogan 6 months ago

  • Status changed from New to In Progress
  • Assignee set to Mark Kogan

was not able to repro on local env, working on generating a valgrind suppression

Actions #3

Updated by Mark Kogan 6 months ago

reproduced with valgrind suppression generation @ run:
https://pulpito.ceph.com/mkogan-2023-11-15_16:30:36-rgw-rgw-wip-t63445-valg-supp_i001-distro-default-smithi/
==>
http://qa-proxy.ceph.com/teuthology/mkogan-2023-11-15_16:30:36-rgw-rgw-wip-t63445-valg-supp_i001-distro-default-smithi/7459204/remote/smithi098/log/valgrind/ceph.client.0.log.gz

the suggested suppression for review/discussion is:

{
   <insert_a_suppression_name_here>
   Memcheck:Leak
   match-leak-kinds: possible
   fun:calloc
   fun:calloc
   fun:allocate_dtv
   fun:_dl_allocate_tls
   fun:allocate_stack
   fun:pthread_create@@GLIBC_2.34
   fun:__aio_create_helper_thread
   fun:__aio_enqueue_request
   fun:aio_write@@GLIBC_2.34
   fun:_ZN12D3nDataCache31d3n_libaio_create_write_requestERN4ceph6buffer7v15_2_04listEjNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
   fun:_ZN12D3nDataCache3putERN4ceph6buffer7v15_2_04listEjRNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
   fun:_ZN12get_obj_data5flushEON3rgw10OwningListINS0_14AioResultEntryEJEEE
   fun:_ZN8RGWRados6Object4Read7iterateEPK18DoutPrefixProviderllP12RGWGetDataCB14optional_yield
   fun:_ZN9RGWGetObj7executeE14optional_yield
   fun:_Z25rgw_process_authenticatedP15RGWHandler_RESTRP5RGWOpP10RGWRequestP9req_state14optional_yieldPN3rgw3sal6DriverEb
   fun:_Z15process_requestRK13RGWProcessEnvP10RGWRequestRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEP12RGWRestfulIO14optional_yieldPN3rgw7dmclock9SchedulerEPS9_PNSt6chrono8durationImSt5ratioILl1ELl1000000000EEEEPi
   obj:/usr/bin/radosgw
   obj:/usr/bin/radosgw
   fun:make_fcontext
}

Actions #4

Updated by Mark Kogan 5 months ago

  • Pull request ID set to 54810
Actions #5

Updated by Casey Bodley 5 months ago

  • Status changed from In Progress to Pending Backport
  • Backport set to quincy reef
Actions #6

Updated by Backport Bot 5 months ago

  • Copied to Backport #63752: quincy: valgrind leak from D3nDataCache::d3n_libaio_create_write_request added
Actions #7

Updated by Backport Bot 5 months ago

  • Copied to Backport #63753: reef: valgrind leak from D3nDataCache::d3n_libaio_create_write_request added
Actions #8

Updated by Backport Bot 5 months ago

  • Tags changed from d3n to d3n backport_processed
Actions #9

Updated by Mark Kogan 5 months ago

  • Status changed from Pending Backport to In Progress

for future reference, previous similar tracker issue was:
https://tracker.ceph.com/issues/61661 -- valgrind: Leak_PossiblyLost under aio_read() in d3n

Actions #10

Updated by Casey Bodley 5 months ago

  • Status changed from In Progress to Pending Backport
Actions #11

Updated by Mark Kogan 4 months ago

  • Status changed from Pending Backport to Resolved

backports merged

Actions #12

Updated by Casey Bodley 2 months ago

D3nDataCache::d3n_libaio_create_write_request popped up again on main. maybe just an unlikely race vs ainit.aio_idle_time = 5?

from http://qa-proxy.ceph.com/teuthology/cbodley-2024-02-20_21:16:10-rgw-wip-rgw-meta-topic-distro-default-smithi/7569000/remote/smithi067/log/valgrind/ceph.client.0.log.gz


<error>
  <unique>0xaa3c6</unique>
  <tid>1</tid>
  <kind>InvalidRead</kind>
  <what>Invalid read of size 8</what>
  <stack>
    <frame>
      <ip>0x66BB848</ip>
      <obj>/usr/lib/x86_64-linux-gnu/libc.so.6</obj>
      <fn>free_res</fn>
    </frame>
    <frame>
      <ip>0x66BB8C1</ip>
      <obj>/usr/lib/x86_64-linux-gnu/libc.so.6</obj>
      <fn>__libc_freeres</fn>
    </frame>
    <frame>
      <ip>0x483F1B2</ip>
      <obj>/usr/libexec/valgrind/vgpreload_core-amd64-linux.so</obj>
      <fn>_vgnU_freeres</fn>
    </frame>
    <frame>
      <ip>0x6545551</ip>
      <obj>/usr/lib/x86_64-linux-gnu/libc.so.6</obj>
      <fn>__run_exit_handlers</fn>
      <dir>./stdlib/./stdlib</dir>
      <file>exit.c</file>
      <line>136</line>
    </frame>
    <frame>
      <ip>0x654560F</ip>
      <obj>/usr/lib/x86_64-linux-gnu/libc.so.6</obj>
      <fn>exit</fn>
      <dir>./stdlib/./stdlib</dir>
      <file>exit.c</file>
      <line>143</line>
    </frame>
    <frame>
      <ip>0x6529D96</ip>
      <obj>/usr/lib/x86_64-linux-gnu/libc.so.6</obj>
      <fn>(below main)</fn>
      <dir>./csu/../sysdeps/nptl</dir>
      <file>libc_start_call_main.h</file>
      <line>74</line>
    </frame>
  </stack>
  <auxwhat>Address 0x8f35758 is 56 bytes inside a block of size 64 free'd</auxwhat>
  <stack>
    <frame>
      <ip>0x484B27F</ip>
      <obj>/usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so</obj>
      <fn>free</fn>
    </frame>
    <frame>
      <ip>0x66BB853</ip>
      <obj>/usr/lib/x86_64-linux-gnu/libc.so.6</obj>
      <fn>free_res</fn>
    </frame>
    <frame>
      <ip>0x66BB8C1</ip>
      <obj>/usr/lib/x86_64-linux-gnu/libc.so.6</obj>
      <fn>__libc_freeres</fn>
    </frame>
    <frame>
      <ip>0x483F1B2</ip>
      <obj>/usr/libexec/valgrind/vgpreload_core-amd64-linux.so</obj>
      <fn>_vgnU_freeres</fn>
    </frame>
    <frame>
      <ip>0x6545551</ip>
      <obj>/usr/lib/x86_64-linux-gnu/libc.so.6</obj>
      <fn>__run_exit_handlers</fn>
      <dir>./stdlib/./stdlib</dir>
      <file>exit.c</file>
      <line>136</line>
    </frame>
    <frame>
      <ip>0x654560F</ip>
      <obj>/usr/lib/x86_64-linux-gnu/libc.so.6</obj>
      <fn>exit</fn>
      <dir>./stdlib/./stdlib</dir>
      <file>exit.c</file>
      <line>143</line>
    </frame>
    <frame>
      <ip>0x6529D96</ip>
      <obj>/usr/lib/x86_64-linux-gnu/libc.so.6</obj>
      <fn>(below main)</fn>
      <dir>./csu/../sysdeps/nptl</dir>
      <file>libc_start_call_main.h</file>
      <line>74</line>
    </frame>
  </stack>
  <auxwhat>Block was alloc'd at</auxwhat>
  <stack>
    <frame>
      <ip>0x48487A9</ip>
      <obj>/usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so</obj>
      <fn>malloc</fn>
    </frame>
    <frame>
      <ip>0x659ECA7</ip>
      <obj>/usr/lib/x86_64-linux-gnu/libc.so.6</obj>
      <fn>get_elem</fn>
      <dir>./rt/./rt</dir>
      <file>aio_misc.c</file>
      <line>140</line>
    </frame>
    <frame>
      <ip>0x659ECA7</ip>
      <obj>/usr/lib/x86_64-linux-gnu/libc.so.6</obj>
      <fn>__aio_enqueue_request</fn>
      <dir>./rt/./rt</dir>
      <file>aio_misc.c</file>
      <line>352</line>
    </frame>
    <frame>
      <ip>0x659F8B1</ip>
      <obj>/usr/lib/x86_64-linux-gnu/libc.so.6</obj>
      <fn>aio_write@@GLIBC_2.34</fn>
      <dir>./rt/./rt</dir>
      <file>aio_write.c</file>
      <line>35</line>
    </frame>
    <frame>
      <ip>0x982394</ip>
      <obj>/usr/bin/radosgw</obj>
      <fn>D3nDataCache::d3n_libaio_create_write_request(ceph::buffer::v15_2_0::list&amp;, unsigned int, std::__cxx11::basic_string&lt;char, std::char_traits&lt;char&gt;, std::allocator&lt;char&gt; &gt;)</fn>
    </frame>
    <frame>
      <ip>0x9848A3</ip>
      <obj>/usr/bin/radosgw</obj>
      <fn>D3nDataCache::put(ceph::buffer::v15_2_0::list&amp;, unsigned int, std::__cxx11::basic_string&lt;char, std::char_traits&lt;char&gt;, std::allocator&lt;char&gt; &gt;&amp;)</fn>
    </frame>
    <frame>
      <ip>0x9DBAAE</ip>
      <obj>/usr/bin/radosgw</obj>
      <fn>get_obj_data::flush(rgw::OwningList&lt;rgw::AioResultEntry&gt;&amp;&amp;)</fn>
    </frame>
    <frame>
      <ip>0x9DF77A</ip>
      <obj>/usr/bin/radosgw</obj>
      <fn>RGWRados::Object::Read::iterate(DoutPrefixProvider const*, long, long, RGWGetDataCB*, optional_yield)</fn>
    </frame>
    <frame>
      <ip>0x7F7E15</ip>
      <obj>/usr/bin/radosgw</obj>
      <fn>RGWGetObj::execute(optional_yield)</fn>
    </frame>
    <frame>
      <ip>0x6B792B</ip>
      <obj>/usr/bin/radosgw</obj>
      <fn>rgw_process_authenticated(RGWHandler_REST*, RGWOp*&amp;, RGWRequest*, req_state*, optional_yield, rgw::sal::Driver*, bool)</fn>
    </frame>
    <frame>
      <ip>0x6BB95D</ip>
      <obj>/usr/bin/radosgw</obj>
      <fn>process_request(RGWProcessEnv const&amp;, RGWRequest*, std::__cxx11::basic_string&lt;char, std::char_traits&lt;char&gt;, std::allocator&lt;char&gt; &gt; const&amp;, RGWRestfulIO*, optional_yield, rgw::dmclock::Scheduler*, std::__cxx11::basic_string&lt;char, std::char_traits&lt;char&gt;, std::allocator&lt;char&gt; &gt;*, std::chrono::duration&lt;unsigned long, std::ratio&lt;1l, 1000000000l&gt; &gt;*, int*)</fn>
    </frame>
    <frame>
      <ip>0x5E2172</ip>
      <obj>/usr/bin/radosgw</obj>
    </frame>
    <frame>
      <ip>0x62370F</ip>
      <obj>/usr/bin/radosgw</obj>
    </frame>
    <frame>
      <ip>0x12461B6</ip>
      <obj>/usr/bin/radosgw</obj>
      <fn>make_fcontext</fn>
    </frame>
  </stack>
</error>

Actions #13

Updated by Casey Bodley about 2 months ago

  • Related to Bug #64835: valgrind invalid read related to D3nDataCache::d3n_libaio_create_write_request() added
Actions

Also available in: Atom PDF