Project

General

Profile

Bug #14794

Objecter: valgrind unclean

Added by Greg Farnum about 8 years ago. Updated almost 8 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Objecter
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

http://pulpito.ceph.com/gregf-2016-02-15_18:08:49-fs-greg-fs-testing-215-1---basic-smithi/10830/

Looks like our unique_lock either has an issue or the boost bit needs a valgrind exception.

<error>
  <unique>0xd</unique>
  <tid>1</tid>
  <kind>Leak_StillReachable</kind>
  <xwhat>
    <text>8 bytes in 1 blocks are still reachable in loss record 13 of 66</text>
    <leakedbytes>8</leakedbytes>
    <leakedblocks>1</leakedblocks>
  </xwhat>
  <stack>
    <frame>
      <ip>0xA564BFD</ip>
      <obj>/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so</obj>
      <fn>malloc</fn>
    </frame>
    <frame>
      <ip>0xA9BE7F9</ip>
      <obj>/usr/lib64/libboost_thread-mt.so.1.53.0</obj>
      <fn>boost::detail::get_once_per_thread_epoch()</fn>
    </frame>
    <frame>
      <ip>0x4CF647</ip>
      <obj>/usr/bin/ceph-fuse</obj>
      <fn>void boost::call_once&lt;void (*)()&gt;(boost::once_flag&amp;, void (*)())</fn>
    </frame>
    <frame>
      <ip>0xA9B8C16</ip>
      <obj>/usr/lib64/libboost_thread-mt.so.1.53.0</obj>
      <fn>boost::detail::get_current_thread_data()</fn>
    </frame>
    <frame>
      <ip>0xA9B8C38</ip>
      <obj>/usr/lib64/libboost_thread-mt.so.1.53.0</obj>
      <fn>boost::this_thread::interruption_enabled()</fn>
    </frame>
    <frame>
      <ip>0xA9B8C68</ip>
      <obj>/usr/lib64/libboost_thread-mt.so.1.53.0</obj>
      <fn>boost::this_thread::disable_interruption::disable_interruption()</fn>
    </frame>
    <frame>
      <ip>0x384A1E</ip>
      <obj>/usr/bin/ceph-fuse</obj>
      <fn>boost::shared_mutex::lock()</fn>
    </frame>
    <frame>
      <ip>0x35F3EF</ip>
      <obj>/usr/bin/ceph-fuse</obj>
      <fn>Objecter::update_crush_location()</fn>
    </frame>
    <frame>
      <ip>0x35FA6E</ip>
      <obj>/usr/bin/ceph-fuse</obj>
      <fn>Objecter::init()</fn>
    </frame>
    <frame>
      <ip>0x2AF74A</ip>
      <obj>/usr/bin/ceph-fuse</obj>
      <fn>Client::init()</fn>
    </frame>
    <frame>
      <ip>0x2909E3</ip>
      <obj>/usr/bin/ceph-fuse</obj>
      <fn>main</fn>
    </frame>
  </stack>
</error>

Related issues

Copied to Ceph - Bug #15356: Objecter: valgrind unclean and suppression not working in CentOS Resolved 02/18/2016

History

#1 Updated by Greg Farnum about 8 years ago

  • Assignee set to Adam Emerson

I think this came in as part of the locking changes you've been making? If this is an internal boost issue we can just whitelist the error, but let's check that it's the case.

#2 Updated by Adam Emerson about 8 years ago

  • Status changed from New to 4

This is known behavior in Boost which they claim is a false positive, their bug history is at https://svn.boost.org/trac/boost/ticket/3296

#3 Updated by Greg Farnum about 8 years ago

  • Status changed from 4 to Resolved

https://github.com/ceph/teuthology/pull/817

Added a valgrind exception and it passed the tests without getting angry.

#4 Updated by John Spray almost 8 years ago

Hmm, saw this again here: http://pulpito.ceph.com/teuthology-2016-03-28_18:04:01-fs-master---basic-smithi/92567/

The log claims this was run with master teuthology so it should have had the suppression (checked that it was still in master), so I don't know what's going on.

#5 Updated by Greg Farnum almost 8 years ago

  • Copied to Bug #15356: Objecter: valgrind unclean and suppression not working in CentOS added

#6 Updated by Sage Weil almost 8 years ago

I saw it too, http://pulpito.ceph.com/sage-2016-04-05_18:12:02-fs-wip-sage-testing---basic-smithi/110231

host was centos.

it appears to match the exception perfectly... :/

    <frame>
      <ip>0xA9E27F9</ip>
      <obj>/usr/lib64/libboost_thread-mt.so.1.53.0</obj>
      <fn>boost::detail::get_once_per_thread_epoch()</fn>
    </frame>
    <frame>
      <ip>0x4EB827</ip>
      <obj>/usr/bin/ceph-fuse</obj>
      <fn>void boost::call_once&lt;void (*)()&gt;(boost::once_flag&amp;, void (*)())</fn>
      <dir>/usr/include/boost/thread/pthread</dir>
      <file>once.hpp</file>
      <line>84</line>
    </frame>
    <frame>
      <ip>0xA9DCC16</ip>
      <obj>/usr/lib64/libboost_thread-mt.so.1.53.0</obj>
      <fn>boost::detail::get_current_thread_data()</fn>
    </frame>

#7 Updated by Loïc Dachary almost 8 years ago

could it be because names must be mangled ?

#8 Updated by Greg Farnum almost 8 years ago

<Aside: Well, I tried creating a separate ticket. >

Loic: The mangling shouldn't matter given our use of wild cards, and it matches all the other exceptions in the matching it's doing. :/

I think the only difference I could possibly come up with was that we were doing something different about referencing or not referencing the bottom-most frame compared to some (but not all) of the other suppressions. :/

#9 Updated by John Spray almost 8 years ago

  • Status changed from Resolved to 12

Reopening because we're still seeing this regularly: the suppression is in there but for some reason it isn't taking effect. Today's instance: http://pulpito.ceph.com/teuthology-2016-05-21_17:15:02-fs-master---basic-smithi/206167/

#10 Updated by Greg Farnum almost 8 years ago

  • Status changed from 12 to Resolved

Also available in: Atom PDF