Project

General

Profile

Actions

Bug #10720

closed

MDS: valgrind leaks

Added by Greg Farnum about 9 years ago. Updated almost 8 years ago.

Status:
Rejected
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

http://qa-proxy.ceph.com/teuthology/teuthology-2015-01-30_23:04:02-fs-master-testing-basic-multi/733168/
http://qa-proxy.ceph.com/teuthology/teuthology-2015-01-30_23:04:02-fs-master-testing-basic-multi/733172/

These both contain PossiblyLost and DefinitelyLost records. Not sure if they're directly related to each other, but they're both very new so it shouldn't be too difficult to correlate them with the patches which introduced them.

Actions #1

Updated by Zheng Yan about 9 years ago

  • Project changed from CephFS to Ceph
  • Subject changed from MDS: valgrind leaks to OSD : valgrind leaks
  • Category changed from 47 to OSD

both failures are leaks in OSD

Actions #2

Updated by Greg Farnum about 9 years ago

  • Project changed from Ceph to CephFS
  • Category changed from OSD to 47

I know there are some OSD leaks, but there are records in the MDS too. Unless I mis-parsed the valgrind logs and they're in the Objecter we need to deal with this...

(I think they came in as part of John's perfcounter additions, based on the traces I examined.)

Actions #4

Updated by Greg Farnum about 9 years ago

  • Status changed from New to 7

Merged. Do you think there might be more leaks, John? Your comment was ambiguous. :)

Actions #5

Updated by John Spray about 9 years ago

This is a global string constant, which valgrind apparently doesn't understand:

    <frame>
      <ip>0x5EAEEF7</ip>
      <obj>/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.19</obj>  
      <fn>std::basic_string&lt;char, std::char_traits&lt;char&gt;, std::allocator&lt;char&gt; &gt;::basic_string(char const*, std::allocator&lt;char&gt; const&amp;)</fn>
    </frame>
    <frame>
      <ip>0x594F55</ip>
      <obj>/usr/bin/ceph-mds</obj>
      <fn>_GLOBAL__sub_I_MDSAuthCaps.cc</fn>
      <dir>/srv/autobuild-ceph/gitbuilder.git/build/out~/ceph-0.91-1066-gcab246d/src/mds</dir>
      <file>MDSAuthCaps.cc</file>
      <line>30</line>
    </frame>

This one is weird, I wonder if the compiler is optmising and leaving the constant function-scoped FeatureSet instances in get_mdsmap_compat_set_default allocated between functions? that's not a very credible suggestion but it's all I've got...

    <frame>
      <ip>0x5EAEEF7</ip>
      <obj>/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.19</obj>  
      <fn>std::basic_string&lt;char, std::char_traits&lt;char&gt;, std::allocator&lt;char&gt; &gt;::basic_string(char const*, std::allocator&lt;char&gt; const&amp;)</fn>
    </frame>
    <frame>
      <ip>0x89A9F8</ip>
      <obj>/usr/bin/ceph-mds</obj>
      <fn>get_mdsmap_compat_set_default()</fn>
      <dir>/srv/autobuild-ceph/gitbuilder.git/build/out~/ceph-0.91-1066-gcab246d/src/./include</dir>
      <file>CompatSet.h</file>
      <line>28</line>
    </frame>
    <frame>
      <ip>0x5CB936</ip>
      <obj>/usr/bin/ceph-mds</obj>
      <fn>Beacon::_notify_mdsmap(MDSMap const*)</fn>
      <dir>/srv/autobuild-ceph/gitbuilder.git/build/out~/ceph-0.91-1066-gcab246d/src/mds</dir>
      <file>Beacon.cc</file>
      <line>198</line>
Actions #6

Updated by Greg Farnum about 9 years ago

  • Status changed from 7 to Need More Info

There's one DefinitelyLost in addition to a whole bunch of PossiblyLost. It's unfortunately a Message, so the origination point is just some buffer calls inside Pipe::read_message(). I'm not quite sure how to go about debugging that further.

One of them is in Beacon.cc::_notify_mdsmap:

compat = get_mdsmap_compat_set_default();

CompatSet doesn't have an operator=(), so that might be doing the wrong thing. But that line is from August...

The others I've spot-checked are largely string constants in common code. A couple are in the OSDMap (erasure code plugins?), for which I believe there are already tickets they're working on. Maybe when that's resolved it'll be clearer what the problems in our code are.

Actions #7

Updated by Greg Farnum about 9 years ago

  • Subject changed from OSD : valgrind leaks to MDS: valgrind leaks
Actions #8

Updated by Greg Farnum about 9 years ago

  • Status changed from Need More Info to Rejected

Dur, we aren't valgrind clean yet anyway...these failures are all due to the OSD failures associated with them.

Actions #9

Updated by Greg Farnum almost 8 years ago

  • Component(FS) MDS added
Actions

Also available in: Atom PDF