Project

General

Profile

Actions

Bug #17035

closed

"saw valgrind issues" in hammer 0.94.8 release

Added by Yuri Weinstein over 7 years ago. Updated over 7 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
fs, rgw
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

CEPH_BRANCH=08277b7bc7c0e533c3fd56a0040dc0ddc74637d6

In several runs:
http://pulpito.ceph.com/yuriw-2016-08-15_16:43:26-rgw-master-distro-basic-smithi/
http://pulpito.ceph.com/yuriw-2016-08-15_17:41:47-fs-master---basic-smithi/

Per IRC badone's assessment:

(03:44:46 PM) badone: yuriw: looks like it's triggering in ceph::crypto::init but no clue whether it could be a problem in one of the libraries involved or ceph itself
(03:47:08 PM) badone: yuriw: all I can tell you is we are calling msync with a parameter (start) which points to uninitilaised bytes
(03:48:48 PM) badone: We'd need to reproduce it and get a better stack to see more I guess
(03:49:00 PM) badone: yuriw: I'd say it's bad enough, yes
(03:49:28 PM) badone: it's a ticking time bomb
<error>
  <unique>0x2</unique>
  <tid>1</tid>
  <kind>SyscallParam</kind>
  <what>Syscall param msync(start) points to uninitialised byte(s)</what>
  <stack>
    <frame>
      <ip>0x5E9A90D</ip>
      <obj>/usr/lib64/libpthread-2.17.so</obj>
    </frame>
    <frame>
      <ip>0x76B5F63</ip>
      <obj>/usr/lib64/libunwind.so.8.0.1</obj>
    </frame>
    <frame>
      <ip>0x76B8EAE</ip>
      <obj>/usr/lib64/libunwind.so.8.0.1</obj>
    </frame>
    <frame>
      <ip>0x76BA181</ip>
      <obj>/usr/lib64/libunwind.so.8.0.1</obj>
    </frame>
    <frame>
      <ip>0x76BA518</ip>
      <obj>/usr/lib64/libunwind.so.8.0.1</obj>
    </frame>
    <frame>
      <ip>0x76B6900</ip>
      <obj>/usr/lib64/libunwind.so.8.0.1</obj>
      <fn>_ULx86_64_step</fn>
    </frame>
    <frame>
      <ip>0x4E688CA</ip>
      <obj>/usr/lib64/libtcmalloc.so.4.2.6</obj>
    </frame>
    <frame>
      <ip>0x4E690BD</ip>
      <obj>/usr/lib64/libtcmalloc.so.4.2.6</obj>
      <fn>GetStackTrace(void**, int, int)</fn>
    </frame>
    <frame>
      <ip>0x4E5A313</ip>
      <obj>/usr/lib64/libtcmalloc.so.4.2.6</obj>
      <fn>tcmalloc::PageHeap::GrowHeap(unsigned long)</fn>
    </frame>
    <frame>
      <ip>0x4E5A632</ip>
      <obj>/usr/lib64/libtcmalloc.so.4.2.6</obj>
      <fn>tcmalloc::PageHeap::New(unsigned long)</fn>
    </frame>
    <frame>
      <ip>0x4E58F63</ip>
      <obj>/usr/lib64/libtcmalloc.so.4.2.6</obj>
      <fn>tcmalloc::CentralFreeList::Populate()</fn>
    </frame>
    <frame>
      <ip>0x4E59147</ip>
      <obj>/usr/lib64/libtcmalloc.so.4.2.6</obj>
      <fn>tcmalloc::CentralFreeList::FetchFromOneSpansSafe(int, void**, void**)</fn>
    </frame>
    <frame>
      <ip>0x4E591DC</ip>
      <obj>/usr/lib64/libtcmalloc.so.4.2.6</obj>
      <fn>tcmalloc::CentralFreeList::RemoveRange(void**, void**, int)</fn>
    </frame>
    <frame>
      <ip>0x4E5C234</ip>
      <obj>/usr/lib64/libtcmalloc.so.4.2.6</obj>
      <fn>tcmalloc::ThreadCache::FetchFromCentralCache(unsigned long, unsigned long)</fn>
    </frame>
    <frame>
      <ip>0x4E6B31A</ip>
      <obj>/usr/lib64/libtcmalloc.so.4.2.6</obj>
      <fn>calloc</fn>
    </frame>
    <frame>
      <ip>0x78E3431</ip>
      <obj>/usr/lib64/libnssutil3.so</obj>
      <fn>PORT_ZAlloc_Util</fn>
    </frame>
    <frame>
      <ip>0x8F1241B</ip>
      <obj>/usr/lib64/libsoftokn3.so</obj>
    </frame>
    <frame>
      <ip>0x8F1303B</ip>
      <obj>/usr/lib64/libsoftokn3.so</obj>
    </frame>
    <frame>
      <ip>0x8F13279</ip>
      <obj>/usr/lib64/libsoftokn3.so</obj>
    </frame>
    <frame>
      <ip>0x5965ADE</ip>
      <obj>/usr/lib64/libnss3.so</obj>
    </frame>
    <frame>
      <ip>0x5966109</ip>
      <obj>/usr/lib64/libnss3.so</obj>
    </frame>
    <frame>
      <ip>0x5971DCA</ip>
      <obj>/usr/lib64/libnss3.so</obj>
      <fn>SECMOD_LoadModule</fn>
    </frame>
    <frame>
      <ip>0x5971EBF</ip>
      <obj>/usr/lib64/libnss3.so</obj>
      <fn>SECMOD_LoadModule</fn>
    </frame>
    <frame>
      <ip>0x59413AA</ip>
      <obj>/usr/lib64/libnss3.so</obj>
    </frame>
    <frame>
      <ip>0x5941D80</ip>
      <obj>/usr/lib64/libnss3.so</obj>
      <fn>NSS_InitContext</fn>
    </frame>
    <frame>
      <ip>0x809277</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>ceph::crypto::init(CephContext*)</fn>
    </frame>
    <frame>
      <ip>0x7FFEA8</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>CephContext::init_crypto()</fn>
    </frame>
    <frame>
      <ip>0x81240F</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>common_init_finish(CephContext*, int)</fn>
    </frame>
    <frame>
      <ip>0x5792FB</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>main</fn>
    </frame>
  </stack>
  <auxwhat>Address 0xffeffe010 is on thread 1's stack</auxwhat>
</error>

Related issues 1 (0 open1 closed)

Has duplicate Ceph - Bug #15117: hammer: CentOS 7 tcmalloc::ThreadCache valgrind errorDuplicate03/14/2016

Actions
Actions #1

Updated by Yuri Weinstein over 7 years ago

  • Description updated (diff)
Actions #2

Updated by Brad Hubbard over 7 years ago

It appears this may at least be related to http://tracker.ceph.com/issues/14799 and http://tracker.ceph.com/issues/15117 since the run is notcmalloc and the issue is in tcmalloc::ThreadCache. These trackers tend to indicate a build issue and perhaps a valgrind false positive for tcmalloc.

I think this is probably a duplicate of http://tracker.ceph.com/issues/15117 ?

Actions #4

Updated by Kefu Chai over 7 years ago

  • Status changed from New to Fix Under Review
  • Assignee set to Kefu Chai
Actions #5

Updated by Kefu Chai over 7 years ago

  • Status changed from Fix Under Review to Resolved
Actions #6

Updated by Kefu Chai over 7 years ago

  • Has duplicate Bug #15117: hammer: CentOS 7 tcmalloc::ThreadCache valgrind error added
Actions

Also available in: Atom PDF