Actions
Bug #17035
closed"saw valgrind issues" in hammer 0.94.8 release
% Done:
0%
Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
fs, rgw
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
CEPH_BRANCH=08277b7bc7c0e533c3fd56a0040dc0ddc74637d6
In several runs:
http://pulpito.ceph.com/yuriw-2016-08-15_16:43:26-rgw-master-distro-basic-smithi/
http://pulpito.ceph.com/yuriw-2016-08-15_17:41:47-fs-master---basic-smithi/
Per IRC badone's assessment:
(03:44:46 PM) badone: yuriw: looks like it's triggering in ceph::crypto::init but no clue whether it could be a problem in one of the libraries involved or ceph itself (03:47:08 PM) badone: yuriw: all I can tell you is we are calling msync with a parameter (start) which points to uninitilaised bytes (03:48:48 PM) badone: We'd need to reproduce it and get a better stack to see more I guess (03:49:00 PM) badone: yuriw: I'd say it's bad enough, yes (03:49:28 PM) badone: it's a ticking time bomb
<error> <unique>0x2</unique> <tid>1</tid> <kind>SyscallParam</kind> <what>Syscall param msync(start) points to uninitialised byte(s)</what> <stack> <frame> <ip>0x5E9A90D</ip> <obj>/usr/lib64/libpthread-2.17.so</obj> </frame> <frame> <ip>0x76B5F63</ip> <obj>/usr/lib64/libunwind.so.8.0.1</obj> </frame> <frame> <ip>0x76B8EAE</ip> <obj>/usr/lib64/libunwind.so.8.0.1</obj> </frame> <frame> <ip>0x76BA181</ip> <obj>/usr/lib64/libunwind.so.8.0.1</obj> </frame> <frame> <ip>0x76BA518</ip> <obj>/usr/lib64/libunwind.so.8.0.1</obj> </frame> <frame> <ip>0x76B6900</ip> <obj>/usr/lib64/libunwind.so.8.0.1</obj> <fn>_ULx86_64_step</fn> </frame> <frame> <ip>0x4E688CA</ip> <obj>/usr/lib64/libtcmalloc.so.4.2.6</obj> </frame> <frame> <ip>0x4E690BD</ip> <obj>/usr/lib64/libtcmalloc.so.4.2.6</obj> <fn>GetStackTrace(void**, int, int)</fn> </frame> <frame> <ip>0x4E5A313</ip> <obj>/usr/lib64/libtcmalloc.so.4.2.6</obj> <fn>tcmalloc::PageHeap::GrowHeap(unsigned long)</fn> </frame> <frame> <ip>0x4E5A632</ip> <obj>/usr/lib64/libtcmalloc.so.4.2.6</obj> <fn>tcmalloc::PageHeap::New(unsigned long)</fn> </frame> <frame> <ip>0x4E58F63</ip> <obj>/usr/lib64/libtcmalloc.so.4.2.6</obj> <fn>tcmalloc::CentralFreeList::Populate()</fn> </frame> <frame> <ip>0x4E59147</ip> <obj>/usr/lib64/libtcmalloc.so.4.2.6</obj> <fn>tcmalloc::CentralFreeList::FetchFromOneSpansSafe(int, void**, void**)</fn> </frame> <frame> <ip>0x4E591DC</ip> <obj>/usr/lib64/libtcmalloc.so.4.2.6</obj> <fn>tcmalloc::CentralFreeList::RemoveRange(void**, void**, int)</fn> </frame> <frame> <ip>0x4E5C234</ip> <obj>/usr/lib64/libtcmalloc.so.4.2.6</obj> <fn>tcmalloc::ThreadCache::FetchFromCentralCache(unsigned long, unsigned long)</fn> </frame> <frame> <ip>0x4E6B31A</ip> <obj>/usr/lib64/libtcmalloc.so.4.2.6</obj> <fn>calloc</fn> </frame> <frame> <ip>0x78E3431</ip> <obj>/usr/lib64/libnssutil3.so</obj> <fn>PORT_ZAlloc_Util</fn> </frame> <frame> <ip>0x8F1241B</ip> <obj>/usr/lib64/libsoftokn3.so</obj> </frame> <frame> <ip>0x8F1303B</ip> <obj>/usr/lib64/libsoftokn3.so</obj> </frame> <frame> <ip>0x8F13279</ip> <obj>/usr/lib64/libsoftokn3.so</obj> </frame> <frame> <ip>0x5965ADE</ip> <obj>/usr/lib64/libnss3.so</obj> </frame> <frame> <ip>0x5966109</ip> <obj>/usr/lib64/libnss3.so</obj> </frame> <frame> <ip>0x5971DCA</ip> <obj>/usr/lib64/libnss3.so</obj> <fn>SECMOD_LoadModule</fn> </frame> <frame> <ip>0x5971EBF</ip> <obj>/usr/lib64/libnss3.so</obj> <fn>SECMOD_LoadModule</fn> </frame> <frame> <ip>0x59413AA</ip> <obj>/usr/lib64/libnss3.so</obj> </frame> <frame> <ip>0x5941D80</ip> <obj>/usr/lib64/libnss3.so</obj> <fn>NSS_InitContext</fn> </frame> <frame> <ip>0x809277</ip> <obj>/usr/bin/ceph-mon</obj> <fn>ceph::crypto::init(CephContext*)</fn> </frame> <frame> <ip>0x7FFEA8</ip> <obj>/usr/bin/ceph-mon</obj> <fn>CephContext::init_crypto()</fn> </frame> <frame> <ip>0x81240F</ip> <obj>/usr/bin/ceph-mon</obj> <fn>common_init_finish(CephContext*, int)</fn> </frame> <frame> <ip>0x5792FB</ip> <obj>/usr/bin/ceph-mon</obj> <fn>main</fn> </frame> </stack> <auxwhat>Address 0xffeffe010 is on thread 1's stack</auxwhat> </error>
Updated by Brad Hubbard over 7 years ago
It appears this may at least be related to http://tracker.ceph.com/issues/14799 and http://tracker.ceph.com/issues/15117 since the run is notcmalloc and the issue is in tcmalloc::ThreadCache. These trackers tend to indicate a build issue and perhaps a valgrind false positive for tcmalloc.
I think this is probably a duplicate of http://tracker.ceph.com/issues/15117 ?
Updated by Kefu Chai over 7 years ago
see also #8225 and https://github.com/gperftools/gperftools/issues/101
Updated by Kefu Chai over 7 years ago
- Status changed from New to Fix Under Review
- Assignee set to Kefu Chai
Updated by Kefu Chai over 7 years ago
- Status changed from Fix Under Review to Resolved
Updated by Kefu Chai over 7 years ago
- Has duplicate Bug #15117: hammer: CentOS 7 tcmalloc::ThreadCache valgrind error added
Actions