Project

General

Profile

Bug #16504

ctest rados failures: *** Error in `rados': double free or corruption (fasttop): 0x0000000001db47d0 ***

Added by David Zafman about 4 years ago. Updated about 4 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature:

Description

$ do_cmake.sh
$ cd build ; make -j4
$ ctest
Test project /home/dzafman/ceph/build
Start 1: test-ceph-helpers.sh
Start 2: erasure-decode-non-regression.sh
Start 3: ceph_objectstore_tool.py
Start 4: cephtool-test-mds.sh
  • Error in `/home/dzafman/ceph/build/bin/rados': double free or corruption (fasttop): 0x00000000016817a0 ***
    1/144 Test #4: cephtool-test-mds.sh .................... Passed 57.67 sec
    Start 5: cephtool-test-mon.sh
  • Error in `rados': double free or corruption (fasttop): 0x0000000001db47d0 ***
    2/144 Test #2: erasure-decode-non-regression.sh ........ Passed 136.36 sec
    Start 6: cephtool-test-osd.sh
  • Error in `rados': double free or corruption (fasttop): 0x00000000011c27e0 ***
  • Error in `rados': double free or corruption (fasttop): 0x000000000120b7a0 ***
    ^C

$ gdb bin/rados
(gbd) run
...
Thread 1 (Thread 0x7ffff7fd5640 (LWP 113008)):
#0 0x00007fffe5193cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007fffe51970d8 in __GI_abort () at abort.c:89
#2 0x00007fffe51d0394 in __libc_message (do_abort=do_abort@entry=1, fmt=fmt@entry=0x7fffe52deb28 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/posix/libc_fatal.c:175
#3 0x00007fffe51dc66e in malloc_printerr (ptr=<optimized out>, str=0x7fffe52decf0 "double free or corruption (fasttop)", action=1) at malloc.c:4996
#4 _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at malloc.c:3840
#5 0x00007fffe51995ea in __cxa_finalize (d=0x7fffe694b0a0) at cxa_finalize.c:56
#6 0x00007fffe6338683 in __do_global_dtors_aux () from /home/dzafman/ceph/build/lib/libradosstriper.so.1
#7 0x00007fffffffddb0 in ?? ()
#8 0x00007ffff7dea73a in _dl_fini () at dl-fini.c:252

History

#1 Updated by Jason Dillaman about 4 years ago

@David: did you intend to assign me to this ticket?

#2 Updated by David Zafman about 4 years ago

  • Assignee changed from Jason Dillaman to Casey Bodley

reassigned

#3 Updated by Kefu Chai about 4 years ago

  • Assignee changed from Casey Bodley to Kefu Chai

#4 Updated by Kefu Chai about 4 years ago

  • Status changed from New to Fix Under Review

Brad has an excellent analysis root-causing this issue.

https://github.com/ceph/ceph/pull/9995

#5 Updated by Brad Hubbard about 4 years ago

Since NObjectIterator::__EndObjectIterator is static and present in both librados.so.2 and libradosstriper.so.1 its destructor is run by
__do_global_dtors_aux for both of the libs thus attempting to delete the "impl" pointer twice and causing the abort. This can be demonstrated by setting a breakpoint on the destructor and also through valgrind memcheck output. kefu's change means the destructor will only be called once and should resolve this issue.

#6 Updated by Kefu Chai about 4 years ago

  • Status changed from Fix Under Review to Resolved

Also available in: Atom PDF