Bug #16504
closedctest rados failures: *** Error in `rados': double free or corruption (fasttop): 0x0000000001db47d0 ***
0%
Description
$ cd build ; make -j4
$ ctest
Test project /home/dzafman/ceph/build
Start 1: test-ceph-helpers.sh
Start 2: erasure-decode-non-regression.sh
Start 3: ceph_objectstore_tool.py
Start 4: cephtool-test-mds.sh
- Error in `/home/dzafman/ceph/build/bin/rados': double free or corruption (fasttop): 0x00000000016817a0 ***
1/144 Test #4: cephtool-test-mds.sh .................... Passed 57.67 sec
Start 5: cephtool-test-mon.sh - Error in `rados': double free or corruption (fasttop): 0x0000000001db47d0 ***
2/144 Test #2: erasure-decode-non-regression.sh ........ Passed 136.36 sec
Start 6: cephtool-test-osd.sh - Error in `rados': double free or corruption (fasttop): 0x00000000011c27e0 ***
- Error in `rados': double free or corruption (fasttop): 0x000000000120b7a0 ***
^C
$ gdb bin/rados
(gbd) run
...
Thread 1 (Thread 0x7ffff7fd5640 (LWP 113008)):
#0 0x00007fffe5193cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007fffe51970d8 in __GI_abort () at abort.c:89
#2 0x00007fffe51d0394 in __libc_message (do_abort=do_abort@entry=1, fmt=fmt@entry=0x7fffe52deb28 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/posix/libc_fatal.c:175
#3 0x00007fffe51dc66e in malloc_printerr (ptr=<optimized out>, str=0x7fffe52decf0 "double free or corruption (fasttop)", action=1) at malloc.c:4996
#4 _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at malloc.c:3840
#5 0x00007fffe51995ea in __cxa_finalize (d=0x7fffe694b0a0) at cxa_finalize.c:56
#6 0x00007fffe6338683 in __do_global_dtors_aux () from /home/dzafman/ceph/build/lib/libradosstriper.so.1
#7 0x00007fffffffddb0 in ?? ()
#8 0x00007ffff7dea73a in _dl_fini () at dl-fini.c:252
Updated by Jason Dillaman almost 8 years ago
@David: did you intend to assign me to this ticket?
Updated by David Zafman almost 8 years ago
- Assignee changed from Jason Dillaman to Casey Bodley
reassigned
Updated by Kefu Chai almost 8 years ago
- Assignee changed from Casey Bodley to Kefu Chai
Updated by Kefu Chai almost 8 years ago
- Status changed from New to Fix Under Review
Brad has an excellent analysis root-causing this issue.
Updated by Brad Hubbard almost 8 years ago
Since NObjectIterator::__EndObjectIterator is static and present in both librados.so.2 and libradosstriper.so.1 its destructor is run by
__do_global_dtors_aux for both of the libs thus attempting to delete the "impl" pointer twice and causing the abort. This can be demonstrated by setting a breakpoint on the destructor and also through valgrind memcheck output. kefu's change means the destructor will only be called once and should resolve this issue.
Updated by Kefu Chai almost 8 years ago
- Status changed from Fix Under Review to Resolved