Project

General

Profile

Actions

Bug #11578

closed

make check hangs due to LTTNG (developer build)

Added by David Zafman almost 9 years ago. Updated about 7 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

On my development build machine using Ubuntu 14.04.1 (Trusty Tahr) make check hangs. I have a ceph-mon hung being started by test-erasure-code.sh.

One interesting thing to note is that manually running just this test does not hang. But
going through make check always seem to hang at this same test. Doing a do_autogen.sh -L to disable LTTNG clears the problem for now.

make check
...
Running /bin/bash ./test/erasure-code/test-erasure-code.sh

$ more test/erasure-code/test-erasure-code.sh.log
main: 115: setup testdir/test-erasure-code
setup: 21: local dir=testdir/test-erasure-code
setup: 22: teardown testdir/test-erasure-code
teardown: 27: local dir=testdir/test-erasure-code
teardown: 28: kill_daemons testdir/test-erasure-code
kill_daemons: 68: local dir=testdir/test-erasure-code
kkill_daemons: 69: find testdir/test-erasure-code
kkill_daemons: 69: grep '\.pid'
find: `testdir/test-erasure-code': No such file or directory
tteardown: 29: stat -f -c %T .
teardown: 29: '[' ext2/ext3 == btrfs ']'
teardown: 32: rm -fr testdir/test-erasure-code
setup: 23: mkdir -p testdir/test-erasure-code
main: 116: local code
main: 117: run testdir/test-erasure-code
run: 24: local dir=testdir/test-erasure-code
run: 26: export CEPH_MON=127.0.0.1:7101
run: 26: CEPH_MON=127.0.0.1:7101
run: 27: export CEPH_ARGS
rrun: 28: uuidgen
run: 28: CEPH_ARGS+='--fsid=dd46695b-da94-4a94-9b70-9a30f77dcf6a --auth-supported=none '
run: 29: CEPH_ARGS+='--enable-experimental-unrecoverable-data-corrupting-features=shec '
run: 30: CEPH_ARGS+='--mon-host=127.0.0.1:7101 '
run: 32: setup testdir/test-erasure-code
setup: 21: local dir=testdir/test-erasure-code
setup: 22: teardown testdir/test-erasure-code
teardown: 27: local dir=testdir/test-erasure-code
teardown: 28: kill_daemons testdir/test-erasure-code
kill_daemons: 68: local dir=testdir/test-erasure-code
kkill_daemons: 69: find testdir/test-erasure-code
kkill_daemons: 69: grep '\.pid'
tteardown: 29: stat -f -c %T .
teardown: 29: '[' ext2/ext3 == btrfs ']'
teardown: 32: rm -fr testdir/test-erasure-code
setup: 23: mkdir -p testdir/test-erasure-code
run: 33: run_mon testdir/test-erasure-code a --public-addr 127.0.0.1:7101
run_mon: 36: local dir=testdir/test-erasure-code
run_mon: 37: shift
run_mon: 38: local id=a
run_mon: 39: shift
run_mon: 40: dir+=/a
run_mon: 43: ./ceph-mon --id a --mkfs --mon-data=testdir/test-erasure-code/a --run-dir=testdir/test-erasure-code/a --public-addr 127.0.0.1:7101
2015-05-07 09:53:42.397548 2b61456b4840 -1 WARNING: the following dangerous and experimental features are enabled: shec
2015-05-07 09:53:42.397618 2b61456b4840 -1 WARNING: the following dangerous and experimental features are enabled: shec
./ceph-mon: renaming mon.noname-a 127.0.0.1:7101/0 to mon.a
./ceph-mon: set fsid to dd46695b-da94-4a94-9b70-9a30f77dcf6a
./ceph-mon: created monfs at testdir/test-erasure-code/a for mon.a
run_mon: 49: ./ceph-mon --id a --mon-osd-full-ratio=.99 --mon-data-avail-crit=1 --paxos-propose-interval=0.1 --osd-crush-chooseleaf-type=0 --osd-pool-default-erasure-code-directory=.libs --debug-mon 20 --debug-ms 20 --debug-paxos 20 --chdir= --mon-data=testdir/test-erasure-code/a --log-file=testdir/t
est-erasure-code/a/log --mon-cluster-log-file=testdir/test-erasure-code/a/log --run-dir=testdir/test-erasure-code/a '--pid-file=testdir/test-erasure-code/a/$name.pid' --public-addr 127.0.0.1:7101
2015-05-07 09:53:42.414740 2b8dd20c4840 -1 WARNING: the following dangerous and experimental features are enabled: shec
2015-05-07 09:53:42.414845 2b8dd20c4840 -1 WARNING: the following dangerous and experimental features are enabled: shec

dzafman 106984 106911 0 09:53 pts/1 00:00:00 ./ceph-mon --id a --mon-osd-full-ratio=.99 --mon-data-avail-crit=1 --paxos-propose-interval=0.1 --osd-crush-chooseleaf-type=0 --osd-pool-default-erasure-code-directory=.libs --debug-mon 20 --debug-ms 20 --debug-paxos 20 --chdir= --mon-data=testdir/test-erasure-code/a --log-file=testdir/test-erasure-code/a/log --mon-cluster-log-file=testdir/test-erasure-code/a/log --run-dir=testdir/test-erasure-code/a --pid-file=testdir/test-erasure-code/a/$name.pid --public-addr 127.0.0.1:7101

Thread 3 (Thread 0x2b8dd574f700 (LWP 106987)):
#0 0x00002b8dd29568ad in recvmsg () at ../sysdeps/unix/syscall-template.S:81
#1 0x00002b8dd2705efd in ustcomm_recv_unix_sock () from /usr/lib/x86_64-linux-gnu/liblttng-ust.so.0
#2 0x00002b8dd27092e8 in ?? () from /usr/lib/x86_64-linux-gnu/liblttng-ust.so.0
#3 0x00002b8dd294f182 in start_thread (arg=0x2b8dd574f700) at pthread_create.c:312
#4 0x00002b8dd442200d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 2 (Thread 0x2b8dd5950700 (LWP 106988)):
#0 syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
#1 0x00002b8dd2709d1b in ?? () from /usr/lib/x86_64-linux-gnu/liblttng-ust.so.0
#2 0x00002b8dd294f182 in start_thread (arg=0x2b8dd5950700) at pthread_create.c:312
#3 0x00002b8dd442200d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 1 (Thread 0x2b8dd20c4840 (LWP 106984)):
#0 lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1 0x00002b8dd2951657 in _L_lock_909 () from /lib/x86_64-linux-gnu/libpthread.so.0
#2 0x00002b8dd2951480 in __GI
_pthread_mutex_lock (mutex=0x2b8dd29466a0) at ../nptl/pthread_mutex_lock.c:79
#3 0x00002b8dd270c74a in ust_before_fork () from /usr/lib/x86_64-linux-gnu/liblttng-ust.so.0
#4 0x00002b8dd22a7887 in fork () from /usr/lib/x86_64-linux-gnu/liblttng-ust-fork.so
#5 0x0000000000d9e3f3 in Preforker::prefork (this=0x7fff00bd9c80, err="") at common/Preforker.h:62
#6 0x0000000000d96644 in main (argc=22, argv=0x7fff00bdaba8) at ceph_mon.cc:501

Actions #1

Updated by Sage Weil almost 9 years ago

  • Assignee deleted (Sage Weil)
Actions #2

Updated by Sage Weil about 7 years ago

  • Status changed from New to Can't reproduce
Actions

Also available in: Atom PDF