Bug #2090
closedmon: assertion failed on shutdown
0%
Description
I was running repeated cycles of the kernel_untar_build.sh workunit
to try to reproduce a problem in the client and hit this problem
instead.
./common/Mutex.h: 89: FAILED assert(nlock 0)
Looks like this is the version:
ceph version 0.42-106-g761ecc6 (761ecc69c24856b15531c92b69b1c73c5cc81bfc)
Full info below (sorry if it's excessive, I'm not sure what's needed).
Here is the YAML file that was driving it:
roles:
- [mon.a, mon.c, osd.0]
- [mon.b, mds.a, osd.1]
- [client.0]
kernel:
mon:
branch: master
client:
branch: testing
overrides:
ceph:
conf:
osd:
osd op complaint time: 120
coverage: true
fs: btrfs
log-whitelist:
- clocks not synchronized
- old request
tasks:
- ceph:
- rbd:
all:
- workunit:
all:
- kernel_untar_build.sh
And here is the information I got at the end of the run:
. . .
INFO:teuthology.task.workunit.client.0.out:removed `linux-2.6.33/usr/gen_init_cpio.c'
INFO:teuthology.task.workunit.client.0.out:removed directory: `linux-2.6.33/usr'
INFO:teuthology.task.workunit.client.0.out:removed `linux-2.6.33/README'
INFO:teuthology.task.workunit.client.0.out:removed `linux-2.6.33/COPYING'
INFO:teuthology.task.workunit.client.0.out:removed directory: `linux-2.6.33'
INFO:teuthology.task.workunit.client.0.out:removed directory: `t'
INFO:teuthology.task.workunit.client.0.out:removed `linux-2.6.33.tar.bz2'
INFO:teuthology.task.rbd:Unmounting rbd images...
INFO:teuthology.task.rbd:Unmapping rbd devices...
INFO:teuthology.task.rbd:Unloading rbd kernel module...
INFO:teuthology.task.rbd:Deleting rbd images...
Removing image: 100% complete...done.
INFO:teuthology.task.ceph:Shutting down mds daemons...
INFO:teuthology.task.ceph.mds.a.err:2012-02-21 22:14:40.784639 7f5f8deb8700 mds.0.1 * got signal Terminated
INFO:teuthology.task.ceph.mds.a:Stopped
INFO:teuthology.task.ceph:Shutting down osd daemons...
INFO:teuthology.task.ceph.osd.1:Stopped
INFO:teuthology.task.ceph.osd.0:Stopped
INFO:teuthology.task.ceph:Shutting down mon daemons...
INFO:teuthology.task.ceph.mon.a.err:2012-02-21 22:14:42.088201 7f51e2e79700 mon.a@1(peon) e1 Got Signal Terminated
INFO:teuthology.task.ceph.mon.a.err:./common/Mutex.h: In function 'Mutex::~Mutex()' thread 7f51e6d16780 time 2012-02-21 22:14:42.091122
INFO:teuthology.task.ceph.mon.a.err:./common/Mutex.h: 89: FAILED assert(nlock 0)
INFO:teuthology.task.ceph.mon.a.err: ceph version 0.42-106-g761ecc6 (761ecc69c24856b15531c92b69b1c73c5cc81bfc)
INFO:teuthology.task.ceph.mon.a.err: 1: (Monitor::~Monitor()+0x95c) [0x476d2c]
INFO:teuthology.task.ceph.mon.a.err: 2: (main()+0x301e) [0x461a1e]
INFO:teuthology.task.ceph.mon.a.err: 3: (libc_start_main()+0xfe) [0x7f51e50b5d8e]
INFO:teuthology.task.ceph.mon.a.err: 4: /tmp/cephtest/binary/usr/local/bin/ceph-mon() [0x45e799]
INFO:teuthology.task.ceph.mon.a.err: ceph version 0.42-106-g761ecc6 (761ecc69c24856b15531c92b69b1c73c5cc81bfc)
INFO:teuthology.task.ceph.mon.a.err: 1: (Monitor::~Monitor()+0x95c) [0x476d2c]
INFO:teuthology.task.ceph.mon.a.err: 2: (main()+0x301e) [0x461a1e]
INFO:teuthology.task.ceph.mon.a.err: 3: (_libc_start_main()+0xfe) [0x7f51e50b5d8e]
INFO:teuthology.task.ceph.mon.a.err: 4: /tmp/cephtest/binary/usr/local/bin/ceph-mon() [0x45e799]
INFO:teuthology.task.ceph.mon.a.err:terminate called after throwing an instance of 'ceph::FailedAssertion'
INFO:teuthology.task.ceph.mon.a.err:** Caught signal (Aborted)
INFO:teuthology.task.ceph.mon.a.err: in thread 7f51e6d16780
INFO:teuthology.task.ceph.mon.a.err: ceph version 0.42-106-g761ecc6 (761ecc69c24856b15531c92b69b1c73c5cc81bfc)
INFO:teuthology.task.ceph.mon.a.err: 1: /tmp/cephtest/binary/usr/local/bin/ceph-mon() [0x5b2931]
INFO:teuthology.task.ceph.mon.a.err: 2: (()+0xfb40) [0x7f51e68f6b40]
INFO:teuthology.task.ceph.mon.a.err: 3: (gsignal()+0x35) [0x7f51e50caba5]
INFO:teuthology.task.ceph.mon.a.err: 4: (abort()+0x180) [0x7f51e50ce6b0]
INFO:teuthology.task.ceph.mon.a.err: 5: (_gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f51e596e6bd]
INFO:teuthology.task.ceph.mon.a.err: 6: (()+0xb9906) [0x7f51e596c906]
INFO:teuthology.task.ceph.mon.a.err: 7: (()+0xb9933) [0x7f51e596c933]
INFO:teuthology.task.ceph.mon.a.err: 8: (()+0xb9a3e) [0x7f51e596ca3e]
INFO:teuthology.task.ceph.mon.a.err: 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x841) [0x534871]
INFO:teuthology.task.ceph.mon.a.err: 10: (Monitor::~Monitor()+0x95c) [0x476d2c]
INFO:teuthology.task.ceph.mon.a.err: 11: (main()+0x301e) [0x461a1e]
INFO:teuthology.task.ceph.mon.a.err: 12: (_libc_start_main()+0xfe) [0x7f51e50b5d8e]
INFO:teuthology.task.ceph.mon.a.err: 13: /tmp/cephtest/binary/usr/local/bin/ceph-mon() [0x45e799]
INFO:teuthology.task.ceph.mon.a.err:daemon-helper: command crashed with signal 6
ERROR:teuthology.task.ceph:Saw exception from mon.a
Traceback (most recent call last):
File "/home/elder/ceph/teuthology/teuthology/task/ceph.py", line 817, in run_daemon
daemon.stop()
File "/home/elder/ceph/teuthology/teuthology/task/ceph.py", line 40, in stop
run.wait([self.proc])
File "/home/elder/ceph/teuthology/teuthology/orchestra/run.py", line 272, in wait
proc.exitstatus.get()
File "/home/elder/ceph/teuthology/virtualenv/local/lib/python2.7/site-packages/gevent/event.py", line 223, in get
raise self._exception
CommandFailedError: Command failed with status 1: '/tmp/cephtest/enable-coredump /tmp/cephtest/binary/usr/local/bin/ceph-coverage /tmp/cephtest/archive/coverage /tmp/cephtest/daemon-helper term /tmp/cephtest/binary/usr/local/bin/ceph-mon -f -i a -c /tmp/cephtest/ceph.conf'
INFO:teuthology.task.ceph.mon.c.err:2012-02-21 22:14:42.332393 7fb5e458c700 mon.c@2(peon) e1 Got Signal Terminated *
INFO:teuthology.task.ceph.mon.c:Stopped
INFO:teuthology.task.ceph.mon.b.err:2012-02-21 22:14:42.428996 7f6586237700 mon.b@0(leader) e1 * Got Signal Terminated **
INFO:teuthology.task.ceph.mon.b:Stopped
INFO:teuthology.task.ceph:Grabbing cluster log from ubuntu@sepia77.ceph.dreamhost.com mon.a...
INFO:teuthology.task.ceph:Checking cluster ceph.log for badness...
INFO:teuthology.task.ceph:Unmounting /tmp/cephtest/data/osd.0.data on ubuntu@sepia77.ceph.dreamhost.com
INFO:teuthology.task.ceph:Unmounting /tmp/cephtest/data/osd.1.data on ubuntu@sepia70.ceph.dreamhost.com
INFO:teuthology.task.ceph:Cleaning ceph cluster...
INFO:teuthology.task.ceph:Removing ceph binaries...
INFO:teuthology.task.ceph:Removing shipped files: daemon-helper enable-coredump...
INFO:teuthology.task.ceph:Compressing logs...
ERROR:teuthology.run_tasks:Manager failed: <contextlib.GeneratorContextManager object at 0x2868950>
Traceback (most recent call last):
File "/home/elder/ceph/teuthology/teuthology/run_tasks.py", line 45, in run_tasks
suppress = manager._exit(*exc_info)
File "/usr/lib/python2.7/contextlib.py", line 24, in exit
self.gen.next()
File "/home/elder/ceph/teuthology/teuthology/task/ceph.py", line 1017, in task
yield
File "/usr/lib/python2.7/contextlib.py", line 24, in exit
self.gen.next()
File "/home/elder/ceph/teuthology/teuthology/contextutil.py", line 35, in nested
if exit(*exc):
File "/usr/lib/python2.7/contextlib.py", line 24, in exit
self.gen.next()
File "/home/elder/ceph/teuthology/teuthology/task/ceph.py", line 817, in run_daemon
daemon.stop()
File "/home/elder/ceph/teuthology/teuthology/task/ceph.py", line 40, in stop
run.wait([self.proc])
File "/home/elder/ceph/teuthology/teuthology/orchestra/run.py", line 272, in wait
proc.exitstatus.get()
File "/home/elder/ceph/teuthology/virtualenv/local/lib/python2.7/site-packages/gevent/event.py", line 223, in get
raise self._exception
CommandFailedError: Command failed with status 1: '/tmp/cephtest/enable-coredump /tmp/cephtest/binary/usr/local/bin/ceph-coverage /tmp/cephtest/archive/coverage /tmp/cephtest/daemon-helper term /tmp/cephtest/binary/usr/local/bin/ceph-mon -f -i a -c /tmp/cephtest/ceph.conf'
INFO:teuthology.task.internal:Shutting down syslog monitoring...
INFO:teuthology.orchestra.run.out:rsyslog start/running, process 4452
INFO:teuthology.orchestra.run.out:rsyslog start/running, process 18961
INFO:teuthology.orchestra.run.out:rsyslog start/running, process 25670
INFO:teuthology.task.internal:Checking logs for errors...
INFO:teuthology.task.internal:Compressing syslogs...
INFO:teuthology.orchestra.run.out:kernel.core_pattern = core
INFO:teuthology.orchestra.run.out:kernel.core_pattern = core
INFO:teuthology.orchestra.run.out:kernel.core_pattern = core
WARNING:teuthology.task.internal:Found coredumps on ubuntu@sepia77.ceph.dreamhost.com, flagging run as failed
INFO:teuthology.task.internal:Transferring archived files...
INFO:teuthology.task.internal:Removing archive directory...
INFO:teuthology.task.internal:Tidying up after the test...
INFO:teuthology.run:Duration was 773.190049 seconds