Project

General

Profile

Actions

Bug #2090

closed

mon: assertion failed on shutdown

Added by Alex Elder about 12 years ago. Updated about 12 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
Monitor
Target version:
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I was running repeated cycles of the kernel_untar_build.sh workunit
to try to reproduce a problem in the client and hit this problem
instead.
./common/Mutex.h: 89: FAILED assert(nlock 0)
Looks like this is the version:
ceph version 0.42-106-g761ecc6 (761ecc69c24856b15531c92b69b1c73c5cc81bfc)
Full info below (sorry if it's excessive, I'm not sure what's needed).

Here is the YAML file that was driving it:
roles:
- [mon.a, mon.c, osd.0]
- [mon.b, mds.a, osd.1]
- [client.0]
kernel:
mon:
branch: master
client:
branch: testing
overrides:
ceph:
conf:
osd:
osd op complaint time: 120
coverage: true
fs: btrfs
log-whitelist:
- clocks not synchronized
- old request
tasks:
- ceph:
- rbd:
all:
- workunit:
all:
- kernel_untar_build.sh

And here is the information I got at the end of the run:

. . .
INFO:teuthology.task.workunit.client.0.out:removed `linux-2.6.33/usr/gen_init_cpio.c'
INFO:teuthology.task.workunit.client.0.out:removed directory: `linux-2.6.33/usr'
INFO:teuthology.task.workunit.client.0.out:removed `linux-2.6.33/README'
INFO:teuthology.task.workunit.client.0.out:removed `linux-2.6.33/COPYING'
INFO:teuthology.task.workunit.client.0.out:removed directory: `linux-2.6.33'
INFO:teuthology.task.workunit.client.0.out:removed directory: `t'
INFO:teuthology.task.workunit.client.0.out:removed `linux-2.6.33.tar.bz2'
INFO:teuthology.task.rbd:Unmounting rbd images...
INFO:teuthology.task.rbd:Unmapping rbd devices...
INFO:teuthology.task.rbd:Unloading rbd kernel module...
INFO:teuthology.task.rbd:Deleting rbd images...
Removing image: 100% complete...done.
INFO:teuthology.task.ceph:Shutting down mds daemons...
INFO:teuthology.task.ceph.mds.a.err:2012-02-21 22:14:40.784639 7f5f8deb8700 mds.0.1 * got signal Terminated
INFO:teuthology.task.ceph.mds.a:Stopped
INFO:teuthology.task.ceph:Shutting down osd daemons...
INFO:teuthology.task.ceph.osd.1:Stopped
INFO:teuthology.task.ceph.osd.0:Stopped
INFO:teuthology.task.ceph:Shutting down mon daemons...
INFO:teuthology.task.ceph.mon.a.err:2012-02-21 22:14:42.088201 7f51e2e79700 mon.a@1(peon) e1
Got Signal Terminated
INFO:teuthology.task.ceph.mon.a.err:./common/Mutex.h: In function 'Mutex::~Mutex()' thread 7f51e6d16780 time 2012-02-21 22:14:42.091122
INFO:teuthology.task.ceph.mon.a.err:./common/Mutex.h: 89: FAILED assert(nlock 0)
INFO:teuthology.task.ceph.mon.a.err: ceph version 0.42-106-g761ecc6 (761ecc69c24856b15531c92b69b1c73c5cc81bfc)
INFO:teuthology.task.ceph.mon.a.err: 1: (Monitor::~Monitor()+0x95c) [0x476d2c]
INFO:teuthology.task.ceph.mon.a.err: 2: (main()+0x301e) [0x461a1e]
INFO:teuthology.task.ceph.mon.a.err: 3: (libc_start_main()+0xfe) [0x7f51e50b5d8e]
INFO:teuthology.task.ceph.mon.a.err: 4: /tmp/cephtest/binary/usr/local/bin/ceph-mon() [0x45e799]
INFO:teuthology.task.ceph.mon.a.err: ceph version 0.42-106-g761ecc6 (761ecc69c24856b15531c92b69b1c73c5cc81bfc)
INFO:teuthology.task.ceph.mon.a.err: 1: (Monitor::~Monitor()+0x95c) [0x476d2c]
INFO:teuthology.task.ceph.mon.a.err: 2: (main()+0x301e) [0x461a1e]
INFO:teuthology.task.ceph.mon.a.err: 3: (_libc_start_main()+0xfe) [0x7f51e50b5d8e]
INFO:teuthology.task.ceph.mon.a.err: 4: /tmp/cephtest/binary/usr/local/bin/ceph-mon() [0x45e799]
INFO:teuthology.task.ceph.mon.a.err:terminate called after throwing an instance of 'ceph::FailedAssertion'
INFO:teuthology.task.ceph.mon.a.err:
** Caught signal (Aborted)
INFO:teuthology.task.ceph.mon.a.err: in thread 7f51e6d16780
INFO:teuthology.task.ceph.mon.a.err: ceph version 0.42-106-g761ecc6 (761ecc69c24856b15531c92b69b1c73c5cc81bfc)
INFO:teuthology.task.ceph.mon.a.err: 1: /tmp/cephtest/binary/usr/local/bin/ceph-mon() [0x5b2931]
INFO:teuthology.task.ceph.mon.a.err: 2: (()+0xfb40) [0x7f51e68f6b40]
INFO:teuthology.task.ceph.mon.a.err: 3: (gsignal()+0x35) [0x7f51e50caba5]
INFO:teuthology.task.ceph.mon.a.err: 4: (abort()+0x180) [0x7f51e50ce6b0]
INFO:teuthology.task.ceph.mon.a.err: 5: (
_gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f51e596e6bd]
INFO:teuthology.task.ceph.mon.a.err: 6: (()+0xb9906) [0x7f51e596c906]
INFO:teuthology.task.ceph.mon.a.err: 7: (()+0xb9933) [0x7f51e596c933]
INFO:teuthology.task.ceph.mon.a.err: 8: (()+0xb9a3e) [0x7f51e596ca3e]
INFO:teuthology.task.ceph.mon.a.err: 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x841) [0x534871]
INFO:teuthology.task.ceph.mon.a.err: 10: (Monitor::~Monitor()+0x95c) [0x476d2c]
INFO:teuthology.task.ceph.mon.a.err: 11: (main()+0x301e) [0x461a1e]
INFO:teuthology.task.ceph.mon.a.err: 12: (_libc_start_main()+0xfe) [0x7f51e50b5d8e]
INFO:teuthology.task.ceph.mon.a.err: 13: /tmp/cephtest/binary/usr/local/bin/ceph-mon() [0x45e799]
INFO:teuthology.task.ceph.mon.a.err:daemon-helper: command crashed with signal 6
ERROR:teuthology.task.ceph:Saw exception from mon.a
Traceback (most recent call last):
File "/home/elder/ceph/teuthology/teuthology/task/ceph.py", line 817, in run_daemon
daemon.stop()
File "/home/elder/ceph/teuthology/teuthology/task/ceph.py", line 40, in stop
run.wait([self.proc])
File "/home/elder/ceph/teuthology/teuthology/orchestra/run.py", line 272, in wait
proc.exitstatus.get()
File "/home/elder/ceph/teuthology/virtualenv/local/lib/python2.7/site-packages/gevent/event.py", line 223, in get
raise self._exception
CommandFailedError: Command failed with status 1: '/tmp/cephtest/enable-coredump /tmp/cephtest/binary/usr/local/bin/ceph-coverage /tmp/cephtest/archive/coverage /tmp/cephtest/daemon-helper term /tmp/cephtest/binary/usr/local/bin/ceph-mon -f -i a -c /tmp/cephtest/ceph.conf'
INFO:teuthology.task.ceph.mon.c.err:2012-02-21 22:14:42.332393 7fb5e458c700 mon.c@2(peon) e1
Got Signal Terminated *
INFO:teuthology.task.ceph.mon.c:Stopped
INFO:teuthology.task.ceph.mon.b.err:2012-02-21 22:14:42.428996 7f6586237700 mon.b@0(leader) e1 *
Got Signal Terminated *
*
INFO:teuthology.task.ceph.mon.b:Stopped
INFO:teuthology.task.ceph:Grabbing cluster log from mon.a...
INFO:teuthology.task.ceph:Checking cluster ceph.log for badness...
INFO:teuthology.task.ceph:Unmounting /tmp/cephtest/data/osd.0.data on
INFO:teuthology.task.ceph:Unmounting /tmp/cephtest/data/osd.1.data on
INFO:teuthology.task.ceph:Cleaning ceph cluster...
INFO:teuthology.task.ceph:Removing ceph binaries...
INFO:teuthology.task.ceph:Removing shipped files: daemon-helper enable-coredump...
INFO:teuthology.task.ceph:Compressing logs...
ERROR:teuthology.run_tasks:Manager failed: <contextlib.GeneratorContextManager object at 0x2868950>
Traceback (most recent call last):
File "/home/elder/ceph/teuthology/teuthology/run_tasks.py", line 45, in run_tasks
suppress = manager.
_exit
(*exc_info)
File "/usr/lib/python2.7/contextlib.py", line 24, in exit
self.gen.next()
File "/home/elder/ceph/teuthology/teuthology/task/ceph.py", line 1017, in task
yield
File "/usr/lib/python2.7/contextlib.py", line 24, in exit
self.gen.next()
File "/home/elder/ceph/teuthology/teuthology/contextutil.py", line 35, in nested
if exit(*exc):
File "/usr/lib/python2.7/contextlib.py", line 24, in exit
self.gen.next()
File "/home/elder/ceph/teuthology/teuthology/task/ceph.py", line 817, in run_daemon
daemon.stop()
File "/home/elder/ceph/teuthology/teuthology/task/ceph.py", line 40, in stop
run.wait([self.proc])
File "/home/elder/ceph/teuthology/teuthology/orchestra/run.py", line 272, in wait
proc.exitstatus.get()
File "/home/elder/ceph/teuthology/virtualenv/local/lib/python2.7/site-packages/gevent/event.py", line 223, in get
raise self._exception
CommandFailedError: Command failed with status 1: '/tmp/cephtest/enable-coredump /tmp/cephtest/binary/usr/local/bin/ceph-coverage /tmp/cephtest/archive/coverage /tmp/cephtest/daemon-helper term /tmp/cephtest/binary/usr/local/bin/ceph-mon -f -i a -c /tmp/cephtest/ceph.conf'
INFO:teuthology.task.internal:Shutting down syslog monitoring...
INFO:teuthology.orchestra.run.out:rsyslog start/running, process 4452
INFO:teuthology.orchestra.run.out:rsyslog start/running, process 18961
INFO:teuthology.orchestra.run.out:rsyslog start/running, process 25670
INFO:teuthology.task.internal:Checking logs for errors...
INFO:teuthology.task.internal:Compressing syslogs...
INFO:teuthology.orchestra.run.out:kernel.core_pattern = core
INFO:teuthology.orchestra.run.out:kernel.core_pattern = core
INFO:teuthology.orchestra.run.out:kernel.core_pattern = core
WARNING:teuthology.task.internal:Found coredumps on , flagging run as failed
INFO:teuthology.task.internal:Transferring archived files...
INFO:teuthology.task.internal:Removing archive directory...
INFO:teuthology.task.internal:Tidying up after the test...
INFO:teuthology.run:Duration was 773.190049 seconds

Actions

Also available in: Atom PDF