Project

General

Profile

Bug #1743

teuthology: not exiting with error when ceph-fuse shutdown fails

Added by Sage Weil over 9 years ago. Updated over 9 years ago.

Status:
Closed
Priority:
High
Assignee:
-
Category:
teuthology
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

here's the log tail:


INFO:teuthology.orchestra.run.out:22:41:21 DEBUG| [stdout]    4    111229     3.27 MB/sec  cleanup 602 sec
INFO:teuthology.orchestra.run.out:22:41:22 DEBUG| [stdout]    4    111229     3.27 MB/sec  cleanup 603 sec
INFO:teuthology.orchestra.run.out:22:41:23 DEBUG| [stdout]    4    111229     3.26 MB/sec  cleanup 604 sec
INFO:teuthology.orchestra.run.out:22:41:24 DEBUG| [stdout]    4    111229     3.26 MB/sec  cleanup 605 sec
INFO:teuthology.orchestra.run.out:22:41:25 DEBUG| [stdout]    4    111229     3.25 MB/sec  cleanup 606 sec
INFO:teuthology.orchestra.run.out:22:41:25 DEBUG| [stdout]
INFO:teuthology.orchestra.run.out:22:41:25 DEBUG| [stdout] Throughput 3.2846 MB/sec 4 procs
INFO:teuthology.orchestra.run.out:22:41:25 INFO | post-test siteration sysinfo error:
INFO:teuthology.orchestra.run.out:22:41:25 INFO | Traceback (most recent call last):
INFO:teuthology.orchestra.run.out:22:41:25 INFO |   File "/tmp/cephtest/autotest/client/common_lib/log.py", line 70, in decorated_func
INFO:teuthology.orchestra.run.out:22:41:25 INFO |     fn(*args, **dargs)
INFO:teuthology.orchestra.run.out:22:41:25 INFO |   File "/tmp/cephtest/autotest/client/bin/base_sysinfo.py", line 314, in log_after_each_iteration
INFO:teuthology.orchestra.run.out:22:41:25 INFO |     log.run(logdir)
INFO:teuthology.orchestra.run.out:22:41:25 INFO |   File "/tmp/cephtest/autotest/client/bin/base_sysinfo.py", line 78, in run
INFO:teuthology.orchestra.run.out:22:41:25 INFO |     shutil.copyfile(self.path, os.path.join(logdir, self.logf))
INFO:teuthology.orchestra.run.out:22:41:25 INFO |   File "/usr/lib/python2.6/shutil.py", line 50, in copyfile
INFO:teuthology.orchestra.run.out:22:41:25 INFO |     with open(src, 'rb') as fsrc:
INFO:teuthology.orchestra.run.out:22:41:25 INFO | IOError: [Errno 13] Permission denied: '/proc/slabinfo'
INFO:teuthology.orchestra.run.out:22:41:25 INFO | Test finished after 1 iterations.
INFO:teuthology.orchestra.run.out:22:41:25 DEBUG| Logging subprocess finished
INFO:teuthology.orchestra.run.out:22:41:25 DEBUG| Logging subprocess finished
INFO:teuthology.orchestra.run.out:22:41:25 DEBUG| Running 'gzip -9 "/tmp/cephtest/autotest/client/results/client.0.num0.dbench/dbench/sysinfo/dmesg"'
INFO:teuthology.orchestra.run.out:22:41:25 DEBUG| Running 'grep MemTotal /proc/meminfo'
INFO:teuthology.orchestra.run.out:22:41:25 DEBUG| Running 'rpm -qa'
INFO:teuthology.orchestra.run.out:22:41:25 DEBUG| Running 'dpkg -l'
INFO:teuthology.orchestra.run.out:AUTOTEST_STATUS::             GOOD    dbench dbench   timestamp=1321771285    localtime=Nov 19 22:41:25       completed successfully
INFO:teuthology.orchestra.run.out:22:41:25 INFO |               GOOD    dbench dbench   timestamp=1321771285    localtime=Nov 19 22:41:25       completed successfully
INFO:teuthology.orchestra.run.out:AUTOTEST_STATUS::     END GOOD        dbench dbench   timestamp=1321771285    localtime=Nov 19 22:41:25
INFO:teuthology.orchestra.run.out:22:41:25 INFO |       END GOOD        dbench dbench   timestamp=1321771285    localtime=Nov 19 22:41:25
INFO:teuthology.orchestra.run.out:22:41:25 DEBUG| Persistent state client._record_indent now set to 1
INFO:teuthology.orchestra.run.out:22:41:25 DEBUG| Dropping caches
INFO:teuthology.orchestra.run.out:22:41:25 DEBUG| Running 'sync'
INFO:teuthology.orchestra.run.out:22:41:25 DEBUG| Running 'echo 3 > /proc/sys/vm/drop_caches'
INFO:teuthology.orchestra.run.out:22:41:25 DEBUG| [stderr] /bin/bash: /proc/sys/vm/drop_caches: Permission denied
INFO:teuthology.orchestra.run.out:AUTOTEST_STATUS::END GOOD     ----    ----   timestamp=1321771285     localtime=Nov 19 22:41:25
INFO:teuthology.orchestra.run.out:22:41:25 INFO | END GOOD      ----    ----   timestamp=1321771285     localtime=Nov 19 22:41:25
INFO:teuthology.orchestra.run.out:22:41:25 DEBUG| Persistent state client._record_indent now set to 0
INFO:teuthology.orchestra.run.out:22:41:25 DEBUG| Logging subprocess finished
INFO:teuthology.orchestra.run.out:22:41:25 DEBUG| Logging subprocess finished
INFO:teuthology.task.cfuse:Unmounting ceph-fuse clients...
INFO:teuthology.task.cfuse.cfuse.0.err:ceph-fuse[16227]: fuse finished with error 0
INFO:teuthology.task.cfuse.cfuse.0.err:./include/xlist.h: In function 'void xlist<T>::remove(xlist<T>::item*) [with T = Cap*]', in thread '7fd018917700'
INFO:teuthology.task.cfuse.cfuse.0.err:./include/xlist.h: 130: FAILED assert((bool)_front == (bool)_size)
INFO:teuthology.task.cfuse.cfuse.0.err: ceph version 0.38-204-g9920a16 (commit:9920a168c59807083019c62fdf381434edea12e5)
INFO:teuthology.task.cfuse.cfuse.0.err: 1: (Client::remove_cap(Cap*)+0x72e) [0x46be2e]
INFO:teuthology.task.cfuse.cfuse.0.err: 2: (Client::remove_all_caps(Inode*)+0x33) [0x46bef3]
INFO:teuthology.task.cfuse.cfuse.0.err: 3: (Client::put_inode(Inode*, int)+0xfb) [0x46c00b]
INFO:teuthology.task.cfuse.cfuse.0.err: 4: (Client::handle_cap_flush_ack(Inode*, int, Cap*, MClientCaps*)+0x262) [0x47bb22]
INFO:teuthology.task.cfuse.cfuse.0.err: 5: (Client::handle_caps(MClientCaps*)+0x777) [0x498927]
INFO:teuthology.task.cfuse.cfuse.0.err: 6: (Client::ms_dispatch(Message*)+0x45b) [0x4a669b]
INFO:teuthology.task.cfuse.cfuse.0.err: 7: (SimpleMessenger::dispatch_entry()+0x7f9) [0x4f1b49]
INFO:teuthology.task.cfuse.cfuse.0.err: 8: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x4599cc]
INFO:teuthology.task.cfuse.cfuse.0.err: 9: (()+0x7971) [0x7fd01cd91971]
INFO:teuthology.task.cfuse.cfuse.0.err: 10: (clone()+0x6d) [0x7fd01b62092d]
INFO:teuthology.task.cfuse.cfuse.0.err: ceph version 0.38-204-g9920a16 (commit:9920a168c59807083019c62fdf381434edea12e5)
INFO:teuthology.task.cfuse.cfuse.0.err: 1: (Client::remove_cap(Cap*)+0x72e) [0x46be2e]
INFO:teuthology.task.cfuse.cfuse.0.err: 2: (Client::remove_all_caps(Inode*)+0x33) [0x46bef3]
INFO:teuthology.task.cfuse.cfuse.0.err: 3: (Client::put_inode(Inode*, int)+0xfb) [0x46c00b]
INFO:teuthology.task.cfuse.cfuse.0.err: 4: (Client::handle_cap_flush_ack(Inode*, int, Cap*, MClientCaps*)+0x262) [0x47bb22]
INFO:teuthology.task.cfuse.cfuse.0.err: 5: (Client::handle_caps(MClientCaps*)+0x777) [0x498927]
INFO:teuthology.task.cfuse.cfuse.0.err: 6: (Client::ms_dispatch(Message*)+0x45b) [0x4a669b]
INFO:teuthology.task.cfuse.cfuse.0.err: 7: (SimpleMessenger::dispatch_entry()+0x7f9) [0x4f1b49]
INFO:teuthology.task.cfuse.cfuse.0.err: 8: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x4599cc]
INFO:teuthology.task.cfuse.cfuse.0.err: 9: (()+0x7971) [0x7fd01cd91971]
INFO:teuthology.task.cfuse.cfuse.0.err: 10: (clone()+0x6d) [0x7fd01b62092d]
INFO:teuthology.task.cfuse.cfuse.0.err:terminate called after throwing an instance of 'ceph::FailedAssertion'
INFO:teuthology.task.cfuse.cfuse.0.err:*** Caught signal (Aborted) **
INFO:teuthology.task.cfuse.cfuse.0.err: in thread 7fd018917700
INFO:teuthology.task.cfuse.cfuse.0.err: ceph version 0.38-204-g9920a16 (commit:9920a168c59807083019c62fdf381434edea12e5)
INFO:teuthology.task.cfuse.cfuse.0.err: 1: /tmp/cephtest/binary/usr/local/bin/ceph-fuse() [0x5e1234]
INFO:teuthology.task.cfuse.cfuse.0.err: 2: (()+0xfb40) [0x7fd01cd99b40]
INFO:teuthology.task.cfuse.cfuse.0.err: 3: (gsignal()+0x35) [0x7fd01b56dba5]
INFO:teuthology.task.cfuse.cfuse.0.err: 4: (abort()+0x180) [0x7fd01b5716b0]
INFO:teuthology.task.cfuse.cfuse.0.err: 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7fd01be116bd]
INFO:teuthology.task.cfuse.cfuse.0.err: 6: (()+0xb9906) [0x7fd01be0f906]
INFO:teuthology.task.cfuse.cfuse.0.err: 7: (()+0xb9933) [0x7fd01be0f933]
INFO:teuthology.task.cfuse.cfuse.0.err: 8: (()+0xb9a3e) [0x7fd01be0fa3e]
INFO:teuthology.task.cfuse.cfuse.0.err: 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x39f) [0x4c6dbf]
INFO:teuthology.task.cfuse.cfuse.0.err: 10: (Client::remove_cap(Cap*)+0x72e) [0x46be2e]
INFO:teuthology.task.cfuse.cfuse.0.err: 11: (Client::remove_all_caps(Inode*)+0x33) [0x46bef3]
INFO:teuthology.task.cfuse.cfuse.0.err: 12: (Client::put_inode(Inode*, int)+0xfb) [0x46c00b]
INFO:teuthology.task.cfuse.cfuse.0.err: 13: (Client::handle_cap_flush_ack(Inode*, int, Cap*, MClientCaps*)+0x262) [0x47bb22]
INFO:teuthology.task.cfuse.cfuse.0.err: 14: (Client::handle_caps(MClientCaps*)+0x777) [0x498927]
INFO:teuthology.task.cfuse.cfuse.0.err: 15: (Client::ms_dispatch(Message*)+0x45b) [0x4a669b]
INFO:teuthology.task.cfuse.cfuse.0.err: 16: (SimpleMessenger::dispatch_entry()+0x7f9) [0x4f1b49]
INFO:teuthology.task.cfuse.cfuse.0.err: 17: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x4599cc]
INFO:teuthology.task.cfuse.cfuse.0.err: 18: (()+0x7971) [0x7fd01cd91971]
INFO:teuthology.task.cfuse.cfuse.0.err: 19: (clone()+0x6d) [0x7fd01b62092d]
INFO:teuthology.task.cfuse.cfuse.0.err:daemon-helper: command crashed with signal 6
ERROR:teuthology.run_tasks:Manager failed: <contextlib.GeneratorContextManager object at 0x31672d0>
Traceback (most recent call last):
  File "/home/qa/teuthology/teuthology/run_tasks.py", line 45, in run_tasks
    suppress = manager.__exit__(*exc_info)
  File "/usr/lib/python2.6/contextlib.py", line 23, in __exit__
    self.gen.next()
  File "/home/qa/teuthology/teuthology/task/cfuse.py", line 146, in task
    run.wait(cfuse_daemons.itervalues())
  File "/home/qa/teuthology/teuthology/orchestra/run.py", line 272, in wait
    proc.exitstatus.get()
  File "/home/qa/teuthology/virtualenv/lib/python2.6/site-packages/gevent/event.py", line 223, in get
    raise self._exception
CommandFailedError: Command failed with status 1: '/tmp/cephtest/enable-coredump /tmp/cephtest/binary/usr/local/bin/ceph-coverage /tmp/cephtest/archive/coverage /tmp/cephtest/daemon-helper kill /tmp/cephtest/binary/usr/local/bin/ceph-fuse -f --name client.0 -c /tmp/cephtest/ceph.conf /tmp/cephtest/mnt.0'
INFO:teuthology.task.ceph:Shutting down mds daemons...
INFO:teuthology.task.ceph.mds.0:mds.0: Stopped
INFO:teuthology.task.ceph:Shutting down osd daemons...
INFO:teuthology.task.ceph.osd.1:osd.1: Stopped
INFO:teuthology.task.ceph.osd.0:osd.0: Stopped
INFO:teuthology.task.ceph:Shutting down mon daemons...
INFO:teuthology.task.ceph.mon.1:mon.1: Stopped
INFO:teuthology.task.ceph.mon.0:mon.0: Stopped
INFO:teuthology.task.ceph.mon.2:mon.2: Stopped
INFO:teuthology.task.ceph:Checking cluster ceph.log for badness...
INFO:teuthology.task.ceph:Cleaning ceph cluster...
INFO:teuthology.task.ceph:Removing ceph binaries...
INFO:teuthology.task.ceph:Removing shipped files: daemon-helper enable-coredump...
INFO:teuthology.orchestra.run.out:kernel.core_pattern = core
INFO:teuthology.orchestra.run.out:kernel.core_pattern = core
INFO:teuthology.orchestra.run.out:kernel.core_pattern = core
WARNING:teuthology.task.internal:Found coredumps on ubuntu@sepia10.ceph.dreamhost.com, flagging run as failed
INFO:teuthology.task.internal:Removing archive directory...
INFO:teuthology.task.internal:Tidying up after the test...
INFO:teuthology.orchestra.run.err:rmdir: failed to remove `/tmp/cephtest': Directory not empty
ERROR:teuthology.run_tasks:Manager failed: <contextlib.GeneratorContextManager object at 0x30b2750>
Traceback (most recent call last):
  File "/home/qa/teuthology/teuthology/run_tasks.py", line 45, in run_tasks
    suppress = manager.__exit__(*exc_info)
  File "/usr/lib/python2.6/contextlib.py", line 23, in __exit__
    self.gen.next()
  File "/home/qa/teuthology/teuthology/task/internal.py", line 43, in base
    wait=False,
  File "/home/qa/teuthology/teuthology/orchestra/run.py", line 272, in wait
    proc.exitstatus.get()
  File "/home/qa/teuthology/virtualenv/lib/python2.6/site-packages/gevent/event.py", line 207, in get
    raise self._exception
CommandFailedError: Command failed with status 1: 'rmdir -- /tmp/cephtest'

... but the calling while virtualenv/bin/teuthology .... ; do ... ; done loop did not terminate.

History

#1 Updated by Sage Weil over 9 years ago

  • translation missing: en.field_position set to 10

#2 Updated by Anonymous over 9 years ago

  • Status changed from New to 12
  • Assignee set to Anonymous

This reproduced it nicely:

diff --git a/teuthology/task/internal.py b/teuthology/task/internal.py
index 58e7f14..7b3e648 100644
--- a/teuthology/task/internal.py
++ b/teuthology/task/internal.py
@ -36,7 +36,7 @ def base(ctx, config):
run.wait(
ctx.cluster.run(
args=[
- 'rmdir',
'BORKrmdir',
'--',
'/tmp/cephtest',
],

#3 Updated by Anonymous over 9 years ago

  • Status changed from 12 to Closed

commit c651c88eacf9c3bbf1f037be3a5dc0425308c730
Author: Tommi Virtanen <>
Date: 2011-11-21 16:00:19 -0800

Properly handle case where first error is inside a context manager exit.
Closes: http://tracker.newdream.net/issues/1743

Also available in: Atom PDF