Project

General

Profile

Actions

Bug #4094

closed

teuthology-nuke hangs if nuking fails

Added by Greg Farnum about 11 years ago. Updated about 10 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

Following #4093, teuthology-nuke just hung, without making further progress or erroring out:

gregf@kai:~/src/teuthology [master]$ ./virtualenv/bin/teuthology-nuke -t targets.yaml
INFO:teuthology.nuke:targets:
  ubuntu@plana02.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCtjMpSkaJhFqFtpo5AEe3KHygR+ueaWU+gYrrRzPa8YvmR0TCapw0kz77y1Fjcfh8rkTapnevpaYgQSMrMs0Yc34kF5XtNRuQXkpTwrhS8isZJBeNSc1W5XeKjj4KB/UuzBywJq0h/0KbH1DrMy72cGISOzdiP9CMA5KUvJo0m31wv1+MPcPn/5AhZgoWPStfaZdb4TaJUrNLrws0oRXa0yQbUa6WmUBsYhHsw4K1ukJAcJwVjcgAAv1N+GnyuWLVs+pvknBO3Whv1RhjY6EDGjun1MDPw+OE3wJsJX7BRr8eZv2Avi7pRlseWeWJwgsHMJ/j0yhf+SCy1+oSPrD2b
  ubuntu@plana05.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC4QLrFgiF5oIdiIeB+l40EWRmssMz9SjaNaS46izEtQn9dzF6wcTaqR8JFz7xPpU5b0IH4m3cCCCUfybVZMeyZRh94Xa0knzohzxGEjJzjmaxBhZVJZ0ecXSK724Vy2fnCTpeXo8aAJdPvng06TUy9Fh4RhT6IMHq0xjVUwb4J4GLKse5bumBJXwmrTFh8EMY3vBJ/6+XkRw8ASWBIVMRGGI+QZwlb/UMdf6pcyaDwJNv4nQDeBea+5nwtlyxCjYV/RGo4GoHU/Uit7pF73xJLGDH9kvmDoXKmXBI08wqGA14hVyt5CEHzkNbAy5fUq3yJdzGf5oBlghyTLjsWmf3Z
  ubuntu@plana07.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDJpBygMO1DgQyq6sMOVE6YRX2rs3/biGU3ceT1spqzuIx/XGuBGquTpWzMZ2dqYnEbGclvDZj254r5eIqB47VirwaBXFO1Lzo/yqpSWau/f6mdO8zGoloZXxBr2qRlLHx9luePTsKTRxeXULEpgbKN65+/QkUn4cicwPUG80R8y73cEl0jncxaUCGt3MvhXWsaEfV4EQshnRhasX0zjQNd2I5Gi7AEJK28pQfFM+SvqDVzpPyLBJskWbuOfY+d3IKUricXAElh/rHWjIM/INSC22wlJSRAydemLCwvTyz840JKo5NifRj6d0bLtlug35hRfCuwcT/OPPaDtBZN5afH
INFO:teuthology.nuke:checking console status of plana07.ipmi.sepia.ceph.com
INFO:teuthology.nuke:checking console status of plana05.ipmi.sepia.ceph.com
INFO:teuthology.nuke:checking console status of plana02.ipmi.sepia.ceph.com
INFO:teuthology.nuke:console ready on plana02.ipmi.sepia.ceph.com
INFO:teuthology.task.internal:Checking locks...
INFO:teuthology.task.internal:Opening connections...
INFO:teuthology.nuke:console ready on plana07.ipmi.sepia.ceph.com
INFO:teuthology.task.internal:Checking locks...
INFO:teuthology.task.internal:Opening connections...
INFO:teuthology.nuke:Unmount ceph-fuse and killing daemons...
INFO:teuthology.nuke:Unmount ceph-fuse and killing daemons...
INFO:teuthology.nuke:Waiting for ubuntu@plana02.front.sepia.ceph.com to finish shutdowns...
INFO:teuthology.nuke:Waiting for ubuntu@plana07.front.sepia.ceph.com to finish shutdowns...
INFO:teuthology.orchestra.run.err:fusermount: failed to unmount /tmp/cephtest/gregf@kai-2013-02-11_13-25-29/mnt.0: Device or resource busy
INFO:teuthology.nuke:All daemons killed.
INFO:teuthology.nuke:Looking for kernel mounts to handle...
INFO:teuthology.nuke:Unmount any osd data directories...
INFO:teuthology.nuke:All daemons killed.
INFO:teuthology.nuke:Looking for kernel mounts to handle...
INFO:teuthology.nuke:Unmount any osd data directories...
INFO:teuthology.nuke:Unmount any osd tmpfs dirs...
INFO:teuthology.nuke:All kernel mounts gone.
INFO:teuthology.nuke:Synchronizing clocks...
INFO:teuthology.nuke:Reseting syslog output locations...
INFO:teuthology.nuke:Waiting for ubuntu@plana02.front.sepia.ceph.com to restart syslog...
INFO:teuthology.nuke:Unmount any osd tmpfs dirs...
INFO:teuthology.nuke:All kernel mounts gone.
INFO:teuthology.nuke:Synchronizing clocks...
INFO:teuthology.nuke:Reseting syslog output locations...
INFO:teuthology.nuke:Waiting for ubuntu@plana07.front.sepia.ceph.com to restart syslog...
INFO:teuthology.orchestra.run.out:rsyslog start/running, process 32616
INFO:teuthology.nuke:Clearing filesystem of test data...
INFO:teuthology.nuke:Waiting for ubuntu@plana02.front.sepia.ceph.com to clear filesystem...
INFO:teuthology.orchestra.run.out:rsyslog start/running, process 9376
INFO:teuthology.nuke:Clearing filesystem of test data...
INFO:teuthology.nuke:Waiting for ubuntu@plana07.front.sepia.ceph.com to clear filesystem...
INFO:teuthology.orchestra.run.err:rm: cannot remove `/tmp/cephtest/gregf@kai-2013-02-11_13-25-29/data/osd.0.data/current': Operation not permitted
INFO:teuthology.orchestra.run.err:rm: cannot remove `/tmp/cephtest/gregf@kai-2013-02-11_13-25-29/data/osd.0.data/snap_41320': Operation not permitted
INFO:teuthology.orchestra.run.err:rm: cannot remove `/tmp/cephtest/gregf@kai-2013-02-11_13-25-29/data/osd.0.data/snap_41365': Operation not permitted
INFO:teuthology.orchestra.run.err:rm: cannot remove `/tmp/cephtest/gregf@kai-2013-02-11_13-25-29/data/osd.1.data/current': Operation not permitted
INFO:teuthology.orchestra.run.err:rm: cannot remove `/tmp/cephtest/gregf@kai-2013-02-11_13-25-29/data/osd.1.data/snap_32479': Operation not permitted
INFO:teuthology.orchestra.run.err:rm: cannot remove `/tmp/cephtest/gregf@kai-2013-02-11_13-25-29/data/osd.1.data/snap_32515': Operation not permitted
INFO:teuthology.orchestra.run.err:rm: cannot remove `/tmp/cephtest/gregf@kai-2013-02-11_13-25-29/data/osd.2.data/current': Operation not permitted
INFO:teuthology.orchestra.run.err:rm: cannot remove `/tmp/cephtest/gregf@kai-2013-02-11_13-25-29/data/osd.2.data/snap_18962': Operation not permitted
INFO:teuthology.orchestra.run.err:rm: cannot remove `/tmp/cephtest/gregf@kai-2013-02-11_13-25-29/data/osd.2.data/snap_18994': Operation not permitted
ERROR:teuthology.nuke:Could not nuke all targets in {'ubuntu@plana07.front.sepia.ceph.com': 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDJpBygMO1DgQyq6sMOVE6YRX2rs3/biGU3ceT1spqzuIx/XGuBGquTpWzMZ2dqYnEbGclvDZj254r5eIqB47VirwaBXFO1Lzo/yqpSWau/f6mdO8zGoloZXxBr2qRlLHx9luePTsKTRxeXULEpgbKN65+/QkUn4cicwPUG80R8y73cEl0jncxaUCGt3MvhXWsaEfV4EQshnRhasX0zjQNd2I5Gi7AEJK28pQfFM+SvqDVzpPyLBJskWbuOfY+d3IKUricXAElh/rHWjIM/INSC22wlJSRAydemLCwvTyz840JKo5NifRj6d0bLtlug35hRfCuwcT/OPPaDtBZN5afH'}
Traceback (most recent call last):
  File "/home/gregf/src/teuthology/teuthology/nuke.py", line 352, in nuke_one
    nuke_helper(ctx, log)
  File "/home/gregf/src/teuthology/teuthology/nuke.py", line 423, in nuke_helper
    remove_testing_tree(ctx, log)
  File "/home/gregf/src/teuthology/teuthology/nuke.py", line 244, in remove_testing_tree
    proc.exitstatus.get()
  File "/home/gregf/src/teuthology/virtualenv/lib/python2.6/site-packages/gevent/event.py", line 223, in get
    raise self._exception
CommandFailedError: Command failed with status 1: 'sudo rm -rf /tmp/cephtest'
INFO:teuthology.nuke:console ready on plana05.ipmi.sepia.ceph.com
INFO:teuthology.task.internal:Checking locks...
INFO:teuthology.task.internal:Opening connections...
INFO:teuthology.nuke:Unmount ceph-fuse and killing daemons...
INFO:teuthology.nuke:Waiting for ubuntu@plana05.front.sepia.ceph.com to finish shutdowns...
INFO:teuthology.nuke:All daemons killed.
INFO:teuthology.nuke:Looking for kernel mounts to handle...
INFO:teuthology.nuke:Unmount any osd data directories...
INFO:teuthology.nuke:Unmount any osd tmpfs dirs...
INFO:teuthology.nuke:All kernel mounts gone.
INFO:teuthology.nuke:Synchronizing clocks...
INFO:teuthology.nuke:Reseting syslog output locations...
INFO:teuthology.nuke:Waiting for ubuntu@plana05.front.sepia.ceph.com to restart syslog...
INFO:teuthology.orchestra.run.out:rsyslog start/running, process 29146
INFO:teuthology.nuke:Clearing filesystem of test data...
INFO:teuthology.nuke:Waiting for ubuntu@plana05.front.sepia.ceph.com to clear filesystem...
INFO:teuthology.orchestra.run.err:rm: cannot remove `/tmp/cephtest/gregf@kai-2013-02-11_13-25-29/data/osd.3.data/current': Operation not permitted
INFO:teuthology.orchestra.run.err:rm: cannot remove `/tmp/cephtest/gregf@kai-2013-02-11_13-25-29/data/osd.3.data/snap_29990': Operation not permitted
INFO:teuthology.orchestra.run.err:rm: cannot remove `/tmp/cephtest/gregf@kai-2013-02-11_13-25-29/data/osd.3.data/snap_29995': Operation not permitted
INFO:teuthology.orchestra.run.err:rm: cannot remove `/tmp/cephtest/gregf@kai-2013-02-11_13-25-29/data/osd.5.data/current': Operation not permitted
INFO:teuthology.orchestra.run.err:rm: cannot remove `/tmp/cephtest/gregf@kai-2013-02-11_13-25-29/data/osd.5.data/snap_31429': Operation not permitted
INFO:teuthology.orchestra.run.err:rm: cannot remove `/tmp/cephtest/gregf@kai-2013-02-11_13-25-29/data/osd.5.data/snap_31434': Operation not permitted
INFO:teuthology.orchestra.run.err:rm: cannot remove `/tmp/cephtest/gregf@kai-2013-02-11_13-25-29/data/osd.4.data/current': Operation not permitted
INFO:teuthology.orchestra.run.err:rm: cannot remove `/tmp/cephtest/gregf@kai-2013-02-11_13-25-29/data/osd.4.data/snap_33783': Operation not permitted
INFO:teuthology.orchestra.run.err:rm: cannot remove `/tmp/cephtest/gregf@kai-2013-02-11_13-25-29/data/osd.4.data/snap_33832': Operation not permitted
ERROR:teuthology.nuke:Could not nuke all targets in {'ubuntu@plana05.front.sepia.ceph.com': 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC4QLrFgiF5oIdiIeB+l40EWRmssMz9SjaNaS46izEtQn9dzF6wcTaqR8JFz7xPpU5b0IH4m3cCCCUfybVZMeyZRh94Xa0knzohzxGEjJzjmaxBhZVJZ0ecXSK724Vy2fnCTpeXo8aAJdPvng06TUy9Fh4RhT6IMHq0xjVUwb4J4GLKse5bumBJXwmrTFh8EMY3vBJ/6+XkRw8ASWBIVMRGGI+QZwlb/UMdf6pcyaDwJNv4nQDeBea+5nwtlyxCjYV/RGo4GoHU/Uit7pF73xJLGDH9kvmDoXKmXBI08wqGA14hVyt5CEHzkNbAy5fUq3yJdzGf5oBlghyTLjsWmf3Z'}
Traceback (most recent call last):
  File "/home/gregf/src/teuthology/teuthology/nuke.py", line 352, in nuke_one
    nuke_helper(ctx, log)
  File "/home/gregf/src/teuthology/teuthology/nuke.py", line 423, in nuke_helper
    remove_testing_tree(ctx, log)
  File "/home/gregf/src/teuthology/teuthology/nuke.py", line 244, in remove_testing_tree
    proc.exitstatus.get()
  File "/home/gregf/src/teuthology/virtualenv/lib/python2.6/site-packages/gevent/event.py", line 223, in get
    raise self._exception
CommandFailedError: Command failed with status 1: 'sudo rm -rf /tmp/cephtest'
^[[A^[[B

Actions #1

Updated by Sam Lang about 11 years ago

wip-reboot-timeout should fix these hangs. The branch needs to be rebased and merged since Josh's changes to only unmount if reboot isn't requested went into master.

Actions #2

Updated by Ian Colle about 11 years ago

  • Assignee set to Sam Lang
Actions #3

Updated by Sage Weil almost 11 years ago

  • Assignee deleted (Sam Lang)
Actions #4

Updated by Zack Cerza about 10 years ago

  • Status changed from New to Closed

Closing due to lack of inactivity after the 'this should be fixed' comment.

Actions

Also available in: Atom PDF