Actions
Bug #7620
closedBUG: soft lockup - CPU#0 stuck for 23s!
Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Target version:
-
% Done:
0%
Source:
other
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Noticed this hung job:
http://pulpito.front.sepia.ceph.com/teuthology-2014-03-04_19:01:51-rbd-dumpling-testing-basic-plana/116699/
http://qa-proxy.ceph.com/teuthology/teuthology-2014-03-04_19:01:51-rbd-dumpling-testing-basic-plana/116699/teuthology.log
2014-03-04T23:09:03.958 INFO:teuthology.task.qemu.client.0.out:[10.214.133.24]: mount: block device /dev/sr0 is write-protected, mounting read-only
2014-03-05T03:23:44.115 INFO:teuthology.task.qemu.client.0.out:[10.214.133.24]: [15292.040016] BUG: soft lockup - CPU#0 stuck for 22s! [kworker/0:2:11087]
2014-03-05T03:23:44.115 INFO:teuthology.task.qemu.client.0.out:[10.214.133.24]: [15292.040736] Stack:
2014-03-05T03:23:44.115 INFO:teuthology.task.qemu.client.0.out:[10.214.133.24]: [15292.040949] Call Trace:
2014-03-05T03:23:44.115 INFO:teuthology.task.qemu.client.0.out:[10.214.133.24]: [15292.041398] Code: dd fe ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 55 b8 00 01 00 00 48 89 e5 3e 66 0f c1 07 0f b6 d4 38 c2 74 0c 0f 1f 00 f3 90 <0f> b6 07 38 d0 75 f7 5d c3 66 66 66 66 2e 0f 1f 84 00 00 00 00
Not killing it, to give others a chance to investigate
Updated by Yuri Weinstein about 10 years ago
Test failed with similar error.
2014-04-25T02:25:14.274 INFO:teuthology.orchestra.run.out:[10.214.138.107]: rsyslog start/running, process 12755 2014-04-25T02:25:14.276 INFO:teuthology.task.internal:Checking logs for errors... 2014-04-25T02:25:14.276 DEBUG:teuthology.task.internal:Checking ubuntu@vpm040.front.sepia.ceph.com 2014-04-25T02:25:14.276 DEBUG:teuthology.orchestra.run:Running [10.214.138.94]: "egrep --binary-files=text '\\bBUG\\b|\\bINFO\\b|\\bDEADLOCK\\b' /home/ubuntu/cephtest/archive/syslog/*.log | grep -v 'task .* blocked for more than .* seconds' | grep -v 'lockdep is turned off' | grep -v 'trying to register non-static key' | grep -v 'DEBUG: fsize' | grep -v CRON | grep -v 'BUG: bad unlock balance detected' | grep -v 'inconsistent lock state' | grep -v '*** DEADLOCK ***' | grep -v 'INFO: possible irq lock inversion dependency detected' | grep -v 'INFO: NMI handler (perf_event_nmi_handler) took too long to run' | grep -v 'INFO: recovery required on readonly' | head -n 1" 2014-04-25T02:25:14.441 DEBUG:teuthology.task.internal:Checking ubuntu@vpm039.front.sepia.ceph.com 2014-04-25T02:25:14.441 DEBUG:teuthology.orchestra.run:Running [10.214.138.96]: "egrep --binary-files=text '\\bBUG\\b|\\bINFO\\b|\\bDEADLOCK\\b' /home/ubuntu/cephtest/archive/syslog/*.log | grep -v 'task .* blocked for more than .* seconds' | grep -v 'lockdep is turned off' | grep -v 'trying to register non-static key' | grep -v 'DEBUG: fsize' | grep -v CRON | grep -v 'BUG: bad unlock balance detected' | grep -v 'inconsistent lock state' | grep -v '*** DEADLOCK ***' | grep -v 'INFO: possible irq lock inversion dependency detected' | grep -v 'INFO: NMI handler (perf_event_nmi_handler) took too long to run' | grep -v 'INFO: recovery required on readonly' | head -n 1" 2014-04-25T02:25:14.616 DEBUG:teuthology.task.internal:Checking ubuntu@vpm038.front.sepia.ceph.com 2014-04-25T02:25:14.617 DEBUG:teuthology.orchestra.run:Running [10.214.138.107]: "egrep --binary-files=text '\\bBUG\\b|\\bINFO\\b|\\bDEADLOCK\\b' /home/ubuntu/cephtest/archive/syslog/*.log | grep -v 'task .* blocked for more than .* seconds' | grep -v 'lockdep is turned off' | grep -v 'trying to register non-static key' | grep -v 'DEBUG: fsize' | grep -v CRON | grep -v 'BUG: bad unlock balance detected' | grep -v 'inconsistent lock state' | grep -v '*** DEADLOCK ***' | grep -v 'INFO: possible irq lock inversion dependency detected' | grep -v 'INFO: NMI handler (perf_event_nmi_handler) took too long to run' | grep -v 'INFO: recovery required on readonly' | head -n 1" 2014-04-25T02:25:19.267 ERROR:teuthology.task.internal:Error in syslog on ubuntu@vpm038.front.sepia.ceph.com: /home/ubuntu/cephtest/archive/syslog/kern.log:2014-04-25T04:36:10.915422+00:00 vpm038 kernel: [ 2080.060036] BUG: soft lockup - CPU#0 stuck for 24s! [rwhod:1265] 2014-04-25T02:25:19.268 INFO:teuthology.task.internal:Compressing syslogs... 2014-04-25T02:25:19.268 DEBUG:teuthology.orchestra.run:Running [10.214.138.107]: "find /home/ubuntu/cephtest/archive/syslog -name '*.log' -print0 | sudo xargs -0 --no-run-if-empty -- gzip --" 2014-04-25T02:25:19.271 DEBUG:teuthology.orchestra.run:Running [10.214.138.96]: "find /home/ubuntu/cephtest/archive/syslog -name '*.log' -print0 | sudo xargs -0 --no-run-if-empty -- gzip --" 2014-04-25T02:25:19.274 DEBUG:teuthology.orchestra.run:Running [10.214.138.94]: "find /home/ubuntu/cephtest/archive/syslog -name '*.log' -print0 | sudo xargs -0 --no-run-if-empty -- gzip --"
archive_path: /var/lib/teuthworker/archive/teuthology-2014-04-24_20:35:03-upgrade:dumpling-x:stress-split-firefly-distro-basic-vps/213298 description: upgrade/dumpling-x/stress-split/{0-cluster/start.yaml 1-dumpling-install/dumpling.yaml 2-partial-upgrade/firsthalf.yaml 3-thrash/default.yaml 4-mon/mona.yaml 5-workload/rados_api_tests.yaml 6-next-mon/monb.yaml 7-workload/radosbench.yaml 8-next-mon/monc.yaml 9-workload/{rados_api_tests.yaml rbd-python.yaml rgw-s3tests.yaml snaps-many-objects.yaml} distros/ubuntu_12.04.yaml} email: null job_id: '213298' kernel: &id001 kdb: true sha1: distro last_in_suite: false machine_type: vps name: teuthology-2014-04-24_20:35:03-upgrade:dumpling-x:stress-split-firefly-distro-basic-vps nuke-on-error: true os_type: ubuntu os_version: '12.04' overrides: admin_socket: branch: firefly ceph: conf: mon: debug mon: 20 debug ms: 1 debug paxos: 20 mon warn on legacy crush tunables: false osd: debug filestore: 20 debug journal: 20 debug ms: 1 debug osd: 20 log-whitelist: - slow request - wrongly marked me down - objects unfound and apparently lost - log bound mismatch sha1: 2708c3c559d99e6f3b557ee1d223efa3745f655c ceph-deploy: branch: dev: firefly conf: client: log file: /var/log/ceph/ceph-$name.$pid.log mon: debug mon: 1 debug ms: 20 debug paxos: 20 osd default pool size: 2 install: ceph: sha1: 2708c3c559d99e6f3b557ee1d223efa3745f655c s3tests: branch: master workunit: sha1: 2708c3c559d99e6f3b557ee1d223efa3745f655c owner: scheduled_teuthology@teuthology roles: - - mon.a - mon.b - mds.a - osd.0 - osd.1 - osd.2 - - osd.3 - osd.4 - osd.5 - mon.c - - client.0 targets: ubuntu@vpm038.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDDFyMM1axZoScxRE3sY8ImoFGCHd+vZG2KaLx/9xlG8JmluSW6yvj6kz2Vg7hPbFppKS+NsIGsDDvtJYDmHbEKilQlzAcAbvBeuPna+VmyEfaaknnz7LzN59+G06itecO3Ix1PfW9c5FbLQiaZ1go1stfkTwpS3gzu1PJG3wPlF0RKyh0Y12FiybXncvciD/Rbhjq5axyZMGbe1vAIuL71/YcpBWdRPSL9Nsom24cIPufNhosOcQKEw7mYfby0/qYgExA2h7wDS90WEQgr7Tx9j0icYF/tqPzzIAoTrUlULJgk0fUycuj6ckPEWHwZJ01P0HDCdaAHrXY9kpLlaxRB ubuntu@vpm039.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCw8YQxiNxcHF8Zda6Lbvyu/VixWj2Fh2vDzx8ftawS+h0umt3giu4fwk8XBWybmw65jT2+lEAVDo8vqB72v+94X7Yf05kK0c+BzCtESp7pFhCy0L+kx5eBMQPBy9hwdEJxxgJkzgg6omVOteSxP0QN/E+Q7rHdPKKbai77EuJL8elBDPDOabBpGvJ/WHv5wWRMKxlWbrDWX1Ywr4noJ3/btzBbSnlIcQgOXJYuYIV/3tBSnej5Qn332ZANJT0yFf+5e07HcIT//P6RBlA2+pQmiWBdiihrtmJ9oT6f/DsXvaNZ/RHr0onBVo5IU3w/WT7Iln3vC7NtEHugiLaAR+FZ ubuntu@vpm040.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC8lY1htTUcngc7uQ8fqkA4Co8iBWMNytgkKCaENRLmA71VlLrZdqlraeWyOmBmyQ9bkdO1omCvqzf5gshqNrJkEam2Ve7tLFTkvcuZbSyy/TRTdSLiQ3cN8wKRxgQqbl0QMm0VD5fTHkoRWAIa7MwtdtZRH6tb9VPd7K2MGTME/EzlMXv9qutCk1zaVk+p98HKCXpp6wqgypTTf45s/xtUAPCZxiDhQpRkk4F7qGdkn6ws42w4CJ1m9WQAK01k1eesfUzZItooU9f4WVb77YTzHpcLAgVtLm4U+l70eVxvl42ClUhGyN9D3mCQgPs1EqzdM+qtbJU+Nq+dNE7ruVKr tasks: - internal.lock_machines: - 3 - vps - internal.save_config: null - internal.check_lock: null - internal.connect: null - internal.check_conflict: null - internal.check_ceph_data: null - internal.vm_setup: null - kernel: *id001 - internal.base: null - internal.archive: null - internal.coredump: null - internal.sudo: null - internal.syslog: null - internal.timer: null - chef: null - clock.check: null - install: branch: dumpling - ceph: fs: xfs - install.upgrade: osd.0: null - ceph.restart: daemons: - osd.0 - osd.1 - osd.2 - thrashosds: chance_pgnum_grow: 1 chance_pgpnum_fix: 1 thrash_primary_affinity: false timeout: 1200 - ceph.restart: daemons: - mon.a wait-for-healthy: false wait-for-osds-up: true - workunit: branch: dumpling clients: client.0: - rados/test-upgrade-firefly.sh - ceph.restart: daemons: - mon.b wait-for-healthy: false wait-for-osds-up: true - radosbench: clients: - client.0 time: 1800 - install.upgrade: mon.c: null - ceph.restart: daemons: - mon.c wait-for-healthy: false wait-for-osds-up: true - ceph.wait_for_mon_quorum: - a - b - c - workunit: branch: dumpling clients: client.0: - rados/test-upgrade-firefly.sh - workunit: branch: dumpling clients: client.0: - rbd/test_librbd_python.sh - rgw: client.0: idle_timeout: 300 - swift: client.0: rgw_server: client.0 - rados: clients: - client.0 objects: 500 op_weights: delete: 50 read: 100 rollback: 50 snap_create: 50 snap_remove: 50 write: 100 ops: 4000 teuthology_branch: firefly verbose: true worker_log: /var/lib/teuthworker/archive/worker_logs/worker.vps.30566
description: upgrade/dumpling-x/stress-split/{0-cluster/start.yaml 1-dumpling-install/dumpling.yaml 2-partial-upgrade/firsthalf.yaml 3-thrash/default.yaml 4-mon/mona.yaml 5-workload/rados_api_tests.yaml 6-next-mon/monb.yaml 7-workload/radosbench.yaml 8-next-mon/monc.yaml 9-workload/{rados_api_tests.yaml rbd-python.yaml rgw-s3tests.yaml snaps-many-objects.yaml} distros/ubuntu_12.04.yaml} duration: 20001.275528907776 failure_reason: '''/home/ubuntu/cephtest/archive/syslog/kern.log:2014-04-25T04:36:10.915422+00:00 vpm038 kernel: [ 2080.060036] BUG: soft lockup - CPU#0 stuck for 24s! [rwhod:1265] '' in syslog' flavor: basic owner: scheduled_teuthology@teuthology success: false
Updated by Sage Weil over 9 years ago
- Status changed from New to Can't reproduce
Actions