Bug #5077
closednightlies: single node cluster hung waiting for ceph_health to be OK
0%
Description
logs: ubuntu@teuthology:/a/teuthology-2013-05-15_01:00:04-rados-master-testing-basic/13506
2013-05-15T02:04:01.884 INFO:teuthology.task.ceph.mds.a:Started 2013-05-15T02:04:01.884 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph mds set_max_mds 1' 2013-05-15T02:04:01.918 INFO:teuthology.orchestra.run.err:2013-05-15 02:04:23.864714 7f4afec00700 0 -- :/11450 >> 10.214.131.7:6790/0 pipe(0x23684d0 sd=7 :0 s=1 pgs=0 cs=0 l=1).fault 2013-05-15T02:04:01.957 INFO:teuthology.task.ceph.osd.0.out:starting osd.0 at :/0 osd_data /var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal 2013-05-15T02:04:01.961 INFO:teuthology.task.ceph.osd.1.out:starting osd.1 at :/0 osd_data /var/lib/ceph/osd/ceph-1 /var/lib/ceph/osd/ceph-1/journal 2013-05-15T02:04:01.970 INFO:teuthology.task.ceph.osd.2.out:starting osd.2 at :/0 osd_data /var/lib/ceph/osd/ceph-2 /var/lib/ceph/osd/ceph-2/journal 2013-05-15T02:04:01.977 INFO:teuthology.task.ceph.mds.a.out:starting mds.a at :/0 2013-05-15T02:04:02.296 INFO:teuthology.task.ceph.osd.0.err:2013-05-15 02:04:24.242986 7f4f9c34d780 -1 journal FileJournal::_open: disabling aio for non-block journal. Use journal_force_aio to force use of aio anyway 2013-05-15T02:04:02.322 INFO:teuthology.task.ceph.osd.2.err:2013-05-15 02:04:24.269058 7ffa9b864780 -1 journal FileJournal::_open: disabling aio for non-block journal. Use journal_force_aio to force use of aio anyway 2013-05-15T02:04:02.324 INFO:teuthology.task.ceph.mon.1.out:starting mon.1 rank 1 at 10.214.131.7:6790/0 mon_data /var/lib/ceph/mon/ceph-1 fsid 235d3123-2fe8-4f98-b3ac-4e510c530320 2013-05-15T02:04:02.336 INFO:teuthology.task.ceph.osd.1.err:2013-05-15 02:04:24.283103 7f80d13a4780 -1 journal FileJournal::_open: disabling aio for non-block journal. Use journal_force_aio to force use of aio anyway 2013-05-15T02:04:02.350 INFO:teuthology.task.ceph.mon.0.out:starting mon.0 rank 0 at 10.214.131.7:6789/0 mon_data /var/lib/ceph/mon/ceph-0 fsid 235d3123-2fe8-4f98-b3ac-4e510c530320 2013-05-15T02:04:02.350 INFO:teuthology.task.ceph.mon.2.out:starting mon.2 rank 2 at 10.214.131.7:6791/0 mon_data /var/lib/ceph/mon/ceph-2 fsid 235d3123-2fe8-4f98-b3ac-4e510c530320 2013-05-15T02:04:07.582 INFO:teuthology.orchestra.run.out:max_mds = 1 2013-05-15T02:04:07.584 INFO:teuthology.task.ceph:Waiting until ceph is healthy... 2013-05-15T02:04:07.585 DEBUG:teuthology.misc:with jobid basedir: 13506 2013-05-15T02:04:07.585 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph --concise osd dump --format=json' 2013-05-15T02:04:07.988 DEBUG:teuthology.misc:2 of 3 OSDs are up 2013-05-15T02:04:08.988 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph --concise osd dump --format=json' 2013-05-15T02:04:09.021 DEBUG:teuthology.misc:3 of 3 OSDs are up 2013-05-15T02:04:09.021 DEBUG:teuthology.misc:with jobid basedir: 13506 2013-05-15T02:04:09.021 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:04:09.114 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN 36 pgs stuck inactive; 36 pgs stuck unclean 2013-05-15T02:04:10.115 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:04:10.144 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN 27 pgs peering; 15 pgs stuck inactive; 15 pgs stuck unclean 2013-05-15T02:04:11.144 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:04:11.173 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN 27 pgs peering; 15 pgs stuck inactive; 15 pgs stuck unclean 2013-05-15T02:04:12.174 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:04:12.197 INFO:teuthology.misc.health.err:2013-05-15 02:04:34.144414 7fb45cbcf700 0 -- :/11838 >> 10.214.131.7:6790/0 pipe(0x13c04a0 sd=7 :0 s=1 pgs=0 cs=0 l=1).fault 2013-05-15T02:04:12.203 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN 27 pgs peering; 15 pgs stuck inactive; 15 pgs stuck unclean 2013-05-15T02:04:13.204 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:04:13.236 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN 27 pgs peering; 15 pgs stuck inactive; 15 pgs stuck unclean 2013-05-15T02:04:14.236 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:04:14.264 INFO:teuthology.misc.health.err:2013-05-15 02:04:36.210838 7f6d7ab52700 0 monclient: hunting for new mon 2013-05-15T02:04:14.268 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN recovery recovering 3 o/s, 1554B/s 2013-05-15T02:04:15.269 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:04:15.299 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN recovery recovering 3 o/s, 1554B/s 2013-05-15T02:04:16.299 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:04:16.329 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN recovery recovering 3 o/s, 1554B/s 2013-05-15T02:04:17.329 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:04:17.359 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN recovery recovering 3 o/s, 1554B/s 2013-05-15T02:04:18.358 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:04:18.387 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN recovery recovering 3 o/s, 1554B/s 2013-05-15T02:04:19.388 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:04:19.421 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN recovery recovering 3 o/s, 1554B/s 2013-05-15T02:04:20.421 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:04:20.452 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN recovery recovering 3 o/s, 1554B/s 2013-05-15T02:04:21.453 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:04:21.482 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN recovery recovering 3 o/s, 1554B/s 2013-05-15T02:04:22.483 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:04:22.512 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN recovery recovering 3 o/s, 1554B/s 2013-05-15T02:04:23.513 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:04:23.543 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN recovery recovering 3 o/s, 1554B/s 2013-05-15T02:04:24.543 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:04:24.572 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN recovery recovering 3 o/s, 1554B/s 2013-05-15T02:04:25.573 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:04:25.603 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN recovery recovering 3 o/s, 1554B/s 2013-05-15T02:04:26.604 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:04:26.633 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN recovery recovering 3 o/s, 1554B/s 2013-05-15T02:04:27.634 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:04:27.663 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN recovery recovering 3 o/s, 1554B/s 2013-05-15T02:04:28.663 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:04:28.692 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN recovery recovering 3 o/s, 1554B/s 2013-05-15T02:04:29.692 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:04:29.721 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN recovery recovering 3 o/s, 1554B/s 2013-05-15T02:04:30.721 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:04:30.749 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN recovery recovering 3 o/s, 1554B/s 2013-05-15T02:04:31.749 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:04:31.778 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN recovery recovering 3 o/s, 1554B/s 2013-05-15T02:04:32.778 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:04:32.807 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN recovery recovering 3 o/s, 1554B/s 2013-05-15T02:04:33.807 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:04:33.836 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN recovery recovering 3 o/s, 1554B/s ... ... .. 2013-05-15T02:10:35.191 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:10:35.220 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN recovery recovering 0 o/s, 64B/s 2013-05-15T02:10:36.220 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:10:36.251 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN recovery recovering 0 o/s, 64B/s 2013-05-15T02:10:37.251 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:10:37.281 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN recovery recovering 0 o/s, 64B/s 2013-05-15T02:10:38.281 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:10:38.311 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN recovery recovering 0 o/s, 64B/s 2013-05-15T02:10:39.311 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:10:39.341 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN recovery recovering 0 o/s, 64B/s 2013-05-15T02:10:40.340 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:10:40.370 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN recovery recovering 0 o/s, 64B/s 2013-05-15T02:10:41.370 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:10:41.399 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN recovery recovering 0 o/s, 64B/s 2013-05-15T02:10:42.399 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:10:42.428 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN recovery recovering 0 o/s, 64B/s 2013-05-15T02:10:43.429 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:10:43.459 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN recovery recovering 0 o/s, 64B/s 2013-05-15T02:10:44.458 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:10:44.488 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN recovery recovering 0 o/s, 64B/s 2013-05-15T02:10:45.488 DEBUG:teuthology.orchestra.run:Running [10.214.131.7]: '/home/ubuntu/cephtest/13506/enable-coredump ceph-coverage /home/ubuntu/cephtest/13506/archive/coverage ceph health --concise' 2013-05-15T02:10:45.517 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN recovery recovering 0 o/s, 64B/s
Updated by Tamilarasi muthamizhan almost 11 years ago
ubuntu@teuthology:/a/teuthology-2013-05-15_01:00:04-rados-master-testing-basic/13506$ cat config.yaml kernel: &id001 kdb: true sha1: 4ebfb52255fd25a987154e0e8847a4155532f760 machine_type: plana nuke-on-error: true overrides: ceph: conf: global: ms inject socket failures: 5000 mon: debug mon: 20 debug ms: 20 debug paxos: 20 osd: osd op thread timeout: 60 fs: btrfs log-whitelist: - slow request sha1: 2a441aa28abdffec5dd5f9bdbc219ac41fbc6d89 s3tests: branch: master workunit: sha1: 2a441aa28abdffec5dd5f9bdbc219ac41fbc6d89 roles: - - mon.0 - mon.1 - mon.2 - mds.a - osd.0 - osd.1 - osd.2 - client.0 targets: ubuntu@plana33.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDZv0weNolvyE+m05bRiGKWyjAbZs/nYL1/5DAdr8QYbmjTSrEjycJ2iZCHVPXHHPTkFN4vWtWQzfEE2Ra8T/Ti0w65C+H6HNwckWvLk0RRYWSFNLjfvTR+0OeCNBbpTcCBaeGpIdJhMcM9k6eek5GGm1djc7ZgG11jepzVe6HiKKbh2roc/EZGuADs8sY/bBf0cRbhsPc/1EJ2sxd8CUmnWXrGfCj5izW/1bAdyBQcAZPpMp5OJkuAT2OznMVYWxkg54JM8TlKKj1T8nccEC5+c01Dbe0vAxuIPCeU2obkxr+VvQN3oJbhUXFDYv9PCNaS0LReuVBKVN9bRYn97xXB tasks: - internal.lock_machines: - 1 - plana - internal.save_config: null - internal.check_lock: null - internal.connect: null - internal.check_conflict: null - internal.check_ceph_data: null - internal.vm_setup: null - kernel: *id001 - internal.base: null - internal.archive: null - internal.coredump: null - internal.syslog: null - internal.timer: null - chef: null - clock.check: null - install: null - ceph: null - ceph-fuse: null - workunit: clients: all: - cephtool - mon/pool_ops.sh
Updated by Tamilarasi muthamizhan almost 11 years ago
also,
ubuntu@teuthology:/a/teuthology-2013-05-15_01:00:04-rados-master-testing-basic/13515$ cat orig.config.yaml
kernel:
kdb: true
sha1: 4ebfb52255fd25a987154e0e8847a4155532f760
machine_type: plana
nuke-on-error: true
overrides:
ceph:
conf:
global:
ms inject socket failures: 500
mon:
debug mon: 20
debug ms: 20
debug paxos: 20
osd:
osd op thread timeout: 60
fs: btrfs
log-whitelist:
- slow request
sha1: 2a441aa28abdffec5dd5f9bdbc219ac41fbc6d89
s3tests:
branch: master
workunit:
sha1: 2a441aa28abdffec5dd5f9bdbc219ac41fbc6d89
roles:
- - mon.a
- mon.b
- mon.c
- osd.0
- osd.1
- mds.0
- client.0
tasks:
- chef: null
- clock.check: null
- install: null
- ceph: null
- mon_thrash:
revive_delay: 20
thrash_delay: 1
- ceph-fuse: null
- workunit:
clients:
all:
- mon/workloadgen.sh
env:
DURATION: '600'
LOADGEN_NUM_OSDS: '5'
VERBOSE: '1'
Updated by Tamilarasi muthamizhan almost 11 years ago
- Assignee changed from Samuel Just to Joao Eduardo Luis
- Priority changed from Normal to High
looks like one of the monitors on the single node cluster went down
Updated by Sage Weil almost 11 years ago
- Status changed from Fix Under Review to Resolved