Actions
Bug #7673
closed"reached maximum tries" in /teuthology-2014-03-09_03:00:01-rados-firefly-testing-basic-plana suite
% Done:
0%
Source:
other
Tags:
Backport:
Regression:
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-03-09_03:00:01-rados-firefly-testing-basic-plana/123828/
2014-03-09T12:20:28.624 DEBUG:teuthology.orchestra.run:Running [10.214.132.27]: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph status --format=json-pretty' 2014-03-09T12:20:28.880 INFO:teuthology.task.thrashosds.ceph_manager:{u'election_epoch': 4, u'quorum': [0, 1, 2], u'mdsmap': {u'max': 1, u'epoch': 5, u'by_rank': [{u'status': u'up:active', u'name': u'a', u'rank': 0}], u'up': 1, u'in': 1}, u'monmap': {u'epoch': 1, u'mons': [{u'name': u'b', u'rank': 0, u'addr': u'10.214.131.3:6789/0'}, {u'name': u'a', u'rank': 1, u'addr': u'10.214.132.27:6789/0'}, {u'name': u'c', u'rank': 2, u'addr': u'10.214.132.27:6790/0'}], u'modified': u'2014-03-09 11:19:40.635841', u'fsid': u'257857a3-1b65-4492-bfe3-35d5a54c5acd', u'created': u'2014-03-09 11:19:40.635841'}, u'health': {u'detail': [], u'timechecks': {u'round_status': u'finished', u'epoch': 4, u'round': 26, u'mons': [{u'latency': u'0.000000', u'skew': u'0.000000', u'health': u'HEALTH_OK', u'name': u'b'}, {u'latency': u'0.009557', u'skew': u'0.000000', u'health': u'HEALTH_OK', u'name': u'a'}, {u'latency': u'0.009445', u'skew': u'0.000000', u'health': u'HEALTH_OK', u'name': u'c'}]}, u'health': {u'health_services': [{u'mons': [{u'last_updated': u'2014-03-09 12:20:05.993151', u'name': u'b', u'avail_percent': 92, u'kb_total': 472345880, u'kb_avail': 438195344, u'health': u'HEALTH_OK', u'kb_used': 10133640, u'store_stats': {u'bytes_total': 3999988, u'bytes_log': 983040, u'last_updated': u'0.000000', u'bytes_misc': 65552, u'bytes_sst': 2951396}}, {u'last_updated': u'2014-03-09 12:20:06.124380', u'name': u'a', u'avail_percent': 93, u'kb_total': 472345880, u'kb_avail': 441166052, u'health': u'HEALTH_OK', u'kb_used': 7162932, u'store_stats': {u'bytes_total': 5048606, u'bytes_log': 2031616, u'last_updated': u'0.000000', u'bytes_misc': 65552, u'bytes_sst': 2951438}}, {u'last_updated': u'2014-03-09 12:20:06.124467', u'name': u'c', u'avail_percent': 93, u'kb_total': 472345880, u'kb_avail': 441166052, u'health': u'HEALTH_OK', u'kb_used': 7162932, u'store_stats': {u'bytes_total': 5048606, u'bytes_log': 2031616, u'last_updated': u'0.000000', u'bytes_misc': 65552, u'bytes_sst': 2951438}}]}]}, u'overall_status': u'HEALTH_WARN', u'summary': [{u'severity': u'HEALTH_WARN', u'summary': u'1 pgs recovering'}, {u'severity': u'HEALTH_WARN', u'summary': u'1 pgs stuck unclean'}, {u'severity': u'HEALTH_WARN', u'summary': u'10 requests are blocked > 32 sec'}, {u'severity': u'HEALTH_WARN', u'summary': u'recovery 2691/39593 objects degraded (6.797%)'}, {u'severity': u'HEALTH_WARN', u'summary': u'pool data pg_num 44 > pgp_num 34'}, {u'severity': u'HEALTH_WARN', u'summary': u'pool metadata pg_num 54 > pgp_num 24'}, {u'severity': u'HEALTH_WARN', u'summary': u'pool rbd pg_num 34 > pgp_num 24'}, {u'severity': u'HEALTH_WARN', u'summary': u'pool unique_pool_0 has too few pgs'}]}, u'pgmap': {u'bytes_total': 1499591012352, u'recovering_bytes_per_sec': 9225307, u'degraded_objects': 2691, u'num_pgs': 143, u'recovering_keys_per_sec': 0, u'data_bytes': 55218016448, u'degraded_total': 39593, u'bytes_used': 78067666944, u'recovering_objects_per_sec': 2, u'version': 918, u'pgs_by_state': [{u'count': 142, u'state_name': u'active+clean'}, {u'count': 1, u'state_name': u'active+recovering'}], u'degraded_ratio': u'6.797', u'bytes_avail': 1421523345408}, u'quorum_names': [u'b', u'a', u'c'], u'osdmap': {u'osdmap': {u'full': u'false', u'nearfull': u'false', u'num_osds': 6, u'num_up_osds': 6, u'epoch': 61, u'num_in_osds': 3}}, u'fsid': u'257857a3-1b65-4492-bfe3-35d5a54c5acd'} 2014-03-09T12:20:28.881 DEBUG:teuthology.orchestra.run:Running [10.214.132.27]: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph pg dump --format=json' 2014-03-09T12:20:29.119 INFO:teuthology.orchestra.run.err:[10.214.132.27]: dumped all in format json 2014-03-09T12:20:29.350 INFO:teuthology.task.radosbench.radosbench.0.out:[10.214.132.27]: 2014-03-09 12:20:29.350115min lat: 0.11649 max lat: 2003.8 avg lat: 3.265 2014-03-09T12:20:29.351 INFO:teuthology.task.radosbench.radosbench.0.out:[10.214.132.27]: sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 2014-03-09T12:20:29.351 INFO:teuthology.task.radosbench.radosbench.0.out:[10.214.132.27]: 3600 4 11935 11931 13.2547 0 - 3.265 2014-03-09T12:20:29.533 ERROR:teuthology.run_tasks:Manager failed: radosbench Traceback (most recent call last): File "/home/teuthworker/teuthology-firefly/teuthology/run_tasks.py", line 84, in run_tasks suppress = manager.__exit__(*exc_info) File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__ self.gen.next() File "/home/teuthworker/teuthology-firefly/teuthology/task/radosbench.py", line 80, in task run.wait(radosbench.itervalues(), timeout=timeout) File "/home/teuthworker/teuthology-firefly/teuthology/orchestra/run.py", line 349, in wait check_time() File "/home/teuthworker/teuthology-firefly/teuthology/contextutil.py", line 125, in __call__ raise MaxWhileTries(error_msg) MaxWhileTries: reached maximum tries (600) after waiting for 3600 seconds
archive_path: /var/lib/teuthworker/archive/teuthology-2014-03-09_03:00:01-rados-firefly-testing-basic-plana/123828 description: rados/thrash/{clusters/fixed-2.yaml fs/xfs.yaml msgr-failures/osd-delay.yaml thrashers/morepggrow.yaml workloads/ec-radosbench.yaml} email: null job_id: '123828' kernel: &id001 kdb: true sha1: f31a96afabfad92cb917fd52a421b23275cdb6da last_in_suite: false machine_type: plana name: teuthology-2014-03-09_03:00:01-rados-firefly-testing-basic-plana nuke-on-error: true os_type: ubuntu overrides: admin_socket: branch: firefly ceph: conf: global: ms inject delay max: 1 ms inject delay probability: 0.005 ms inject delay type: osd ms inject internal delays: 0.002 ms inject socket failures: 2500 mon: debug mon: 20 debug ms: 1 debug paxos: 20 osd: debug filestore: 20 debug ms: 1 debug osd: 20 osd sloppy crc: true fs: xfs log-whitelist: - slow request sha1: a4cbb192ab9e1b2a997e3a831e58648a30e16e59 ceph-deploy: branch: dev: firefly conf: client: log file: /var/log/ceph/ceph-$name.$pid.log mon: debug mon: 1 debug ms: 20 debug paxos: 20 osd default pool size: 2 install: ceph: sha1: a4cbb192ab9e1b2a997e3a831e58648a30e16e59 s3tests: branch: master workunit: sha1: a4cbb192ab9e1b2a997e3a831e58648a30e16e59 owner: scheduled_teuthology@teuthology roles: - - mon.a - mon.c - osd.0 - osd.1 - osd.2 - client.0 - - mon.b - mds.a - osd.3 - osd.4 - osd.5 - client.1 targets: ubuntu@plana37.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC8FPsPKV1KVlb89QL2k0kNMTM3mIenC2wHxnVb9EgA7MGjC/gJFv4FoYFtTn0SadJl2hZNJ8kk7HjBsgCQG3f+LL3l7DPlqSJG8zFFXW6LCzjk0YQX/JX7X6nK33HdxzzOZVecglaQnTSWKbPDp8ofd9EQX4gN7mPb/C0/FUtT0Hjrb97QBYqDDVWEMBo7BCT4YdsisPBkCFpQ1Khl2K89e9uhfw4wvVvqveLnU3NEAULbEhMeLg0LMsSlmK2gfiyJbyxweApXo4VqfuNd6DnUqUzilAM0VJL3KgJqJGW46IYC76VPMSHPKD66kgrYiyBm12iLEy70kODNVaNe3wnX ubuntu@plana51.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDgNXP3p/sw2sy34ARorzUh9QvDPit80IHDKQ71BGtytSAQL5ijlSpjJRjGT0HB9xHvR6v8115ikzmot1HgVeJSnC07UQKWp3CfVIUHZOtbMgw0exON14083tSlvn2djTA/bphuwag5u9y+0XkufOXBNrY4aBlQS9vNXnsW0PQwlgJ6YqK3W2e1qirpvfMamLugAFLdycCXXmjriXFuAxvHqbFrJYVEvNbsK8Bt+cRE5l0gcBin+5wJmz4iKagwYVAVqW7i1lZM1F0QdffYwuUrQ110/iz9AcnNvu6dSU+3g7agjBKvWCA+DVEn0RWbaRJ7M+FCl2PmLULnjvK44Qsp tasks: - internal.lock_machines: - 2 - plana - internal.save_config: null - internal.check_lock: null - internal.connect: null - internal.check_conflict: null - internal.check_ceph_data: null - internal.vm_setup: null - kernel: *id001 - internal.base: null - internal.archive: null - internal.coredump: null - internal.sudo: null - internal.syslog: null - internal.timer: null - chef: null - clock.check: null - install: null - ceph: log-whitelist: - wrongly marked me down - objects unfound and apparently lost - thrashosds: chance_pgnum_grow: 3 chance_pgpnum_fix: 1 timeout: 1200 - radosbench: clients: - client.0 ec_pool: true time: 1800 unique_pool: true teuthology_branch: firefly verbose: true worker_log: /var/lib/teuthworker/archive/worker_logs/worker.plana.11186
description: rados/thrash/{clusters/fixed-2.yaml fs/xfs.yaml msgr-failures/osd-delay.yaml thrashers/morepggrow.yaml workloads/ec-radosbench.yaml} duration: 7009.944432973862 failure_reason: reached maximum tries (600) after waiting for 3600 seconds flavor: basic mon.a-kernel-sha1: f31a96afabfad92cb917fd52a421b23275cdb6da mon.b-kernel-sha1: f31a96afabfad92cb917fd52a421b23275cdb6da owner: scheduled_teuthology@teuthology success: false
Updated by Yuri Weinstein about 10 years ago
- Severity changed from 3 - minor to 2 - major
There are seem to be several of those, so 'major'
Updated by Samuel Just about 10 years ago
- Assignee set to Samuel Just
- Priority changed from Normal to Urgent
Updated by Samuel Just about 10 years ago
The non-ec ones are probably just an inadequate timeout -- the cleanup is likely to take longer than the writeout. The ec ones are probably 7649.
Updated by Samuel Just about 10 years ago
I've turned up the timeout in the tests.
Actions