Bug #6118
failed to recover before timeout expired on radosbench, rados api tests
% Done:
0%
Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
ubuntu@teuthology:/a/teuthology-2013-08-25_09:23:30-rados-master-testing-basic-plana/4753
History
#1 Updated by Sage Weil over 10 years ago
- Subject changed from failed to recover before timeout expired on radosbench to failed to recover before timeout expired on radosbench, rados api tests
4 objects degraded, 1 pg stuck in recovery_wait
{u'election_epoch': 6, u'quorum': [0, 1, 2], u'mdsmap': {u'max': 1, u'epoch': 5, u'by_rank': [{u'status': u'up:active', u'name': u'a', u'rank': 0}], u'up': 1, u'in': 1}, u'monmap': {u'epoch': 1, u'mons': [{u'name': u'b', u'rank': 0, u'addr': u'10. 214.131.10:6789/0'}, {u'name': u'a', u'rank': 1, u'addr': u'10.214.132.34:6789/0'}, {u'name': u'c', u'rank': 2, u'addr': u'10.214.132.34:6790/0'}], u'modified': u'2013-09-03 02:37:54.713075', u'fsid': u'1934cbfb-2bc2-4a63-a87e-edf7f443e025', u'created': u'2013-09-03 02:37:54.713075'}, u'health': {u'detail': [], u't imechecks': {u'round_status': u'finished', u'epoch': 6, u'round': 16, u'mons': [{u'latency': u'0.000000', u'skew': u'0.000000', u'health': u'HEALTH_OK', u'name': u'b'}, {u'latency': u'0.045938', u'skew': u'0.000000', u'health': u'HEALTH_OK', u'name': u'a'}, {u'latency': u'0.125255', u'skew': u'0.000000', u'health': u'HEALTH_OK', u'name': u'c'}]}, u'health': {u'health_services': [{u'mons': [{u'last_updated': u'2013-09-03 03:14:09.386691', u'name': u'b', u'avail_percent': 91, u'kb_total': 472345880, u'kb_avail': 430895876, u'health': u'HEALTH_OK', u'kb_used': 17433108}, {u'last_updated': u'2013-09-03 03:14:10.489290', u'name': u'a', u'avail_percent': 92, u'kb_total': 472345880, u'kb_avail': 437662924, u'health': u'HEALTH_OK', u'kb_used': 10666060}, {u'last_updated': u'2013-09-03 03:14:09.490316', u'name': u'c', u'avail_percent': 92, u'kb_total': 472345880, u'kb_avail': 437662924, u'health': u'HEALTH_OK', u'kb_used': 10666060}]}]}, u'ove rall_status': u'HEALTH_WARN', u'summary': [{u'severity': u'HEALTH_WARN', u'summary': u'1 pgs recovery_wait'}]}, u'pgmap': {u'bytes_total': 3000647172096, u'degraded_objects': 4, u'num_pgs': 212, u'data_bytes': 43201, u'degraded_total': 402, u'bytes_used': 684716032, u'version': 755, u'pgs_by_state': [{u'count': 211 , u'state_name': u'active+clean'}, {u'count': 1, u'state_name': u'active+recovery_wait'}], u'degrated_ratio': u'0.995', u'bytes_avail': 2993456123904}, u'quorum_names': [u'b', u'a', u'c'], u'osdmap': {u'osdmap': {u'full': u'false', u'nearfull': u'false', u'num_osds': 6, u'num_up_osds': 6, u'epoch': 523, u'num_in_os ds': u'6'}}, u'fsid': u'1934cbfb-2bc2-4a63-a87e-edf7f443e025'}
ubuntu@teuthology:/a/teuthology-2013-09-02_20:00:14-rados-dumpling-testing-basic-plana/18001$ cat orig.config.yaml kernel: kdb: true sha1: 263cbbcaf605e359a46e30889595d82629f82080 machine_type: plana nuke-on-error: true os_type: ubuntu overrides: admin_socket: branch: dumpling ceph: conf: global: ms inject socket failures: 5000 mon: debug mon: 20 debug ms: 1 debug paxos: 20 osd: osd op thread timeout: 60 fs: btrfs log-whitelist: - slow request sha1: a708c8ab52e5b1476405a1f817c23b8845fbaab3 valgrind: mds: - --tool=memcheck mon: - --tool=memcheck - --leak-check=full - --show-reachable=yes osd: - --tool=memcheck ceph-deploy: branch: dev: dumpling conf: client: log file: /var/log/ceph/ceph-$name.$pid.log mon: debug mon: 1 debug ms: 20 debug paxos: 20 install: ceph: flavor: notcmalloc sha1: a708c8ab52e5b1476405a1f817c23b8845fbaab3 s3tests: branch: master workunit: sha1: a708c8ab52e5b1476405a1f817c23b8845fbaab3 roles: - - mon.a - mon.c - osd.0 - osd.1 - osd.2 - - mon.b - mds.a - osd.3 - osd.4 - osd.5 - client.0 tasks: - chef: null - clock.check: null - install: null - ceph: log-whitelist: - wrongly marked me down - objects unfound and apparently lost - thrashosds: chance_pgnum_grow: 1 chance_pgpnum_fix: 1 timeout: 1200 - workunit: clients: client.0: - rados/test.sh teuthology_branch: dumpling
#3 Updated by Sage Weil over 10 years ago
another one with full logs: ubuntu@teuthology:/a/teuthology-2013-09-07_13:39:47-rados-dumpling-testing-basic-plana/25183
#4 Updated by Ian Colle over 10 years ago
- Assignee set to Samuel Just
#5 Updated by Samuel Just over 10 years ago
Seems actually to have been a hung ceph status. ceph.log seems to indicate that the pgs went clean.
#6 Updated by Samuel Just over 10 years ago
Much of the code has been replaced as part of 5992, might be worth closing for now.
#7 Updated by Samuel Just over 10 years ago
- Status changed from New to Can't reproduce