Actions
Bug #13359
closedgiant stale requests
Status:
Resolved
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
upgrade/hammer
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
upgrade:hammer/older/{0-cluster/start.yaml 1-install/latest_giant_release.yaml 2-workload/testrados.yaml 3-upgrade-sequence/upgrade-osd-mon-mds.yaml 4-final/{monthrash.yaml osdthrash.yaml testrados.yaml} distros/centos_6.5.yaml}
2015-10-04T20:51:17.163 INFO:tasks.rados.rados.0.vpm159.stdout:1899: expect (ObjNum 694 snap 234 seq_num 694) 2015-10-04T20:51:19.456 INFO:tasks.ceph.osd.0.vpm186.stdout:starting osd.0 at :/0 osd_data /var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal 2015-10-04T20:51:20.148 INFO:tasks.ceph.osd.0.vpm186.stderr:2015-10-04 23:51:20.132469 7f77517ae800 -1 filestore(/var/lib/ceph/osd/ceph-0) FileStore::mount : stale version stamp detected: 3. Proceeding, do_update is set, performing disk format upgrade. 2015-10-04T20:51:20.777 DEBUG:teuthology.misc:6 of 6 OSDs are up 2015-10-04T20:51:20.778 INFO:teuthology.orchestra.run.vpm186:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph health' 2015-10-04T20:51:20.807 INFO:tasks.ceph.osd.0.vpm186.stderr:2015-10-04 23:51:20.791634 7f77517ae800 -1 journal FileJournal::_open: disabling aio for non-block journal. Use journal_force_aio to force use of aio anyway 2015-10-04T20:51:21.380 INFO:tasks.ceph.osd.0.vpm186.stderr:2015-10-04 23:51:21.365391 7f77517ae800 -1 osd.0 518 PGs are upgrading 2015-10-04T20:51:22.215 INFO:tasks.ceph.osd.0.vpm186.stderr:2015-10-04 23:51:22.198976 7f77517ae800 -1 osd.0 518 log_to_monitors {default=true} 2015-10-04T20:51:22.291 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN 7 requests are blocked > 32 sec 2015-10-04T20:51:25.787 INFO:tasks.rados.rados.0.vpm159.stdout:1893: finishing write tid 1 to vpm1599889-37 2015-10-04T20:51:25.787 INFO:tasks.rados.rados.0.vpm159.stdout:1893: finishing write tid 2 to vpm1599889-37 ... 2015-10-04T20:52:24.722 INFO:tasks.rados.rados.0.vpm159.stdout:1996: finishing write tid 7 to vpm1599889-32 2015-10-04T20:52:24.722 INFO:tasks.rados.rados.0.vpm159.stdout:update_object_version oid 32 v 1118 (ObjNum 767 snap 260 seq_num 767) dirty exists 2015-10-04T20:52:24.723 INFO:tasks.rados.rados.0.vpm159.stdout:1996: done (1 left) 2015-10-04T20:52:24.723 INFO:tasks.rados.rados.0.vpm159.stdout:1999: done (0 left) 2015-10-04T20:52:24.836 INFO:tasks.rados.rados.0.vpm159.stderr:0 errors. 2015-10-04T20:52:24.836 INFO:tasks.rados.rados.0.vpm159.stderr: 2015-10-04T20:52:24.953 INFO:tasks.ceph.ceph_manager:removing pool_name unique_pool_0 2015-10-04T20:52:24.953 INFO:teuthology.orchestra.run.vpm186:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage rados rmpool unique_pool_0 unique_pool_0 --yes-i-really-really-mean-it' 2015-10-04T20:52:26.013 INFO:teuthology.orchestra.run.vpm186.stdout:successfully deleted pool unique_pool_0 2015-10-04T20:52:26.015 DEBUG:teuthology.parallel:result is None 2015-10-04T20:52:30.825 INFO:teuthology.orchestra.run.vpm186:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph health' 2015-10-04T20:52:31.178 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN 6 requests are blocked > 32 sec 2015-10-04T20:52:38.179 INFO:teuthology.orchestra.run.vpm186:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph health' 2015-10-04T20:52:38.591 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN 6 requests are blocked > 32 sec ... 2015-10-04T21:11:24.688 INFO:teuthology.orchestra.run.vpm186:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph health' 2015-10-04T21:11:25.042 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN 6 requests are blocked > 32 sec 2015-10-04T21:11:26.042 ERROR:teuthology.parallel:Exception in parallel execution Traceback (most recent call last): File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 82, in __exit__ for result in self: File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 101, in next resurrect_traceback(result) File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 19, in capture_traceback return func(*args, **kwargs) File "/home/teuthworker/src/teuthology_master/teuthology/task/parallel.py", line 50, in _run_spawned mgr = run_tasks.run_one_task(taskname, ctx=ctx, config=config) File "/home/teuthworker/src/teuthology_master/teuthology/run_tasks.py", line 41, in run_one_task return fn(**kwargs) File "/home/teuthworker/src/teuthology_master/teuthology/task/sequential.py", line 48, in task mgr.__enter__() File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__ return self.gen.next() File "/var/lib/teuthworker/src/ceph-qa-suite_hammer/tasks/ceph.py", line 1066, in restart healthy(ctx=ctx, config=None) File "/var/lib/teuthworker/src/ceph-qa-suite_hammer/tasks/ceph.py", line 972, in healthy remote=mon0_remote, File "/home/teuthworker/src/teuthology_master/teuthology/misc.py", line 876, in wait_until_healthy while proceed(): File "/home/teuthworker/src/teuthology_master/teuthology/contextutil.py", line 134, in __call__ raise MaxWhileTries(error_msg) MaxWhileTries: 'wait_until_healthy' reached maximum tries (150) after waiting for 900 seconds
Updated by Loïc Dachary over 8 years ago
- Subject changed from wait_until_healthy: to wait_until_healthy: HEALTH_WARN 6 requests are blocked > 32 sec
- Description updated (diff)
Updated by Samuel Just over 8 years ago
- Subject changed from wait_until_healthy: HEALTH_WARN 6 requests are blocked > 32 sec to wait_until_healthy: timed out
Updated by Samuel Just over 8 years ago
I think /a/loic-2015-10-05_01:41:20-upgrade:hammer-hammer-backports---basic-vps/1088508/remote is the actual path -- the one above seems to be another run.
Updated by Samuel Just over 8 years ago
- Subject changed from wait_until_healthy: timed out to giant stale requests
- Assignee deleted (
Samuel Just) - Priority changed from Urgent to High
I think this is a bug in giant. There seem to be some old requests stuck in the op tracker which I think actually completed. Anyway, not high priority.
Actions