Project

General

Profile

Actions

Bug #7497

closed

timeout waiting to go clean

Added by Samuel Just about 10 years ago. Updated about 10 years ago.

Status:
Can't reproduce
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2014-02-20T07:02:16.913 INFO:teuthology.task.radosbench.radosbench.0.err:[10.214.132.16]: 2014-02-20 07:02:16.911765 7f51a019f700 0 -- 10.214.132.16:0/1026039 >> 10.214.132.16:6801/23497 pipe(0x7f518c01fe10 sd=7 :37922 s=2 pgs=28 cs=1 l=1 c=0x7f518c020070).injecting socket failure
2014-02-20T07:02:20.628 INFO:teuthology.task.radosbench.radosbench.0.err:[10.214.132.16]: 2014-02-20 07:02:20.627091 7f51a019f700 0 -- 10.214.132.16:0/1026039 >> 10.214.132.16:6801/23497 pipe(0x7f518c0115e0 sd=7 :37924 s=2 pgs=30 cs=1 l=1 c=0x7f518c00b270).injecting socket failure
2014-02-20T07:02:22.749 INFO:teuthology.task.radosbench.radosbench.0.err:[10.214.132.16]: 2014-02-20 07:02:22.748616 7f51927f3700 0 -- 10.214.132.16:0/1026039 >> 10.214.132.16:6801/23497 pipe(0x7f518c01d0d0 sd=7 :37928 s=2 pgs=33 cs=1 l=1 c=0x7f518c0250a0).injecting socket failure
2014-02-20T07:02:23.585 INFO:teuthology.task.thrashosds.ceph_manager:creating pool_name unique_pool_0
2014-02-20T07:02:23.585 DEBUG:teuthology.orchestra.run:Running [10.214.132.16]: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage rados rmpool unique_pool_0 unique_pool_0 --yes-i-really-really-mean-it'
2014-02-20T07:02:24.040 INFO:teuthology.orchestra.run.out:[10.214.132.16]: successfully deleted pool unique_pool_0
2014-02-20T07:02:24.042 DEBUG:teuthology.run_tasks:Unwinding manager thrashosds
2014-02-20T07:02:24.043 INFO:teuthology.task.thrashosds:joining thrashosds
2014-02-20T07:02:24.043 ERROR:teuthology.run_tasks:Manager failed: thrashosds
Traceback (most recent call last):
File "/home/teuthworker/teuthology-master/teuthology/run_tasks.py", line 84, in run_tasks
suppress = manager.__exit__(*exc_info)
File "/usr/lib/python2.7/contextlib.py", line 24, in exit
self.gen.next()
File "/home/teuthworker/teuthology-master/teuthology/task/thrashosds.py", line 172, in task
thrash_proc.do_join()
File "/home/teuthworker/teuthology-master/teuthology/task/ceph_manager.py", line 117, in do_join
self.thread.get()
File "/usr/lib/python2.7/dist-packages/gevent/greenlet.py", line 308, in get
raise self._exception
AssertionError: failed to recover before timeout expired
2014-02-20T07:02:24.101 DEBUG:teuthology.run_tasks:Unwinding manager ceph
2014-02-20T07:02:24.101 DEBUG:teuthology.orchestra.run:Running [10.214.132.16]: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph pg dump --format json'
2014-02-20T07:02:24.338 INFO:teuthology.orchestra.run.err:[10.214.132.16]: dumped all in format json
2014-02-20T07:02:25.353 INFO:teuthology.task.ceph:Scrubbing osd osd.0

In ceph.log, looks like we did go clean:
2014-02-20 07:01:52.215490 mon.0 10.214.132.11:6789/0 885 : [INF] pgmap v776: 203 pgs: 25 active+recovery_wait, 168 active+clean, 10 active+recovering; 20536 MB data, 16959 MB used, 2311 GB / 2327 GB avail; 0 B/s wr, 9 op/s; -3248/10384 objects degraded (-31.279%); 37089 kB/s, 9 objects/s recovering
2014-02-20 07:01:56.874698 mon.0 10.214.132.11:6789/0 886 : [INF] pgmap v777: 203 pgs: 24 active+recovery_wait, 170 active+clean, 9 active+recovering; 20444 MB data, 16871 MB used, 2311 GB / 2327 GB avail; 0 B/s wr, 7 op/s; -3058/10338 objects degraded (-29.580%); 29613 kB/s, 7 objects/s recovering
2014-02-20 07:01:56.874698 mon.0 10.214.132.11:6789/0 886 : [INF] pgmap v777: 203 pgs: 24 active+recovery_wait, 170 active+clean, 9 active+recovering; 20444 MB data, 16871 MB used, 2311 GB / 2327 GB avail; 0 B/s wr, 7 op/s; -3058/10338 objects degraded (-29.580%); 29613 kB/s, 7 objects/s recovering
2014-02-20 07:02:01.875773 mon.0 10.214.132.11:6789/0 887 : [INF] pgmap v778: 203 pgs: 20 active+recovery_wait, 176 active+clean, 7 active+recovering; 20180 MB data, 16655 MB used, 2311 GB / 2327 GB avail; 0 B/s wr, 9 op/s; -2379/10206 objects degraded (-23.310%); 16821 kB/s, 4 objects/s recovering
2014-02-20 07:02:01.875773 mon.0 10.214.132.11:6789/0 887 : [INF] pgmap v778: 203 pgs: 20 active+recovery_wait, 176 active+clean, 7 active+recovering; 20180 MB data, 16655 MB used, 2311 GB / 2327 GB avail; 0 B/s wr, 9 op/s; -2379/10206 objects degraded (-23.310%); 16821 kB/s, 4 objects/s recovering
2014-02-20 07:02:06.877967 mon.0 10.214.132.11:6789/0 888 : [INF] pgmap v779: 203 pgs: 18 active+recovery_wait, 176 active+clean, 9 active+recovering; 20024 MB data, 16534 MB used, 2311 GB / 2327 GB avail; 0 B/s wr, 10 op/s; -2431/10128 objects degraded (-24.003%); 17609 kB/s, 4 objects/s recovering
2014-02-20 07:02:06.877967 mon.0 10.214.132.11:6789/0 888 : [INF] pgmap v779: 203 pgs: 18 active+recovery_wait, 176 active+clean, 9 active+recovering; 20024 MB data, 16534 MB used, 2311 GB / 2327 GB avail; 0 B/s wr, 10 op/s; -2431/10128 objects degraded (-24.003%); 17609 kB/s, 4 objects/s recovering
2014-02-20 07:02:11.878641 mon.0 10.214.132.11:6789/0 889 : [INF] pgmap v780: 203 pgs: 18 active+recovery_wait, 176 active+clean, 9 active+recovering; 19896 MB data, 16476 MB used, 2311 GB / 2327 GB avail; 0 B/s wr, 7 op/s; -2486/10064 objects degraded (-24.702%); 21706 kB/s, 5 objects/s recovering
2014-02-20 07:02:11.878641 mon.0 10.214.132.11:6789/0 889 : [INF] pgmap v780: 203 pgs: 18 active+recovery_wait, 176 active+clean, 9 active+recovering; 19896 MB data, 16476 MB used, 2311 GB / 2327 GB avail; 0 B/s wr, 7 op/s; -2486/10064 objects degraded (-24.702%); 21706 kB/s, 5 objects/s recovering
2014-02-20 07:02:16.882638 mon.0 10.214.132.11:6789/0 890 : [INF] pgmap v781: 203 pgs: 203 active+clean; 18528 MB data, 15944 MB used, 2312 GB / 2327 GB avail; 0 B/s wr, 37 op/s; 13923 kB/s, 4 objects/s recovering
2014-02-20 07:02:16.882638 mon.0 10.214.132.11:6789/0 890 : [INF] pgmap v781: 203 pgs: 203 active+clean; 18528 MB data, 15944 MB used, 2312 GB / 2327 GB avail; 0 B/s wr, 37 op/s; 13923 kB/s, 4 objects/s recovering
2014-02-20 07:02:21.849278 mon.0 10.214.132.11:6789/0 891 : [INF] pgmap v782: 203 pgs: 203 active+clean; 15412 MB data, 12661 MB used, 2315 GB / 2327 GB avail; 0 B/s wr, 112 op/s; 2457 kB/s, 1 objects/s recovering
2014-02-20 07:02:21.849278 mon.0 10.214.132.11:6789/0 891 : [INF] pgmap v782: 203 pgs: 203 active+clean; 15412 MB data, 12661 MB used, 2315 GB / 2327 GB avail; 0 B/s wr, 112 op/s; 2457 kB/s, 1 objects/s recovering
2014-02-20 07:02:24.007638 mon.0 10.214.132.11:6789/0 892 : [INF] osdmap e84: 6 osds: 6 up, 5 in
2014-02-20 07:02:24.007638 mon.0 10.214.132.11:6789/0 892 : [INF] osdmap e84: 6 osds: 6 up, 5 in

Actions #1

Updated by Samuel Just about 10 years ago

We might just increase the timeout for rados bench runs.

Actions #2

Updated by Samuel Just about 10 years ago

ubuntu@teuthology:/a/teuthology-2014-02-19_23:00:21-rados-master-testing-basic-plana/91236

Actions #3

Updated by Sage Weil about 10 years ago

  • Status changed from New to Need More Info
Actions #4

Updated by Ian Colle about 10 years ago

  • Status changed from Need More Info to Can't reproduce
Actions

Also available in: Atom PDF