Bug #15519
closedfailed to recover before timeout expired
0%
Description
http://pulpito.ceph.com/dzafman-2016-04-14_10:27:09-rados:thrash-jewel---basic-smithi/129483/
All 6 OSDs are up and in.
u'osdmap': {u'osdmap': {u'full': False, u'nearfull': False, u'num_osds': 6, u'num_up_osds': 6, u'epoch': 613, u'num_in_osds': 6, u'num_remapped_pgs': 8}}
There is a massive teuthology.log because the thrasher background threads didn't stop after this error:
2016-04-14T12:03:42.553 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last):
File "/var/lib/teuthworker/src/ceph-qa-suite_wip-8885/tasks/ceph_manager.py", line 657, in wrapper
return func(self)
File "/var/lib/teuthworker/src/ceph-qa-suite_wip-8885/tasks/ceph_manager.py", line 765, in do_thrash
timeout=self.config.get('timeout')
File "/var/lib/teuthworker/src/ceph-qa-suite_wip-8885/tasks/ceph_manager.py", line 1713, in wait_for_recovery
'failed to recover before timeout expired'
AssertionError: failed to recover before timeout expired
1.1a 5 0 0 10 0 14483201 129 129 active+remapped+wait_backfill 2016-04-14 18:43:24.346042 213'400 546:191 [1,4] 1 [1,2] 1 113'240 2016-04-14 18:40:48.438370 0'0 2016-04-14 18:38:20.120679 1.18 0 0 3 0 0 495203 83 83 active+recovery_wait+degraded 2016-04-14 18:43:19.074773 206'83 545:47 [2,5] 2 [2,5] 2 70'34 2016-04-14 18:39:45.187599 0'0 2016-04-14 18:38:20.120661 1.17 4 0 0 16 0 9283748 113 113 active+remapped+wait_backfill 2016-04-14 18:43:30.487653 140'217 548:122 [4,0] 4 [1,3] 1 0'0 2016-04-14 18:38:20.120650 0'0 2016-04-14 18:38:20.120650 1.9 0 0 2 0 0 1307115 80 80 active+recovery_wait+degraded 2016-04-14 18:43:24.342528 185'80 546:87 [1,4] 1 [1,4] 1 0'0 2016-04-14 18:38:20.120670 0'0 2016-04-14 18:38:20.120670 1.e 3 0 2 0 0 8132857 176 176 active+recovery_wait+degraded 2016-04-14 18:43:23.516688 183'176 534:126 [1,4] 1 [1,4] 1 0'0 2016-04-14 18:38:20.120733 0'0 2016-04-14 18:38:20.120733 1.11 4 0 2 0 0 8218136 41 41 active+recovery_wait+degraded 2016-04-14 18:43:32.878493 214'241 549:214 [1,0] 1 [1,0] 1 176'237 2016-04-14 18:42:12.089802 70'119 2016-04-14 18:39:42.471949 1.1e 3 0 9 0 0 7162000 165 165 active+recovery_wait+degraded 2016-04-14 18:43:32.874862 185'165 549:29 [3,0] 3 [3,0] 3 0'0 2016-04-14 18:38:20.120733 0'0 2016-04-14 18:38:20.120733 1.22 4 0 4 12 0 10612650 133 133 undersized+degraded+remapped+wait_backfill+peered 2016-04-14 18:43:25.645207 182'275 546:143 [2,5] 2 [4]4 64'140 2016-04-14 18:39:36.187705 0'0 2016-04-14 18:38:20.120599 1.26 2 0 4 0 0 6800651 166 166 active+recovery_wait+degraded 2016-04-14 18:43:25.642740 208'166 546:24 [1,4] 1 [1,4] 1 69'79 2016-04-14 18:39:47.155539 0'0 2016-04-14 18:38:20.120641 1.2d 6 0 0 6 0 15427476 121 121 active+remapped+backfilling 2016-04-14 18:43:16.464681 216'448 545:338 [1,5] 1 [1,3] 1 67'194 2016-04-14 18:39:39.106077 0'0 2016-04-14 18:38:20.120720 1.2c 5 0 5 5 0 10473475 12 12 undersized+degraded+remapped+wait_backfill+peered 2016-04-14 18:43:25.649851 185'224 546:139 [3,4] 3 [3]3 69'131 2016-04-14 18:39:49.157256 0'0 2016-04-14 18:38:20.120704 1.35 3 0 0 6 0 7792057 68 68 active+remapped+wait_backfill 2016-04-14 18:43:17.895333 215'255 545:194 [3,5] 3 [3,2] 3 67'136 2016-04-14 18:39:38.190855 0'0 2016-04-14 18:38:20.120631 1.37 3 0 0 6 0 14833959 230 230 active+remapped+wait_backfill 2016-04-14 18:43:24.343021 185'334 546:161 [4,1] 4 [1,3] 1 0'0 2016-04-14 18:38:20.120650 0'0 2016-04-14 18:38:20.120650 1.3a 5 0 0 10 0 10886470 130 130 active+remapped+wait_backfill 2016-04-14 18:43:24.340794 210'401 546:188 [1,4] 1 [1,2] 1 113'240 2016-04-14 18:40:48.438370 0'0 2016-04-14 18:38:20.120679
Updated by Sage Weil over 7 years ago
- Status changed from New to Can't reproduce