Project

General

Profile

Actions

Bug #36073

closed

failed to recover before timeout expired -- premerge+peered PGs?

Added by Ilya Dryomov over 5 years ago. Updated over 5 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Appeared between 93748a325cd8 ("Merge pull request #23944 from ceph/wip-s3a-update-mirror") and 5a3344f0e52c ("Merge pull request #23895 from xiexingguo/wip-more-async-fixes"). That's when PG merging went in, https://github.com/ceph/ceph/pull/20469.

2018-09-10T14:39:18.920 INFO:tasks.ceph.ceph_manager.ceph:version 1921
stamp 2018-09-10 14:39:18.476763
last_osdmap_epoch 0
last_pg_scan 0
PG_STAT OBJECTS MISSING_ON_PRIMARY DEGRADED MISPLACED UNFOUND BYTES     LOG  DISK_LOG STATE           STATE_STAMP                VERSION   REPORTED  UP    UP_PRIMARY ACTING ACTING_PRIMARY LAST_SCRUB SCRUB_STAMP                LAST_DEEP_SCRUB DEEP_SCRUB_STAMP           SNAPTRIMQ_LEN 
1.11        146                  0        0         0       0 600054443 2226     2226 premerge+peered 2018-09-10 14:18:59.876713 430'64653 856:69171 [3,6]          3  [3,6]              3        0'0 2018-09-10 14:04:00.462578             0'0 2018-09-10 14:04:00.462578             0 
1.10        113                  0        0         0       0 453292048 1950     1950    active+clean 2018-09-10 14:19:07.919719 433'63433 856:68267 [1,5]          1  [1,5]              1  433'63433 2018-09-10 14:19:07.919662             0'0 2018-09-10 14:04:00.462578             0 
1.5         225                  0        0         0       0 925523968 3024     3024    active+clean 2018-09-10 14:19:02.405614 445'59368 856:62501 [2,6]          2  [2,6]              2  445'59368 2018-09-10 14:19:02.405562       445'59368 2018-09-10 14:19:02.405562             0 
1.4         224                  0        0         0       0 912334848 2761     2761    active+clean 2018-09-10 14:18:54.001434 433'61454 856:63685 [7,0]          7  [7,0]              7  433'61454 2018-09-10 14:18:54.001376             0'0 2018-09-10 14:04:00.462578             0 
1.3         207                  0        0         0       0 829743104 2893     2893    active+clean 2018-09-10 14:18:52.119741 433'66936 856:69519 [4,1]          4  [4,1]              4  433'66936 2018-09-10 14:18:52.119679             0'0 2018-09-10 14:04:00.462578             0 
1.2         214                  0        0         0       0 874147840 2950     2950    active+clean 2018-09-10 14:19:06.197593 432'58939 856:61954 [6,2]          6  [6,2]              6  432'58939 2018-09-10 14:19:06.197542             0'0 2018-09-10 14:04:00.462578             0 
1.0         102                  0        0         0       0 419954688 2929     2929    active+clean 2018-09-10 14:18:51.945868 433'63260 856:68095 [1,5]          1  [1,5]              1  433'63260 2018-09-10 14:18:51.945811             0'0 2018-09-10 14:04:00.462578             0 
1.1         146                  0        0         0       0 600054442 1727     1727 premerge+peered 2018-09-10 14:18:43.709546 430'64662 856:68871 [7,2]          7  [7,2]              7        0'0 2018-09-10 14:04:00.462578             0'0 2018-09-10 14:04:00.462578             0 
1.6         216                  0        0         0       0 887095296 3057     3057    active+clean 2018-09-10 14:19:08.910623 432'58912 856:62247 [3,6]          3  [3,6]              3  432'58912 2018-09-10 14:19:08.910511             0'0 2018-09-10 14:04:00.462578             0 
1.7         213                  0        0         0       0 869560320 3010     3010    active+clean 2018-09-10 14:19:04.132453 430'66844 856:71392 [6,4]          6  [6,4]              6  430'66844 2018-09-10 14:19:04.132399             0'0 2018-09-10 14:04:00.462578             0 
1.8         234                  0        0         0       0 949211136 3062     3062    active+clean 2018-09-10 14:18:57.001312 430'63485 856:68319 [1,5]          1  [1,5]              1  430'63485 2018-09-10 14:18:57.001249             0'0 2018-09-10 14:04:00.462578             0 
1.9         233                  0        0         0       0 947957760 2807     2807    active+clean 2018-09-10 14:19:17.983556 433'65431 856:70095 [7,5]          7  [7,5]              7  433'65431 2018-09-10 14:19:17.983472             0'0 2018-09-10 14:04:00.462578             0 
1.a         213                  0        0         0       0 858841107 2439     2439    active+clean 2018-09-10 14:18:52.996082 433'59512 856:63171 [0,4]          0  [0,4]              0  433'59512 2018-09-10 14:18:52.996021             0'0 2018-09-10 14:04:00.462578             0 
1.b         226                  0        0         0       0 916500480 3098     3098    active+clean 2018-09-10 14:18:55.173662 432'66098 856:68680 [4,1]          4  [4,1]              4  432'66098 2018-09-10 14:18:55.173576             0'0 2018-09-10 14:04:00.462578             0 
1.c         226                  0        0         0       0 911437824 3018     3018    active+clean 2018-09-10 14:19:09.053686 433'62149 856:64379 [7,0]          7  [7,0]              7  433'62149 2018-09-10 14:19:09.053609             0'0 2018-09-10 14:04:00.462578             0 
1.d         187                  0        0         0       0 760508416 2498     2498    active+clean 2018-09-10 14:19:07.847682 430'57843 856:60975 [2,6]          2  [2,6]              2  430'57843 2018-09-10 14:19:07.847606             0'0 2018-09-10 14:04:00.462578             0 
1.e         212                  0        0         0       0 875585536 3017     3017    active+clean 2018-09-10 14:19:17.923133 432'59602 856:62940 [3,6]          3  [3,6]              3  432'59602 2018-09-10 14:19:17.923073             0'0 2018-09-10 14:04:00.462578             0 
1.f         223                  0        0         0       0 899743744 2919     2919    active+clean 2018-09-10 14:19:10.920017 435'67385 856:72625 [1,6]          1  [1,6]              1  435'67385 2018-09-10 14:19:10.919945             0'0 2018-09-10 14:04:00.462578             0 

1 3560 0 0 0 0 14491547000 49385 49385 

sum 3560 0 0 0 0 14491547000 49385 49385 
OSD_STAT USED    AVAIL   TOTAL   HB_PEERS        PG_SUM PRIMARY_PG_SUM 
7        3.6 GiB  86 GiB  89 GiB [0,1,2,3,4,5,6]      4              4 
6        6.3 GiB  83 GiB  89 GiB [0,1,2,3,4,5,7]      8              2 
1        4.4 GiB  85 GiB  89 GiB [0,2,3,4,5,6,7]      6              4 
0        2.7 GiB  87 GiB  89 GiB [1,2,3,4,5,6,7]      3              1 
2        3.4 GiB  86 GiB  89 GiB [0,1,3,4,5,6,7]      4              2 
3        2.2 GiB  87 GiB  89 GiB [0,1,2,4,5,6,7]      3              3 
4        3.5 GiB  86 GiB  89 GiB [0,1,2,3,5,6,7]      4              2 
5        2.8 GiB  87 GiB  89 GiB [0,1,2,3,4,6,7]      4              0 
sum       29 GiB 686 GiB 715 GiB                                       

2018-09-10T14:39:18.920 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last):
  File "/home/teuthworker/src/git.ceph.com_ceph_master/qa/tasks/ceph_manager.py", line 883, in wrapper
    return func(self)
  File "/home/teuthworker/src/git.ceph.com_ceph_master/qa/tasks/ceph_manager.py", line 1005, in do_thrash
    timeout=self.config.get('timeout')
  File "/home/teuthworker/src/git.ceph.com_ceph_master/qa/tasks/ceph_manager.py", line 2242, in wait_for_recovery
    'failed to recover before timeout expired'
AssertionError: failed to recover before timeout expired

http://pulpito.ceph.com/teuthology-2018-09-08_03:20:02-krbd-master-testing-basic-smithi/2991338
http://pulpito.ceph.com/teuthology-2018-09-10_03:20:02-krbd-master-testing-basic-smithi/2999776
http://pulpito.ceph.com/dis-2018-09-17_20:52:59-krbd-master-distro-basic-smithi/3034817
http://pulpito.ceph.com/dis-2018-09-18_10:05:25-krbd-master-distro-basic-smithi/3037567

Actions #1

Updated by Sage Weil over 5 years ago

  • Status changed from New to In Progress
  • Assignee set to Sage Weil
Actions #2

Updated by Neha Ojha over 5 years ago

  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF