Project

General

Profile

Actions

Bug #40154

open

nautilus: failed to become clean before timeout expired

Added by Neha Ojha almost 5 years ago. Updated almost 5 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

PG_STAT OBJECTS MISSING_ON_PRIMARY DEGRADED MISPLACED UNFOUND BYTES    OMAP_BYTES* OMAP_KEYS* LOG  DISK_LOG STATE                                             STATE_STAMP                VERSION REPORTED UP        UP_PRIMARY ACTING       ACTING_PRIMARY LAST_SCRUB SCRUB_STAMP                LAST_DEEP_SCRUB DEEP_SCRUB_STAMP           SNAPTRIMQ_LEN
2.f        6331                  0     1834         0       0 51863552           0          0 3093     3093 active+recovery_wait+undersized+degraded+remapped 2019-05-31 18:48:36.091582 34'7493  34:7581 [0,3,2,1]          0 [0,3,2,NONE]              0        0'0 2019-05-31 18:47:29.711807             0'0 2019-05-31 18:47:29.711807             0
2.e        6232                  0     1885         0       0 51042652           0          0 3058     3058 active+recovery_wait+undersized+degraded+remapped 2019-05-31 18:48:36.114224 34'7558  34:7684 [2,3,1,0]          2 [2,3,NONE,0]              2        0'0 2019-05-31 18:47:29.711807             0'0 2019-05-31 18:47:29.711807             0
2.d        6243                  0     1908         0       0 51129542           0          0 3016     3016 active+recovery_wait+undersized+degraded+remapped 2019-05-31 18:48:36.084241 34'7516  34:7641 [1,0,3,2]          1 [NONE,0,3,2]              0        0'0 2019-05-31 18:47:29.711807             0'0 2019-05-31 18:47:29.711807             0
2.c        6308                  0     1904         0       0 51675136           0          0 3054     3054 active+recovery_wait+undersized+degraded+remapped 2019-05-31 18:48:36.152287 34'7554  34:7656 [2,0,3,1]          2 [2,0,3,NONE]              2        0'0 2019-05-31 18:47:29.711807             0'0 2019-05-31 18:47:29.711807             0
2.b        6287                  0     1939         0       0 51493212           0          0 3052     3052 active+recovery_wait+undersized+degraded+remapped 2019-05-31 18:48:36.167984 34'7552  34:7673 [3,2,0,1]          3 [3,2,0,NONE]              3        0'0 2019-05-31 18:47:29.711807             0'0 2019-05-31 18:47:29.711807             0
2.a        6376                  0      731         0       0 52222300           0          0 3005     3005    active+recovering+undersized+degraded+remapped 2019-05-31 18:51:30.600716 34'7605 34:10087 [1,3,0,2]          1 [NONE,3,0,2]              3        0'0 2019-05-31 18:47:29.711807             0'0 2019-05-31 18:47:29.711807             0
2.9        6156                  0     1808         0       0 50420060           0          0 3096     3096 active+recovery_wait+undersized+degraded+remapped 2019-05-31 18:48:36.222252 34'7396  34:7504 [1,2,0,3]          1 [NONE,2,0,3]              2        0'0 2019-05-31 18:47:29.711807             0'0 2019-05-31 18:47:29.711807             0
2.8        6209                  0        0         0       0 50859182           0          0 3078     3078                                      active+clean 2019-05-31 18:51:32.525046 34'7478 34:11286 [0,2,3,1]          0    [0,2,3,1]              0        0'0 2019-05-31 18:47:29.711807             0'0 2019-05-31 18:47:29.711807             0
1.5           0                  0        0         0       0        0           0          0    0        0                                      active+clean 2019-05-31 18:47:54.530962     0'0    33:45     [2,3]          2        [2,3]              2        0'0 2019-05-31 18:47:17.646112             0'0 2019-05-31 18:47:17.646112             0
2.6        6263                  0     1889         0       0 51291658           0          0 3010     3010 active+recovery_wait+undersized+degraded+remapped 2019-05-31 18:48:36.119380 34'7510  34:7632 [1,0,2,3]          1 [NONE,0,2,3]              0        0'0 2019-05-31 18:47:29.711807             0'0 2019-05-31 18:47:29.711807             0
1.4           0                  0        0         0       0        0           0          0    0        0                                      active+clean 2019-05-31 18:47:54.534616     0'0    33:36     [3,0]          3        [3,0]              3        0'0 2019-05-31 18:47:17.646112             0'0 2019-05-31 18:47:17.646112             0
2.7        6332                  0     1850         0       0 51866798           0          0 3007     3007 active+recovery_wait+undersized+degraded+remapped 2019-05-31 18:48:36.149315 34'7507  34:7630 [3,0,2,1]          3 [3,0,2,NONE]              3        0'0 2019-05-31 18:47:29.711807             0'0 2019-05-31 18:47:29.711807             0
1.3           0                  0        0         0       0        0           0          0    0        0                                      active+clean 2019-05-31 18:48:34.783467     0'0    33:25     [1,2]          1        [1,2]              1        0'0 2019-05-31 18:47:17.646112             0'0 2019-05-31 18:47:17.646112             0
2.0        6176                  0     1856         0       0 50583900           0          0 3034     3034 active+recovery_wait+undersized+degraded+remapped 2019-05-31 18:48:36.221755 34'7434  34:7550 [3,1,2,0]          3 [3,NONE,2,0]              3        0'0 2019-05-31 18:47:29.711807             0'0 2019-05-31 18:47:29.711807             0
1.2           0                  0        0         0       0        0           0          0    0        0                                      active+clean 2019-05-31 18:48:34.790627     0'0    33:33     [0,1]          0        [0,1]              0        0'0 2019-05-31 18:47:17.646112             0'0 2019-05-31 18:47:17.646112             0
2.1        6240                  0        0         0       0 51108188           0          0 3057     3057                                      active+clean 2019-05-31 18:50:11.399213 34'7557 34:11552 [2,3,0,1]          2    [2,3,0,1]              2        0'0 2019-05-31 18:47:29.711807             0'0 2019-05-31 18:47:29.711807             0
1.0           0                  0        0         0       0        0           0          0    0        0                                      active+clean 2019-05-31 18:48:34.785707     0'0    33:44     [1,0]          1        [1,0]              1        0'0 2019-05-31 18:47:17.646112             0'0 2019-05-31 18:47:17.646112             0
2.3        6315                  0     1880         0       0 51722588           0          0 3036     3036 active+recovery_wait+undersized+degraded+remapped 2019-05-31 18:48:36.166173 34'7536  34:7657 [3,2,0,1]          3 [3,2,0,NONE]              3        0'0 2019-05-31 18:47:29.711807             0'0 2019-05-31 18:47:29.711807             0
1.1           0                  0        0         0       0        0           0          0    0        0                                      active+clean 2019-05-31 18:47:54.532618     0'0    33:36     [3,0]          3        [3,0]              3        0'0 2019-05-31 18:47:17.646112             0'0 2019-05-31 18:47:17.646112             0
2.2        6160                  0     1872         0       0 50452828           0          0 3038     3038 active+recovery_wait+undersized+degraded+remapped 2019-05-31 18:48:36.197786 34'7438  34:7565 [3,1,0,2]          3 [3,NONE,0,2]              3        0'0 2019-05-31 18:47:29.711807             0'0 2019-05-31 18:47:29.711807             0
1.6           0                  0        0         0       0        0           0          0    0        0                                      active+clean 2019-05-31 18:47:54.533937     0'0    33:36     [3,0]          3        [3,0]              3        0'0 2019-05-31 18:47:17.646112             0'0 2019-05-31 18:47:17.646112             0
2.5        6313                  0     1925         0       0 51706204           0          0 3071     3071 active+recovery_wait+undersized+degraded+remapped 2019-05-31 18:48:36.200270 34'7571  34:7705 [3,0,1,2]          3 [3,0,NONE,2]              3        0'0 2019-05-31 18:47:29.711807             0'0 2019-05-31 18:47:29.711807             0
1.7           0                  0        0         0       0        0           0          0    0        0                                      active+clean 2019-05-31 18:48:34.791988     0'0    33:39     [1,3]          1        [1,3]              1        0'0 2019-05-31 18:47:17.646112             0'0 2019-05-31 18:47:17.646112             0
2.4        6240                  0     1900         0       0 51098296           0          0 3008     3008 active+recovery_wait+undersized+degraded+remapped 2019-05-31 18:48:36.159731 34'7508  34:7628 [1,0,2,3]          1 [NONE,0,2,3]              0        0'0 2019-05-31 18:47:29.711807             0'0 2019-05-31 18:47:29.711807             0

Some PGs are stuck in active+recovery_wait+undersized+degraded+remapped.

rados/singleton/{all/ec-lost-unfound.yaml msgr-failures/many.yaml msgr/random.yaml objectstore/bluestore-bitmap.yaml rados.yaml supported-random-distro$/{ubuntu_latest.yaml}}

/a/yuriw-2019-05-31_15:41:54-rados-wip-yuri4-testing-2019-05-30-2109-nautilus-distro-basic-smithi/3993655/

Actions #1

Updated by David Zafman almost 5 years ago

With osd_max_backfills default to 1 and all recovery targeting OSD.1 all recovery is waiting behind PG 2.a to finish. The OSD log for OSD.1 doesn't match up with the teuthology.log information because it contains different PG numbers not including PG 2.a.

Actions

Also available in: Atom PDF