Project

General

Profile

Actions

Bug #38069

open

upgrade:jewel-x-luminous with short_pg_log.yaml fails with assert(s <= can_rollback_to)

Added by Yuri Weinstein over 5 years ago. Updated about 4 years ago.

Status:
New
Priority:
Low
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
upgrade/jewel-x
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Run: http://pulpito.ceph.com/yuriw-2019-01-24_16:20:56-upgrade:jewel-x-luminous-distro-basic-smithi/
Jobs: '3501809', '3501807', '3501805'
Logs: http://qa-proxy.ceph.com/teuthology/yuriw-2019-01-24_16:20:56-upgrade:jewel-x-luminous-distro-basic-smithi/3501809/teuthology.log

0.d           0                  0        0         0       0         0   0        0                             active+clean 2019-01-24 19:53:29.637818     0'0  600:542 [2,4]          2  [2,4]              2        0'0 2019-01-24 19:47:53.515232             0'0 2019-01-24 19:47:53.515232             0
2.c         112                  0        0         0       0 469762048  18       18                             active+clean 2019-01-24 19:53:58.515563 196'118  600:698 [2,5]          2  [2,5]              2        0'0 2019-01-24 19:53:02.382625             0'0 2019-01-24 19:53:02.382625             0
0.e           0                  0        0         0       0         0   0        0                             active+clean 2019-01-24 19:53:56.997141     0'0  600:557 [2,5]          2  [2,5]              2        0'0 2019-01-24 19:47:53.515235             0'0 2019-01-24 19:47:53.515235             0
2.d         100                  0      100         0       0 415236120  10       10               active+undersized+degraded 2019-01-24 19:55:15.467466 196'110  600:596   [2]          2    [2]              2        0'0 2019-01-24 19:53:02.382625             0'0 2019-01-24 19:53:02.382625             0
0.f           0                  0        0         0       0         0   0        0                             active+clean 2019-01-24 19:53:57.095939     0'0   497:19 [4,5]          4  [4,5]              4        0'0 2019-01-24 19:47:53.515239             0'0 2019-01-24 19:47:53.515239             0

2 1743 94 1079 90 68 7306477592 779 779
0    0  0    0  0  0          0   0   0

sum 1743 94 1079 90 68 7306477592 779 779
OSD_STAT USED    AVAIL   TOTAL   HB_PEERS    PG_SUM PRIMARY_PG_SUM
5        3.10GiB 86.3GiB 89.4GiB     [2,3,4]     38             25
4        4.01GiB 85.4GiB 89.4GiB     [2,3,5]     43             20
0        1.48GiB  445GiB  447GiB [1,2,3,4,5]      0              0
1        4.00GiB  443GiB  447GiB   [2,3,4,5]      0              0
2        2.76GiB  444GiB  447GiB     [3,4,5]     48             45
3             0B      0B      0B          []      1              0
sum      15.3GiB 1.47TiB 1.48TiB

2019-01-24T20:15:52.286 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last):
  File "/home/teuthworker/src/github.com_ceph_ceph_luminous/qa/tasks/ceph_manager.py", line 917, in wrapper
    return func(self)
  File "/home/teuthworker/src/github.com_ceph_ceph_luminous/qa/tasks/ceph_manager.py", line 1033, in do_thrash
    timeout=self.config.get('timeout')
  File "/home/teuthworker/src/github.com_ceph_ceph_luminous/qa/tasks/ceph_manager.py", line 2234, in wait_for_recovery
    'failed to recover before timeout expired'
AssertionError: failed to recover before timeout expired

Per @Neha . "so technically this bug was always there and can happen in jewel to luminous split upgrades, under boundary conditions" and
"(12:25:21 PM) neha: This got highlighted due the new tests I added
(12:26:29 PM) neha: see https://github.com/ceph/ceph/pull/25949#issuecomment-454171639"

Actions

Also available in: Atom PDF