Project

General

Profile

Actions

Bug #49988

closed

Global Recovery Event never completes

Added by Sage Weil about 3 years ago. Updated almost 3 years ago.

Status:
Resolved
Priority:
Urgent
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

  services:
    mon: 3 daemons, quorum a,b,c (age 29m)
    mgr: x(active, since 29m)
    osd: 6 osds: 6 up (since 17m), 6 in (since 29m)

  data:
    pools:   3 pools, 160 pgs
    objects: 0 objects, 0 B
    usage:   6.1 GiB used, 600 GiB / 606 GiB avail
    pgs:     160 active+clean

  progress:
    Global Recovery Event (5m)
      [==================..........] (remaining: 2m)


this is a OSD=6 vstart cluster that just ran qa/workunits/rados/test_python.sh, but i've seen this in other cases too.

Related issues 2 (0 open2 closed)

Has duplicate mgr - Bug #50243: test_turn_off_module (tasks.mgr.test_progress.TestProgress) AssertionError: False is not trueDuplicateKamoltat (Junior) Sirivadhna

Actions
Copied to RADOS - Backport #51215: pacific: Global Recovery Event never completesResolvedKamoltat (Junior) SirivadhnaActions
Actions #1

Updated by Neha Ojha about 3 years ago

  • Assignee set to Kamoltat (Junior) Sirivadhna
Actions #2

Updated by Kamoltat (Junior) Sirivadhna about 3 years ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 40480

Problem was that I did not subtract pgs that I skip because (reported_epoch_of_pg < start_epoch_of_event) from total_pg_num, this results in always (active_clean_pg < total_pg_num).
Therefore, a fix for this is just to subtract the pgs I skipped from total_pg_num.

Actions #3

Updated by Neha Ojha almost 3 years ago

  • Related to Bug #50243: test_turn_off_module (tasks.mgr.test_progress.TestProgress) AssertionError: False is not true added
Actions #4

Updated by Kefu Chai almost 3 years ago

  • Related to deleted (Bug #50243: test_turn_off_module (tasks.mgr.test_progress.TestProgress) AssertionError: False is not true)
Actions #5

Updated by Kefu Chai almost 3 years ago

  • Has duplicate Bug #50243: test_turn_off_module (tasks.mgr.test_progress.TestProgress) AssertionError: False is not true added
Actions #6

Updated by Kefu Chai almost 3 years ago

  • Status changed from Fix Under Review to Resolved
Actions #7

Updated by Neha Ojha almost 3 years ago

  • Status changed from Resolved to Pending Backport
  • Backport set to pacific
Actions #8

Updated by Backport Bot almost 3 years ago

  • Copied to Backport #51215: pacific: Global Recovery Event never completes added
Actions #10

Updated by Loïc Dachary almost 3 years ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Actions

Also available in: Atom PDF