Project

General

Profile

Actions

Bug #63613

open

[rgw][lc] using custom lc schedule (work time) may cause lc processing to stall

Added by Oguzhan Ozmen 6 months ago. Updated 5 months ago.

Status:
Pending Backport
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
rgw lifecycle backport_processed
Backport:
quincy reef pacific
Regression:
No
Severity:
4 - irritation
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

We use different lc processing time windows for our different clusters utilizing the knob rgw_lifecycle_work_time ([[https://bbgithub.dev.bloomberg.com/ceph/ceph/blob/e877333f07d0eeb574572674cbcdefc9f07e231a/src/common/options/rgw.yaml.in#L358]]).

We noticed this issue when we tried to start LC processing at 2PM local time and allow it to run for 24 hours - which translates into "14:00:13:59" - at one of our clusters. However, LC processing stalled completely for several days after applying this setting.

After analyzing the extended logs, we realized that the logic used in the function that decides whether LC should start running at the current time (i.e., "should_work" function at https://bbgithub.dev.bloomberg.com/ceph/ceph/blob/e877333f07d0eeb574572674cbcdefc9f07e231a/src/rgw/rgw_lc.cc#L2431) doesn't take "next date" notion into account. As a result, any custom work time "XY:TW-AB:CD" breaks LC processing when AB < XY.


Related issues 3 (2 open1 closed)

Copied to rgw - Backport #63776: quincy: [rgw][lc] using custom lc schedule (work time) may cause lc processing to stallIn ProgressMykola GolubActions
Copied to rgw - Backport #63777: reef: [rgw][lc] using custom lc schedule (work time) may cause lc processing to stallIn ProgressMykola GolubActions
Copied to rgw - Backport #63787: pacific: [rgw][lc] using custom lc schedule (work time) may cause lc processing to stallRejectedMykola GolubActions
Actions #1

Updated by Oguzhan Ozmen 6 months ago

Added the PR https://github.com/ceph/ceph/pull/54622 addressing this issue.

Actions #2

Updated by Casey Bodley 5 months ago

  • Status changed from New to Fix Under Review
  • Backport set to quincy reef
  • Pull request ID set to 54622
Actions #3

Updated by Casey Bodley 5 months ago

  • Status changed from Fix Under Review to Pending Backport
  • Assignee set to Oguzhan Ozmen
  • Target version deleted (v18.2.2)
Actions #4

Updated by Backport Bot 5 months ago

  • Copied to Backport #63776: quincy: [rgw][lc] using custom lc schedule (work time) may cause lc processing to stall added
Actions #5

Updated by Backport Bot 5 months ago

  • Copied to Backport #63777: reef: [rgw][lc] using custom lc schedule (work time) may cause lc processing to stall added
Actions #6

Updated by Backport Bot 5 months ago

  • Tags changed from rgw lifecycle to rgw lifecycle backport_processed
Actions #7

Updated by Mykola Golub 5 months ago

  • Backport changed from quincy reef to quincy reef pacific

Adding pacific to Backport list: even if it is not merged before release I would like to prepare a backport PR just in case. We will be able to close it any time.

Actions #8

Updated by Mykola Golub 5 months ago

  • Tags changed from rgw lifecycle backport_processed to rgw lifecycle
Actions #9

Updated by Backport Bot 5 months ago

  • Copied to Backport #63787: pacific: [rgw][lc] using custom lc schedule (work time) may cause lc processing to stall added
Actions #10

Updated by Backport Bot 5 months ago

  • Tags changed from rgw lifecycle to rgw lifecycle backport_processed
Actions #11

Updated by Oguzhan Ozmen 5 months ago

I cannot change the status but I don't think a backport to Pacific or Quincy needed for this change.

Actions

Also available in: Atom PDF