Bug #63613
open[rgw][lc] using custom lc schedule (work time) may cause lc processing to stall
0%
Description
We use different lc processing time windows for our different clusters utilizing the knob rgw_lifecycle_work_time ([[https://bbgithub.dev.bloomberg.com/ceph/ceph/blob/e877333f07d0eeb574572674cbcdefc9f07e231a/src/common/options/rgw.yaml.in#L358]]).
We noticed this issue when we tried to start LC processing at 2PM local time and allow it to run for 24 hours - which translates into "14:00:13:59" - at one of our clusters. However, LC processing stalled completely for several days after applying this setting.
After analyzing the extended logs, we realized that the logic used in the function that decides whether LC should start running at the current time (i.e., "should_work" function at https://bbgithub.dev.bloomberg.com/ceph/ceph/blob/e877333f07d0eeb574572674cbcdefc9f07e231a/src/rgw/rgw_lc.cc#L2431) doesn't take "next date" notion into account. As a result, any custom work time "XY:TW-AB:CD" breaks LC processing when AB < XY.
Updated by Oguzhan Ozmen 6 months ago
Added the PR https://github.com/ceph/ceph/pull/54622 addressing this issue.
Updated by Casey Bodley 5 months ago
- Status changed from New to Fix Under Review
- Backport set to quincy reef
- Pull request ID set to 54622
Updated by Casey Bodley 5 months ago
- Status changed from Fix Under Review to Pending Backport
- Assignee set to Oguzhan Ozmen
- Target version deleted (
v18.2.2)
Updated by Backport Bot 5 months ago
- Copied to Backport #63776: quincy: [rgw][lc] using custom lc schedule (work time) may cause lc processing to stall added
Updated by Backport Bot 5 months ago
- Copied to Backport #63777: reef: [rgw][lc] using custom lc schedule (work time) may cause lc processing to stall added
Updated by Backport Bot 5 months ago
- Tags changed from rgw lifecycle to rgw lifecycle backport_processed
Updated by Mykola Golub 5 months ago
- Backport changed from quincy reef to quincy reef pacific
Adding pacific to Backport list: even if it is not merged before release I would like to prepare a backport PR just in case. We will be able to close it any time.
Updated by Mykola Golub 5 months ago
- Tags changed from rgw lifecycle backport_processed to rgw lifecycle
Updated by Backport Bot 5 months ago
- Copied to Backport #63787: pacific: [rgw][lc] using custom lc schedule (work time) may cause lc processing to stall added
Updated by Backport Bot 5 months ago
- Tags changed from rgw lifecycle to rgw lifecycle backport_processed
Updated by Oguzhan Ozmen 5 months ago
I cannot change the status but I don't think a backport to Pacific or Quincy needed for this change.