Bug #63613
open[rgw][lc] using custom lc schedule (work time) may cause lc processing to stall
0%
Description
We use different lc processing time windows for our different clusters utilizing the knob rgw_lifecycle_work_time ([[https://bbgithub.dev.bloomberg.com/ceph/ceph/blob/e877333f07d0eeb574572674cbcdefc9f07e231a/src/common/options/rgw.yaml.in#L358]]).
We noticed this issue when we tried to start LC processing at 2PM local time and allow it to run for 24 hours - which translates into "14:00:13:59" - at one of our clusters. However, LC processing stalled completely for several days after applying this setting.
After analyzing the extended logs, we realized that the logic used in the function that decides whether LC should start running at the current time (i.e., "should_work" function at https://bbgithub.dev.bloomberg.com/ceph/ceph/blob/e877333f07d0eeb574572674cbcdefc9f07e231a/src/rgw/rgw_lc.cc#L2431) doesn't take "next date" notion into account. As a result, any custom work time "XY:TW-AB:CD" breaks LC processing when AB < XY.