Project

General

Profile

Actions

Bug #56090

closed

[rbd_support] a schedule may get lost due to load vs add race

Added by Ilya Dryomov almost 2 years ago. Updated almost 2 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
octopus,pacific,quincy
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

If load_schedules() (i.e. periodic refresh) races with add_schedule() invoked by the user for a fresh image, that image's schedule may get lost until the next rebuild (not refresh!) of the queue:

1. periodic refresh invokes load_schedules()
2. load_schedules() creates a new Schedules instance and loads schedules from rbd_mirror_snapshot_schedule object
3. add_schedule() is invoked for a new image (an image that isn't present in self.images) by the user
4. before load_schedules() can grab self.lock, add_schedule() commits the new schedule to rbd_mirror_snapshot_schedule object and adds it to self.schedules
5. load_schedules() grabs self.lock and reassigns self.schedules with Schedules instance that is now stale
6. periodic refresh invokes load_pool_images() which discovers the new image; eventually it is added to self.images
7. periodic refresh invokes refresh_queue() which attempts to enqueue() the new image; this fails because a matching schedule isn't present

The next periodic refresh recovers the discarded schedule from rbd_mirror_snapshot_schedule object but no attempt to enqueue() that image is made since it is already "known" at that point. Despite the schedule being in place, no snapshots are created until the queue is rebuilt from scratch or rbd_support module is reloaded.


Related issues 3 (0 open3 closed)

Copied to rbd - Backport #56141: octopus: [rbd_support] a schedule may get lost due to load vs add raceResolvedIlya DryomovActions
Copied to rbd - Backport #56142: quincy: [rbd_support] a schedule may get lost due to load vs add raceResolvedIlya DryomovActions
Copied to rbd - Backport #56143: pacific: [rbd_support] a schedule may get lost due to load vs add raceResolvedIlya DryomovActions
Actions #1

Updated by Ilya Dryomov almost 2 years ago

  • Status changed from In Progress to Fix Under Review
  • Backport set to octopus,pacific,quincy
  • Pull request ID set to 46734
Actions #2

Updated by Ilya Dryomov almost 2 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #3

Updated by Backport Bot almost 2 years ago

  • Copied to Backport #56141: octopus: [rbd_support] a schedule may get lost due to load vs add race added
Actions #4

Updated by Backport Bot almost 2 years ago

  • Copied to Backport #56142: quincy: [rbd_support] a schedule may get lost due to load vs add race added
Actions #5

Updated by Backport Bot almost 2 years ago

  • Copied to Backport #56143: pacific: [rbd_support] a schedule may get lost due to load vs add race added
Actions #6

Updated by Ilya Dryomov almost 2 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF