Project

General

Profile

Actions

Bug #34534

closed

Blacklisted client might not notice it lost the lock

Added by Jason Dillaman over 5 years ago. Updated over 5 years ago.

Status:
Resolved
Priority:
High
Assignee:
Jason Dillaman
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
luminous,mimic
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

After blacklisting the lock owner, if after 30 seconds the blacklist is removed, the watch on the RBD image header should be marked as failed and librbd should be able to detect that the lock was lost when it attempts to re-acquire it. However, during an iSCSI test where IO was incorrectly sent to previously blacklisted lock owner, the IO improperly succeeded when it should have failed w/ -EROFS.


Related issues 2 (0 open2 closed)

Copied to rbd - Backport #36143: luminous: Blacklisted client might not notice it lost the lockResolvedJason DillamanActions
Copied to rbd - Backport #36144: mimic: Blacklisted client might not notice it lost the lockResolvedJason DillamanActions
Actions #1

Updated by Jason Dillaman over 5 years ago

Couple bugs:

(1) upon blacklist, the watcher doesn't attempt to re-acquire the lock but that leaves the lock in a "locked" state internally since it also doesn't attempt to reacquire the lock.

2018-08-30 18:01:09.090044 7fbae17fa700 -1 librbd::ImageWatcher: 0x146fb00 image watch failed: 21530400, (107) Transport endpoint is not connected
2018-08-30 18:01:09.090077 7fbae17fa700 -1 librbd::Watcher: 0x146fb00 handle_error: handle=21530400: (107) Transport endpoint is not connected
2018-08-30 18:01:09.090692 7fbae17fa700 -1 librbd::watcher::RewatchRequest: 0x7fbad0000f60 handle_unwatch client blacklisted
2018-08-30 18:01:09.090726 7fbae0ff9700 -1 librbd::ManagedLock: 0x14c5ca0 send_reacquire_lock: aborting reacquire due to invalid watch handle

(2) attempting to blacklist another peer while in this state will result in the lock_break API method failing w/ -EBUSY since it thinks it owns the lock and doesn't check if the blacklist target is itself:

2018-08-30 17:49:15.396290 7f7c9affd700 10 librbd::ManagedLock: 0x7f7c940588d0 break_lock
2018-08-30 17:49:15.396295 7f7c9affd700 20 librbd::ManagedLock: 0x7f7c940588d0 is_lock_owner=1
2018-08-30 17:49:15.396296 7f7c9affd700 -1 librbd: failed to break lock: (16) Device or resource busy
Actions #2

Updated by Jason Dillaman over 5 years ago

  • Status changed from New to In Progress
  • Assignee set to Jason Dillaman
Actions #3

Updated by Jason Dillaman over 5 years ago

  • Backport set to luminous,mimic
Actions #4

Updated by Jason Dillaman over 5 years ago

Actions #5

Updated by Mykola Golub over 5 years ago

  • Status changed from In Progress to Fix Under Review
Actions #6

Updated by Mykola Golub over 5 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #7

Updated by Nathan Cutler over 5 years ago

  • Copied to Backport #36143: luminous: Blacklisted client might not notice it lost the lock added
Actions #8

Updated by Nathan Cutler over 5 years ago

  • Copied to Backport #36144: mimic: Blacklisted client might not notice it lost the lock added
Actions #9

Updated by Nathan Cutler over 5 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF