Project

General

Profile

Actions

Bug #9650

closed

RWTimer cancel_event is racy

Added by Sage Weil over 9 years ago. Updated over 9 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

(in safe mode) we carry the rwlock for the callback. but we use a separate mutex to protect the events. and we can

B- take rwlock.read
...
A- take rwlock.read
A- dequeue an event
A- drop the mutex
B- cancel_event (no-op, not queued)
A- do event callback

This is triggering #9582.


Related issues 1 (0 open1 closed)

Related to Ceph - Bug #9582: librados: segmentation fault on timeoutResolvedSage Weil09/24/2014

Actions
Actions #1

Updated by Sage Weil over 9 years ago

  • Description updated (diff)
  • Priority changed from Normal to Urgent
  • Source changed from other to Q/A
Actions #2

Updated by Sage Weil over 9 years ago

The issue is that we execute events under a shared (read) lock, and we allow you to cancel them under a shared (read) lock. Those two things are fundamentally racy.

Actions #3

Updated by Sage Weil over 9 years ago

  • Status changed from New to Fix Under Review

wip-rwtimer

Actions #4

Updated by Sage Weil over 9 years ago

  • Status changed from Fix Under Review to 7
Actions #5

Updated by Sage Weil over 9 years ago

  • Status changed from 7 to Resolved
Actions

Also available in: Atom PDF