Bug #2476: osd: watch timeout depends on operations to an object - Ceph - Ceph

Actions

Copy link

Bug #2476

closed

osd: watch timeout depends on operations to an object

Added by Josh Durgin almost 12 years ago. Updated about 11 years ago.

Status:

Resolved

Priority:

Urgent

Assignee:

Samuel Just

Category:

OSD

Target version:

% Done:

Source:

Community (user)

Tags:

Backport:

Regression:

Severity:

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

The watch timeout is an in-memory thing that's local to the primary. If the primary changes, the timer for ending the watch isn't started until the object context for the watched object is loaded. This normally only happens when an operation is performed on the watched object, but this results in an unbounded delay for the watch timeout, and errors like this when clients haven't accessed the image for >> 30s:

# rbd rm postgresql -p winnie-test
Removing image: 99% complete...failed.
delete error: image still has watchers
This means the image is still open or the client using it crashed. Try again after closing/unmapping it or waiting 30s for the crashed client to timeout.
2012-05-24 15:25:44.647532 7f35b8849760 -1 librbd: error removing header: (16) Device or resource busy

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph

Custom queries

Bug #2476

osd: watch timeout depends on operations to an object

Updated by Sage Weil almost 12 years ago

Updated by Sage Weil almost 12 years ago

Updated by Maciej Galkiewicz over 11 years ago

Updated by Samuel Just about 11 years ago

Updated by Ian Colle about 11 years ago

Updated by Greg Farnum about 11 years ago

Updated by Samuel Just about 11 years ago

Updated by Samuel Just about 11 years ago

Updated by Samuel Just about 11 years ago

Updated by Samuel Just about 11 years ago