Bug #10784
librbd: image has watchers - not removing
Status:
Resolved
Priority:
Urgent
Assignee:
David Zafman
Category:
-
Target version:
-
% Done:
0%
Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
2015-02-05T12:45:10.552 INFO:tasks.workunit.client.0.plana31.stderr:blacklisting 10.214.131.9:0/1008981 until 2015-02-05 13:45:09.563070 (3600 sec) 2015-02-05T12:45:10.566 INFO:tasks.workunit.client.0.plana31.stderr:+ wait 8981 2015-02-05T12:45:10.566 INFO:tasks.workunit.client.0.plana31.stderr:+ rbdrw_exitcode=108 2015-02-05T12:45:10.566 INFO:tasks.workunit.client.0.plana31.stderr:+ '[' 108 '!=' 108 ']' 2015-02-05T12:45:10.566 INFO:tasks.workunit.client.0.plana31.stderr:+ echo 'rbdrw stopped with ESHUTDOWN' 2015-02-05T12:45:10.567 INFO:tasks.workunit.client.0.plana31.stderr:+ set -e 2015-02-05T12:45:10.567 INFO:tasks.workunit.client.0.plana31.stderr:+ ceph osd blacklist rm 10.214.131.9:0/1008981 2015-02-05T12:45:10.567 INFO:tasks.workunit.client.0.plana31.stdout:rbdrw stopped with ESHUTDOWN 2015-02-05T12:45:12.342 INFO:tasks.workunit.client.0.plana31.stderr:10.214.131.9:0/1008981 isn't blacklisted 2015-02-05T12:45:12.355 INFO:tasks.workunit.client.0.plana31.stderr:+ rbd lock remove rbdrw-image rbdrw client.4120 2015-02-05T12:45:12.503 INFO:tasks.workunit.client.0.plana31.stderr:+ sleep 30 2015-02-05T12:45:42.505 INFO:tasks.workunit.client.0.plana31.stderr:+ rbd rm rbdrw-image 2015-02-05T12:45:42.645 INFO:tasks.workunit.client.0.plana31.stderr:2015-02-05 12:45:42.655600 7fd3e3acc840 -1 librbd: image has watchers - not removing 2015-02-05T12:45:42.674 INFO:tasks.workunit.client.0.plana31.stderr: Removing image: 0% complete...failed. 2015-02-05T12:45:42.674 INFO:tasks.workunit.client.0.plana31.stderr:rbd: error: image still has watchers 2015-02-05T12:45:42.674 INFO:tasks.workunit.client.0.plana31.stderr:This means the image is still open or the client using it crashed. Try again after closing/unmapping it or waiting 30s for the crashed client to timeout.
Associated revisions
osd: Update object state after removing watch from object info
Fixes: #10784
Signed-off-by: David Zafman <dzafman@redhat.com>
History
#1 Updated by Jason Dillaman about 9 years ago
- Project changed from rbd to Ceph
Upon a client crash (or blacklist), the OSD doesn't appear to ever remove the dead client from the list of watchers. Easy to reproduce by running "rbd watch <image>", kill the client, then run "rbd status <image>" to see the current watchers.
#2 Updated by Samuel Just about 9 years ago
- Priority changed from Normal to Urgent
#3 Updated by Sage Weil about 9 years ago
- Project changed from Ceph to rbd
#4 Updated by Sage Weil about 9 years ago
- Project changed from rbd to Ceph
#5 Updated by Sage Weil about 9 years ago
- Assignee set to David Zafman
#6 Updated by David Zafman about 9 years ago
- Status changed from New to Fix Under Review
#7 Updated by David Zafman about 9 years ago
Caused by Sage's change 1c6944f79: osd/ReplicatedPG: do watch effects only when change commits
In handle_watch_timeout() directly changing obs.oi
obc->obs.oi.watchers.erase(make_pair(watch->get_cookie(), watch->get_entity()));
now changes the new_obs.oi
object_info_t& oi = ctx->new_obs.oi;
oi.watchers.erase(make_pair(watch->get_cookie(),
watch->get_entity()));
#8 Updated by David Zafman about 9 years ago
- Status changed from Fix Under Review to 7
#9 Updated by David Zafman about 9 years ago
- Status changed from 7 to Resolved
#10 Updated by Loïc Dachary about 9 years ago
- Status changed from Resolved to Pending Backport
- Backport set to firefly
- Severity changed from 3 - minor to 1 - critical
firefly backport https://github.com/ceph/ceph/pull/3830
#11 Updated by Loïc Dachary about 9 years ago
- Status changed from Pending Backport to Resolved
- Backport deleted (
firefly) - Severity changed from 1 - critical to 3 - minor
Wrong interpretation on my part, revert back to the previous state.