Project

General

Profile

Actions

Bug #12254

closed

krbd image watch

Added by Xinze Chi almost 9 years ago. Updated almost 8 years ago.

Status:
Closed
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

there are 3 osd3, we create a rbd image named "image_xxx" in rbd pool.
using "rbd map image_xxx -p rbd", so rados listwatchers image_xxx -p rbd would return something tell us there are a watcher for image_xxx.
so we stop the primary for object image_xxx.rbd, such as osd.a.

so rados listwatchers image_xxx -p rbd would return someting like above.

And we restart osd.a and wait for about 30s, rados listwatchers image_xxx -p rbd would return emtpy.

Summary, if we stop primary osd.a and start it again after some seconds, image watcher would be delete by osd.a.


Related issues 1 (0 open1 closed)

Related to Linux kernel client - Feature #9779: libceph: sync up with objecterResolvedIlya Dryomov10/14/2014

Actions
Actions #1

Updated by Zheng Yan almost 9 years ago

  • Assignee set to Ilya Dryomov

__kick_linger_request() is triggered by osd_reset(). let's assume the rbd head is mapped to [osd0, osd1].

When stopping osd0, the kclient calls osd_reset(). osd_reset() calls __kick_linger_request() to re-send the watch request to osd1.

When restarting osd0, the connection between kclient and osd1 is not reset. So the kclient does not send watch to osd0. This will cause the corresponding watcher on osd0 timeout.

Actions #2

Updated by Ilya Dryomov almost 9 years ago

  • Status changed from New to Closed

Yeah, that's because we are not subscribing to new osdmaps in all the cases we should. If you do any I/O after you bring osd0 back up we will get a new osdmap and reestablish the watch on osd0. If your client is idle, the same will happen after userspace resets the connection after idle timeout. I'm working on #9779 which will resolve this - linking this ticket as another test case.

Actions #3

Updated by Ilya Dryomov almost 8 years ago

Fixed with #9779 in 4.7.

Actions

Also available in: Atom PDF