Project

General

Profile

Actions

Bug #3859

closed

osd_client: define ceph_osdc_clear_request_linger()

Added by Alex Elder over 11 years ago. Updated almost 11 years ago.

Status:
Resolved
Priority:
High
Assignee:
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

There is a ceph_osdc_set_request_linger() function that
sets a flag on a request and takes an additional reference.
There is no corresponding ceph_osdc_clear_request_linger()
function.

Instead, there is only ceph_osdc_unregister_linger_request(),
which clears the flag and drops the reference, but also
unregisters the request.

A linger request only gets registered after it has completed
(and in fact, not until the "safe" ONDISK completion).

OK, a few observations:
- There is no simple interface that allows one to wait for
the ONDISK completion event; I think there should be a
ceph_osdc_wait_request_safe() function, and that is what
registering a watch request should be using.
- There is no ceph_osdc_clear_request_linger(), and there
should be. Callers of the corresponding set routine should
be able to use that to clean up state, until the point at
which it is known complete (the wait succeeds), and at
that point ceph_osdc_unregister_linger_request() can
be used.

I started implementing this, and it should have been easy,
but the ceph_osdc_wait_request() does some stuff I don't
think it should when wait_for_completion_interruptible()
gets interrupted (http://tracker.newdream.net/issues/3858)
and I didn't want to go that far with it right now.

Actions #1

Updated by Ian Colle about 11 years ago

  • Project changed from Linux kernel client to rbd
Actions #2

Updated by Alex Elder almost 11 years ago

  • Priority changed from Normal to High
Actions #3

Updated by Alex Elder almost 11 years ago

  • Assignee set to Alex Elder
Actions #4

Updated by Alex Elder almost 11 years ago

I have implemented a change that waits for a WATCH
request (as well as "normal" data write requests)
to get an indication the request is safely completed
before considering them done in rbd:
http://tracker.ceph.com/issues/5146

Because of the way I implemented it, there's no need
in rbd for a special "ceph_osdc_wait_request_safe()"
function.

Actions #5

Updated by Alex Elder almost 11 years ago

As described initially, it's not really valid to
call ceph_osdc_unregister_linger_request() until
after the original request marked to linger has
completed successfully. That is, if an error is
returned by rbd_obj_request_wait() (because an
interrupt occurred before the request completed)
we really need to unset the linger flag as part
of error processing.

The reason this matters is ceph_osdc_set_request_linger()
takes a reference to the request.

Actions #6

Updated by Alex Elder almost 11 years ago

Once again, rather than doing what I thought might work,
I've decided on a better fix.

Right now the osd client takes a reference to a request
when it's asked to mark it as a lingering request. But
it doesn't actually refer to the request at that point.

Instead, take the request when it's actually getting
registered, after it's known the osd has acknowledged
the lingering (watch) request is complete.

Actions #7

Updated by Alex Elder almost 11 years ago

  • Status changed from New to In Progress
Actions #8

Updated by Alex Elder almost 11 years ago

  • Status changed from In Progress to Fix Under Review

The following patch has been posted for review. It's one of three
new patches available in the "review/wip-rbd" branch of the
ceph-client git repository.

[PATCH] libceph: add lingering request reference when registered

Actions #9

Updated by Alex Elder almost 11 years ago

  • Status changed from Fix Under Review to Resolved

The following has been committed to the ceph-client
"testing" branch:

ebd8324 libceph: add lingering request reference when registered
Actions

Also available in: Atom PDF