Bug #63560
closedretry_raced_bucket_write considerations
0%
Description
Updating bucket's metadata concurrently by two or more threads is allowed in radosgw.
There is a retry mechanism: retry_raced_bucket_write(), that expects the bucket references to fetch the latest bucket's state from the persistent store and check that it is equals to the cached bucket's state.
In the case the cached and the stored states are not equal the updating operation wrapped by the retry_raced_bucket_write() is retried.
the wrapped operation returns -ECANCELED to inform retry_raced_bucket_write() to retry.
the two methods used by this mechanism are:
- merge_and_store_attrs
- put_info
merge_and_store_attrs() can confront the cached state of the bucket with the stored one because the modifications are passed with the second parameter: rgw::sal::Attrs& attrs.
So the method can make a confrontation of the current state in memory before applying the delta (attrs) with the stored state.
put_info() on the contrary, can't, because it hasn't a RGWBucketInfo param with an associated "delta".
The get_info() method is used to access the current RGWBucketInfo property of the bucket and that is used to make modifications over that property.
In my opinion the put_info() can never say if its image in memory is good to be flushed into the persistent storage because that comparison will always result different.
I'd guess that put_info() should have an interface like merge_and_store_attrs() with a parameter containing the delta (the new RGWBucketInfo).
What do you think about this?
Updated by Daniel Gryniewicz 5 months ago
In principle, you are probably correct. However, all the cases where put_info() uses retry_raced_bucket_write(), the lambda does all the modifications to obj->info before calling put_info(), so it works correctly. In addition, none of the caller have (or have ever had) a separate place to store the bucket_info, and I'm not sure adding such storage will gain us much. So, I think this is a fine way of doing this. Is there a case where this doesn't work?
Updated by Daniel Gryniewicz 5 months ago
If you'd prefer to discuss this in person, feel free to come to one of the Wed. refactoring meetings (on the community calendar) and we can discuss it on the call.
Updated by Giuseppe Baccini 5 months ago
Thanks Daniel to have looked into this; unfortunately I'm currently in the position to not be able to follow this issue.
I cannot say now, if I will have time in the future to follow this.
Up to you, we can leave this opened or we can close.
Updated by Casey Bodley 5 months ago
- Status changed from New to Can't reproduce
thanks Giuseppe
Daniel Gryniewicz wrote:
In principle, you are probably correct. However, all the cases where put_info() uses retry_raced_bucket_write(), the lambda does all the modifications to obj->info before calling put_info(), so it works correctly. In addition, none of the caller have (or have ever had) a separate place to store the bucket_info, and I'm not sure adding such storage will gain us much. So, I think this is a fine way of doing this. Is there a case where this doesn't work?
based on Dan's summary, it sounds like retry_raced_bucket_write()
is being used correctly and there's nothing to fix