Bug #5197
closedBucket shows up when listing buckets but does not exist anywhere else.
0%
Description
There is a bucket which shows up when buckets are listed through the api but exists nowhere else. We need to get this bucket properly deleted and ideally isolate how it came into this state to prevent it in the future.
Updated by Greg Farnum almost 11 years ago
This was an empty bucket created under argonaut. It was deleted normally while an argonaut->bobtail upgrade was "in progress" — most likely a bobtail RADOS cluster but an argonaut RGW daemon.
When the user discovered that the bucket was still showing up in listing, he ran "radosgw-admin bucket rm", and it kept showing up. Then he tried "radosgw-admin user check", which noticed the issue, then ran it with the fix option. That didn't resolve it either.
Updated by Greg Farnum almost 11 years ago
- Status changed from New to In Progress
Looking at the cluster indicates that indeed, there's an orphaned omap entry on the <user>.buckets object, that doesn't correspond to an existing bucket object. It should be easy enough to clean up by removing the omap entry, but I'd like to go through the code paths a bit and figure out why it didn't get cleaned, and look a little bit for likely causes.
Updated by Greg Farnum almost 11 years ago
- Status changed from In Progress to Resolved
Okay, so the bucket rm didn't work because the object's not on disk, so the initial stat fails, and the radosgw-admin tool doesn't have a way to look up the owner of the bucket so there's not really a chance to remove the omap.
The user fix failed to remove the bucket because it's grabbing the bucket info via RGWRados::get_bucket_info(), and that function actually lies if the bucket object doesn't exist (it default initializes the bucket object using the passed-in name). This is fixed in a wip branch right now; I'm not sure if we can backport it or not but I'll take a look. So the user fix "fixed" it by setting the values in the index for that bucket to whatever an rgw_bucket object is default-initialized to.
I haven't looked for other failure modes, but bucket removal is not an atomic process and we don't have any guards for it. I've created feature request #5218 for that.
Updated by Greg Farnum almost 11 years ago
And #5219 covers the "user check" not cleaning up.