Project

General

Profile

Actions

Bug #5197

closed

Bucket shows up when listing buckets but does not exist anywhere else.

Added by JuanJose Galvez almost 11 years ago. Updated almost 11 years ago.

Status:
Resolved
Priority:
High
Assignee:
Target version:
-
% Done:

0%

Source:
Support
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

There is a bucket which shows up when buckets are listed through the api but exists nowhere else. We need to get this bucket properly deleted and ideally isolate how it came into this state to prevent it in the future.

Actions #1

Updated by Anonymous almost 11 years ago

  • Priority changed from Normal to High
Actions #2

Updated by Yehuda Sadeh almost 11 years ago

  • Assignee set to Greg Farnum
Actions #3

Updated by Greg Farnum almost 11 years ago

This was an empty bucket created under argonaut. It was deleted normally while an argonaut->bobtail upgrade was "in progress" — most likely a bobtail RADOS cluster but an argonaut RGW daemon.
When the user discovered that the bucket was still showing up in listing, he ran "radosgw-admin bucket rm", and it kept showing up. Then he tried "radosgw-admin user check", which noticed the issue, then ran it with the fix option. That didn't resolve it either.

Actions #4

Updated by Greg Farnum almost 11 years ago

  • Status changed from New to In Progress

Looking at the cluster indicates that indeed, there's an orphaned omap entry on the <user>.buckets object, that doesn't correspond to an existing bucket object. It should be easy enough to clean up by removing the omap entry, but I'd like to go through the code paths a bit and figure out why it didn't get cleaned, and look a little bit for likely causes.

Actions #5

Updated by Yehuda Sadeh almost 11 years ago

  • Project changed from Ceph to rgw
Actions #6

Updated by Yehuda Sadeh almost 11 years ago

  • Description updated (diff)
Actions #7

Updated by Greg Farnum almost 11 years ago

  • Status changed from In Progress to Resolved

Okay, so the bucket rm didn't work because the object's not on disk, so the initial stat fails, and the radosgw-admin tool doesn't have a way to look up the owner of the bucket so there's not really a chance to remove the omap.

The user fix failed to remove the bucket because it's grabbing the bucket info via RGWRados::get_bucket_info(), and that function actually lies if the bucket object doesn't exist (it default initializes the bucket object using the passed-in name). This is fixed in a wip branch right now; I'm not sure if we can backport it or not but I'll take a look. So the user fix "fixed" it by setting the values in the index for that bucket to whatever an rgw_bucket object is default-initialized to.

I haven't looked for other failure modes, but bucket removal is not an atomic process and we don't have any guards for it. I've created feature request #5218 for that.

Actions #8

Updated by Greg Farnum almost 11 years ago

And #5219 covers the "user check" not cleaning up.

Actions

Also available in: Atom PDF