Project

General

Profile

Bug #36344

radosgw index has been inconsistent with reality

Added by Yang Yang 2 months ago. Updated about 2 months ago.

Status:
Need More Info
Priority:
Normal
Assignee:
-
Target version:
Start date:
10/08/2018
Due date:
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:

Description

Background:

Ceph version 12.2.4 (52085d5249a80c5f5121a76d6288429f35e4e77b) luminous (stable)
Index pool is on ssd.
There is a very big bucket with more than 10 million object and 500TB data.
Ceph health is OK.

Describe:

When use s3 list_object() to list, some uploaded object can not be listed and some uploaded object have an old lastModified time.
But at the same time, we can get this object by an exact key. And if I put a new object into this bucket, it can be listed.
It seems that some indexes during a period of time have been lost.

Some else:

I found that one bucket will have many indexes, and we can use "radosgw-admin metadata list bucket.instance | grep "{bucket name}" to show them. But I can not found a doc to describe this feature. And we can use "adosgw-admin bucket stats --bucket {bucket_name}" to get id as the active instance id.
I use "rados listomapkeys" at active index to get all object in a index, it is really lost. But when I use "rados listomapkeys" at another index which is not active as mentioned above, I found the lost object index.

Question:

Why my index lost?
How to recover?
Why radosgw has many index instances?


Related issues

Duplicated by rgw - Bug #36343: radosgw index has been inconsistent with reality Duplicate 10/08/2018

History

#1 Updated by Nathan Cutler 2 months ago

  • Duplicated by Bug #36343: radosgw index has been inconsistent with reality added

#2 Updated by Abhishek Lekshmanan 2 months ago

with 10 mill. objects it is likely that the bucket has been resharded. so listomapkeys on the old index/shard may not get the desired key. Are you not able to see an object via the end user api or is it purely from rados listomapkeys that hte requisite key is not found (in case of resharding the key may be on another index object). Can you also try the question in the mailing lists?

#3 Updated by Abhishek Lekshmanan 2 months ago

  • Status changed from New to Need More Info

#4 Updated by Yang Yang about 2 months ago

Abhishek Lekshmanan wrote:

with 10 mill. objects it is likely that the bucket has been resharded. so listomapkeys on the old index/shard may not get the desired key. Are you not able to see an object via the end user api or is it purely from rados listomapkeys that hte requisite key is not found (in case of resharding the key may be on another index object). Can you also try the question in the mailing lists?

Resharding is within my consideration. Listomapkeys means do this action on all shards(more than 300).
In my understanding, a big bucket has one latest index and many old indexes. Every index has many shards. So listomapkeys on a index means listomapkeys on many shards.

I will try to use mail list to describe this question.

Also available in: Atom PDF