Project

General

Profile

Bug #40700

memory usage of: radosgw-admin bucket rm

Added by Harald Staub 2 months ago. Updated 4 days ago.

Status:
Pending Backport
Priority:
High
Assignee:
Target version:
Start date:
07/09/2019
Due date:
% Done:

0%

Source:
Tags:
Backport:
nautilus,mimic,luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:

Description

Cluster is Nautilus 14.2.1, 500 OSDs with BlueStore. Both of the RadosGW pools that are involved here (for data and for index) are replicated and without SSDs.

Steps that led to the problem:

1. There is a bucket $BIG_BUCKET with about 60 M objects, with 1024 shards.

2. radosgw-admin bucket rm --bucket=$BIG_BUCKET --bypass-gc --purge-objects

3. After several hours, the removal command was killed by the out-of-memory killer. Then looking at the graphs, we see a continuous increase of memory usage for this process, about +24 GB per day. Removal rate is about 3 M objects per day.

So with this bucket with 60 M objects, we would need about 480 GB of RAM to come through.

Expected behaviour:

Bucket removal with radosgw-admin should work with a somewhat limited amount of memory, also with buckets with lots of objects.

Some additional information:

The killed remove command can just be called again, but it will be killed again before it finishes. Also, it has to run some time until it continues to actually remove objects. This "wait time" is also increasing. Last time, after about 16 M objects already removed, the wait time was nearly 9 hours. Also during this time, there is a memory ramp, but not so steep.

Harry


Related issues

Copied to rgw - Backport #41858: nautilus: memory usage of: radosgw-admin bucket rm New
Copied to rgw - Backport #41859: mimic: memory usage of: radosgw-admin bucket rm New
Copied to rgw - Backport #41860: luminous: memory usage of: radosgw-admin bucket rm New

History

#1 Updated by Casey Bodley 2 months ago

  • Priority changed from Normal to High

#2 Updated by Paul Emmerich 2 months ago

I've also got two clusters here with this problem, one is running 14.2.1 (50M objects in a bucket) and one 13.2.5 (450M objects in a bucket).

Looks like radosgw-admin uses libc malloc, so it's hard to say what the memory is being used for

#3 Updated by Casey Bodley 2 months ago

  • Status changed from New to Verified
  • Assignee set to Eric Ivancich

#4 Updated by Casey Bodley about 2 months ago

  • Assignee changed from Eric Ivancich to Mark Kogan

#5 Updated by Mark Kogan about 2 months ago

Investigating this issue,
it is possible to alleviate the "wait time" increasing incrementally after each iteration of

radosgw-admin bucket rm --bucket=$BIG_BUCKET --bypass-gc --purge-objects

by running
radosgw-admin bucket check --bucket=$BIG_BUCKET --fix

between each itteration of radosgw-admin bucket rm operations.

#6 Updated by Eric Ivancich about 2 months ago

That's interesting, Mark!

So the bucket index is left in an unsynchronized state (i.e., original state) when bucket removal is terminated part-way through. And then when bucket removal is restarted, it begins by trying to re-remove those same objects at the head of the bucket index all over again, causing a delay before forward progress is made.

Since the bucket removal is generally expected to complete, there "should" be no need to update the bucket index at "check-points" during the bucket removal process.

If terminating bucket removal is semi-expected (possibly through manual admin intervention), it seems that updating the index after every 100,000 to 1,000,000 objects is removed would mitigate this, without creating a lot of overhead.

And would there be any benefit to removing the objects from back to front in the bucket index? In other words, is there an easy way to truncate the index of its tail members, making the update of the bucket index quick?

#7 Updated by Mark Kogan 14 days ago

  • Pull request ID set to 30174

#8 Updated by Mark Kogan 14 days ago

Update -
found the source of the memory growth:

src/rgw/rgw_rados.cc

RGWObjState *RGWObjectCtx::get_state(const rgw_obj& obj) {
  RGWObjState *result;
  typename std::map<rgw_obj, RGWObjState>::iterator iter;
  lock.lock_shared();
  assert (!obj.empty());
  iter = objs_state.find(obj);
  if (iter != objs_state.end()) {
    result = &iter->second;
    lock.unlock_shared();
  } else {
    lock.unlock_shared();
    lock.lock();
    result = &objs_state[obj];    <--------------
    lock.unlock();
  }
  return result;
}

Submitted proposed fix PR.

#9 Updated by Eric Ivancich 14 days ago

  • Status changed from Verified to Need Test
  • Target version set to v15.0.0
  • Backport set to nautilus,mimic,luminous

#10 Updated by Abhishek Lekshmanan 7 days ago

  • Status changed from Need Test to Testing

#11 Updated by Eric Ivancich 4 days ago

  • Status changed from Testing to Pending Backport

#12 Updated by Nathan Cutler 3 days ago

  • Copied to Backport #41858: nautilus: memory usage of: radosgw-admin bucket rm added

#13 Updated by Nathan Cutler 3 days ago

  • Copied to Backport #41859: mimic: memory usage of: radosgw-admin bucket rm added

#14 Updated by Nathan Cutler 3 days ago

  • Copied to Backport #41860: luminous: memory usage of: radosgw-admin bucket rm added

Also available in: Atom PDF