Project

General

Profile

Actions

Bug #18331

closed

RGW leaking data

Added by Matas Tvarijonas over 7 years ago. Updated almost 7 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
RGW leak data
Backport:
jewel, kraken
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hello, we have leaking data at 50GB / hour on our ceph cluster 10.2.2

We use this command to calculate real usage
  1. radosgw-admin bucket stats | grep '"size_kb":' | awk '{print $2}' | sed 's/.$//' | paste -sd+ | bc

24912659802

This is about 28TB of used storage and our replication factor is 3, so it is near 90TB of RAW storage.

And we are checking pool size via:

  1. ceph df detail
    GLOBAL:
    SIZE AVAIL RAW USED %RAW USED OBJECTS
    507T 194T 312T 61.63 41756k
    POOLS:
    NAME ID CATEGORY QUOTA OBJECTS QUOTA BYTES USED %USED MAX AVAIL OBJECTS DIRTY READ WRITE RAW USED
    data 0 - N/A N/A 240G 0.14 15492G 61613 61613 411k 194k 722G
    .rgw.root 13 - N/A N/A 3397 0 15492G 16 15 26007k 928 10191
    .rgw.control 14 - N/A N/A 0 0 15492G 8 3 0 0 0
    .rgw 15 - N/A N/A 297k 0 15492G 1273 1273 106M 21640k 891k
    .rgw.gc 16 - N/A N/A 0 0 15492G 512 492 374M 450M 0
    .users.uid 17 - N/A N/A 31695 0 15492G 145 145 72228k 22828k 95085
    .users 18 - N/A N/A 2425 0 15492G 80 79 81628 224 7275
    .users.email 19 - N/A N/A 1861 0 15492G 49 49 21 88 5583
    .rgw.buckets.index 20 - N/A N/A 0 0 15492G 979 979 1595M 744M 0
    .log 21 - N/A N/A 2423 0 15492G 321 320 39877k 103M 7269
    .rgw.buckets 22 - N/A N/A 74569G 43.09 15492G 34667278 33854k 4485M 1336M 218T
    cinder_backups 25 - N/A N/A 0 0 15492G 0 0 0 0 0
    cinder_volumes 26 - N/A N/A 18664G 10.78 15492G 4720798 4610k 15087M 335G 55992G
    nova_root 30 - N/A N/A 9349G 5.40 15492G 1259549 1230k 27544M 161G 28047G
    glance_images 33 - N/A N/A 3102G 1.79 15492G 397580 388k 58006k 947k 9306G
    migration 35 - N/A N/A 136 0 15492G 2 2 5 2 408
    .users.swift 37 - N/A N/A 107 0 15492G 8 8 6 14 321
    default.rgw.meta 38 - N/A N/A 475M 0 15492G 1648259 1609k 0 5494k 1425M
    default.rgw.buckets.non-ec 39 - N/A N/A 0 0 15492G 15 15 18064 315k 0
    .rgw.root.161220 40 - N/A N/A 7315 0 15492G 21 21 0 21 21945

So:

.rgw.buckets 22 - N/A N/A 74569G 43.09 15492G 34667278 33854k 4485M 1336M 218T

This is 218 TB used RAW space, so we are missing ~130 TB of RAW space.

Currently we leak 50GB /hour.

Very similar case is discussed here: http://lists.opennebula.org/pipermail/ceph-users-ceph.com/2014-November/045037.html

Please look at this case.


Related issues 3 (0 open3 closed)

Has duplicate rgw - Bug #18258: rgw: radosgw-admin orphan find goes into infinite loopDuplicateMatt Benjamin12/15/2016

Actions
Copied to rgw - Backport #18827: jewel: RGW leaking dataResolvedMatt BenjaminActions
Copied to rgw - Backport #19047: kraken: RGW leaking dataResolvedNathan CutlerActions
Actions

Also available in: Atom PDF