Bug #44731: Space leak in Bluestore - bluestore - Ceph

Actions

Copy link

Bug #44731

closed

Space leak in Bluestore

Added by Vitaliy Filippov about 4 years ago. Updated over 3 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

2 - major

Reviewed:

Affected Versions:

Ceph - v14.2.8

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

Hi.

I'm experiencing some kind of a space leak in Bluestore. I use EC, compression and snapshots. First I thought that the leak was caused by "virtual clones" (issue #38184). However, then I got rid of most of the snapshots, but continued to experience the problem.

I suspected something when I added a new disk to the cluster and free space in the cluster didn't increase (!).

So to track down the issue I moved one PG (34.1a) using upmaps from osd11,6,0 to osd6,0,7 and then back to osd11,6,0.

It ate +59 GB after the first move and +51 GB after the second. As I understand this proves that it's not #38184. Devirtualizaton of virtual clones couldn't eat additional space after SECOND rebalance of the same PG.

The PG has ~39000 objects, it is EC 2+1 and the compression is enabled. Compression ratio is about ~2.7 in my setup, so the PG should use ~90 GB raw space.

Before and after moving the PG I stopped osd0, mounted it with ceph-objectstore-tool with debug bluestore = 20/20 and opened the 34.1a***/all directory. It seems to dump all object extents into the log in that case. So now I have two logs with all allocated extents for osd0 (I hope all extents are there). I parsed both logs and added all compressed blob sizes together ("get_ref Blob ... 0x20000 -> 0x... compressed"). But they add up to ~39 GB before first rebalance (34.1as2), ~22 GB after it (34.1as1) and ~41 GB again after the second move (34.1as2) which doesn't indicate a leak.

But the raw space usage still exceeds initial by a lot. So it's clear that there's a leak somewhere.

What additional details can I provide for you to identify the bug?

Actions

Copy link

Updated by Greg Farnum about 4 years ago

Project changed from Ceph to RADOS

Actions

Copy link

Updated by Neha Ojha about 4 years ago

Project changed from RADOS to bluestore

Actions

Copy link

Updated by Vitaliy Filippov over 3 years ago

An update here: this was caused by broken compression in Ubuntu builds around 14.2.6-7-8 or so. Old data was compressed, but it was becoming uncompressed when buggy OSDs were rebalancing it. This resulted in an apparent "leak". Now the issue is gone.

Actions

Copy link

Updated by Neha Ojha over 3 years ago

Status changed from New to Closed

Doesn't look like a Ceph issue.

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » bluestore

Custom queries

Bug #44731

Space leak in Bluestore

Updated by Greg Farnum about 4 years ago

Updated by Neha Ojha about 4 years ago

Updated by Vitaliy Filippov over 3 years ago

Updated by Neha Ojha over 3 years ago