Bug #8438: erasure code: object are not cleanup - Ceph - Ceph

Actions

Copy link

Bug #8438

closed

erasure code: object are not cleanup

Added by Sébastien Han almost 10 years ago. Updated over 9 years ago.

Status:

Resolved

Priority:

Urgent

Assignee:

Samuel Just

Category:

Target version:

% Done:

Source:

Community (user)

Tags:

Backport:

Firefly

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

While playing with the erasure code, I noticed that erasure coded objects were not cleaned up.
My setup:

Debian Wheezy 7.4 with Kernel 3.10
Ceph 0.80.1

How to reproduce:

$ sudo ceph osd pool create ec 1024 1024 erasure default
$ dd if=/dev/zero of=ec bs=100M count=1
$ sudo ceph -s
cluster 92775394-e114-40f5-8789-cd96220579f0
health HEALTH_OK
monmap e1: 1 mons at {ceph001=10.143.114.185:6789/0}, election epoch 2, quorum 0 ceph001
mdsmap e4: 1/1/1 up {0=ceph002=up:active}, 4 up:standby
osdmap e50: 29 osds: 29 up, 29 in
pgmap v118: 2240 pgs, 5 pools, 1884 bytes data, 20 objects
1765 MB used, 26771 GB / 26773 GB avail
2240 active+clean
$ sudo ceph df
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
26773G 26771G 1765M 0
POOLS:
NAME ID USED %USED OBJECTS
data 0 0 0 0
metadata 1 1884 0 20
rbd 2 0 0 0
leseb 3 0 0 0
ec 4 0 0 0
$ rados -p ec put ec ec

$ sudo ceph -s
cluster 92775394-e114-40f5-8789-cd96220579f0
health HEALTH_OK
monmap e1: 1 mons at {ceph001=10.143.114.185:6789/0}, election epoch 2, quorum 0 ceph001
mdsmap e4: 1/1/1 up {0=ceph002=up:active}, 4 up:standby
osdmap e50: 29 osds: 29 up, 29 in
pgmap v119: 2240 pgs, 5 pools, 100 MB data, 21 objects
1796 MB used, 26771 GB / 26773 GB avail
2240 active+clean
client io 359 kB/s wr, 0 op/s
$ sudo ceph df
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
26773G 26771G 1796M 0
POOLS:
NAME ID USED %USED OBJECTS
data 0 0 0 0
metadata 1 1884 0 20
rbd 2 0 0 0
leseb 3 0 0 0
ec 4 102400k 0 1

Everything is fine.
Now deleting the object:

$ rados -p ec rm ec

$ sudo ceph df
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
26773G 26770G 2322M 0
POOLS:
NAME ID USED %USED OBJECTS
data 0 0 0 0
metadata 1 1884 0 20
rbd 2 0 0 0
leseb 3 0 0 0
ec 4 0 0 0

Everything looks good, however:

$ sudo ceph osd map ec ec
osdmap e50 pool 'ec' (4) object 'ec' -> pg 4.f79165da (4.1da) -> up ([28,26,11], p28) acting ([28,26,11], p28)

root@ceph003:/var/lib/ceph/osd/ceph-11/current/4.1das2_head# ls alh
total 51M
drwxr-xr-x 2 root root 38 May 26 16:48 .
drwxr-xr-x 245 root root 8.0K May 26 16:04 ..
-rw-r--r- 1 root root 50M May 26 16:46 ec__head_F79165DA__4_80_2

root@ceph006:/var/lib/ceph/osd/ceph-28/current/4.1das0_head# ls lah
total 51M
drwxr-xr-x 2 root root 38 May 26 16:48 .
drwxr-xr-x 248 root root 8.0K May 26 16:04 ..
-rw-r--r- 1 root root 50M May 26 16:46 ec__head_F79165DA__4_80_0

root@ceph005:/var/lib/ceph/osd/ceph-26/current/4.1das1_head# ls alhtotal 51M
drwxr-xr-x 2 root root 38 May 26 16:48 .
drwxr-xr-x 231 root root 8.0K May 26 16:04 ..
-rw-r--r- 1 root root 50M May 26 16:46 ec__head_F79165DA__4_80_1

Am I missing something?
Btw I tried with a replicated pool and I don't have this problem.

Cheers.

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Updated by Sébastien Han almost 10 years ago

Just tried to run a deep-scrub but this didn't trigger any deletion...

$ sudo ceph pg deep-scrub 4.1da
instructing pg 4.1da on osd.28 to deep-scrub

Actions

Copy link

Updated by Samuel Just almost 10 years ago

This may be correct-ish behavior. The ec__head_F79165DA__4_80_1 file/object will be removed when the log entry at version 80 is removed from the pg log. Unfortunately, pgs by default keep 3k log entries, so these old object versions tend to stick around for a while.

We can probably trim these entries up to last_complete instead of log_tail.

Actions

Copy link

Updated by Samuel Just almost 10 years ago

Status changed from New to 12

Actions

Copy link

Updated by Samuel Just almost 10 years ago

Status changed from 12 to In Progress
Assignee set to Samuel Just

Oops, int

Actions

Copy link

Updated by Sébastien Han almost 10 years ago

Why is the behaviour different with replicated pools? The object's deletion is instantaneous.

Ok so let's imagine I remove 10G objects and that they stay around for a while. I believe that I can possibly end up in a situation where the disk is near-full or full but Ceph thinks that there is still space available.

Is this assumption correct?

Is it a problem to decrease this value? https://github.com/ceph/ceph/blob/firefly/src/common/config_opts.h#L527

Actions

Copy link