Bug #21809
Raw Used space is 70x higher than actually used space (maybe orphaned objects from pool deletion)
0%
Description
Hi,
I've had a pool named vm-0d29db27 which has used round about 3T of storage.
I wanted to purge this pool and adjust pgs, so I deleted the entire pool.
After a few minutes I recreated this pool and checked for the cluster usage:
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
17472G 16731G 740G 4.24
POOLS:
NAME ID USED %USED MAX AVAIL OBJECTS
vm-0d29db27 3 8317M 0.05 5149G 2086
os 4 1259M 0 5149G 323
As you can see, there are 740G marked as raw used space - but in fact there are only round about 10G used by the pools.
It seems to me that the deletion was somehow aborted.
Now - how can I free up this orphaned space?
I'm using:
CentOS Linux release 7.4.1708 (Core)
ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)
I'm using bluestore and having WAL and BlockDB on a separate SSD for each OSD disk.
No erasure coding used.
Here are some information:
Pools:
3 vm-0d29db27
4 os
Cluster:
cluster:
id: ***
health: HEALTH_OK
services:
mon: 3 daemons, quorum inf-d7a3ca,inf-30d985,inf-0a38f9
mgr: inf-0a38f9(active), standbys: inf-d7a3ca, inf-30d985
osd: 6 osds: 6 up, 6 in
rbd-mirror: 1 daemon active
data:
pools: 2 pools, 128 pgs
objects: 2409 objects, 9577 MB
usage: 740 GB used, 16731 GB / 17472 GB avail
pgs: 128 active+clean
OSD Usage:
ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS
1 hdd 2.91499 1.00000 2984G 196G 2788G 6.59 1.55 69
2 hdd 2.91499 1.00000 2984G 195G 2789G 6.56 1.55 59
4 hdd 2.81070 1.00000 2878G 91998M 2788G 3.12 0.74 67
5 hdd 2.81070 1.00000 2878G 91325M 2789G 3.10 0.73 61
0 hdd 2.80579 1.00000 2873G 86100M 2789G 2.93 0.69 56
3 hdd 2.80579 1.00000 2873G 86986M 2788G 2.96 0.70 72
TOTAL 17472G 740G 16731G 4.24
MIN/MAX VAR: 0.69/1.55 STDDEV: 1.68
OSD Pool Mapping:
pool : 4 3 | SUM
--------------------------------
osd.4 34 33 | 67
osd.5 30 31 | 61
osd.0 26 30 | 56
osd.1 33 36 | 69
osd.2 31 28 | 59
osd.3 38 34 | 72
--------------------------------
SUM : 192 192 |
Regards,
Yves
History
#1 Updated by Sage Weil over 6 years ago
- Status changed from New to Need More Info
deleting old pgs is asynchronous. unfortunately i don't think there are metrics reported to globally observe them.. i suspect that is the actual bug here. can you check again and see if the raw used space has dropped now?
#2 Updated by Yves Vogl over 6 years ago
Hi Sage,
unfortunately the RAW space has not dropped. It's at 2775G when the sum of both pools is at 700G. There's a replica set of 2 and no snapshots, so I guess there should not have been used more than ~1400G.
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
17472G 14697G 2775G 15.89
POOLS:
NAME ID USED %USED MAX AVAIL OBJECTS
vm-0d29db27 3 687G 4.94 4405G 176441
os 4 1267M 0 4405G 328
cluster:
id: **
health: HEALTH_OK
services:
mon: 3 daemons, quorum inf-d7a3ca,inf-30d985,inf-0a38f9
mgr: inf-0a38f9(active), standbys: inf-d7a3ca, inf-30d985
osd: 6 osds: 6 up, 6 in
rbd-mirror: 1 daemon active
data:
pools: 2 pools, 128 pgs
objects: 172k objects, 688 GB
usage: 2775 GB used, 14697 GB / 17472 GB avail
pgs: 128 active+clean
ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS
1 hdd 2.91499 1.00000 2984G 577G 2407G 19.35 1.22 69
2 hdd 2.91499 1.00000 2984G 493G 2491G 16.53 1.04 59
4 hdd 2.81070 1.00000 2878G 440G 2437G 15.30 0.96 67
5 hdd 2.81070 1.00000 2878G 416G 2461G 14.49 0.91 61
0 hdd 2.80579 1.00000 2873G 402G 2470G 14.01 0.88 56
3 hdd 2.80579 1.00000 2873G 444G 2428G 15.48 0.97 72
TOTAL 17472G 2775G 14697G 15.89
MIN/MAX VAR: 0.88/1.22 STDDEV: 1.75
#3 Updated by Sage Weil over 6 years ago
Can you go into the /var/lib/ceph/osd/ceph-NNN/current directory of one of the OSDs, and do a 'du -hs *', and attach the output? and also share 'ceph osd dump' so we can see how many osds there are and what the pool ids are. Thanks!
#4 Updated by Yves Vogl over 6 years ago
I'm sure if this helps as I'm using bluestore.
[root@inf-30d985 ceph-mirror-4]# ls -al
total 56
drwxr-xr-x. 2 ceph ceph 271 Oct 14 11:31 .
drwxr-x---. 4 ceph ceph 48 Oct 14 11:32 ..
-rw-r--r--. 1 root root 438 Oct 14 11:31 activate.monmap
-rw-r--r--. 1 ceph ceph 3 Oct 14 11:31 active
lrwxrwxrwx. 1 ceph ceph 58 Oct 14 11:29 block -> /dev/disk/by-partuuid/6ad2178e-ed6b-489a-a567-5cd8899c3456
lrwxrwxrwx. 1 ceph ceph 9 Oct 14 11:29 block.db -> /dev/sda4
-rw-r--r--. 1 ceph ceph 37 Oct 14 11:29 block.db_uuid
-rw-r--r--. 1 ceph ceph 37 Oct 14 11:29 block_uuid
-rw-r--r--. 1 ceph ceph 2 Oct 14 11:31 bluefs
-rw-r--r--. 1 ceph ceph 37 Oct 14 11:29 ceph_fsid
-rw-r--r--. 1 ceph ceph 37 Oct 14 11:29 fsid
-rw-------. 1 ceph ceph 56 Oct 14 11:31 keyring
-rw-r--r--. 1 ceph ceph 8 Oct 14 11:31 kv_backend
-rw-r--r--. 1 ceph ceph 21 Oct 14 11:29 magic
-rw-r--r--. 1 ceph ceph 4 Oct 14 11:31 mkfs_done
-rw-r--r--. 1 ceph ceph 6 Oct 14 11:31 ready
-rw-r--r--. 1 ceph ceph 0 Oct 14 11:31 systemd
-rw-r--r--. 1 ceph ceph 10 Oct 14 11:29 type
-rw-r--r--. 1 ceph ceph 2 Oct 14 11:31 whoami
[root@inf-30d985 ceph-mirror-4]# du -hs *
4.0K activate.monmap
4.0K active
0 block
0 block.db
4.0K block.db_uuid
4.0K block_uuid
4.0K bluefs
4.0K ceph_fsid
4.0K fsid
4.0K keyring
4.0K kv_backend
4.0K magic
4.0K mkfs_done
4.0K ready
0 systemd
4.0K type
4.0K whoami
[root@inf-30d985 ceph-mirror-4]# cd ..
[root@inf-30d985 osd]# cd ceph-mirror-5/
[root@inf-30d985 ceph-mirror-5]# ls -al
total 56
drwxr-xr-x. 2 ceph ceph 271 Oct 14 11:32 .
drwxr-x---. 4 ceph ceph 48 Oct 14 11:32 ..
-rw-r--r--. 1 root root 438 Oct 14 11:31 activate.monmap
-rw-r--r--. 1 ceph ceph 3 Oct 14 11:32 active
lrwxrwxrwx. 1 ceph ceph 58 Oct 14 11:30 block -> /dev/disk/by-partuuid/b85aeb01-09b1-44fe-9da2-3cdafd680019
lrwxrwxrwx. 1 ceph ceph 9 Oct 14 11:30 block.db -> /dev/sdb4
-rw-r--r--. 1 ceph ceph 37 Oct 14 11:30 block.db_uuid
-rw-r--r--. 1 ceph ceph 37 Oct 14 11:30 block_uuid
-rw-r--r--. 1 ceph ceph 2 Oct 14 11:31 bluefs
-rw-r--r--. 1 ceph ceph 37 Oct 14 11:30 ceph_fsid
-rw-r--r--. 1 ceph ceph 37 Oct 14 11:30 fsid
-rw-------. 1 ceph ceph 56 Oct 14 11:31 keyring
-rw-r--r--. 1 ceph ceph 8 Oct 14 11:31 kv_backend
-rw-r--r--. 1 ceph ceph 21 Oct 14 11:30 magic
-rw-r--r--. 1 ceph ceph 4 Oct 14 11:32 mkfs_done
-rw-r--r--. 1 ceph ceph 6 Oct 14 11:32 ready
-rw-r--r--. 1 ceph ceph 0 Oct 14 11:32 systemd
-rw-r--r--. 1 ceph ceph 10 Oct 14 11:30 type
-rw-r--r--. 1 ceph ceph 2 Oct 14 11:31 whoami
[root@inf-30d985 ceph-mirror-5]# du -hs *
4.0K activate.monmap
4.0K active
0 block
0 block.db
4.0K block.db_uuid
4.0K block_uuid
4.0K bluefs
4.0K ceph_fsid
4.0K fsid
4.0K keyring
4.0K kv_backend
4.0K magic
4.0K mkfs_done
4.0K ready
0 systemd
4.0K type
4.0K whoami
#5 Updated by Yves Vogl over 6 years ago
I'm (not) sure if this helps
#6 Updated by Greg Farnum over 6 years ago
- Project changed from RADOS to bluestore
- Category deleted (
Performance/Resource Usage)
#7 Updated by Sage Weil about 6 years ago
- Status changed from Need More Info to Can't reproduce