Actions
Bug #23994
openmds: OSD space is not reclaimed until MDS is restarted
Status:
Need More Info
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Community (user)
Tags:
Backport:
mimic,luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Client, MDS, kceph
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
With my Luminous test cluster on Ubuntu I ran into a situation where I filled up an OSD by putting files on CephFS, and deleting that data did not free the OSD space. Only restarting the MDS helped.
After filling the CephFS mount up until "no space left on device", I got:
# ceph status -w cluster: id: f59e49e7-5539-42fd-8706-9b937a517c5a health: HEALTH_ERR 1 full osd(s) 7 pool(s) full Degraded data redundancy: 1094/1391724 objects degraded (0.079%), 25 pgs degraded Degraded data redundancy (low space): 26 pgs recovery_toofull mons ceph2,ceph3 are low on available space services: mon: 3 daemons, quorum ceph2,ceph3,ceph1 mgr: ceph1(active), standbys: ceph3, ceph2 mds: testfs-1/1/1 up {0=ceph3=up:active}, 2 up:standby osd: 3 osds: 3 up, 3 in rgw: 1 daemon active data: pools: 7 pools, 188 pgs objects: 453k objects, 5763 MB usage: 24897 MB used, 5616 MB / 30513 MB avail pgs: 1094/1391724 objects degraded (0.079%) 162 active+clean 25 active+recovery_toofull+degraded 1 active+recovery_toofull
# ceph df GLOBAL: SIZE AVAIL RAW USED %RAW USED 30513M 5616M 24897M 81.59 POOLS: NAME ID USED %USED MAX AVAIL OBJECTS .rgw.root 1 1446 100.00 0 5 default.rgw.control 2 0 0 0 8 default.rgw.meta 3 0 0 0 0 default.rgw.log 4 0 0 0 207 mytest 5 180M 100.00 0 14 cephfs_data 6 5522M 100.00 0 463177 cephfs_metadata 7 62153k 100.00 0 497
# ceph health detail HEALTH_ERR 1 full osd(s); 7 pool(s) full; Degraded data redundancy: 1094/1391724 objects degraded (0.079%), 25 pgs degraded; Degraded data redundancy (low space): 26 pgs recovery_toofull; mons ceph2,ceph3 are low on available space OSD_FULL 1 full osd(s) osd.2 is full POOL_FULL 7 pool(s) full pool '.rgw.root' is full (no space) pool 'default.rgw.control' is full (no space) pool 'default.rgw.meta' is full (no space) pool 'default.rgw.log' is full (no space) pool 'mytest' is full (no space) pool 'cephfs_data' is full (no space) pool 'cephfs_metadata' is full (no space) PG_DEGRADED Degraded data redundancy: 1094/1391724 objects degraded (0.079%), 25 pgs degraded pg 1.0 is active+recovery_toofull+degraded, acting [1,2,0] pg 1.5 is active+recovery_toofull+degraded, acting [2,0,1] pg 1.7 is active+recovery_toofull+degraded, acting [1,2,0] pg 2.0 is active+recovery_toofull+degraded, acting [1,2,0] pg 2.1 is active+recovery_toofull+degraded, acting [0,2,1] pg 2.3 is active+recovery_toofull+degraded, acting [1,0,2] pg 2.4 is active+recovery_toofull+degraded, acting [0,2,1] pg 2.5 is active+recovery_toofull+degraded, acting [2,0,1] pg 2.6 is active+recovery_toofull+degraded, acting [0,2,1] pg 2.7 is active+recovery_toofull+degraded, acting [1,2,0] pg 5.0 is active+recovery_toofull+degraded, acting [2,0,1] pg 5.1 is active+recovery_toofull+degraded, acting [1,0,2] pg 5.2 is active+recovery_toofull+degraded, acting [1,0,2] pg 5.3 is active+recovery_toofull+degraded, acting [0,1,2] pg 5.4 is active+recovery_toofull+degraded, acting [2,0,1] pg 5.5 is active+recovery_toofull+degraded, acting [0,1,2] pg 5.6 is active+recovery_toofull+degraded, acting [0,1,2] pg 7.0 is active+recovery_toofull+degraded, acting [1,2,0] pg 7.2 is active+recovery_toofull+degraded, acting [0,2,1] pg 7.3 is active+recovery_toofull+degraded, acting [0,1,2] pg 7.f is active+recovery_toofull+degraded, acting [0,2,1] pg 7.10 is active+recovery_toofull+degraded, acting [0,2,1] pg 7.11 is active+recovery_toofull+degraded, acting [2,1,0] pg 7.12 is active+recovery_toofull+degraded, acting [1,0,2] pg 7.13 is active+recovery_toofull+degraded, acting [0,1,2] PG_DEGRADED_FULL Degraded data redundancy (low space): 26 pgs recovery_toofull pg 1.0 is active+recovery_toofull+degraded, acting [1,2,0] pg 1.5 is active+recovery_toofull+degraded, acting [2,0,1] pg 1.7 is active+recovery_toofull+degraded, acting [1,2,0] pg 2.0 is active+recovery_toofull+degraded, acting [1,2,0] pg 2.1 is active+recovery_toofull+degraded, acting [0,2,1] pg 2.3 is active+recovery_toofull+degraded, acting [1,0,2] pg 2.4 is active+recovery_toofull+degraded, acting [0,2,1] pg 2.5 is active+recovery_toofull+degraded, acting [2,0,1] pg 2.6 is active+recovery_toofull+degraded, acting [0,2,1] pg 2.7 is active+recovery_toofull+degraded, acting [1,2,0] pg 5.0 is active+recovery_toofull+degraded, acting [2,0,1] pg 5.1 is active+recovery_toofull+degraded, acting [1,0,2] pg 5.2 is active+recovery_toofull+degraded, acting [1,0,2] pg 5.3 is active+recovery_toofull+degraded, acting [0,1,2] pg 5.4 is active+recovery_toofull+degraded, acting [2,0,1] pg 5.5 is active+recovery_toofull+degraded, acting [0,1,2] pg 5.6 is active+recovery_toofull+degraded, acting [0,1,2] pg 7.0 is active+recovery_toofull+degraded, acting [1,2,0] pg 7.2 is active+recovery_toofull+degraded, acting [0,2,1] pg 7.3 is active+recovery_toofull+degraded, acting [0,1,2] pg 7.4 is active+recovery_toofull, acting [0,1,2] pg 7.f is active+recovery_toofull+degraded, acting [0,2,1] pg 7.10 is active+recovery_toofull+degraded, acting [0,2,1] pg 7.11 is active+recovery_toofull+degraded, acting [2,1,0] pg 7.12 is active+recovery_toofull+degraded, acting [1,0,2] pg 7.13 is active+recovery_toofull+degraded, acting [0,1,2] MON_DISK_LOW mons ceph2,ceph3 are low on available space mon.ceph2 has 16% avail mon.ceph3 has 27% avail
Doing `rm -r` on the cephfs, deleting all the data, did not change that situation.
Only after a restart of the MDS service was the OSD no longer full and I got:
# ceph df GLOBAL: SIZE AVAIL RAW USED %RAW USED 30513M 21507M 9006M 29.52 POOLS: NAME ID USED %USED MAX AVAIL OBJECTS .rgw.root 1 1446 0 5140M 5 default.rgw.control 2 0 0 5140M 8 default.rgw.meta 3 0 0 5140M 0 default.rgw.log 4 0 0 5140M 207 mytest 5 180M 0 5140M 14 cephfs_data 6 206M 3.86 5140M 461526 cephfs_metadata 7 58136k 0.58 5140M 496
Actions