Bug #277
closed"No space left on device", while fs is not full
0%
Description
While doing a rsync of kernel.org i got the message that there was no space left on the device:
receiving incremental file list rsync: opendir "/RCS" (in pub) failed: Permission denied (13) linux/kernel/v2.6/testing/v2.6.31/ linux/kernel/v2.6/testing/v2.6.31/linux-2.6.31-rc6.tar.gz 0 0% 0.00kB/s 0:00:00 rsync: write failed on "/mnt/ceph/static/kernel/linux/kernel/v2.6/testing/v2.6.31/linux-2.6.31-rc6.tar.gz": No space left on device (28) rsync error: error in file IO (code 11) at receiver.c(302) [receiver=3.0.7] rsync: connection unexpectedly closed (182 bytes received so far) [generator] rsync error: error in rsync protocol data stream (code 12) at io.c(601) [generator=3.0.7] root@client02:~# touch^C root@client02:~# dd if=/dev/zero of=/mnt/ceph/static/kernel/linux/kernel/v2.6/testing/v2.6.31/test.bin bs=1024k count=100 dd: writing `/mnt/ceph/static/kernel/linux/kernel/v2.6/testing/v2.6.31/test.bin': No space left on device 1+0 records in 0+0 records out 0 bytes (0 B) copied, 0.00167821 s, 0.0 kB/s root@client02:~# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda1 9.2G 2.6G 6.2G 30% / none 1.5G 160K 1.5G 1% /dev none 1.5G 0 1.5G 0% /dev/shm none 1.5G 60K 1.5G 1% /var/run none 1.5G 0 1.5G 0% /var/lock none 1.5G 0 1.5G 0% /lib/init/rw /dev/sda5 65G 634M 61G 2% /var/log/ceph [2001:16f8:10:2::c3c3:3f9b],[2001:16f8:10:2::c3c3:2e5c],[109.72.85.37]:/ 6.3T 1.5T 4.9T 23% /mnt/ceph root@client02:~# ceph -s 10.07.14_17:01:25.969799 7fe6254b1710 monclient(hunting): found mon1 10.07.14_17:01:26.001553 pg v39725: 7952 pgs: 7952 active+clean; 485 GB data, 1463 GB used, 4920 GB / 6383 GB avail 10.07.14_17:01:26.027460 mds e33: 1/1/1 up {0=up:active}, 1 up:standby(laggy or crashed), 1 up:standby 10.07.14_17:01:26.027555 osd e251: 30 osds: 30 up, 30 in 10.07.14_17:01:26.027896 log 10.07.14_16:41:03.588470 mds0 2001:16f8:10:2::c3c3:3f9b:6800/22606 9636 : [WRN] client22076 released lease on dn 100000738ec/linux-2.6.31-rc5.tar.bz2 which dne 10.07.14_17:01:26.028042 mon e1: 2 mons at 2001:16f8:10:2::c3c3:3f9b:6789/0 2001:16f8:10:2::c3c3:2e5c:6789/0 root@client02:~#
root@client02:~# find /mnt/ceph -type f|wc -l 447424 root@client02:~#
As you can see, there is more then enough space left.
Tried umounting, removing the module and mounting again, no succes.
Even tried a restart of the MDS'es, no succes either, after a succesfull recovery, i still got the message. Same goes for the monitors.
Tried removing a file and running a "sync" afterwards, that failed. The sync blocked and went into status "D".
Haven't got any logfiles to go on.
During these test my replication level was set to 3 for the "data" and "metadata" pool.
While the filesystem gave a message that it was full, RADOS didn't, creating new RADOS objects was possible (through S3 Gateway).
Updated by Yehuda Sadeh almost 14 years ago
Will be nice to have the kernel log, see where this error code is coming from. Might be that one of the osds got filled up?
Updated by Wido den Hollander almost 14 years ago
I checked the dmesg, but there were no messages about that. (Actually, no messages at all)
The diskspace of the OSD's is one thing i checked, but that was sufficient too, the only one that got filled up pretty good is osd20 (See #279), but that one still had some free space.
Updated by Sage Weil almost 14 years ago
can you dump your osdmap (ceph osd getmap -o /tmp/foo) and post it here? there are 'fs is full' and 'fs is almost full' flags in the osdmap that may be set...
Updated by Greg Farnum almost 14 years ago
Also, the flags for full are (at present) set once a single OSD reaches 95% of its disk space. It could potentially be very bad to actually empty it so we leave a little margin.
And indeed it appears the RADOS interface just ignores those flags when doing writes while the filesystem clients check them. Really this just illustrates that we need to come up with a coherent strategy for handling full disks.
Updated by Wido den Hollander almost 14 years ago
Since my cluster is in a bad shape at the moment, my OSD won't be worth much, but here it is:
root@node14:~# osdmaptool --print osdmap osdmaptool: osdmap file 'osdmap' epoch 8213 fsid aff59923-6265-6097-e493-1880c0c451e3 created 10.07.13_09:53:03.670862 modifed 10.07.15_23:26:00.033949 pg_pool 0 'data' pg_pool(rep pg_size 3 crush_ruleset 0 object_hash rjenkins pg_num 1920 pgp_num 1920 lpg_num 2 lpgp_num 2 last_change 76 owner 0) pg_pool 1 'metadata' pg_pool(rep pg_size 3 crush_ruleset 1 object_hash rjenkins pg_num 1920 pgp_num 1920 lpg_num 2 lpgp_num 2 last_change 19 owner 0) pg_pool 2 'casdata' pg_pool(rep pg_size 2 crush_ruleset 2 object_hash rjenkins pg_num 1920 pgp_num 1920 lpg_num 2 lpgp_num 2 last_change 1 owner 0) pg_pool 3 'rbd' pg_pool(rep pg_size 2 crush_ruleset 3 object_hash rjenkins pg_num 1920 pgp_num 1920 lpg_num 2 lpgp_num 2 last_change 1 owner 0) pg_pool 4 '.rgw' pg_pool(rep pg_size 2 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 lpg_num 0 lpgp_num 0 last_change 9 owner 0) pg_pool 5 '.users' pg_pool(rep pg_size 2 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 lpg_num 0 lpgp_num 0 last_change 11 owner 0) pg_pool 6 '.users.email' pg_pool(rep pg_size 2 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 lpg_num 0 lpgp_num 0 last_change 13 owner 0) pg_pool 7 'wido' pg_pool(rep pg_size 2 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 lpg_num 0 lpgp_num 0 last_change 15 owner 0) max_osd 30 osd0 in weight 1 up (up_from 7803 up_thru 8208 down_at 7786 last_clean 7698-7802) 2001:16f8:10:2::c3c3:8f6b:6800/28700 2001:16f8:10:2::c3c3:8f6b:6801/28700 osd1 in weight 1 up (up_from 7803 up_thru 8207 down_at 7786 last_clean 7697-7802) 2001:16f8:10:2::c3c3:8f6b:6802/28791 2001:16f8:10:2::c3c3:8f6b:6803/28791 osd2 in weight 1 down (up_from 8173 up_thru 8173 down_at 8194 last_clean 7700-7801) osd3 in weight 1 up (up_from 7803 up_thru 8203 down_at 7786 last_clean 7705-7802) 2001:16f8:10:2::c3c3:8f6b:6806/28981 2001:16f8:10:2::c3c3:8f6b:6807/28981 osd4 in weight 1 down (up_from 7815 up_thru 8189 down_at 8198 last_clean 7748-7812) osd5 out down (up_from 7677 up_thru 7713 down_at 7735 last_clean 6753-7671) osd6 in weight 1 down (up_from 7813 up_thru 8186 down_at 8198 last_clean 7754-7812) osd7 out down (up_from 7783 up_thru 7786 down_at 7809 last_clean 7753-7775) osd8 in weight 1 down (up_from 7912 up_thru 8178 down_at 8194 last_clean 7865-7888) osd9 in weight 1 down (up_from 7876 up_thru 8154 down_at 8194 last_clean 6792-7841) osd10 in weight 1 down (up_from 7915 up_thru 8183 down_at 8198 last_clean 6847-7851) osd11 in weight 1 down (up_from 7952 up_thru 8178 down_at 8198 last_clean 7908-7940) osd12 out down (up_from 6895 up_thru 6906 down_at 7669 last_clean 6868-6878) osd13 in weight 1 down (up_from 7964 up_thru 8184 down_at 8194 last_clean 7921-7947) osd14 in weight 1 up (up_from 7982 up_thru 8206 down_at 7929 last_clean 6896-7911) 2001:16f8:10:2::c3c3:3b6a:6800/15052 2001:16f8:10:2::c3c3:3b6a:6801/15052 osd15 in weight 1 down (up_from 7978 up_thru 8124 down_at 8203 last_clean 6894-7959) osd16 in weight 1 down (up_from 8160 up_thru 8071 down_at 8178 last_clean 8018-8154) osd17 in weight 1 down (up_from 8191 up_thru 8157 down_at 8194 last_clean 8000-8186) osd18 in weight 1 down (up_from 8022 up_thru 8180 down_at 8196 last_clean 6946-8001) osd19 in weight 1 down (up_from 8019 up_thru 8178 down_at 8196 last_clean 6945-7972) osd20 out down (up_from 979 up_thru 989 down_at 1036 last_clean 3-976) osd21 out down (up_from 7380 up_thru 7186 down_at 7381 last_clean 7008-7315) osd22 out down (up_from 7604 up_thru 7605 down_at 7669 last_clean 7010-7602) osd23 out down (up_from 7205 up_thru 7201 down_at 7218 last_clean 7009-7128) osd24 in weight 1 up (up_from 8207 up_thru 8207 down_at 8194 last_clean 8036-8206) 2001:16f8:10:2::c3c3:2e56:6800/8827 2001:16f8:10:2::c3c3:2e56:6801/8827 osd25 in weight 1 down (up_from 8040 up_thru 8180 down_at 8203 last_clean 7017-8023) osd26 in weight 1 up (up_from 8071 up_thru 8207 down_at 8059 last_clean 8045-8068) 2001:16f8:10:2::c3c3:2bfe:6800/6153 2001:16f8:10:2::c3c3:2bfe:6801/6153 osd27 in weight 1 down (up_from 8046 up_thru 8184 down_at 8196 last_clean 7040-8029) osd28 in weight 1 down (up_from 8058 up_thru 8184 down_at 8203 last_clean 7065-8036) osd29 in weight 1 up (up_from 8058 up_thru 8207 down_at 8048 last_clean 7064-8039) 2001:16f8:10:2::c3c3:ab76:6802/32524 2001:16f8:10:2::c3c3:ab76:6803/32524 pg_temp 0.64 [9,29] pg_temp 0.7b [0,15] pg_temp 0.df [11,14] pg_temp 0.18f [27,18] pg_temp 0.1ab [0,29] pg_temp 0.206 [8,26] pg_temp 0.293 [18,28] pg_temp 0.2d2 [1,8] pg_temp 0.488 [8,26] pg_temp 0.632 [28,18] pg_temp 0.662 [26,8] pg_temp 0.681 [29,18] pg_temp 0.698 [9] pg_temp 0.6ad [26,16] pg_temp 1.63 [9,29] pg_temp 1.205 [8,26] pg_temp 1.292 [18,28] pg_temp 1.2d1 [1,8] pg_temp 1.487 [8,26] pg_temp 1.5a0 [18,29] pg_temp 1.680 [29,18] pg_temp 1.697 [9] pg_temp 6.7 [8] pg_temp 7.6 [8] blacklist 2001:16f8:10:2::c3c3:3f9b:6800/5194 expires 10.07.15_23:27:54.282270 blacklist 2001:16f8:10:2::c3c3:2e5c:6800/3704 expires 10.07.15_23:26:57.044402
But i think osd20 is the "bad guy" here:
root@node09:~# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda1 14G 4.0G 9.2G 31% / none 742M 196K 742M 1% /dev none 748M 0 748M 0% /dev/shm none 748M 68K 748M 1% /var/run none 748M 0 748M 0% /var/lock none 748M 0 748M 0% /lib/init/rw /dev/sda7 54G 52G 1.7G 97% /srv/ceph/osd20 /dev/sdb2 462G 48G 415G 11% /srv/ceph/osd21 /dev/sdc2 462G 60G 403G 13% /srv/ceph/osd22 /dev/sdd2 146G 45G 102G 31% /srv/ceph/osd23 root@node09:~#
It has the smallest disk of the whole cluster and is over 95% of it's capacity. Will this be the one triggering the error?
If so, is this behaviour wanted? We could mix multiple disk sizes, couldn't we? Is a crushmap modification needed? Change the weight according to a OSD's size?
Updated by Sage Weil almost 14 years ago
- Status changed from New to Closed
Yeah, that's the problem. You should weight the OSD in crush based on the disk size. Or take it out of the mix entirely.