Actions
Bug #277
closed"No space left on device", while fs is not full
Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
While doing a rsync of kernel.org i got the message that there was no space left on the device:
receiving incremental file list rsync: opendir "/RCS" (in pub) failed: Permission denied (13) linux/kernel/v2.6/testing/v2.6.31/ linux/kernel/v2.6/testing/v2.6.31/linux-2.6.31-rc6.tar.gz 0 0% 0.00kB/s 0:00:00 rsync: write failed on "/mnt/ceph/static/kernel/linux/kernel/v2.6/testing/v2.6.31/linux-2.6.31-rc6.tar.gz": No space left on device (28) rsync error: error in file IO (code 11) at receiver.c(302) [receiver=3.0.7] rsync: connection unexpectedly closed (182 bytes received so far) [generator] rsync error: error in rsync protocol data stream (code 12) at io.c(601) [generator=3.0.7] root@client02:~# touch^C root@client02:~# dd if=/dev/zero of=/mnt/ceph/static/kernel/linux/kernel/v2.6/testing/v2.6.31/test.bin bs=1024k count=100 dd: writing `/mnt/ceph/static/kernel/linux/kernel/v2.6/testing/v2.6.31/test.bin': No space left on device 1+0 records in 0+0 records out 0 bytes (0 B) copied, 0.00167821 s, 0.0 kB/s root@client02:~# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda1 9.2G 2.6G 6.2G 30% / none 1.5G 160K 1.5G 1% /dev none 1.5G 0 1.5G 0% /dev/shm none 1.5G 60K 1.5G 1% /var/run none 1.5G 0 1.5G 0% /var/lock none 1.5G 0 1.5G 0% /lib/init/rw /dev/sda5 65G 634M 61G 2% /var/log/ceph [2001:16f8:10:2::c3c3:3f9b],[2001:16f8:10:2::c3c3:2e5c],[109.72.85.37]:/ 6.3T 1.5T 4.9T 23% /mnt/ceph root@client02:~# ceph -s 10.07.14_17:01:25.969799 7fe6254b1710 monclient(hunting): found mon1 10.07.14_17:01:26.001553 pg v39725: 7952 pgs: 7952 active+clean; 485 GB data, 1463 GB used, 4920 GB / 6383 GB avail 10.07.14_17:01:26.027460 mds e33: 1/1/1 up {0=up:active}, 1 up:standby(laggy or crashed), 1 up:standby 10.07.14_17:01:26.027555 osd e251: 30 osds: 30 up, 30 in 10.07.14_17:01:26.027896 log 10.07.14_16:41:03.588470 mds0 2001:16f8:10:2::c3c3:3f9b:6800/22606 9636 : [WRN] client22076 released lease on dn 100000738ec/linux-2.6.31-rc5.tar.bz2 which dne 10.07.14_17:01:26.028042 mon e1: 2 mons at 2001:16f8:10:2::c3c3:3f9b:6789/0 2001:16f8:10:2::c3c3:2e5c:6789/0 root@client02:~#
root@client02:~# find /mnt/ceph -type f|wc -l 447424 root@client02:~#
As you can see, there is more then enough space left.
Tried umounting, removing the module and mounting again, no succes.
Even tried a restart of the MDS'es, no succes either, after a succesfull recovery, i still got the message. Same goes for the monitors.
Tried removing a file and running a "sync" afterwards, that failed. The sync blocked and went into status "D".
Haven't got any logfiles to go on.
During these test my replication level was set to 3 for the "data" and "metadata" pool.
While the filesystem gave a message that it was full, RADOS didn't, creating new RADOS objects was possible (through S3 Gateway).
Actions