Bug #277: "No space left on device", while fs is not full - Ceph - Ceph

Actions

Copy link

Bug #277

closed

"No space left on device", while fs is not full

Added by Wido den Hollander almost 14 years ago. Updated over 13 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Category:

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

While doing a rsync of kernel.org i got the message that there was no space left on the device:

receiving incremental file list
rsync: opendir "/RCS" (in pub) failed: Permission denied (13)
linux/kernel/v2.6/testing/v2.6.31/
linux/kernel/v2.6/testing/v2.6.31/linux-2.6.31-rc6.tar.gz
           0   0%    0.00kB/s    0:00:00
rsync: write failed on "/mnt/ceph/static/kernel/linux/kernel/v2.6/testing/v2.6.31/linux-2.6.31-rc6.tar.gz": No space left on device (28)
rsync error: error in file IO (code 11) at receiver.c(302) [receiver=3.0.7]
rsync: connection unexpectedly closed (182 bytes received so far) [generator]
rsync error: error in rsync protocol data stream (code 12) at io.c(601) [generator=3.0.7]
root@client02:~# touch^C
root@client02:~# dd if=/dev/zero of=/mnt/ceph/static/kernel/linux/kernel/v2.6/testing/v2.6.31/test.bin bs=1024k count=100
dd: writing `/mnt/ceph/static/kernel/linux/kernel/v2.6/testing/v2.6.31/test.bin': No space left on device
1+0 records in
0+0 records out
0 bytes (0 B) copied, 0.00167821 s, 0.0 kB/s
root@client02:~# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             9.2G  2.6G  6.2G  30% /
none                  1.5G  160K  1.5G   1% /dev
none                  1.5G     0  1.5G   0% /dev/shm
none                  1.5G   60K  1.5G   1% /var/run
none                  1.5G     0  1.5G   0% /var/lock
none                  1.5G     0  1.5G   0% /lib/init/rw
/dev/sda5              65G  634M   61G   2% /var/log/ceph
[2001:16f8:10:2::c3c3:3f9b],[2001:16f8:10:2::c3c3:2e5c],[109.72.85.37]:/
                      6.3T  1.5T  4.9T  23% /mnt/ceph
root@client02:~# ceph -s
10.07.14_17:01:25.969799 7fe6254b1710 monclient(hunting): found mon1
10.07.14_17:01:26.001553    pg v39725: 7952 pgs: 7952 active+clean; 485 GB data, 1463 GB used, 4920 GB / 6383 GB avail
10.07.14_17:01:26.027460   mds e33: 1/1/1 up {0=up:active}, 1 up:standby(laggy or crashed), 1 up:standby
10.07.14_17:01:26.027555   osd e251: 30 osds: 30 up, 30 in
10.07.14_17:01:26.027896   log 10.07.14_16:41:03.588470 mds0 2001:16f8:10:2::c3c3:3f9b:6800/22606 9636 : [WRN] client22076 released lease on dn 100000738ec/linux-2.6.31-rc5.tar.bz2 which dne
10.07.14_17:01:26.028042   mon e1: 2 mons at 2001:16f8:10:2::c3c3:3f9b:6789/0 2001:16f8:10:2::c3c3:2e5c:6789/0
root@client02:~#

root@client02:~# find /mnt/ceph -type f|wc -l
447424
root@client02:~#

As you can see, there is more then enough space left.

Tried umounting, removing the module and mounting again, no succes.

Even tried a restart of the MDS'es, no succes either, after a succesfull recovery, i still got the message. Same goes for the monitors.

Tried removing a file and running a "sync" afterwards, that failed. The sync blocked and went into status "D".

Haven't got any logfiles to go on.

During these test my replication level was set to 3 for the "data" and "metadata" pool.

While the filesystem gave a message that it was full, RADOS didn't, creating new RADOS objects was possible (through S3 Gateway).

Actions

Copy link

Updated by Yehuda Sadeh almost 14 years ago

Will be nice to have the kernel log, see where this error code is coming from. Might be that one of the osds got filled up?

Actions

Copy link

Updated by Wido den Hollander almost 14 years ago

I checked the dmesg, but there were no messages about that. (Actually, no messages at all)

The diskspace of the OSD's is one thing i checked, but that was sufficient too, the only one that got filled up pretty good is osd20 (See #279), but that one still had some free space.

Actions

Copy link

Updated by Sage Weil almost 14 years ago

can you dump your osdmap (ceph osd getmap -o /tmp/foo) and post it here? there are 'fs is full' and 'fs is almost full' flags in the osdmap that may be set...

Actions

Copy link

Updated by Greg Farnum almost 14 years ago

Also, the flags for full are (at present) set once a single OSD reaches 95% of its disk space. It could potentially be very bad to actually empty it so we leave a little margin.

And indeed it appears the RADOS interface just ignores those flags when doing writes while the filesystem clients check them. Really this just illustrates that we need to come up with a coherent strategy for handling full disks.

Actions

Copy link

Updated by Wido den Hollander almost 14 years ago

Since my cluster is in a bad shape at the moment, my OSD won't be worth much, but here it is:

root@node14:~# osdmaptool --print osdmap 
osdmaptool: osdmap file 'osdmap'
epoch 8213
fsid aff59923-6265-6097-e493-1880c0c451e3
created 10.07.13_09:53:03.670862
modifed 10.07.15_23:26:00.033949

pg_pool 0 'data' pg_pool(rep pg_size 3 crush_ruleset 0 object_hash rjenkins pg_num 1920 pgp_num 1920 lpg_num 2 lpgp_num 2 last_change 76 owner 0)
pg_pool 1 'metadata' pg_pool(rep pg_size 3 crush_ruleset 1 object_hash rjenkins pg_num 1920 pgp_num 1920 lpg_num 2 lpgp_num 2 last_change 19 owner 0)
pg_pool 2 'casdata' pg_pool(rep pg_size 2 crush_ruleset 2 object_hash rjenkins pg_num 1920 pgp_num 1920 lpg_num 2 lpgp_num 2 last_change 1 owner 0)
pg_pool 3 'rbd' pg_pool(rep pg_size 2 crush_ruleset 3 object_hash rjenkins pg_num 1920 pgp_num 1920 lpg_num 2 lpgp_num 2 last_change 1 owner 0)
pg_pool 4 '.rgw' pg_pool(rep pg_size 2 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 lpg_num 0 lpgp_num 0 last_change 9 owner 0)
pg_pool 5 '.users' pg_pool(rep pg_size 2 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 lpg_num 0 lpgp_num 0 last_change 11 owner 0)
pg_pool 6 '.users.email' pg_pool(rep pg_size 2 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 lpg_num 0 lpgp_num 0 last_change 13 owner 0)
pg_pool 7 'wido' pg_pool(rep pg_size 2 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 lpg_num 0 lpgp_num 0 last_change 15 owner 0)

max_osd 30
osd0 in weight 1 up   (up_from 7803 up_thru 8208 down_at 7786 last_clean 7698-7802) 2001:16f8:10:2::c3c3:8f6b:6800/28700 2001:16f8:10:2::c3c3:8f6b:6801/28700
osd1 in weight 1 up   (up_from 7803 up_thru 8207 down_at 7786 last_clean 7697-7802) 2001:16f8:10:2::c3c3:8f6b:6802/28791 2001:16f8:10:2::c3c3:8f6b:6803/28791
osd2 in weight 1 down (up_from 8173 up_thru 8173 down_at 8194 last_clean 7700-7801)
osd3 in weight 1 up   (up_from 7803 up_thru 8203 down_at 7786 last_clean 7705-7802) 2001:16f8:10:2::c3c3:8f6b:6806/28981 2001:16f8:10:2::c3c3:8f6b:6807/28981
osd4 in weight 1 down (up_from 7815 up_thru 8189 down_at 8198 last_clean 7748-7812)
osd5 out down (up_from 7677 up_thru 7713 down_at 7735 last_clean 6753-7671)
osd6 in weight 1 down (up_from 7813 up_thru 8186 down_at 8198 last_clean 7754-7812)
osd7 out down (up_from 7783 up_thru 7786 down_at 7809 last_clean 7753-7775)
osd8 in weight 1 down (up_from 7912 up_thru 8178 down_at 8194 last_clean 7865-7888)
osd9 in weight 1 down (up_from 7876 up_thru 8154 down_at 8194 last_clean 6792-7841)
osd10 in weight 1 down (up_from 7915 up_thru 8183 down_at 8198 last_clean 6847-7851)
osd11 in weight 1 down (up_from 7952 up_thru 8178 down_at 8198 last_clean 7908-7940)
osd12 out down (up_from 6895 up_thru 6906 down_at 7669 last_clean 6868-6878)
osd13 in weight 1 down (up_from 7964 up_thru 8184 down_at 8194 last_clean 7921-7947)
osd14 in weight 1 up   (up_from 7982 up_thru 8206 down_at 7929 last_clean 6896-7911) 2001:16f8:10:2::c3c3:3b6a:6800/15052 2001:16f8:10:2::c3c3:3b6a:6801/15052
osd15 in weight 1 down (up_from 7978 up_thru 8124 down_at 8203 last_clean 6894-7959)
osd16 in weight 1 down (up_from 8160 up_thru 8071 down_at 8178 last_clean 8018-8154)
osd17 in weight 1 down (up_from 8191 up_thru 8157 down_at 8194 last_clean 8000-8186)
osd18 in weight 1 down (up_from 8022 up_thru 8180 down_at 8196 last_clean 6946-8001)
osd19 in weight 1 down (up_from 8019 up_thru 8178 down_at 8196 last_clean 6945-7972)
osd20 out down (up_from 979 up_thru 989 down_at 1036 last_clean 3-976)
osd21 out down (up_from 7380 up_thru 7186 down_at 7381 last_clean 7008-7315)
osd22 out down (up_from 7604 up_thru 7605 down_at 7669 last_clean 7010-7602)
osd23 out down (up_from 7205 up_thru 7201 down_at 7218 last_clean 7009-7128)
osd24 in weight 1 up   (up_from 8207 up_thru 8207 down_at 8194 last_clean 8036-8206) 2001:16f8:10:2::c3c3:2e56:6800/8827 2001:16f8:10:2::c3c3:2e56:6801/8827
osd25 in weight 1 down (up_from 8040 up_thru 8180 down_at 8203 last_clean 7017-8023)
osd26 in weight 1 up   (up_from 8071 up_thru 8207 down_at 8059 last_clean 8045-8068) 2001:16f8:10:2::c3c3:2bfe:6800/6153 2001:16f8:10:2::c3c3:2bfe:6801/6153
osd27 in weight 1 down (up_from 8046 up_thru 8184 down_at 8196 last_clean 7040-8029)
osd28 in weight 1 down (up_from 8058 up_thru 8184 down_at 8203 last_clean 7065-8036)
osd29 in weight 1 up   (up_from 8058 up_thru 8207 down_at 8048 last_clean 7064-8039) 2001:16f8:10:2::c3c3:ab76:6802/32524 2001:16f8:10:2::c3c3:ab76:6803/32524

pg_temp 0.64 [9,29]
pg_temp 0.7b [0,15]
pg_temp 0.df [11,14]
pg_temp 0.18f [27,18]
pg_temp 0.1ab [0,29]
pg_temp 0.206 [8,26]
pg_temp 0.293 [18,28]
pg_temp 0.2d2 [1,8]
pg_temp 0.488 [8,26]
pg_temp 0.632 [28,18]
pg_temp 0.662 [26,8]
pg_temp 0.681 [29,18]
pg_temp 0.698 [9]
pg_temp 0.6ad [26,16]
pg_temp 1.63 [9,29]
pg_temp 1.205 [8,26]
pg_temp 1.292 [18,28]
pg_temp 1.2d1 [1,8]
pg_temp 1.487 [8,26]
pg_temp 1.5a0 [18,29]
pg_temp 1.680 [29,18]
pg_temp 1.697 [9]
pg_temp 6.7 [8]
pg_temp 7.6 [8]
blacklist 2001:16f8:10:2::c3c3:3f9b:6800/5194 expires 10.07.15_23:27:54.282270
blacklist 2001:16f8:10:2::c3c3:2e5c:6800/3704 expires 10.07.15_23:26:57.044402

But i think osd20 is the "bad guy" here:

root@node09:~# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1              14G  4.0G  9.2G  31% /
none                  742M  196K  742M   1% /dev
none                  748M     0  748M   0% /dev/shm
none                  748M   68K  748M   1% /var/run
none                  748M     0  748M   0% /var/lock
none                  748M     0  748M   0% /lib/init/rw
/dev/sda7              54G   52G  1.7G  97% /srv/ceph/osd20
/dev/sdb2             462G   48G  415G  11% /srv/ceph/osd21
/dev/sdc2             462G   60G  403G  13% /srv/ceph/osd22
/dev/sdd2             146G   45G  102G  31% /srv/ceph/osd23
root@node09:~#

It has the smallest disk of the whole cluster and is over 95% of it's capacity. Will this be the one triggering the error?

If so, is this behaviour wanted? We could mix multiple disk sizes, couldn't we? Is a crushmap modification needed? Change the weight according to a OSD's size?

Actions

Copy link

Updated by Sage Weil almost 14 years ago

Status changed from New to Closed

Yeah, that's the problem. You should weight the OSD in crush based on the disk size. Or take it out of the mix entirely.

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph

Custom queries

Bug #277

"No space left on device", while fs is not full

Updated by Yehuda Sadeh almost 14 years ago

Updated by Wido den Hollander almost 14 years ago

Updated by Sage Weil almost 14 years ago

Updated by Greg Farnum almost 14 years ago

Updated by Wido den Hollander almost 14 years ago

Updated by Sage Weil almost 14 years ago