Actions
Feature #16562
openrados put: use the FULL_TRY flag to report errors when cluster is full
Status:
New
Priority:
Normal
Assignee:
-
Category:
ceph cli
Target version:
-
% Done:
0%
Source:
Community (user)
Tags:
Backport:
Reviewed:
Affected Versions:
Pull request ID:
Description
Hello,
I'm brand new to Ceph. I have 3 OSDs (These are slow Pine64 devices using USB thumb drives. Just some testing I'm doing to learn CEPH.) I created a single pool with a max quota of 1GB (ceph osd pool set-quota ubuntublockdev max_bytes 1073741824). I then attempted to copy a 500MB data file, 3 times, into the pool. I was curious as to the response when the quota was reached. I expected some sort of error message, similar to what mv/cp/rsync would give when a disk is out of space. What I got, instead, was a hanging process.
root@pine1:~/my-cluster# time rados put 500mbdata1 /root/500mb.data --pool=ubuntublockdev real 4m37.586s user 0m0.480s sys 0m1.740s root@pine1:~/my-cluster# time rados put 500mbdata1 /root/500mb.data --pool=ubuntublockdev real 4m34.522s user 0m0.620s sys 0m1.600s root@pine1:~/my-cluster# time rados put 500mbdata2 /root/500mb.data --pool=ubuntublockdev <has not returned in over 14 hours> root@pine2:/var/log/ceph# ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 0.01176 root default -2 0.00269 host pine4 0 0.00269 osd.0 up 1.00000 1.00000 -3 0.00639 host pine3 1 0.00639 osd.1 up 1.00000 1.00000 -4 0.00269 host pine2 2 0.00269 osd.2 up 1.00000 1.00000 root@pine2:/var/log/ceph# ceph osd df ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS 0 0.00269 1.00000 2814M 34696k 2513M 1.20 0.07 67 1 0.00639 1.00000 6676M 1035M 5375M 15.51 0.92 113 2 0.00269 1.00000 2814M 1003M 1545M 35.63 2.12 76 TOTAL 12306M 2072M 9434M 16.84 MIN/MAX VAR: 0.07/2.12 STDDEV: 14.13 root@pine2:/var/log/ceph# ceph -s cluster 34c9464a-c307-4fa5-bbe8-0f654813982e health HEALTH_WARN pool 'ubuntublockdev' is full monmap e4: 4 mons at {pine1=10.10.10.21:6789/0,pine2=10.10.10.14:6789/0,pine3=10.10.10.13:6789/0,pine4=10.10.10.23:6789/0} election epoch 10, quorum 0,1,2,3 pine3,pine2,pine1,pine4 osdmap e19: 3 osds: 3 up, 3 in flags sortbitwise pgmap v436: 128 pgs, 2 pools, 1032 MB data, 3 objects 2072 MB used, 9434 MB / 12306 MB avail 128 active+clean
Logs:
2016-06-30 05:23:32.599870 7fa94073e0 0 log_channel(cluster) log [INF] : pgmap v418: 128 pgs: 128 active+clean; 1000 MB data, 2108 MB used, 9398 MB / 12306 MB avail 2016-06-30 05:23:33.667407 7fa94073e0 0 log_channel(cluster) log [INF] : pgmap v419: 128 pgs: 128 active+clean; 1004 MB data, 2108 MB used, 9398 MB / 12306 MB avail; 82068 B/s wr, 0 op/s 2016-06-30 05:23:37.598989 7fa94073e0 0 log_channel(cluster) log [INF] : pgmap v420: 128 pgs: 128 active+clean; 1004 MB data, 2108 MB used, 9398 MB / 12306 MB avail; 819 kB/s wr, 0 op/s 2016-06-30 05:23:38.674872 7fa94073e0 0 log_channel(cluster) log [INF] : pgmap v421: 128 pgs: 128 active+clean; 1016 MB data, 2108 MB used, 9398 MB / 12306 MB avail; 2458 kB/s wr, 0 op/s 2016-06-30 05:23:42.616913 7fa94073e0 0 log_channel(cluster) log [INF] : pgmap v422: 128 pgs: 128 active+clean; 1016 MB data, 2116 MB used, 9390 MB / 12306 MB avail; 2457 kB/s wr, 0 op/s 2016-06-30 05:23:43.682803 7fa94073e0 0 log_channel(cluster) log [INF] : pgmap v423: 128 pgs: 128 active+clean; 1024 MB data, 2128 MB used, 9378 MB / 12306 MB avail; 1632 kB/s wr, 0 op/s 2016-06-30 05:23:45.023717 7fa84073e0 0 log_channel(cluster) log [WRN] : pool 'ubuntublockdev' is full (reached quota's max_bytes: 1024M) 2016-06-30 05:23:45.092333 7fa94073e0 1 mon.pine3@0(leader).osd e19 e19: 3 osds: 3 up, 3 in 2016-06-30 05:23:45.118943 7fa94073e0 0 log_channel(cluster) log [INF] : osdmap e19: 3 osds: 3 up, 3 in 2016-06-30 05:23:45.180069 7fa94073e0 0 log_channel(cluster) log [INF] : pgmap v424: 128 pgs: 128 active+clean; 1024 MB data, 2128 MB used, 9378 MB / 12306 MB avail; 3206 kB/s wr, 0 op/s 2016-06-30 05:23:47.593415 7fa94073e0 0 log_channel(cluster) log [INF] : pgmap v425: 128 pgs: 128 active+clean; 1024 MB data, 2140 MB used, 9366 MB / 12306 MB avail 2016-06-30 05:23:48.650748 7fa94073e0 0 log_channel(cluster) log [INF] : pgmap v426: 128 pgs: 128 active+clean; 1032 MB data, 2040 MB used, 9466 MB / 12306 MB avail; 2354 kB/s wr, 0 op/s 2016-06-30 05:23:52.545141 7fa94073e0 0 log_channel(cluster) log [INF] : pgmap v427: 128 pgs: 128 active+clean; 1032 MB data, 2040 MB used, 9466 MB / 12306 MB avail; 1638 kB/s wr, 0 op/s 2016-06-30 05:23:53.529835 7fa84073e0 0 log_channel(cluster) log [INF] : HEALTH_WARN; pool 'ubuntublockdev' is full
The line that says HEALTH_WARN repeats every hour, on the hour. Again, the original rados put command still has not returned nor errored.
root@pine2:/var/log/ceph# ceph -v ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374) root@pine2:/var/log/ceph# uname -a Linux pine2 3.10.102-0-pine64-longsleep #7 SMP PREEMPT Fri Jun 17 21:30:48 CEST 2016 aarch64 aarch64 aarch64 GNU/Linux root@pine2:/var/log/ceph# cat /etc/debian_version stretch/sid
Actions