Project

General

Profile

Bug #42313

Updated by Jason Dillaman over 4 years ago

We are seeing lots of jobs failed due to this issue. 
 It's blocking point releases testing, 

 Here is one example: 


 http://pulpito.ceph.com/yuriw-2019-10-11_12:58:22-rbd-wip-yuri6-testing-2019-10-10-2057-mimic-distro-basic-smithi/ 
 http://pulpito.ceph.com/yuriw-2019-10-11_19:41:35-rbd-wip-yuri8-testing-2019-10-11-1347-luminous-distro-basic-smithi/ http://qa-proxy.ceph.com/teuthology/yuriw-2019-10-09_15:42:09-rbd-wip-yuri5-testing-2019-10-08-2016-luminous-distro-basic-smithi/4371741/teuthology.log 

 <pre> 
  2019-10-13T09:31:17.097 
 INFO:tasks.ceph.mon.a.smithi167.stderr:2019-10-13 09:31:17.095566 
 7f18a21be700 -1 log_channel(cluster) log [ERR] : Health check failed: 
 4 full osd(s) (OSD_FULL) 
 2019-10-13T09:31:22.117 
 INFO:tasks.ceph.mon.a.smithi167.stderr:2019-10-13 09:31:22.115362 
 7f18a21be700 -1 log_channel(cluster) log [ERR] : Health check update: 
 1 full osd(s) (OSD_FULL) 
 2019-10-13T09:31:27.632 
 INFO:tasks.ceph.mon.a.smithi167.stderr:2019-10-13 09:31:27.630483 
 7f189f9b9700 -1 log_channel(cluster) log [ERR] : Health check failed: 
 mon b is very low on available 2019-10-10T01:10:27.806 INFO:teuthology.orchestra.run.smithi136.stdout:    Error writing to output file - write (28: No space (MON_DISK_CRIT) 
 2019-10-13T09:31:28.672 
 INFO:tasks.ceph.mon.a.smithi167.stderr:2019-10-13 09:31:28.670676 
 7f18a21be700 -1 log_channel(cluster) log [ERR] : Health check failed: 
 4 full osd(s) (OSD_FULL) 
 2019-10-13T09:31:32.980 
 INFO:tasks.ceph.mon.a.smithi167.stderr:2019-10-13 09:31:32.978369 
 7f18a21be700 -1 log_channel(cluster) log [ERR] : Health check update: 
 mons a,b,c are very low left on available space (MON_DISK_CRIT) 
 2019-10-13T09:31:34.061 
 INFO:tasks.ceph.mon.a.smithi167.stderr:2019-10-13 09:31:34.053935 
 7f18a21be700 -1 log_channel(cluster) log [ERR] : Health check update: 
 2 full osd(s) (OSD_FULL) 
 2019-10-13T09:31:40.153 
 INFO:tasks.ceph.osd.0.smithi167.stderr:2019-10-13 09:31:40.151323 
 7fd400753700 -1 log_channel(cluster) log [ERR] : full status failsafe 
 engaged, dropping updates, now 97% full 
 2019-10-13T09:31:40.328 
 INFO:tasks.ceph.mon.a.smithi167.stderr:2019-10-13 09:31:40.326090 
 7f18a21be700 -1 log_channel(cluster) log [ERR] : Health check failed: 
 1 full osd(s) (OSD_FULL) 
 2019-10-13T09:31:40.494 
 INFO:tasks.ceph.osd.3.smithi167.stderr:2019-10-13 09:31:40.492169 
 7fda29d9c700 -1 log_channel(cluster) log [ERR] : full status failsafe 
 engaged, dropping updates, now 97% full 
 2019-10-13T09:31:40.900 
 INFO:tasks.ceph.osd.1.smithi167.stderr:2019-10-13 09:31:40.898979 
 7f10785bb700 -1 log_channel(cluster) log [ERR] : full status failsafe 
 engaged, dropping updates, now 98% full 
 2019-10-13T09:31:41.279 
 INFO:tasks.ceph.osd.4.smithi174.stderr:2019-10-13 09:31:41.277991 
 7f8651337700 -1 log_channel(cluster) log [ERR] : full status failsafe 
 engaged, dropping updates, now 98% full 
 2019-10-13T09:31:41.525 
 INFO:tasks.ceph.osd.2.smithi167.stderr:2019-10-13 09:31:41.523075 
 7f1df905a700 -1 log_channel(cluster) log [ERR] : full status failsafe 
 engaged, dropping updates, now 98% full 
 2019-10-13T09:31:42.493 
 INFO:tasks.ceph.osd.7.smithi174.stderr:2019-10-13 09:31:42.491132 
 7febd19f8700 -1 log_channel(cluster) log [ERR] : full status failsafe 
 engaged, dropping updates, now 98% full 
 2019-10-13T09:31:43.281 
 INFO:tasks.ceph.osd.5.smithi174.stderr:2019-10-13 09:31:43.279310 
 7f9c80f21700 -1 log_channel(cluster) log [ERR] : full status failsafe 
 engaged, dropping updates, now 98% full 
 2019-10-13T09:31:44.399 
 INFO:tasks.ceph.osd.6.smithi174.stderr:2019-10-13 09:31:44.398004 
 7f4c2bf61700 -1 log_channel(cluster) log [ERR] : full status failsafe 
 engaged, dropping updates, now 98% full device) [IP: 91.189.91.26 80] 
 </pre>

Back