https://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2022-09-15T08:08:29ZCeph rbd - Bug #57534: trash purge stuck and remove images hang when the pool quota is fullhttps://tracker.ceph.com/issues/57534?journal_id=2256652022-09-15T08:08:29Zkevin huang
<ul></ul><p>The reproduce steps as the below</p>
<p>[root@ceph-node01 ~]# ceph osd pool create test1 16<br />pool 'test1' created<br />[root@ceph-node01 ~]# rbd pool init test1<br />[root@ceph-node01 ~]# ceph osd pool set-quota test1 max_bytes $((1 * 1024 * 1024 * 1024))<br />set-quota max_bytes = 1073741824 for pool test1<br />[root@ceph-node01 ~]# rbd create --size 600 test1/img1g001 --thick-provision<br />Thick provisioning: 100% complete...done.<br />[root@ceph-node01 ~]# rbd create --size 600 test1/img1g002 --thick-provision<br />Thick provisioning: 100% complete...done.<br />[root@ceph-node01 ~]# rbd list test1 -l<br />NAME SIZE PARENT FMT PROT LOCK<br />img1g001 600 MiB 2<br />img1g002 600 MiB 2<br />[root@ceph-node01 ~]# ceph health detail<br />HEALTH_WARN 1 pool(s) full<br />[WRN] POOL_FULL: 1 pool(s) full<br /> pool 'test1' is full (running out of quota)<br />[root@ceph-node01 ~]# rbd rm test1/img1g001<br />CTL+C<br />The rm action is hang ...</p> rbd - Bug #57534: trash purge stuck and remove images hang when the pool quota is fullhttps://tracker.ceph.com/issues/57534?journal_id=2256852022-09-15T10:09:52ZIlya Dryomov
<ul></ul><p>Hi Kevin,</p>
<p>What version of Ceph is installed on the client side, i.e. on the node where you are running this test on? What is the output of "rbd --version"?</p>
<p>The reason I ask is this was fixed in 16.2.8 and later releases with the caveat that the "problematic" remove should not be the first remove in that pool, see <a class="external" href="https://tracker.ceph.com/issues/52734">https://tracker.ceph.com/issues/52734</a>. Your test case passes for me with that slight modification:<br /><pre>
$ ceph osd pool create test1 16
pool 'test1' created
$ rbd pool init test1
$ ceph osd pool set-quota test1 max_bytes $((1 * 1024 * 1024 * 1024))
set-quota max_bytes = 1073741824 for pool test1
$ rbd create --size 1 test1/dummy <------
$ rbd rm test1/dummy <------
Removing image: 100% complete...done.
$ rbd create --size 600 test1/img1g001 --thick-provision
Thick provisioning: 100% complete...done.
$ rbd create --size 600 test1/img1g002 --thick-provision
Thick provisioning: 100% complete...done.
$ ceph health detail
HEALTH_WARN 1 pool(s) full
[WRN] POOL_FULL: 1 pool(s) full
pool 'test1' is full (running out of quota)
$ rbd rm test1/img1g001
Removing image: 100% complete...done.
$ rbd rm test1/img1g002
Removing image: 100% complete...done.
$ ceph health detail
HEALTH_OK
</pre><br />The need for the "dummy" remove is just an oversight -- it is only needed in the "reached quota" case, not when the pool actually becomes full. In practice, people tend to run into ENOSPC ("No space left on device") far more often than they run into EDQUOT ("Disk quota exceeded"), possibly because pool quota is not a widely used feature. Nevertheless, I'm going to address it ASAP.</p> rbd - Bug #57534: trash purge stuck and remove images hang when the pool quota is fullhttps://tracker.ceph.com/issues/57534?journal_id=2256862022-09-15T10:10:13ZIlya Dryomov
<ul><li><strong>Tags</strong> deleted (<del><i>stuck pool quota full</i></del>)</li></ul> rbd - Bug #57534: trash purge stuck and remove images hang when the pool quota is fullhttps://tracker.ceph.com/issues/57534?journal_id=2256872022-09-15T10:16:17ZIlya Dryomov
<ul></ul><p>kevin huang wrote:</p>
<blockquote>
<p>[root@ceph-node01 ~]# ceph osd pool set-quota test1 max_bytes $((1 * 1024 * 1024 * 1024))<br />set-quota max_bytes = 1073741824 for pool test1<br />[root@ceph-node01 ~]# rbd create --size 600 test1/img1g001 --thick-provision<br />Thick provisioning: 100% complete...done.<br />[root@ceph-node01 ~]# rbd create --size 600 test1/img1g002 --thick-provision<br />Thick provisioning: 100% complete...done.</p>
</blockquote>
<p>Also note that you may need to CTRL+C this command as well. Because pool quota is not precise -- it lags behind by a few seconds, sometimes you would be able to write 600M + 600M = 1.2G into a pool with max_bytes set to 1G and sometimes not.</p> rbd - Bug #57534: trash purge stuck and remove images hang when the pool quota is fullhttps://tracker.ceph.com/issues/57534?journal_id=2256922022-09-15T10:31:03ZIlya Dryomov
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>Need More Info</i></li></ul> rbd - Bug #57534: trash purge stuck and remove images hang when the pool quota is fullhttps://tracker.ceph.com/issues/57534?journal_id=2313252023-02-13T11:31:00ZIlya Dryomov
<ul><li><strong>Target version</strong> deleted (<del><i>v16.2.11</i></del>)</li></ul>