https://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2019-05-02T20:07:09ZCeph Linux kernel client - Bug #39571: xfstest generic/452 exposes inode refcount leakhttps://tracker.ceph.com/issues/39571?journal_id=1359532019-05-02T20:07:09ZJeff Laytonjlayton@redhat.com
<ul></ul><p>This problem is reliably reproducible too.</p> Linux kernel client - Bug #39571: xfstest generic/452 exposes inode refcount leakhttps://tracker.ceph.com/issues/39571?journal_id=1360382019-05-03T12:26:15ZJeff Laytonjlayton@redhat.com
<ul></ul><p>This reproduces on a stock v5.0.5 kernel too, so I'm pretty sure none of my patches broke it. I'm working on nailing down the specific sequence of things that happen to trigger this. It reliably reproduces under xfstests when I run this:</p>
<pre>
$ sudo ./check generic/452
</pre>
<p>...however, if I run the test itself directly it does not reproduce:</p>
<pre>
$ sudo ./tests/generic/452
</pre>
<p>...if I run it a second time, then the message does pop. That leads be to believe that this is happening when cleaning up the scratch directory from a previous run.</p> Linux kernel client - Bug #39571: xfstest generic/452 exposes inode refcount leakhttps://tracker.ceph.com/issues/39571?journal_id=1360402019-05-03T12:43:49ZJeff Laytonjlayton@redhat.com
<ul><li><strong>File</strong> <a href="/attachments/download/4135/452.filtered">452.filtered</a> added</li></ul><p>Filtered strace of the test, showing all syscalls that touch /mnt/scratch or /mnt/scratch/ls_on_scratch.</p> Linux kernel client - Bug #39571: xfstest generic/452 exposes inode refcount leakhttps://tracker.ceph.com/issues/39571?journal_id=1361452019-05-06T15:57:23ZJeff Laytonjlayton@redhat.com
<ul></ul><p>With some printk debugging, leftover inode is the one for /mnt/scratch/ls_on_scratch. Still not able to reproduce this by hand though, so I wonder if there is some raciness involved.</p> Linux kernel client - Bug #39571: xfstest generic/452 exposes inode refcount leakhttps://tracker.ceph.com/issues/39571?journal_id=1362252019-05-07T12:28:17ZJeff Laytonjlayton@redhat.com
<ul></ul><p>Single shell-script reproducer:</p>
<pre>
#!/bin/bash
mount /mnt/scratch
rm -r /mnt/scratch/*
umount /mnt/scratch
mount /mnt/scratch
umount /mnt/scratch
mount /mnt/scratch
ls /mnt/scratch
umount /mnt/scratch
mount /mnt/scratch
cp /usr/bin/ls /mnt/scratch/ls_on_scratch
/mnt/scratch/ls_on_scratch /mnt/scratch/ls_on_scratch
mount -o remount,ro /mnt/scratch
/mnt/scratch/ls_on_scratch /mnt/scratch/ls_on_scratch
umount /mnt/scratch
</pre>
<p>...with this in fstab:</p>
<pre>
192.168.XXX.YYY:40527:/scratch /mnt/scratch ceph noauto,context="system_u:object_r:root_t:s0",acl 0 0
</pre>
<p>Probably, some of these steps are not needed so I'll try whittling this down next.</p> Linux kernel client - Bug #39571: xfstest generic/452 exposes inode refcount leakhttps://tracker.ceph.com/issues/39571?journal_id=1362262019-05-07T12:50:43ZJeff Laytonjlayton@redhat.com
<ul></ul><p>Slimmed-down reproducer:</p>
<pre>
#!/bin/bash
mount /mnt/scratch
cp /usr/bin/ls /mnt/scratch/ls_on_scratch
mount -o remount,ro /mnt/scratch
umount /mnt/scratch
</pre>
<p>...also, the extra mount options don't matter.</p>
<p>More interestingly, after the last umount, the ls_on_scratch file in cephfs seems to be the correct length, but it's completely zero-filled. I wonder if the remount,ro is occurring before we have a chance to flush the cache, and then that prevents writeback from succeeding?</p>
<p>EDIT: calling sync after doing the copy, but before the remount works around the issue.</p> Linux kernel client - Bug #39571: xfstest generic/452 exposes inode refcount leakhttps://tracker.ceph.com/issues/39571?journal_id=1362272019-05-07T13:26:55ZJeff Laytonjlayton@redhat.com
<ul></ul><p>Found the problem. ceph didn't have a remount_sb operation, so we just need that and to have that call sync_filesystem(). Patch posted:</p>
<p><a class="external" href="https://marc.info/?l=ceph-devel&m=155723567025589&w=2">https://marc.info/?l=ceph-devel&m=155723567025589&w=2</a></p> Linux kernel client - Bug #39571: xfstest generic/452 exposes inode refcount leakhttps://tracker.ceph.com/issues/39571?journal_id=1365942019-05-11T13:27:47ZJeff Laytonjlayton@redhat.com
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>Resolved</i></li></ul>