Bug #36317
fallocate implementation on the kernel cephfs client
0%
Description
I remember seeing a comment somewhere (mailing list?) about this but couldn't find any reference to the issue so I decided to open a bug.
The problem: fallocate doesn't seem to be doing what it's supposed to do. I haven't been able to spend time looking at the code to understand the details, but here's a summary of the issue, on a very small test cluster:
node5:~ # df -Th /mnt Filesystem Type Size Used Avail Use% Mounted on 192.168.122.101:/ ceph 14G 228M 14G 2% /mnt
So, I have ~14G available and fallocate a big file:
node5:/mnt # xfs_io -f -c "falloc 0 1T" hugefile node5:/mnt # ls -lh total 1.0T -rw------- 1 root root 1.0T Oct 4 14:17 hugefile drwxr-xr-x 2 root root 6 Oct 4 14:17 mydir
I would expect this to fail, and it looks like the available space hasn't changed.
node5:/mnt # df -Th /mnt Filesystem Type Size Used Avail Use% Mounted on 192.168.122.101:/ ceph 14G 228M 14G 2% /mnt
Anyway, a successful call to fallocate(2) should mean that "subsequent writes into the range specified by offset and len are guaranteed not to fail because of lack of disk space". Which isn't going to be the case in the above example.
I guess that a fix for this would require a CEPH_MSG_STATFS to the monitors to get the actual free space. But as I said, I haven't spent too much time looking at the problem.
History
#1 Updated by Xiaoxi Chen over 5 years ago
well but as cephfs potentially share same hardware with other pools, not sure how can this kind of reservation can happen unless we zeroed the range
#2 Updated by Zheng Yan over 5 years ago
- Assignee set to Zheng Yan
#3 Updated by Patrick Donnelly over 5 years ago
I think the way forward is to only support punch hole in fallocate for kcephfs. Same for ceph-fuse.
#4 Updated by Luis Henriques over 5 years ago
Patrick Donnelly wrote:
I think the way forward is to only support punch hole in fallocate for kcephfs. Same for ceph-fuse.
This would mean that a patch (for the kernel client) will be pretty easy to put together. I'll send out something to the mailing-list soon. Oh, and I guess this patch should be tagged for stable kernels as well.
Update: I've just sent out a patch [1] to drop support for all fallocate(2) operations but FALLOC_FL_PUNCH_HOLE.
#5 Updated by Patrick Donnelly over 5 years ago
- Project changed from CephFS to Linux kernel client
- Status changed from New to 7
#6 Updated by Zheng Yan about 5 years ago
- Status changed from 7 to Resolved
commit bddff633ab7bc60a18a86ac8b322695b6f8594d0