Project

General

Profile

Bug #36317

fallocate implementation on the kernel cephfs client

Added by Luis Henriques about 1 year ago. Updated 11 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature:

Description

I remember seeing a comment somewhere (mailing list?) about this but couldn't find any reference to the issue so I decided to open a bug.

The problem: fallocate doesn't seem to be doing what it's supposed to do. I haven't been able to spend time looking at the code to understand the details, but here's a summary of the issue, on a very small test cluster:

node5:~ # df -Th /mnt
Filesystem        Type  Size  Used Avail Use% Mounted on
192.168.122.101:/ ceph   14G  228M   14G   2% /mnt

So, I have ~14G available and fallocate a big file:

node5:/mnt # xfs_io -f -c "falloc 0 1T" hugefile
node5:/mnt # ls -lh
total 1.0T
-rw------- 1 root root 1.0T Oct  4 14:17 hugefile
drwxr-xr-x 2 root root    6 Oct  4 14:17 mydir

I would expect this to fail, and it looks like the available space hasn't changed.
node5:/mnt # df -Th /mnt
Filesystem        Type  Size  Used Avail Use% Mounted on
192.168.122.101:/ ceph   14G  228M   14G   2% /mnt

Anyway, a successful call to fallocate(2) should mean that "subsequent writes into the range specified by offset and len are guaranteed not to fail because of lack of disk space". Which isn't going to be the case in the above example.

I guess that a fix for this would require a CEPH_MSG_STATFS to the monitors to get the actual free space. But as I said, I haven't spent too much time looking at the problem.

History

#1 Updated by Xiaoxi Chen about 1 year ago

well but as cephfs potentially share same hardware with other pools, not sure how can this kind of reservation can happen unless we zeroed the range

#2 Updated by Zheng Yan about 1 year ago

  • Assignee set to Zheng Yan

#3 Updated by Patrick Donnelly about 1 year ago

I think the way forward is to only support punch hole in fallocate for kcephfs. Same for ceph-fuse.

#4 Updated by Luis Henriques about 1 year ago

Patrick Donnelly wrote:

I think the way forward is to only support punch hole in fallocate for kcephfs. Same for ceph-fuse.

This would mean that a patch (for the kernel client) will be pretty easy to put together. I'll send out something to the mailing-list soon. Oh, and I guess this patch should be tagged for stable kernels as well.

Update: I've just sent out a patch [1] to drop support for all fallocate(2) operations but FALLOC_FL_PUNCH_HOLE.

[1] https://marc.info/?l=ceph-devel&m=153910762409314&w=2

#5 Updated by Patrick Donnelly about 1 year ago

  • Project changed from fs to Linux kernel client
  • Status changed from New to 7

#6 Updated by Zheng Yan 11 months ago

  • Status changed from 7 to Resolved

commit bddff633ab7bc60a18a86ac8b322695b6f8594d0

Also available in: Atom PDF