Bug #44813
Sendfile on cephfs result in 0 bytes data on other node
0%
Description
Hi,
I use sendfile function to write data on cephfs (eg : https://github.com/pijewski/sendfile-example/). I need to write data from file descriptor to file descriptor whitout copy any data into the user space.
On ceph node1 (mgr, mon osd, mds) file is 10 Mo large : OK
[root@node1 sendfile-example]# ls
10m-file Makefile README.md sendfile sendfile.c
[root@node1 sendfile-example]# ./sendfile 10m-file 10m-file.out $(( 10 * 1024 * 1024 ))
Sent 10240 KiB over sendfile(3EXT) of 10240 KiB requested
[root@node1 sendfile-example]# ll
total 20492rw-r--r- 1 root root 10485760 Mar 27 11:24 10m-file
------x--- 1 root root 10485760 Mar 30 14:14 10m-file.out
On node2 (mgr, mon osd, mds) file is 0 bytesl large : KO
[root@node2sendfile-example]# ll
total 10252rw-r--r- 1 root root 10485760 Mar 27 11:24 10m-file
------x--- 1 root root 0 Mar 30 14:14 10m-file.out
I used CentOS 7.7 last kernel update (3.10.X)
To force data update on ceph client i need to used touch on file.
It is reproductible on ceph 12.x, 14.x and 15.x.
To workaroud the problem I have installed kernel-lt from ELrepo which is in 4.4. It seem to work fine in these kernel version.
I suspect ceph.ko or libceph.ko to not work properly in kernel 3.10
1/ Is it possible to confirm the bug
2/ Is there any issue in roadmap to update ceph client on kernel 3.10 ? i can not use ELrepo Kernel 4.4 because it's a production environnement ...
Thanks for Help
Best Regards
History
#1 Updated by Nicolas Gaston almost 4 years ago
possible same root cause in https://tracker.ceph.com/issues/39938
#2 Updated by Nicolas Gaston almost 4 years ago
Hi i have juste try the same issue on centos 8.1 and the bug is solved on kernel 4.18.
This occur only on ceph client in centos 7.X with 3.10 kernel.
I can't upgrade my production on centos 8 for the moment, is there any plan to correct the bug in kernel 3.10 ? centos 7.8 ? 7.9 ?
Thanck
Regards
#3 Updated by Mikael Öhman almost 4 years ago
Hi Nicolas,
I encountered this a while back, and I asked on the mailinglist. Luckily, Jeff Layton @ Redhat provided a patch that was slated to inclusion in RHEL 7.9.
I'm running a patched kernel 3.10.0-1062.18.1 kernel on CentOS 7.7 now, and it fixes the issue for my clusters.
To quote the patch:
This patch is RHEL-only becuase ceph_file_splice_write() is unique to
RHEL7 kernel. Upstream kernels uses new interface for splice write.
#4 Updated by Jeff Layton almost 4 years ago
- Status changed from New to Resolved
- Assignee set to Jeff Layton
Yes, should be fixed in latest RHEL release. See: https://bugzilla.redhat.com/show_bug.cgi?id=1710751
#5 Updated by Nicolas Gaston almost 4 years ago
Yes i have test it on centos7.8 CR it work fine.
Thanck you very much