Bug #39938
closedIssues with CephFS kernel driver
0%
Description
I described the problem pretty thoroughly on https://marc.info/?l=ceph-devel&m=155786104524387&w=2 but not sure it reached the right audience.
I'll open a bug here as well.
// Patrik
Files
Updated by Ilya Dryomov almost 5 years ago
- Project changed from Ceph to Linux kernel client
- Category changed from ceph cli to fs/ceph
- Assignee set to Zheng Yan
- Priority changed from Normal to Urgent
Updated by Zheng Yan almost 5 years ago
Thanks for reporting this. This seems like a splice write bug.
ssize_t ceph_file_splice_write(struct pipe_inode_info *pipe, struct file *out, loff_t *ppos, size_t len, unsigned int flags) { ssize_t ret; struct inode *inode = file_inode(out); struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_file_info *fi = out->private_data; int got, want; if (fi->fmode & CEPH_FILE_MODE_LAZY) want = CEPH_CAP_FILE_BUFFER | CEPH_CAP_FILE_LAZYIO; else want = CEPH_CAP_FILE_BUFFER; ret = ceph_get_caps(ci, CEPH_CAP_FILE_WR, want, *ppos + len, &got, NULL); if (ret < 0) return ret; if (!(got & want)) { ceph_put_cap_refs(ci, got); return default_file_splice_write(pipe, out, ppos, len, flags); } ret = generic_file_splice_write(pipe, out, ppos, len, flags); ceph_put_cap_refs(ci, got); return ret; }
needs to call __ceph_mark_dirty_caps() before ceph_put_cap_refs(). try config gitlab to not use splice write or using upstream kernel.
Updated by Patrik Martinsson almost 5 years ago
Zheng Yan wrote:
Thanks for reporting this. This seems like a splice write bug.
[...]
needs to call __ceph_mark_dirty_caps() before ceph_put_cap_refs(). try config gitlab to not use splice write or using upstream kernel.
I see. Hm, seems quite serious.
try config gitlab not use splice write
That seems like a "low level" thing. Pretty sure you cant configure the application to do so. Non the less, splice() will need to work correctly, no ?
using upstream kernel.
We are using Rhel 7.6 and I've tried with elrepo-kernels (kernel-lt-4.4.178-1.el7.elrepo.x86_64.rpm, kernel-ml-5.1.1-1.el7.elrepo.x86_64.rpm), but then I get the following on the client, "mon0 xx: feature set mismatch, my XXXXXX < server's XXXXXX, missing 40000000".
Are you saying that this is fixed in a newer kernel driver than the one included for Rhel 7.6 ?
Updated by Zheng Yan almost 5 years ago
that's strange, 5.1 kernel is much newer than luminous, it should support all ceph features
Updated by Jeff Layton over 2 years ago
- Status changed from New to Resolved
I believe this has been fixed upstream for quite some time.