Project

General

Profile

Actions

Bug #64467

open

cephadm deployed NFS Ganesha clusters should disable attribute and dir caching

Added by Wes Dillingham 3 months ago. Updated 2 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
cephadm
Target version:
-
% Done:

0%

Source:
Tags:
cephadm
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

By default I believe cephadm should create the following stanza within the NFS config:

MDCACHE {
    # Size the dirent cache down as small as possible.
    Dir_Chunk = 0;
}

The lack of this config parameter was observed in a 17.2.5 cephadm deployed NFS cluster.

In the Docs: https://docs.ceph.com/en/quincy/cephfs/nfs/#nfs-ganesha-configuration
A sample Ganesha conf is shared: https://github.com/nfs-ganesha/nfs-ganesha/blob/next/src/config_samples/ceph.conf

Which gives the following comments about the config option (Dir_Chunk = 0):

# The libcephfs client will aggressively cache information while it
# can, so there is little benefit to ganesha actively caching the same
# objects. Doing so can also hurt cache coherency. Here, we disable
# as much attribute and directory caching as we can.

This was first hand experienced on a Cent7 NFS client which was producing the following Error:

Feb 14 13:01:23 c2518 kernel: NOTICE  NFS: directory conn_OCD_ABCD_test.qlog/240205150453696 contains a readdir loop.Please contact your server vendor.  The file: node.0564240205150453696.mat has duplicate cookie 1150492533365669891
Feb 14 13:01:39 c2307 kernel: NOTICE  NFS: directory conn_OCD_ABCD_test.qlog/240205150453696 contains a readdir loop.Please contact your server vendor.  The file: node.1080240205150453696.sh has duplicate cookie 1148856723398721539
Feb 14 13:01:39 c2307 kernel: NOTICE  NFS: directory conn_OCD_ABCD_test.qlog/240205150453696 contains a readdir loop.Please contact your server vendor.  The file: node.1080240205150453696.sh has duplicate cookie 1148856723398721539

In an attempt to diagnose the issue I found the following comment in a gluster bug: https://bugzilla.redhat.com/show_bug.cgi?id=1701016#c7

In /etc/ganesha/ganesha.conf

Disabling readdir chunking will make readdir work again

I believe this parameter has functional and performance implications.

Actions #1

Updated by John Mulligan 3 months ago

  • Project changed from Ceph to Orchestrator
  • Category set to cephadm
Actions #2

Updated by John Mulligan 2 months ago

I've been told by one of the ganesha team members that this isn't a good idea, but I don't have an explanation why. I'm still waiting for more feedback on that end, FWIW.

Actions

Also available in: Atom PDF