Project

General

Profile

Actions

Bug #8140

closed

0.79: MDS / CephFS: unable to read directory

Added by Dmitry Smirnov about 10 years ago. Updated about 10 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

With kernel client I got the following error when I attempted to list files in directory containing 1021 files:

ls -l /mnt/ceph/home/user/.config/chromium/Default/Local\ Storage
ls: reading directory /mnt/ceph/home/user/.config/chromium/Default/Local Storage: Cannot allocate memory

With Fuse client similar command was waiting indefinitely not returning any output.

Restarting active MDS fixed the issue.

MDS configuration:

    mds cache size             = 400000
    mds mem max                = 4194304

I found no errors in logs.

Linux-3.13.7 x86_64.

Actions #1

Updated by Zheng Yan about 10 years ago

  • Tracker changed from Bug to Feature
  • Project changed from Ceph to CephFS
  • Category deleted (1)
  • Status changed from New to Resolved

This issue should be fixed by commit 54008399 (ceph: preallocate buffer for readdir reply). For old kernel, you can avoid this issue by using readdir_max_entries mount option (for example: mount -t ceph 1.2.3.4:/ /mnt/ceph -o readdir_max_entries=64)

Actions #2

Updated by Dmitry Smirnov about 10 years ago

Which kernel version contains this fix?

Actions #3

Updated by Dmitry Smirnov about 10 years ago

Sorry, that can't be right. First of all I can't find this commit. Could you please use correct commit ID? I'd like to test this fix.
Also how can you explain that restart of MDS fixed the issue for both Fuse and kernel client? Even if mount option fix this issue for kernel client fuse/MDS problem won't be affected...

Actions #4

Updated by Dmitry Smirnov about 10 years ago

I found commit in "ceph-client: https://github.com/ceph/ceph-client/commit/54008399dc0ce511a07b87f1af3d1f5c791982a4

I see it is a kernel fix but here I think we have a problem in MDS because when I got this error with kernel client I tried to read the same directory using Fuse client and I couldn't until I restarted MDS. Therefore I doubt this problem is resolved...

Actions #5

Updated by Zheng Yan about 10 years ago

  • Tracker changed from Feature to Bug

When -ENOMEM happens, the kclient does not properly release (cache coherence related) resources. that's why ceph-fuse hang when accessing the same directory

Actions #6

Updated by Dmitry Smirnov about 10 years ago

Makes sense, thank you for explaining.

Actions #7

Updated by Zheng Yan about 10 years ago

3.15. For now, please use readdir_max_entries mount option

Actions #8

Updated by Dmitry Smirnov about 10 years ago

Already using. :) Thanks for useful advise. Very helpful.

Actions

Also available in: Atom PDF