Project

General

Profile

Actions

Bug #62664

open

ceph-fuse: failed to remount for kernel dentry trimming; quitting!

Added by Rodrigo Arias 8 months ago. Updated 4 days ago.

Status:
Fix Under Review
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
% Done:

0%

Source:
Community (user)
Tags:
Backport:
reef,squid
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
ceph-fuse
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hi,

While #62604 is being addressed I wanted to try the ceph-fuse client. I'm using the same setup with kernel 6.4.11 and ceph version 18.2.0 (5dd24139a1eada541a3bc16b6941c5dde975e26d) reef (stable).

The ceph-fuse client fails to remount the FS to trim dentries:

hut# ceph-fuse -f -n client.user -m 10.0.40.40 /ceph2
2023-08-31T14:08:31.479+0200 7fcc01e9e3c0 -1 init, newargv = 0x55777ee84dc0 newargc=16
ceph-fuse[1342578]: starting ceph client
ceph-fuse[1342578]: starting fuse
mount: /ceph2: mount point not mounted or bad option.
       dmesg(1) may have more information after failed mount system call.
2023-08-31T14:08:31.501+0200 7fcbd7fff6c0 -1 client.494146 failed to remount (to trim kernel dentries): return code = 32
2023-08-31T14:08:31.501+0200 7fcbd7fff6c0 -1 client.494146 failed to remount for kernel dentry trimming; quitting!
mount: /ceph2: mount point not mounted or bad option.
       dmesg(1) may have more information after failed mount system call.
2023-08-31T14:08:32.512+0200 7fcbd7fff6c0 -1 client.494146 failed to remount (to trim kernel dentries): return code = 32
2023-08-31T14:08:32.512+0200 7fcbd7fff6c0 -1 client.494146 failed to remount for kernel dentry trimming; quitting!
mount: /ceph2: mount point not mounted or bad option.
       dmesg(1) may have more information after failed mount system call.
2023-08-31T14:08:33.522+0200 7fcbd7fff6c0 -1 client.494146 failed to remount (to trim kernel dentries): return code = 32
2023-08-31T14:08:33.522+0200 7fcbd7fff6c0 -1 client.494146 failed to remount for kernel dentry trimming; quitting!
mount: /ceph2: mount point not mounted or bad option.
       dmesg(1) may have more information after failed mount system call.
2023-08-31T14:08:34.533+0200 7fcbd7fff6c0 -1 client.494146 failed to remount (to trim kernel dentries): return code = 32
2023-08-31T14:08:34.533+0200 7fcbd7fff6c0 -1 client.494146 failed to remount for kernel dentry trimming; quitting!
mount: /ceph2: mount point not mounted or bad option.
       dmesg(1) may have more information after failed mount system call.
2023-08-31T14:08:35.543+0200 7fcbd7fff6c0 -1 client.494146 failed to remount (to trim kernel dentries): return code = 32
2023-08-31T14:08:35.543+0200 7fcbd7fff6c0 -1 client.494146 failed to remount for kernel dentry trimming; quitting!
ceph-fuse[1342578]: fuse failed dentry invalidate/remount test with error (32) Broken pipe, stopping
/build/ceph-18.2.0/src/ceph_fuse.cc: In function 'virtual void* main(int, const char**, const char**)::RemountTest::entry()' thread 7fcbd7fff6c0 time 2023-08-31T14:08:36.554
371+0200
/build/ceph-18.2.0/src/ceph_fuse.cc: 243: ceph_abort_msg("abort() called")
...

I attach the full strace log, it seems to be failing in the mount command, in the fsconfig() call with the "allow_other" option:

1343504 access("/run/mount/utab", R_OK|W_OK) = 0
1343504 fspick(3, "", FSPICK_NO_AUTOMOUNT|FSPICK_EMPTY_PATH) = 4
1343504 fsconfig(4, FSCONFIG_SET_FLAG, "allow_other", NULL, 0) = -1 EINVAL (Invalid argument)
1343504 close(4)                        = 0
1343504 close(3)                        = 0

Here is my /etc/fuse.conf:

#user_allow_other
mount_max = 1000

And the same thing happens if I enable user_allow_other.

Setting this option in /etc/ceph.conf:

[client]
client_die_on_failed_dentry_invalidate=false

Allows me to mount the FS anyway and perform some I/O benchmarks with no apparent problems, but I don't know the implications of that option.


Files

ceph-fuse.log (629 KB) ceph-fuse.log Rodrigo Arias, 08/31/2023 12:24 PM
Actions #1

Updated by Xiubo Li 5 months ago

  • Project changed from Linux kernel client to CephFS
  • Category deleted (fs/ceph)
  • Assignee set to Xiubo Li
Actions #2

Updated by Ilya Dryomov 4 months ago

  • Target version deleted (v18.2.0)
Actions #3

Updated by Jakob Haufe 6 days ago

This is related to https://github.com/util-linux/util-linux/issues/2576 and will happen on any system with util-linux/libmount 2.38 or later and a kernel supporting the new fsconfig, which is available since 5.1.

Current workaround is setting the LIBMOUNT_FORCE_MOUNT2 environment variable to always.
This could be added to remount_cb in src/client/fuse_ll.cc.

Note that this (currently) affects all versions of ceph-fuse.

Actions #4

Updated by Jakob Haufe 5 days ago

I created a minimal PR implementing this: https://github.com/ceph/ceph/pull/57170

Is there any jenkins test to run this on a new enough base system to be affected by this issue?

Actions #5

Updated by Venky Shankar 4 days ago

  • Category set to Correctness/Safety
  • Status changed from New to Fix Under Review
  • Target version set to v20.0.0
  • Backport set to reef,squid
  • Pull request ID set to 57170
  • Component(FS) ceph-fuse added

Jakob Haufe wrote in #note-4:

I created a minimal PR implementing this: https://github.com/ceph/ceph/pull/57170

Is there any jenkins test to run this on a new enough base system to be affected by this issue?

Thanks for the PR. I'll have to check the systems we run our fs suites on w.r.t. the versions of the libs used.

Actions

Also available in: Atom PDF