Project

General

Profile

Actions

Bug #58394

open

nofail option in fstab not supported

Added by Brian Woods over 1 year ago. Updated 8 months ago.

Status:
Pending Backport
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
% Done:

0%

Source:
Tags:
backport_processed
Backport:
reef,quincy,pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
ceph-fuse
Labels (FS):
crash
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

There are several old bug reports on this from 2019, but they should have all been resolved. However, testing on a 17.2.4 cluster seems to have the same issue.

Line from fstab:

none    /CephFS  fuse.ceph ceph.id=admin,ceph.conf=/etc/ceph/ceph.conf,_netdev,nofail,defaults  0 0

Removing the "nofail" option and everything works as expected.

Full debug output from mount:

# LIBMOUNT_DEBUG=0xffff  mount CephFS/
2491540: libmount:     INIT: library debug mask: 0xffff
2491540: libmount:     INIT: library version: 2.34.0
2491540: libmount:     INIT:     feature: selinux
2491540: libmount:     INIT:     feature: smack
2491540: libmount:     INIT:     feature: btrfs
2491540: libmount:     INIT:     feature: namespaces
2491540: libmount:     INIT:     feature: assert
2491540: libmount:     INIT:     feature: debug
Available "LIBMOUNT_DEBUG=<name>[,...]|<mask>" debug masks:
   all      [0xffff] : info about all subsystems
   cache    [0x0004] : paths and tags cache
   cxt      [0x0200] : library context (handler)
   diff     [0x0400] : mountinfo changes tracking
   fs       [0x0040] : FS abstraction
   help     [0x0001] : this help
   locks    [0x0010] : mtab and utab locking
   loop     [0x2000] : loop devices routines
   options  [0x0008] : mount options parsing
   tab      [0x0020] : fstab, mtab, mountinfo routines
   update   [0x0080] : mtab, utab updates
   utils    [0x0100] : misc library utils
   monitor  [0x0800] : mount tables monitor
   btrfs    [0x1000] : btrfs specific routines
2491540: libmount:      CXT: [0x564bdbc0da30]: ----> allocate 
2491540: libmount:      CXT: [0x564bdbc0da30]: mount: preparing
2491540: libmount:      CXT: [0x564bdbc0da30]: use default optsmode
2491540: libmount:      CXT: [0x564bdbc0da30]: OPTSMODE: ignore=0, append=0, prepend=1, replace=0, force=0, fstab=1, mtab=1
2491540: libmount:      CXT: [0x564bdbc0da30]: trying to apply fstab (src=(null), target=CephFS/)
2491540: libmount:      TAB: [0x564bdbc0dc60]: alloc
2491540: libmount:    CACHE: [0x564bdbc0dcc0]: alloc
2491540: libmount:      TAB: [0x564bdbc0dc60]: /etc/fstab: start parsing [entries=0, filter=not]
2491540: libmount:      TAB: [0x564bdbc0dc60]: add entry: /dev/disk/by-id/dm-uuid-LVM-wjhfyutui679RTcte75I5qxq5hTAxnlunnWAT5lws0eJg0IBAWCf0yBaJN2acA35 /
2491540: libmount:      TAB: [0x564bdbc0dc60]: add entry: /swap.img none
2491540: libmount:      TAB: [0x564bdbc0dc60]: add entry: UUID=1c515a96-53b5-47dc-8e90-8fa9eaa7c4da /boot
2491540: libmount:      TAB: [0x564bdbc0dc60]: add entry: UUID=AF3B-BEA8 /boot/efi
2491540: libmount:      TAB: [0x564bdbc0dc60]: add entry: none /CephFS
2491540: libmount:       FS: [0x564bdbc0f690]: free [refcount=0]
2491540: libmount:      TAB: [0x564bdbc0dc60]: /etc/fstab: stop parsing (5 entries)
2491540: libmount:      TAB: [0x564bdbc0dc60]: parsing done [filename=/etc/fstab, rc=0]
2491540: libmount:      TAB: [0x564bdbc0dc60]: lookup TARGET: 'CephFS/'
2491540: libmount:      TAB: [0x564bdbc0dc60]: lookup absolute TARGET: '//CephFS/'
2491540: libmount:      CXT: [0x564bdbc0da30]: apply entry:
2491540: libmount:      CXT: ------ fs:
source: none
target: /CephFS
fstype: fuse.ceph
optstr: ceph.id=admin,ceph.conf=/etc/ceph/ceph.conf,_netdev,nofail,defaults
FS-opstr: ceph.id=admin,ceph.conf=/etc/ceph/ceph.conf
user-optstr: _netdev,nofail
2491540: libmount:      CXT: [0x564bdbc0da30]: merging mount flags
2491540: libmount:      CXT: [0x564bdbc0da30]: final flags: VFS=00000000 user=00000480
2491540: libmount:      CXT: [0x564bdbc0da30]: mount: evaluating permissions
2491540: libmount:      CXT: [0x564bdbc0da30]: mount: fixing optstr
2491540: libmount:      CXT: [0x564bdbc0da30]: mount: fixing vfs optstr
2491540: libmount:      CXT: applying 0x00000000 flags to '(null)'
2491540: libmount:      CXT: new optstr 'rw'
2491540: libmount:      CXT: [0x564bdbc0da30]: mount: fixing user optstr
2491540: libmount:      CXT: applying 0x00001152 flags to '_netdev,nofail'
2491540: libmount:      CXT: new optstr '_netdev,nofail'
2491540: libmount:      CXT: [0x564bdbc0da30]: fixed options [rc=0]: vfs: 'rw' fs: 'ceph.id=admin,ceph.conf=/etc/ceph/ceph.conf' user: '_netdev,nofail', optstr: 'rw,ceph.id=admin,ceph.conf=/etc/ceph/ceph.conf,_netdev,nofail'
2491540: libmount:      CXT: [0x564bdbc0da30]: preparing source path
2491540: libmount:      CXT: [0x564bdbc0da30]: srcpath 'none'
2491540: libmount:    CACHE: [0x564bdbc0dcc0]: canonicalize path none
2491540: libmount:    CACHE: [0x564bdbc0dcc0]: add entry [ 1] (path): none: none
2491540: libmount:      CXT: [0x564bdbc0da30]: final srcpath 'none'
2491540: libmount:      CXT: [0x564bdbc0da30]: preparing target path
2491540: libmount:    CACHE: [0x564bdbc0dcc0]: canonicalize path /CephFS
2491540: libmount:    CACHE: [0x564bdbc0dcc0]: add entry [ 2] (path): /CephFS: /CephFS
2491540: libmount:      CXT: [0x564bdbc0da30]: final target '/CephFS'
2491540: libmount:      CXT: [0x564bdbc0da30]: preparing fstype
2491540: libmount:      CXT: [0x564bdbc0da30]: FS type: fuse.ceph [rc=0]
2491540: libmount:      CXT: [0x564bdbc0da30]: /sbin/mount.fuse.ceph     ... found
2491540: libmount:      CXT: [0x564bdbc0da30]: prepare update
2491540: libmount:      CXT: [0x564bdbc0da30]: utab path initialized to: /run/mount/utab
2491540: libmount:      CXT: [0x564bdbc0da30]: checking for writable tab files
2491540: libmount:    UTILS: utab: /run/mount/utab
2491540: libmount:    UTILS: try write /run/mount/utab dir: (null)
2491540: libmount:    UTILS:  access OK
2491540: libmount:   UPDATE: [0x564bdbc0ec00]: allocate
2491540: libmount:   UPDATE: [0x564bdbc0ec00]: resetting FS [target=(null), flags=0x00000000]
2491540: libmount:   UPDATE: [0x564bdbc0ec00]: FS template:
2491540: libmount:   UPDATE: ------ fs:
source: none
target: /CephFS
fstype: fuse.ceph
optstr: rw,ceph.id=admin,ceph.conf=/etc/ceph/ceph.conf,_netdev,nofail
VFS-optstr: rw
FS-opstr: ceph.id=admin,ceph.conf=/etc/ceph/ceph.conf
user-optstr: _netdev,nofail
2491540: libmount:   UPDATE: prepare utab entry
2491540: libmount:   UPDATE: setting FS root
2491540: libmount:      TAB: lookup fs-root for 'none'
2491540: libmount:      TAB: FS root result: /
2491540: libmount:   UPDATE: utab entry OK
2491540: libmount:   UPDATE: [0x564bdbc0ec00]: ready
2491540: libmount:      CXT: [0x564bdbc0da30]: mount: do mount
2491540: libmount:      CXT: [0x564bdbc0da30]: mount: executing helper /sbin/mount.fuse.ceph
2491540: libmount:      CXT: [0x564bdbc0da30]: mount: generate helper mount options
2491541: libmount:      CXT: [0x564bdbc0da30]: argv[0] = "/sbin/mount.fuse.ceph" 
2491541: libmount:      CXT: [0x564bdbc0da30]: argv[1] = "none" 
2491541: libmount:      CXT: [0x564bdbc0da30]: argv[2] = "/CephFS" 
2491541: libmount:      CXT: [0x564bdbc0da30]: argv[3] = "-o" 
2491541: libmount:      CXT: [0x564bdbc0da30]: argv[4] = "rw,ceph.id=admin,ceph.conf=/etc/ceph/ceph.conf,_netdev,nofail" 
2023-01-09T02:49:14.733+0000 7f5a8651f200 -1 init, newargv = 0x55c50b6c86d0 newargc=17
ceph-fuse[2491552]: starting ceph client
fuse: unknown option `nofail'
ceph-fuse[2491552]: fuse failed to start
2023-01-09T02:49:14.745+0000 7f5a8651f200 -1 fuse_lowlevel_new failed
Mount failed with status code: 33
2491540: libmount:      CXT: [0x564bdbc0da30]: /sbin/mount.fuse.ceph executed [status=0, rc=0]
2491540: libmount:   UPDATE: [0x564bdbc0ec00]: /run/mount/utab: checking for previous update
2491540: libmount:    LOCKS: [0x564bdbc107f0]: alloc: default linkfile=/run/mount/utab~.2491540, lockfile=/run/mount/utab~
2491540: libmount:    LOCKS: [0x564bdbc107f0]: signals: BLOCKED
2491540: libmount:    LOCKS: [0x564bdbc107f0]: flock: ENABLED
2491540: libmount:    LOCKS: [0x564bdbc107f0]: new lock filename: '/run/mount/utab.lock'
2491540: libmount:    LOCKS: [0x564bdbc107f0]: /run/mount/utab.lock: locking
2491540: libmount:      TAB: [0x564bdbc10890]: alloc
2491540: libmount:      TAB: [0x564bdbc10890]: new tab for file: /run/mount/utab
2491540: libmount:      TAB: [0x564bdbc10890]: /run/mount/utab: start parsing [entries=0, filter=not]
2491540: libmount:      TAB: [0x564bdbc10890]: add entry: /dev/loop2 /snap/core20/1738
2491540: libmount:      TAB: [0x564bdbc10890]: add entry: /dev/loop3 /snap/snapd/17883
2491540: libmount:      TAB: [0x564bdbc10890]: add entry: /dev/loop4 /snap/lxd/23991
2491540: libmount:      TAB: [0x564bdbc10890]: add entry: /dev/loop6 /snap/lxd/24061
2491540: libmount:      TAB: [0x564bdbc10890]: add entry: /dev/loop7 /snap/core18/2654
2491540: libmount:      TAB: [0x564bdbc10890]: add entry: /dev/loop0 /snap/core18/2667
2491540: libmount:      TAB: [0x564bdbc10890]: add entry: /dev/loop5 /snap/core20/1778
2491540: libmount:      TAB: [0x564bdbc10890]: add entry: none /CephFS
2491540: libmount:       FS: [0x564bdbc123e0]: free [refcount=0]
2491540: libmount:      TAB: [0x564bdbc10890]: /run/mount/utab: stop parsing (8 entries)
2491540: libmount:      TAB: [0x564bdbc10890]: parsing done [filename=/run/mount/utab, rc=0]
2491540: libmount:    LOCKS: [0x564bdbc107f0]: (2491540) unlocking
2491540: libmount:    LOCKS: [0x564bdbc107f0]: /run/mount/utab.lock: unflocking
2491540: libmount:    LOCKS: [0x564bdbc107f0]: restoring sigmask
2491540: libmount:      TAB: [0x564bdbc10890]: lookup SOURCE: none TARGET: /CephFS
2491540: libmount:   UPDATE: [0x564bdbc0ec00]: /run/mount/utab: found none /CephFS
2491540: libmount:      TAB: [0x564bdbc10890]: reset
2491540: libmount:       FS: [0x564bdbc108f0]: free [refcount=0]
2491540: libmount:       FS: [0x564bdbc11a40]: free [refcount=0]
2491540: libmount:       FS: [0x564bdbc11ba0]: free [refcount=0]
2491540: libmount:       FS: [0x564bdbc11d00]: free [refcount=0]
2491540: libmount:       FS: [0x564bdbc11e60]: free [refcount=0]
2491540: libmount:       FS: [0x564bdbc11fc0]: free [refcount=0]
2491540: libmount:       FS: [0x564bdbc12120]: free [refcount=0]
2491540: libmount:       FS: [0x564bdbc12280]: free [refcount=0]
2491540: libmount:      TAB: [0x564bdbc10890]: free [refcount=0]
2491540: libmount:    LOCKS: [0x564bdbc107f0]: free
2491540: libmount:   UPDATE: [0x564bdbc0ec00]: /run/mount/utab: previous update check done [rc=1]
2491540: libmount:      CXT: [0x564bdbc0da30]: don't update: error evaluate or already updated
2491540: libmount:      CXT: [0x564bdbc0da30]: excode: rc=0 message="" 
2491540: libmount:      CXT: [0x564bdbc0da30]: <---- reset [status=1] ---->
2491540: libmount:       FS: [0x564bdbc0db60]: free [refcount=0]
2491540: libmount:      TAB: [0x564bdbc0dc60]: reset
2491540: libmount:       FS: [0x564bdbc0dd00]: free [refcount=0]
2491540: libmount:       FS: [0x564bdbc0ef60]: free [refcount=0]
2491540: libmount:       FS: [0x564bdbc0f0e0]: free [refcount=0]
2491540: libmount:       FS: [0x564bdbc0f2d0]: free [refcount=0]
2491540: libmount:       FS: [0x564bdbc0f470]: free [refcount=0]
2491540: libmount:      TAB: [0x564bdbc0dc60]: free [refcount=0]
2491540: libmount:    CACHE: [0x564bdbc0dcc0]: free [refcount=0]
2491540: libmount:   UPDATE: [0x564bdbc0ec00]: free
2491540: libmount:       FS: [0x564bdbc0ec80]: free [refcount=0]
2491540: libmount:      CXT: [0x564bdbc0da30]: Setting (null) as target namespace
2491540: libmount:      CXT: [0x564bdbc0da30]: <---- free


Related issues 3 (2 open1 closed)

Copied to CephFS - Backport #62425: reef: nofail option in fstab not supportedFix Under ReviewLeonid UsovActions
Copied to CephFS - Backport #62426: quincy: nofail option in fstab not supportedFix Under ReviewLeonid UsovActions
Copied to CephFS - Backport #62427: pacific: nofail option in fstab not supportedResolvedLeonid UsovActions
Actions #1

Updated by Brian Woods over 1 year ago

Results from a non-debug mount:

# mount CephFS/
2023-01-09T02:56:58.517+0000 7f2e7c185200 -1 init, newargv = 0x5615e993a6d0 newargc=17
ceph-fuse[2492343]: starting ceph client
fuse: unknown option `nofail'
ceph-fuse[2492343]: fuse failed to start
2023-01-09T02:56:58.529+0000 7f2e7c185200 -1 fuse_lowlevel_new failed
Mount failed with status code: 33

Actions #2

Updated by Venky Shankar over 1 year ago

FWIW, a similar issue was attempted to be fixed for the kclient - https://github.com/ceph/ceph/pull/26992, but all the change does is to ignore the nofail option.

I we want `nofail` option to be handles, we would want to fix it both in mount helper (for kclient) and ceph-fuse.

Actions #3

Updated by Venky Shankar over 1 year ago

  • Category set to Correctness/Safety
  • Assignee set to Dhairya Parmar
  • Target version set to v18.0.0
  • Backport set to pacific,quincy

Dhairya, please take this one. I think we would want to get the semantics right for `nofail` option with ceph-fuse and mount helper.

Actions #4

Updated by Venky Shankar over 1 year ago

  • Status changed from New to Triaged
Actions #5

Updated by Brian Woods about 1 year ago

Interesting update, I noticed that even without the nofail, systems will continue to boot even if it can't complete a mount.

Actions #6

Updated by Venky Shankar about 1 year ago

Brian Woods wrote:

Interesting update, I noticed that even without the nofail, systems will continue to boot even if it can't complete a mount.

Were you able to check why.

However, we still need this bit to be fixed in ceph.

Actions #7

Updated by Venky Shankar 9 months ago

  • Assignee changed from Dhairya Parmar to Leonid Usov
  • Target version changed from v18.0.0 to v19.0.0
  • Backport changed from pacific,quincy to reef,quincy,pacific
  • Labels (FS) crash added
Actions #8

Updated by Brian Woods 9 months ago

Venky Shankar wrote:

Were you able to check why.

Not sure what you are asking. Check why the ceph mount allowed the system to continue booting? Not sure I can answer that.

The volume was failing to mount because it was in a catch 22 from a poor configuration. That is not a ceph problem though. :)

Actions #9

Updated by Venky Shankar 9 months ago

Brian Woods wrote:

Venky Shankar wrote:

Were you able to check why.

Not sure what you are asking. Check why the ceph mount allowed the system to continue booting? Not sure I can answer that.

Without `nofail` the system should not continue booting, isn't it? Or maybe its that `nofail` will just not report errors, so, its expected that the system would continue booting as usual.

Anyhow, the mount helpers should understand `nofail` option.

Actions #10

Updated by Leonid Usov 9 months ago

Okay, so the research so far shows that

  • The lack of nofail support by fuse has been reported multiple times as an issue across multiple different domains that rely on fuse.
  • Fuse3 has added support for nofail
  • mount.fuse.ceph mount helper has a precedent of filtering out options that fuse doesn't like
    def fs_options(opts, ceph_opts):
        # strip out noauto and _netdev options; libfuse doesn't like it
        strip_opts = ['defaults', 'noauto', '_netdev']
        return ','.join(list(set(opts) - set(ceph_opts) - set(strip_opts)))
    

so, firstly, it appears to be a trivial change to add the nofail to the blacklist above.
However, I wonder whether this approach is correct. Ceph clearly states that it relies on the fuse kernel and userspace library, and the host system provides that as a prerequisite. If there are any limitations inherent to the version of the fuse lib that a particular host provides, then the host administrator should be responsible for providing the compatible list settings which get forwarded to fuse.
If Ceph intervenes, it makes the setup not future-compatible or customizable, so IMO the approach is problematic also for the other options stripped so far.

Another point is the user experience. If we strip the option, it will look as if it's supported. It could be that with nofail in particular the expected behavior matches this incidentally, but I'm not sure about that. But the point is, I think that we can afford to strip options that would have no effect because the behavior already matches. If it's an option that happens to be unsupported by the particular version fo the fuse library, we should probably let it fail so that the user can handle it - simply by removing it if they think it's not required, or upgrading the fuse library if they need the functionality.

WDYT?

Actions #11

Updated by Leonid Usov 9 months ago

https://github.com/libfuse/libfuse/blob/master/ChangeLog.rst#libfuse-322-2018-03-31

libfuse 3.2.2 (2018-03-31)
Added example fuse.conf file.
Added "support" for -o nofail mount option (the option is accepted and ignored).
Various small bugfixes.

Actions #12

Updated by Leonid Usov 9 months ago

A suggestion for Debian, for example, is to apt install fuse3

Actions #13

Updated by Leonid Usov 9 months ago

Here is the commit that introduces the "support" for the option to fuse3: https://github.com/libfuse/libfuse/commit/a83cd72f641671b71b8268b1765e449cae071f3e?diff=unified

I put "support" in quotes because they just ignore the option. This could make it a point for implementing the same on our side. However, I'm still reluctant to do that. I will continue my investigation, and if I find out that the reason they ignore it is that user space file systems are by definition nofail as far as the kernel and the user are concerned, I'll submit a patch that does the same on our side, for versions of fuse < 3.2.2
UPD: we should be stripping the option unconditionally, even with libfuse 3.2.2+, since the change on libfuse only affects users of their standard mount helper mount.fuse

Until then, even if fuse themselves just ignore the option, I would prefer to leave it their responsibility, not ours.

Actions #14

Updated by Leonid Usov 9 months ago

  • Status changed from Triaged to Fix Under Review
  • Pull request ID set to 52834

After further research I've arrived at the conclusion that stripping the option in the mount.fuse.ceph helper is the correct approach. Pull request submitted, more info there.

Actions #15

Updated by Venky Shankar 8 months ago

  • Status changed from Fix Under Review to Pending Backport
Actions #16

Updated by Backport Bot 8 months ago

  • Copied to Backport #62425: reef: nofail option in fstab not supported added
Actions #17

Updated by Backport Bot 8 months ago

  • Copied to Backport #62426: quincy: nofail option in fstab not supported added
Actions #18

Updated by Backport Bot 8 months ago

  • Copied to Backport #62427: pacific: nofail option in fstab not supported added
Actions #19

Updated by Backport Bot 8 months ago

  • Tags set to backport_processed
Actions

Also available in: Atom PDF