Project

General

Profile

Actions

Bug #23421

closed

ceph-fuse: stop ceph-fuse if no root permissions?

Added by Jos Collin about 6 years ago. Updated almost 4 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
ceph-fuse
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I think it would be a good idea to prevent ceph-fuse from proceeding if there are no appropriate permissions.
I request someone to validate the following:

Create mountpoint with non-root ownership

[jcollin@stratocaster build]$ ll /mnt/
total 0
drwxr-xr-x. 2 jcollin jcollin 6 Mar 20 09:26 cephfs
drwxr-xr-x. 2 jcollin jcollin 6 Mar 20 08:52 myfs

Mount without sudo

[jcollin@stratocaster build]$ ./bin/ceph-fuse -m localhost:40548 /mnt/cephfs/
2018-03-20 09:40:17.649 7f4d07710cc0 -1 WARNING: all dangerous and experimental features are enabled.
2018-03-20 09:40:17.674 7f4d07710cc0 -1 WARNING: all dangerous and experimental features are enabled.
2018-03-20 09:40:17.696 7f4d07710cc0 -1 WARNING: all dangerous and experimental features are enabled.
2018-03-20 09:40:17.697 7f4d07710cc0 -1 init, newargv = 0x560b647003c0 newargc=7
ceph-fuse[7058]: starting ceph client
ceph-fuse[7058]: starting fuse

df, ls /mnt/ hangs for around 15 minutes

[jcollin@stratocaster build]$ df
^C
[jcollin@stratocaster build]$ ls /mnt/
^C

Later on, found that ceph-fuse hangs.

[jcollin@stratocaster build]$ ps -A | grep ceph-fuse
 7058 pts/0    00:00:00 ceph-fuse

So Kill ceph-fuse

[jcollin@stratocaster build]$ sudo kill -9 7058

Now everything seems back to normal. But see the output of the list and df now.

[jcollin@stratocaster build]$ df
df: /mnt/cephfs: Transport endpoint is not connected
Filesystem              1K-blocks     Used Available Use% Mounted on
devtmpfs                  9964244        0   9964244   0% /dev
tmpfs                     9978944        0   9978944   0% /dev/shm
tmpfs                     9978944     1524   9977420   1% /run
tmpfs                     9978944        0   9978944   0% /sys/fs/cgroup
/dev/mapper/fedora-root  52403200 19414508  32988692  38% /
tmpfs                     9978944     4952   9973992   1% /tmp
/dev/sda1                  999320   201264    729244  22% /boot
/dev/mapper/fedora-home 436358992 87335312 349023680  21% /home
tmpfs                     1995788       16   1995772   1% /run/user/1000

[jcollin@stratocaster build]$ ll /mnt/
ls: cannot access '/mnt/cephfs': Transport endpoint is not connected
total 0
d?????????? ? ?       ?       ?            ? cephfs
drwxr-xr-x. 2 jcollin jcollin 6 Mar 20 08:52 myfs

[root@stratocaster build]# rmdir /mnt/cephfs
rmdir: failed to remove '/mnt/cephfs': Device or resource busy
[root@stratocaster build]# rm -Rf /mnt/cephfs 
rm: cannot remove '/mnt/cephfs': Is a directory

This requires a reboot to become normal. This problem doesn't happen if we mounted with 'sudo' in the above step.


Files

client.admin.269603.log (31 KB) client.admin.269603.log crash Jos Collin, 06/19/2020 09:26 AM
Actions #1

Updated by Patrick Donnelly about 6 years ago

  • Subject changed from Stop ceph-fuse if no root permissions? to ceph-fuse: stop ceph-fuse if no root permissions?
  • Status changed from New to Need More Info

Jos, can you get more detailed debug logs when this happens? It is probably not related to sudo permissions.

Actions #2

Updated by Jos Collin about 6 years ago

Patrick Donnelly wrote:

Jos, can you get more detailed debug logs when this happens? It is probably not related to sudo permissions.

d?????????? ? ? ? ? ? cephfs

We can ignore this, as this happened because ceph-fuse was killed. Not related to the hanging issue.

Actions #3

Updated by Patrick Donnelly about 6 years ago

  • Status changed from Need More Info to Closed
Actions #4

Updated by Patrick Donnelly about 6 years ago

  • Category set to Correctness/Safety
  • Status changed from Closed to Need More Info
  • Assignee set to Jos Collin
  • Target version set to v13.0.0
  • Source set to Development
  • Component(FS) ceph-fuse added

Jos, please get hte client logs so we can diagnose.

Actions #5

Updated by Jos Collin almost 6 years ago

The hang doesn't exist in the latest code.

The following is my latest finding:

jcollin@stratocaster:~/workspace/cephtest/build$ ll /mnt/
total 8
drwxr-xr-x 2 jcollin jcollin 4096 Apr 17 09:44 cephfs
drwxr-xr-x 2 root    root    4096 Apr 17 13:48 myfs
jcollin@stratocaster:~/workspace/cephtest/build$ ./bin/ceph-fuse -m localhost:40945 /mnt/cephfs/
2018-04-26 15:27:53.044 7f0a37d38c40 -1 WARNING: all dangerous and experimental features are enabled.
2018-04-26 15:27:53.052 7f0a37d38c40 -1 WARNING: all dangerous and experimental features are enabled.
2018-04-26 15:27:53.056 7f0a37d38c40 -1 WARNING: all dangerous and experimental features are enabled.
2018-04-26 15:27:53.056 7f0a37d38c40 -1 init, newargv = 0x555d93198e20 newargc=7
ceph-fuse[4224]: starting ceph client
ceph-fuse[4224]: starting fuse

Neither ceph-fuse nor ll nor df hangs now. But the ? appears again.

jcollin@stratocaster:~/workspace/cephtest/build$ ll /mnt/
ls: cannot access '/mnt/cephfs': Transport endpoint is not connected
total 4
d????????? ? ?    ?       ?            ? cephfs
drwxr-xr-x 2 root root 4096 Apr 17 13:48 myfs

Try unmounting and everything back to normal.

jcollin@stratocaster:~/workspace/cephtest/build$ sudo fusermount -u /mnt/cephfs 
jcollin@stratocaster:~/workspace/cephtest/build$ ll /mnt/
total 8
drwxr-xr-x 2 jcollin jcollin 4096 Apr 17 09:44 cephfs
drwxr-xr-x 2 root    root    4096 Apr 17 13:48 myfs
Actions #6

Updated by Patrick Donnelly almost 6 years ago

Jos Collin wrote:

The hang doesn't exist in the latest code.

The following is my latest finding:

[...]

[...]

Neither ceph-fuse nor ll nor df hangs now. But the ? appears again.

[...]

That would indicate ceph-fuse maybe crashed. Please check the logs.

Actions #7

Updated by Patrick Donnelly almost 6 years ago

  • Target version changed from v13.0.0 to v14.0.0
Actions #8

Updated by Patrick Donnelly about 5 years ago

  • Target version changed from v14.0.0 to v15.0.0
Actions #9

Updated by Patrick Donnelly about 4 years ago

  • Target version deleted (v15.0.0)
Actions #10

Updated by Jos Collin almost 4 years ago

Patrick Donnelly wrote:

That would indicate ceph-fuse maybe crashed. Please check the logs.

2020-06-19T14:49:24.234+0530 7fba997fa700 -1 /home/jcollin/workspace/ceph/src/client/Client.cc: In function 'int Client::_do_remount(bool)' thread 7fba997fa700 time 2020-06-19T14:49:24.2304\
86+0530
/home/jcollin/workspace/ceph/src/client/Client.cc: 4245: ceph_abort_msg("abort() called")

ceph version 16.0.0-2623-gfe183904a2c (fe183904a2caece314d8e167b4a20a455535197e) pacific (dev)                                                                                               
1: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xf6) [0x7fbad790b646]
2: (Client::_do_remount(bool)+0x564) [0x55fb8e890fd2]
3: (Client::test_dentry_handling(bool)+0x28f) [0x55fb8e8c8803]
4: (()+0x289877) [0x55fb8e83a877]
5: (Thread::entry_wrapper()+0x78) [0x7fbad78a6f1e]
6: (Thread::_entry_func(void*)+0x18) [0x7fbad78a6e9c]
7: (()+0x94e2) [0x7fbad5f164e2]
8: (clone()+0x43) [0x7fbad5a6a643]
Actions #11

Updated by Jos Collin almost 4 years ago

  • Status changed from Need More Info to New
Actions #12

Updated by Patrick Donnelly almost 4 years ago

The issue is right there in the log?

2020-06-19T14:49:24.230+0530 7fba997fa700 -1 client.4311 failed to remount for kernel dentry trimming; quitting!

If ceph-fuse cannot remount to force dentry trimming, then it may be unable to release caps. See also config client_die_on_failed_dentry_invalidate.

Actions #13

Updated by Jos Collin almost 4 years ago

  • Status changed from New to Closed
Actions

Also available in: Atom PDF