Bug #23421
closedceph-fuse: stop ceph-fuse if no root permissions?
0%
Description
I think it would be a good idea to prevent ceph-fuse from proceeding if there are no appropriate permissions.
I request someone to validate the following:
Create mountpoint with non-root ownership
[jcollin@stratocaster build]$ ll /mnt/
total 0
drwxr-xr-x. 2 jcollin jcollin 6 Mar 20 09:26 cephfs
drwxr-xr-x. 2 jcollin jcollin 6 Mar 20 08:52 myfs
Mount without sudo
[jcollin@stratocaster build]$ ./bin/ceph-fuse -m localhost:40548 /mnt/cephfs/
2018-03-20 09:40:17.649 7f4d07710cc0 -1 WARNING: all dangerous and experimental features are enabled.
2018-03-20 09:40:17.674 7f4d07710cc0 -1 WARNING: all dangerous and experimental features are enabled.
2018-03-20 09:40:17.696 7f4d07710cc0 -1 WARNING: all dangerous and experimental features are enabled.
2018-03-20 09:40:17.697 7f4d07710cc0 -1 init, newargv = 0x560b647003c0 newargc=7
ceph-fuse[7058]: starting ceph client
ceph-fuse[7058]: starting fuse
df, ls /mnt/ hangs for around 15 minutes
[jcollin@stratocaster build]$ df
^C
[jcollin@stratocaster build]$ ls /mnt/
^C
Later on, found that ceph-fuse hangs.
[jcollin@stratocaster build]$ ps -A | grep ceph-fuse
7058 pts/0 00:00:00 ceph-fuse
So Kill ceph-fuse
[jcollin@stratocaster build]$ sudo kill -9 7058
Now everything seems back to normal. But see the output of the list and df now.
[jcollin@stratocaster build]$ df
df: /mnt/cephfs: Transport endpoint is not connected
Filesystem 1K-blocks Used Available Use% Mounted on
devtmpfs 9964244 0 9964244 0% /dev
tmpfs 9978944 0 9978944 0% /dev/shm
tmpfs 9978944 1524 9977420 1% /run
tmpfs 9978944 0 9978944 0% /sys/fs/cgroup
/dev/mapper/fedora-root 52403200 19414508 32988692 38% /
tmpfs 9978944 4952 9973992 1% /tmp
/dev/sda1 999320 201264 729244 22% /boot
/dev/mapper/fedora-home 436358992 87335312 349023680 21% /home
tmpfs 1995788 16 1995772 1% /run/user/1000
[jcollin@stratocaster build]$ ll /mnt/
ls: cannot access '/mnt/cephfs': Transport endpoint is not connected
total 0
d?????????? ? ? ? ? ? cephfs
drwxr-xr-x. 2 jcollin jcollin 6 Mar 20 08:52 myfs
[root@stratocaster build]# rmdir /mnt/cephfs
rmdir: failed to remove '/mnt/cephfs': Device or resource busy
[root@stratocaster build]# rm -Rf /mnt/cephfs
rm: cannot remove '/mnt/cephfs': Is a directory
This requires a reboot to become normal. This problem doesn't happen if we mounted with 'sudo' in the above step.
Files
Updated by Patrick Donnelly about 6 years ago
- Subject changed from Stop ceph-fuse if no root permissions? to ceph-fuse: stop ceph-fuse if no root permissions?
- Status changed from New to Need More Info
Jos, can you get more detailed debug logs when this happens? It is probably not related to sudo permissions.
Updated by Jos Collin about 6 years ago
Patrick Donnelly wrote:
Jos, can you get more detailed debug logs when this happens? It is probably not related to sudo permissions.
d?????????? ? ? ? ? ? cephfs
We can ignore this, as this happened because ceph-fuse was killed. Not related to the hanging issue.
Updated by Patrick Donnelly about 6 years ago
- Status changed from Need More Info to Closed
Updated by Patrick Donnelly about 6 years ago
- Category set to Correctness/Safety
- Status changed from Closed to Need More Info
- Assignee set to Jos Collin
- Target version set to v13.0.0
- Source set to Development
- Component(FS) ceph-fuse added
Jos, please get hte client logs so we can diagnose.
Updated by Jos Collin almost 6 years ago
The hang doesn't exist in the latest code.
The following is my latest finding:
jcollin@stratocaster:~/workspace/cephtest/build$ ll /mnt/ total 8 drwxr-xr-x 2 jcollin jcollin 4096 Apr 17 09:44 cephfs drwxr-xr-x 2 root root 4096 Apr 17 13:48 myfs
jcollin@stratocaster:~/workspace/cephtest/build$ ./bin/ceph-fuse -m localhost:40945 /mnt/cephfs/ 2018-04-26 15:27:53.044 7f0a37d38c40 -1 WARNING: all dangerous and experimental features are enabled. 2018-04-26 15:27:53.052 7f0a37d38c40 -1 WARNING: all dangerous and experimental features are enabled. 2018-04-26 15:27:53.056 7f0a37d38c40 -1 WARNING: all dangerous and experimental features are enabled. 2018-04-26 15:27:53.056 7f0a37d38c40 -1 init, newargv = 0x555d93198e20 newargc=7 ceph-fuse[4224]: starting ceph client ceph-fuse[4224]: starting fuse
Neither ceph-fuse nor ll nor df hangs now. But the ? appears again.
jcollin@stratocaster:~/workspace/cephtest/build$ ll /mnt/ ls: cannot access '/mnt/cephfs': Transport endpoint is not connected total 4 d????????? ? ? ? ? ? cephfs drwxr-xr-x 2 root root 4096 Apr 17 13:48 myfs
Try unmounting and everything back to normal.
jcollin@stratocaster:~/workspace/cephtest/build$ sudo fusermount -u /mnt/cephfs jcollin@stratocaster:~/workspace/cephtest/build$ ll /mnt/ total 8 drwxr-xr-x 2 jcollin jcollin 4096 Apr 17 09:44 cephfs drwxr-xr-x 2 root root 4096 Apr 17 13:48 myfs
Updated by Patrick Donnelly almost 6 years ago
Jos Collin wrote:
The hang doesn't exist in the latest code.
The following is my latest finding:
[...]
[...]
Neither ceph-fuse nor ll nor df hangs now. But the ? appears again.
[...]
That would indicate ceph-fuse maybe crashed. Please check the logs.
Updated by Patrick Donnelly almost 6 years ago
- Target version changed from v13.0.0 to v14.0.0
Updated by Patrick Donnelly about 5 years ago
- Target version changed from v14.0.0 to v15.0.0
Updated by Jos Collin almost 4 years ago
- File client.admin.269603.log client.admin.269603.log added
Patrick Donnelly wrote:
That would indicate ceph-fuse maybe crashed. Please check the logs.
2020-06-19T14:49:24.234+0530 7fba997fa700 -1 /home/jcollin/workspace/ceph/src/client/Client.cc: In function 'int Client::_do_remount(bool)' thread 7fba997fa700 time 2020-06-19T14:49:24.2304\
86+0530
/home/jcollin/workspace/ceph/src/client/Client.cc: 4245: ceph_abort_msg("abort() called")
ceph version 16.0.0-2623-gfe183904a2c (fe183904a2caece314d8e167b4a20a455535197e) pacific (dev)
1: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xf6) [0x7fbad790b646]
2: (Client::_do_remount(bool)+0x564) [0x55fb8e890fd2]
3: (Client::test_dentry_handling(bool)+0x28f) [0x55fb8e8c8803]
4: (()+0x289877) [0x55fb8e83a877]
5: (Thread::entry_wrapper()+0x78) [0x7fbad78a6f1e]
6: (Thread::_entry_func(void*)+0x18) [0x7fbad78a6e9c]
7: (()+0x94e2) [0x7fbad5f164e2]
8: (clone()+0x43) [0x7fbad5a6a643]
Updated by Jos Collin almost 4 years ago
- Status changed from Need More Info to New
Updated by Patrick Donnelly almost 4 years ago
The issue is right there in the log?
2020-06-19T14:49:24.230+0530 7fba997fa700 -1 client.4311 failed to remount for kernel dentry trimming; quitting!
If ceph-fuse cannot remount to force dentry trimming, then it may be unable to release caps. See also config client_die_on_failed_dentry_invalidate.