Bug #47033
closedBug #46882: client: mount abort hangs: [volumes INFO mgr_util] aborting connection from cephfs 'cephfs'
client: inode ref leak
0%
Description
It can be easily reproduced by following program.
#define _FILE_OFFSET_BITS 64 #include <features.h> #include <sys/types.h> #include <sys/wait.h> #include <sys/stat.h> #include <fcntl.h> #include <time.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <errno.h> #include <cephfs/libcephfs.h> int main(int argc, char *argv[]) { struct ceph_mount_info *cmount = NULL; int n = 64; bool parent = true; if (argc > 2) n = atoi(argv[2]); while (--n >= 0) { pid_t pid = fork(); if (pid < 0) { printf("fork fail %d\n", pid); exit(-1); } if (pid == 0) { parent = false; break; } } if (parent) { pid_t pid; int status; while ((pid = wait(&status)) > 0); return 0; } ceph_create(&cmount, "admin"); ceph_conf_read_file(cmount, "./ceph.conf"); ceph_mount(cmount, NULL); ceph_chdir(cmount, argv[1]); char buf[4096]; sprintf(buf, "dir%d", n); int ret = ceph_mkdir(cmount, buf, 0755); if (ret < 0 && ret != -EEXIST) { printf("ceph_mkdir fail %d\n", ret); return 0; } ceph_chdir(cmount, buf); /* struct ceph_dir_result *dirp; ret = ceph_opendir(cmount, ".", &dirp); if (ret < 0) { printf("ceph_opendir fail %d\n", ret); return 0; } while (ceph_readdir(cmount, dirp)) ; ceph_closedir(cmount, dirp); */ int count = 0; time_t start = time(NULL); for (int i = 0; i < 20000; ++i) { sprintf(buf, "file%d", i, i); int fd = ceph_open(cmount, buf, O_CREAT|O_RDONLY, 0644); if (fd < 0) { printf("ceph_open fail %d\n", fd); exit(-1); } /* ret = ceph_fchmod(cmount, fd, 0666); if (ret < 0) { printf("ceph_fchmod fail %d\n", ret); exit(-1); } */ ceph_close(cmount, fd); count++; if (time(NULL) > start) { printf("%d\n", count); count = 0; start = time(NULL); } } ceph_unmount(cmount); return 0; }
pre-create testdir at root of cephfs, change mode of testdir to 0777.
repeatedly run './test_create testdir 1' (without removing cleanup data)
last good commit is aef8569b807dc946f7dabc44b20c5d986c44e364. taking client_lock in Client::put_inode does not work
Updated by Xiubo Li over 3 years ago
- Status changed from New to In Progress
I will take a look of this. Thanks :-)
Updated by Xiubo Li over 3 years ago
Zheng Yan wrote:
It can be easily reproduced by following program.
[...]
pre-create testdir at root of cephfs, change mode of testdir to 0777.
repeatedly run './test_create testdir 1' (without removing cleanup data)
last good commit is aef8569b807dc946f7dabc44b20c5d986c44e364. taking client_lock in Client::put_inode does not work
BTW, the above commit is invalid, and also couldn't get any info from the "Client::put_inode does not work".
Updated by Zheng Yan over 3 years ago
good commit is c8b5f84f49ef74609ba3ea69dea0764ef925ae85
Updated by Xiubo Li over 3 years ago
With [1] and [2] I have run the test for very long time and didn't see any errors.
[1] https://github.com/ceph/ceph/pull/36580
[2] https://github.com/ceph/ceph/pull/36553
Updated by Patrick Donnelly over 3 years ago
- Priority changed from Normal to High
- Target version set to v16.0.0
- Source set to Development
- Backport set to octopus,nautilus
- Component(FS) Client added
Updated by Xiubo Li over 3 years ago
- Status changed from In Progress to Duplicate
- Parent task set to #46882
Updated by Xiubo Li over 3 years ago
Xiubo Li wrote:
With [1] and [2] I have run the test for very long time and didn't see any errors.
[1] https://github.com/ceph/ceph/pull/36580
[2] https://github.com/ceph/ceph/pull/36553
Ran this for a whole night, and didn't reproduce it with the above [1].
Updated by Zheng Yan over 3 years ago
- Status changed from Duplicate to New
It fails immediately with following trace.
/home/zhyan/Ceph/ceph/src/client/Client.cc: In function 'void Client::delay_put_requests(bool)' thread 7fffee7c0080 time 2020-08-20T14:05:43.636450+0800
/home/zhyan/Ceph/ceph/src/client/Client.cc: 1922: FAILED ceph_assert(!true)
ceph version 16.0.0-4491-gc7857aef5a (c7857aef5a46841cd201faeb3f6e6589bb1a33dc) pacific (dev)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x124) [0x7fffeef1f76e]
2: (()+0x2518f9) [0x7fffeef1f8f9]
3: (()+0x45ef0) [0x7ffff7e89ef0]
4: (()+0xac143) [0x7ffff7ef0143]
5: (()+0xacb62) [0x7ffff7ef0b62]
6: (ceph_mount()+0x88) [0x7ffff7e7a0b8]
7: ./test_create() [0x4012fb]
8: (__libc_start_main()+0xf2) [0x7ffff7950042]
9: ./test_create() [0x40114e]