Project

General

Profile

Actions

Bug #46374

open

ceph-fuse blocks forever, fails to start, emits no errors

Added by John Mulligan almost 4 years ago. Updated almost 4 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Correctness/Safety
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Yes
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
ceph-fuse
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

For testing purposes we're running ceph-fuse in a container (code here [1]), that runs the ceph-fuse command to mount cephfs. Up to and including Ceph v15.2.3 there's been no issue with this.
However, with v15.2.4 the ceph-fuse command now blocks forever with no output or other indication of issues.

I've tried passing '-o debug' as well as '-d' to the command line to enable debugging, but nothing is printed to stdout/stderr.

This strikes me as a regression in behavior because the script has been functioning correctly on ceph luminous through octopus until v15.2.4. However, if there's something the script is doing wrong I'm happy to work with the ceph-fuse experts to improve it as long as the script can function on all the above listed ceph versions.

Since I get no output, I did try running strace on the command. Here's a snippet of output:

994150 openat(AT_FDCWD, "/usr/lib64/ceph/liburcu-common.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
1994150 openat(AT_FDCWD, "/lib64/liburcu-common.so.6", O_RDONLY|O_CLOEXEC) = 3
1994150 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0@\23\0\0\0\0\0\0"..., 832) = 832
1994150 lseek(3, 13232, SEEK_SET)       = 13232
1994150 read(3, "\4\0\0\0\20\0\0\0\5\0\0\0GNU\0\2\0\0\300\4\0\0\0\3\0\0\0\0\0\0\0", 32) = 32
1994150 fstat(3, {st_mode=S_IFREG|0755, st_size=21592, ...}) = 0
1994150 lseek(3, 13232, SEEK_SET)       = 13232
1994150 read(3, "\4\0\0\0\20\0\0\0\5\0\0\0GNU\0\2\0\0\300\4\0\0\0\3\0\0\0\0\0\0\0", 32) = 32
1994150 mmap(NULL, 2113672, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f4ec2766000
1994150 mprotect(0x7f4ec276a000, 2093056, PROT_NONE) = 0
1994150 mmap(0x7f4ec2969000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x3000) = 0x7f4ec2969000
1994150 mmap(0x7f4ec296a000, 136, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f4ec296a000
1994150 close(3)                        = 0
1994150 mprotect(0x7f4ec2969000, 4096, PROT_READ) = 0
1994150 mprotect(0x7f4ec2b73000, 4096, PROT_READ) = 0
1994150 mprotect(0x7f4ec2d7b000, 4096, PROT_READ) = 0
1994150 mprotect(0x7f4ec2f86000, 4096, PROT_READ) = 0
1994150 membarrier(MEMBARRIER_CMD_QUERY, 0) = -1 EPERM (Operation not permitted)
1994150 munmap(0x7f4ecf7b0000, 26382)   = 0
1994150 futex(0x7f4ec432b040, FUTEX_WAKE_PRIVATE, 2147483647) = 0
1994150 uname({sysname="Linux", nodename="popcorn", ...}) = 0
1994150 mmap(NULL, 10883072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4ec1d05000
1994150 brk(NULL)                       = 0x56549e154000
1994150 brk(0x56549e176000)             = 0x56549e176000
1994150 getrandom("\xda", 1, 0)         = 1
1994150 getpid()                        = 778
1994150 prctl(PR_GET_NAME, "ceph-fuse") = 0
1994150 futex(0x56549e0f26e8, FUTEX_WAIT_PRIVATE, 2, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
1994150 --- SIGINT {si_signo=SIGINT, si_code=SI_KERNEL} ---
1994150 +++ killed by SIGINT +++
1993556 <... wait4 resumed>[{WIFSIGNALED(s) && WTERMSIG(s) == SIGINT}], WSTOPPED|WCONTINUED, NULL) = 778
1993556 rt_sigprocma

Since it was not obviously related to networking or file access issues, I did not glean much from the above but perhaps it can serve as a clue for others.

This issue occurs on my laptop running fedora 31 + podman as well as our ci running ubuntu with docker.

Any guidance to get additional debugging info/state would be appreciated. This is currently blocking our ci.

[1] - https://github.com/ceph/go-ceph/blob/d4440eb8c2966508ca4bf41240e6d3aefd144e97/micro-osd.sh


Related issues 1 (1 open0 closed)

Related to CephFS - Bug #47964: ceph-fuse RPM package must same-version ceph rpmNew

Actions
Actions

Also available in: Atom PDF