Project

General

Profile

Bug #46269

ceph-fuse: ceph-fuse process is terminated by the logratote task and what is more serious is that one Uninterruptible Sleep process will be produced

Added by hongsong wu 3 months ago. Updated 1 day ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
% Done:

0%

Source:
Development
Tags:
Backport:
nautilus, octopus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
ceph-fuse
Labels (FS):
Pull request ID:
Crash signature:

Description

1. reproduce the scene as shown below:

(1) step 1:
Open the terminal_1, and
Prepare the cmd: "killall -q -1 ceph-fuse" from /etc/logrotate.d/ceph-common

(2) step 2:
Open the terminal_2 and
run the cmd: "ceph-fuse -m monip:port mountpoint"

(3) step 3:
Switch to terminal_1 immediately, and
Keep running the cmd: "killall -q -1 ceph-fuse"

2. You will find the following anomalies:
The ceph-fuse process has abnormal exit, and there is an Uninterruptible Sleep mount process,just like the following:

root@***:~#  ps -aux  | grep mount
root      271493  0.0  0.0  26484  1252 ?        D    Jun28   0:00 mount -i -o remount mountpoint

The ceph-fuse abnormal exit logs as bellow:

7fe99f769680  0 pidfile_write: ignore empty --pid-file
7fe99f769680 -1 init, newargv = 0x5583ca2243a0 newargc=9
7fe9952d3700  1 client.63156060 handle_mds_map epoch 29934
7fe999adc700 -1 received  signal: Hangup pid: 1062451 from  PID: 1062486 task name: killall -q -1 ceph-fuse  UID: 0

7fe9912cb700  1 client.63156060 using remount_cb
7fe990aca700 -1 fuse_ll: do_init: safe_write failed with error (32) Broken pipe
7fe990aca700 -1 fuse_ll: do_init: safe_write failed with error (32) Broken pipe
7fe990aca700 -1 *** Caught signal (Aborted) **
  in thread 7fe990aca700 thread_name:ceph-fuse

 1: (()+0x6ddf14) [0x5583c1086f14]
 2: (()+0x110e0) [0x7fe99d7fb0e0]
 3: (gsignal()+0xcf) [0x7fe99c5affff]
 4: (abort()+0x16a) [0x7fe99c5b142a]
 5: (()+0x1ed03b) [0x5583c0b9603b]
 6: (()+0x1536c) [0x7fe99f0c736c]
 7: (()+0x165a1) [0x7fe99f0c85a1]
 8: (()+0x12d48) [0x7fe99f0c4d48]
 9: (()+0x74a4) [0x7fe99d7f14a4]
 10: (clone()+0x3f) [0x7fe99c665d0f]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

3. The reason
The SIGHUP signal handlers are not registered in parent ceph-fuse process.

more detailed explanation is as follows:
In the process of starting the ceph-fuse,if system just calls the logratote("killall -q -1 ceph-fuse ..") before the function of safe_read_exact in parent ceph-fuse process is complete,
it will cause the parent ceph-fuse process abnormal exit because of the parent process don't handle the SIGHUP signal,then it will lead to the child ceph-fuse process assert because of the function call of safe_write in do_init,then it will lead the system call of "mount -i -o remount" in remount_cb become Uninterruptible Sleep process.

4.solution

Register the SIGHUP signal handlers in parent ceph-fuse process.


Related issues

Copied to fs - Backport #46591: octopus: ceph-fuse: ceph-fuse process is terminated by the logratote task and what is more serious is that one Uninterruptible Sleep process will be produced Resolved
Copied to fs - Backport #46592: nautilus: ceph-fuse: ceph-fuse process is terminated by the logratote task and what is more serious is that one Uninterruptible Sleep process will be produced Resolved

History

#1 Updated by Xiubo Li 3 months ago

  • Assignee set to Xiubo Li

#2 Updated by Patrick Donnelly 3 months ago

  • Status changed from New to Fix Under Review
  • Assignee changed from Xiubo Li to hongsong wu
  • Pull request ID set to 35844
  • Affected Versions deleted (v16.0.0)

#3 Updated by Kefu Chai 2 months ago

  • Status changed from Fix Under Review to Pending Backport
  • Backport set to nautilus, octopus

#4 Updated by Nathan Cutler 2 months ago

  • Copied to Backport #46591: octopus: ceph-fuse: ceph-fuse process is terminated by the logratote task and what is more serious is that one Uninterruptible Sleep process will be produced added

#5 Updated by Nathan Cutler 2 months ago

  • Copied to Backport #46592: nautilus: ceph-fuse: ceph-fuse process is terminated by the logratote task and what is more serious is that one Uninterruptible Sleep process will be produced added

#6 Updated by Nathan Cutler 1 day ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Also available in: Atom PDF