ceph-fuse: ceph-fuse process is terminated by the logratote task and what is more serious is that one Uninterruptible Sleep process will be produced
1. reproduce the scene as shown below：
(1) step 1:
Open the terminal_1, and
Prepare the cmd: "killall -q -1 ceph-fuse" from /etc/logrotate.d/ceph-common
(2) step 2:
Open the terminal_2 and
run the cmd: "ceph-fuse -m monip:port mountpoint"
(3) step 3:
Switch to terminal_1 immediately, and
Keep running the cmd: "killall -q -1 ceph-fuse"
2. You will find the following anomalies:
The ceph-fuse process has abnormal exit, and there is an Uninterruptible Sleep mount process,just like the following:
root@***:~# ps -aux | grep mount root 271493 0.0 0.0 26484 1252 ? D Jun28 0:00 mount -i -o remount mountpoint
The ceph-fuse abnormal exit logs as bellow:
7fe99f769680 0 pidfile_write: ignore empty --pid-file 7fe99f769680 -1 init, newargv = 0x5583ca2243a0 newargc=9 7fe9952d3700 1 client.63156060 handle_mds_map epoch 29934 7fe999adc700 -1 received signal: Hangup pid: 1062451 from PID: 1062486 task name: killall -q -1 ceph-fuse UID: 0 7fe9912cb700 1 client.63156060 using remount_cb 7fe990aca700 -1 fuse_ll: do_init: safe_write failed with error (32) Broken pipe 7fe990aca700 -1 fuse_ll: do_init: safe_write failed with error (32) Broken pipe 7fe990aca700 -1 *** Caught signal (Aborted) ** in thread 7fe990aca700 thread_name:ceph-fuse 1: (()+0x6ddf14) [0x5583c1086f14] 2: (()+0x110e0) [0x7fe99d7fb0e0] 3: (gsignal()+0xcf) [0x7fe99c5affff] 4: (abort()+0x16a) [0x7fe99c5b142a] 5: (()+0x1ed03b) [0x5583c0b9603b] 6: (()+0x1536c) [0x7fe99f0c736c] 7: (()+0x165a1) [0x7fe99f0c85a1] 8: (()+0x12d48) [0x7fe99f0c4d48] 9: (()+0x74a4) [0x7fe99d7f14a4] 10: (clone()+0x3f) [0x7fe99c665d0f] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
3. The reason
The SIGHUP signal handlers are not registered in parent ceph-fuse process.
more detailed explanation is as follows：
In the process of starting the ceph-fuse,if system just calls the logratote("killall -q -1 ceph-fuse ..") before the function of safe_read_exact in parent ceph-fuse process is complete,
it will cause the parent ceph-fuse process abnormal exit because of the parent process don't handle the SIGHUP signal,then it will lead to the child ceph-fuse process assert because of the function call of safe_write in do_init,then it will lead the system call of "mount -i -o remount" in remount_cb become Uninterruptible Sleep process.
Register the SIGHUP signal handlers in parent ceph-fuse process.