Project

General

Profile

Actions

Bug #46428

open

mon: all the 3 mon daemons crashed when running the fs aio test

Added by Xiubo Li almost 4 years ago. Updated over 3 years ago.

Status:
In Progress
Priority:
High
Assignee:
-
Category:
Correctness/Safety
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Monitor
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

The logs:

[root@lxbceph3 build]# journalctl -r
-- Logs begin at Wed 2020-07-08 21:17:20 EDT, end at Wed 2020-07-08 22:17:27 EDT. --
Jul 08 22:17:27 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:17:23 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:17:21 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:17:20 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:17:19 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:17:04 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:16:56 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:16:52 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:16:50 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:16:49 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:16:48 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:16:34 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:16:26 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:16:22 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:16:20 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:16:19 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:16:18 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:16:03 lxbceph3 kernel: libceph: mon2 (1)10.72.37.136:40684 socket error on write
Jul 08 22:15:55 lxbceph3 kernel: libceph: mon2 (1)10.72.37.136:40684 socket error on write
Jul 08 22:15:51 lxbceph3 kernel: libceph: mon2 (1)10.72.37.136:40684 socket error on write
Jul 08 22:15:49 lxbceph3 kernel: libceph: mon2 (1)10.72.37.136:40684 socket error on write
Jul 08 22:15:48 lxbceph3 kernel: libceph: mon2 (1)10.72.37.136:40684 socket error on write
Jul 08 22:15:47 lxbceph3 kernel: libceph: mon2 (1)10.72.37.136:40684 socket error on write
Jul 08 22:15:33 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:15:25 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:15:21 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:15:19 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:15:18 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:15:17 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:15:03 lxbceph3 kernel: libceph: mon2 (1)10.72.37.136:40684 socket error on write
Jul 08 22:14:55 lxbceph3 kernel: libceph: mon2 (1)10.72.37.136:40684 socket error on write
Jul 08 22:14:51 lxbceph3 kernel: libceph: mon2 (1)10.72.37.136:40684 socket error on write
Jul 08 22:14:49 lxbceph3 kernel: libceph: mon2 (1)10.72.37.136:40684 socket error on write
Jul 08 22:14:48 lxbceph3 kernel: libceph: mon2 (1)10.72.37.136:40684 socket error on write
Jul 08 22:14:47 lxbceph3 kernel: libceph: mon2 (1)10.72.37.136:40684 socket error on write
Jul 08 22:14:32 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:14:24 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:14:20 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:14:18 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:14:17 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:14:16 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:14:03 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:13:54 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:13:50 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:13:48 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:13:47 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:13:46 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:13:31 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:13:23 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:13:19 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:13:17 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:13:16 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:13:15 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:13:01 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:12:53 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:12:49 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:12:47 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:12:46 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:12:45 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:12:30 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:12:22 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:12:18 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:12:16 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:12:15 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:12:15 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:12:00 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:11:52 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:11:48 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:11:46 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:11:45 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:11:44 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:11:30 lxbceph3 kernel: libceph: mon2 (1)10.72.37.136:40684 socket error on write
Jul 08 22:11:22 lxbceph3 kernel: libceph: mon2 (1)10.72.37.136:40684 socket error on write
Jul 08 22:11:18 lxbceph3 kernel: libceph: mon2 (1)10.72.37.136:40684 socket error on write
Jul 08 22:11:16 lxbceph3 kernel: libceph: mon2 (1)10.72.37.136:40684 socket error on write
Jul 08 22:11:15 lxbceph3 kernel: libceph: mon2 (1)10.72.37.136:40684 socket error on write
Jul 08 22:11:14 lxbceph3 kernel: libceph: mon2 (1)10.72.37.136:40684 socket error on write
Jul 08 22:10:59 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:10:51 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:10:47 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:10:45 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:10:44 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:10:43 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:10:30 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:10:21 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:10:17 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:10:15 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:10:14 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:10:13 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 socket error on write
Jul 08 22:09:59 lxbceph3 kernel: ceph: mds0 hung
Jul 08 22:09:58 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:09:58 lxbceph3 systemd[1]: systemd-coredump@1-8612-0.service: Failed with result 'timeout'.
Jul 08 22:09:58 lxbceph3 systemd[1]: systemd-coredump@2-8613-0.service: Failed with result 'timeout'.
Jul 08 22:09:57 lxbceph3 systemd[1]: systemd-coredump@1-8612-0.service: Service reached runtime time limit. Stopping.
Jul 08 22:09:57 lxbceph3 systemd[1]: systemd-coredump@2-8613-0.service: Service reached runtime time limit. Stopping.
Jul 08 22:09:50 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:09:46 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:09:44 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:09:43 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:09:42 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:07:26 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:07:18 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:07:14 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:07:12 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:07:11 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:07:10 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:06:25 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:06:17 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:06:13 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:06:11 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:06:10 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:06:09 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:06:02 lxbceph3 kernel: ceph: mds0 caps stale
Jul 08 22:05:28 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:05:26 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:05:25 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:05:24 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:05:24 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 session lost, hunting for new mon
Jul 08 22:04:57 lxbceph3 systemd[1]: Started Process Core Dump (PID 8612/UID 0).
Jul 08 22:04:57 lxbceph3 systemd[1]: Started Process Core Dump (PID 8613/UID 0).
Jul 08 22:04:52 lxbceph3 systemd-coredump[8605]: Process 2797 (ceph-mon) of user 0 dumped core.

                                                 Stack trace of thread 2831:
                                                 #0  0x00007f8f0efebc5f raise (libpthread.so.0)
                                                 #1  0x0000562109c42163 n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #2  0x0000562109c4359d n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #3  0x00007f8f0efebdc0 __restore_rt (libpthread.so.0)
                                                 #4  0x00007f8f0dc538df raise (libc.so.6)
                                                 #5  0x00007f8f0dc3dcf5 abort (libc.so.6)
                                                 #6  0x00007f8f12683f6d n/a (/home/data/ceph/build/lib/libceph-common.so.2)
                                                 #7  0x0000562109762845 n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #8  0x0000562109a2bfef n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #9  0x0000562109a35d09 n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #10 0x0000562109a363a4 n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #11 0x0000562109a43017 n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #12 0x0000562109a409f9 n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #13 0x0000562109a465b4 n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #14 0x0000562109a4658a n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #15 0x00005621097e1df1 n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #16 0x00007f8f1262be11 n/a (/home/data/ceph/build/lib/libceph-common.so.2)
                                                 #17 0x00007f8f1262d89e n/a (/home/data/ceph/build/lib/libceph-common.so.2)
                                                 #18 0x00007f8f1261f278 n/a (/home/data/ceph/build/lib/libceph-common.so.2)
                                                 #19 0x00007f8f1261f1f6 n/a (/home/data/ceph/build/lib/libceph-common.so.2)
                                                 #20 0x00007f8f0efe12de start_thread (libpthread.so.0)
                                                 #21 0x00007f8f0dd18133 __clone (libc.so.6)

                                                 Stack trace of thread 2825:
                                                 #0  0x00007f8f0dd18467 epoll_wait (libc.so.6)
                                                 #1  0x00007f8f12a41a13 n/a (/home/data/ceph/build/lib/libceph-common.so.2)
                                                 #2  0x00007f8f12a258da n/a (/home/data/ceph/build/lib/libceph-common.so.2)
                                                 #3  0x00007f8f12a33604 n/a (/home/data/ceph/build/lib/libceph-common.so.2)
                                                 #4  0x00007f8f12a34b13 n/a (/home/data/ceph/build/lib/libceph-common.so.2)
                                                 #5  0x0000562109cd9f4c n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #6  0x000056210a021237 n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #7  0x000056210a0204be n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #8  0x000056210a022122 n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #9  0x000056210a0220f8 n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #10 0x000056210a0220dc n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #11 0x00007f8f0e63bb73 execute_native_thread_routine (libstdc++.so.6)
                                                 #12 0x00007f8f0efe12de start_thread (libpthread.so.0)
                                                 #13 0x00007f8f0dd18133 __clone (libc.so.6)

                                                 Stack trace of thread 2797:
                                                 #0  0x00007f8f0efe747c pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
                                                 #1  0x00007f8f126f88de n/a (/home/data/ceph/build/lib/libceph-common.so.2)
                                                 #2  0x00007f8f129b301b n/a (/home/data/ceph/build/lib/libceph-common.so.2)
                                                 #3  0x0000562109758ee8 n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #4  0x00007f8f0dc3f873 __libc_start_main (libc.so.6)
                                                 #5  0x000056210974e9fe n/a (/home/data/ceph/build/bin/ceph-mon)
Jul 08 22:04:48 lxbceph3 systemd[1]: Started Process Core Dump (PID 8604/UID 0).
Jul 08 22:04:48 lxbceph3 systemd[1]: Created slice system-systemd\x2dcoredump.slice.
Jul 08 22:01:01 lxbceph3 run-parts[8570]: (/etc/cron.hourly) finished 0anacron
Jul 08 22:01:01 lxbceph3 run-parts[8564]: (/etc/cron.hourly) starting 0anacron
Jul 08 22:01:01 lxbceph3 CROND[8561]: (root) CMD (run-parts /etc/cron.hourly)
Jul 08 22:07:11 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:07:10 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:06:25 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:06:17 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:06:13 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:06:11 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:06:10 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:06:09 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:06:02 lxbceph3 kernel: ceph: mds0 caps stale
Jul 08 22:05:28 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:05:26 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:05:25 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:05:24 lxbceph3 kernel: libceph: mon0 (1)10.72.37.136:40680 socket error on write
Jul 08 22:05:24 lxbceph3 kernel: libceph: mon1 (1)10.72.37.136:40682 session lost, hunting for new mon
Jul 08 22:04:57 lxbceph3 systemd[1]: Started Process Core Dump (PID 8612/UID 0).
Jul 08 22:04:57 lxbceph3 systemd[1]: Started Process Core Dump (PID 8613/UID 0).
Jul 08 22:04:52 lxbceph3 systemd-coredump[8605]: Process 2797 (ceph-mon) of user 0 dumped core.

                                                 Stack trace of thread 2831:
                                                 #0  0x00007f8f0efebc5f raise (libpthread.so.0)
                                                 #1  0x0000562109c42163 n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #2  0x0000562109c4359d n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #3  0x00007f8f0efebdc0 __restore_rt (libpthread.so.0)
                                                 #4  0x00007f8f0dc538df raise (libc.so.6)
                                                 #5  0x00007f8f0dc3dcf5 abort (libc.so.6)
                                                 #6  0x00007f8f12683f6d n/a (/home/data/ceph/build/lib/libceph-common.so.2)
                                                 #7  0x0000562109762845 n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #8  0x0000562109a2bfef n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #9  0x0000562109a35d09 n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #10 0x0000562109a363a4 n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #11 0x0000562109a43017 n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #12 0x0000562109a409f9 n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #13 0x0000562109a465b4 n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #14 0x0000562109a4658a n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #15 0x00005621097e1df1 n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #16 0x00007f8f1262be11 n/a (/home/data/ceph/build/lib/libceph-common.so.2)
                                                 #17 0x00007f8f1262d89e n/a (/home/data/ceph/build/lib/libceph-common.so.2)
                                                 #18 0x00007f8f1261f278 n/a (/home/data/ceph/build/lib/libceph-common.so.2)
                                                 #19 0x00007f8f1261f1f6 n/a (/home/data/ceph/build/lib/libceph-common.so.2)
                                                 #20 0x00007f8f0efe12de start_thread (libpthread.so.0)
                                                 #21 0x00007f8f0dd18133 __clone (libc.so.6)

                                                 Stack trace of thread 2825:
                                                 #0  0x00007f8f0dd18467 epoll_wait (libc.so.6)
                                                 #1  0x00007f8f12a41a13 n/a (/home/data/ceph/build/lib/libceph-common.so.2)
                                                 #2  0x00007f8f12a258da n/a (/home/data/ceph/build/lib/libceph-common.so.2)
                                                 #3  0x00007f8f12a33604 n/a (/home/data/ceph/build/lib/libceph-common.so.2)
                                                 #4  0x00007f8f12a34b13 n/a (/home/data/ceph/build/lib/libceph-common.so.2)
                                                 #5  0x0000562109cd9f4c n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #6  0x000056210a021237 n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #7  0x000056210a0204be n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #8  0x000056210a022122 n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #9  0x000056210a0220f8 n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #10 0x000056210a0220dc n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #11 0x00007f8f0e63bb73 execute_native_thread_routine (libstdc++.so.6)
                                                 #12 0x00007f8f0efe12de start_thread (libpthread.so.0)
                                                 #13 0x00007f8f0dd18133 __clone (libc.so.6)

                                                 Stack trace of thread 2797:
                                                 #0  0x00007f8f0efe747c pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
                                                 #1  0x00007f8f126f88de n/a (/home/data/ceph/build/lib/libceph-common.so.2)
                                                 #2  0x00007f8f129b301b n/a (/home/data/ceph/build/lib/libceph-common.so.2)
                                                 #3  0x0000562109758ee8 n/a (/home/data/ceph/build/bin/ceph-mon)
                                                 #4  0x00007f8f0dc3f873 __libc_start_main (libc.so.6)
                                                 #5  0x000056210974e9fe n/a (/home/data/ceph/build/bin/ceph-mon)
Jul 08 22:04:48 lxbceph3 systemd[1]: Started Process Core Dump (PID 8604/UID 0).
Jul 08 22:04:48 lxbceph3 systemd[1]: Created slice system-systemd\x2dcoredump.slice.
Jul 08 22:01:01 lxbceph3 run-parts[8570]: (/etc/cron.hourly) finished 0anacron
Jul 08 22:01:01 lxbceph3 run-parts[8564]: (/etc/cron.hourly) starting 0anacron
Jul 08 22:01:01 lxbceph3 CROND[8561]: (root) CMD (run-parts /etc/cron.hourly)

It failed to generate the core dump and the call trace doesn't have enough info.

Actions #1

Updated by Xiubo Li almost 4 years ago

  • Assignee set to Xiubo Li
Actions #2

Updated by Xiubo Li almost 4 years ago

  • Status changed from New to In Progress
Actions #3

Updated by Xiubo Li almost 4 years ago

  • Assignee deleted (Xiubo Li)

I couldn't reproduce it locally, let the core team help to check the above core dump whether they have any idea about it.

Actions #4

Updated by Xiubo Li almost 4 years ago

The steps:

1, mount one cephfs kernel client to /mnt/cephfs/
2, run the following command:

   # fio --name=randread --ioengine=libaio --iodepth=100 -rw=rw --bs=100K --direct=1 --size=1G --numjobs=20 --runtime=10000000 --group_reporting --filename=/mnt/cephfs/a.txt

Actions #5

Updated by Joao Eduardo Luis over 3 years ago

  • Category set to Correctness/Safety

Are you co-locating the test and the monitors? Can this be fd depletion?

Actions

Also available in: Atom PDF