Bug #36299
open
Kernel panic: kernel BUG at fs/ceph/mds_client.c:1279! on CentOS 7.5.1804
Added by Dmitry Isakov over 5 years ago.
Updated over 3 years ago.
Description
Hello! We had two kernel panics when using CEPHFS client on Centos 7.5 on different nodes. Apparently, both the panic occurred when unmounting the file system...
Files
backtrace (1.42 KB)
backtrace |
kdump backtrace |
Dmitry Isakov, 10/03/2018 11:44 AM
|
|
dmesg (3.92 KB)
dmesg |
kdump dmesg |
Dmitry Isakov, 10/03/2018 11:44 AM
|
|
Related issues
1 (1 open — 0 closed)
Kernel version: 3.10.0-862.11.6.el7.x86_64 and 3.10.0-862.el7.x86_64
libcephfs2-12.2.8-0.el7.x86_64
ceph-common-12.2.8-0.el7.x86_64
- Category set to fs/ceph
- Assignee set to Zheng Yan
kernel BUG at fs/ceph/mds_client.c:1279!
invalid opcode: 0000 [#1] SMP
CPU: 3 PID: 38552 Comm: kworker/3:0 Kdump: loaded Tainted: G ------------ T 3.10.0-862.11.6.el7.x86_64 #1
Hardware name: HP ProLiant DL380p Gen8, BIOS P70 11/14/2013
Workqueue: ceph-msgr ceph_con_workfn [libceph]
task: ffff885e37bd8fd0 ti: ffff885609eb0000 task.ti: ffff885609eb0000
RIP: 0010:[<ffffffffc09f12ed>] [<ffffffffc09f12ed>] remove_session_caps+0x16d/0x170 [ceph]
RSP: 0018:ffff885609eb3c48 EFLAGS: 00010202
RAX: 0000000000000001 RBX: ffff88698c7b0d40 RCX: 0000000000000400
RDX: 000000000000001b RSI: ffff889126b48618 RDI: ffff885609eb3c08
RBP: ffff885609eb3c88 R08: ffff88a042bdd770 R09: 0000000000000001
R10: 00000000000003e2 R11: 0000000000000000 R12: ffff88698c7b0800
R13: ffff8870b32289d8 R14: ffff88698c7b0d48 R15: ffff889845322800
FS: 0000000000000000(0000) GS:ffff8871bf6c0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f761ae1c808 CR3: 000000286be0e000 CR4: 00000000000607e0
Call Trace:
[<ffffffffc09f8500>] dispatch+0x5e0/0xb90 [ceph]
[<ffffffff987d155a>] ? kernel_recvmsg+0x3a/0x50
[<ffffffffc0972ff4>] try_read+0x4e4/0x1210 [libceph]
[<ffffffff98234909>] ? sched_clock+0x9/0x10
[<ffffffff982d50d5>] ? sched_clock_cpu+0x85/0xc0
[<ffffffff9822a59e>] ? __switch_to+0xce/0x580
[<ffffffffc0973dd9>] ceph_con_workfn+0xb9/0x670 [libceph]
[<ffffffff982b613f>] process_one_work+0x17f/0x440
[<ffffffff982b71d6>] worker_thread+0x126/0x3c0
[<ffffffff982b70b0>] ? manage_workers.isra.24+0x2a0/0x2a0
[<ffffffff982bdf21>] kthread+0xd1/0xe0
[<ffffffff982bde50>] ? insert_kthread_work+0x40/0x40
[<ffffffff989255f7>] ret_from_fork_nospec_begin+0x21/0x21
[<ffffffff982bde50>] ? insert_kthread_work+0x40/0x40
Code: 5d 41 5e 41 5f 5d c3 48 89 fa 48 c7 c6 b0 7a a0 c0 48 c7 c7 18 4c a1 c0 31 c0 e8 cf 8b b8 d7 e9 d8 fe ff ff e8 45 30 8a d7 0f 0b <0f> 0b 90 66 66 66 66 90 48 8b 07 55 48 89 e5 48 89 02 44 8b 80
RIP [<ffffffffc09f12ed>] remove_session_caps+0x16d/0x170 [ceph]
BUG_ON(session->s_nr_caps > 0);
BUG_ON(!list_empty(&session->s_cap_flushing));
We also have this same kernel panic
before kernel BUG, we see the following message
[5478081.176868] libceph: mds0 xx.xx.xx.xx:6800 socket closed (con state OPEN)
[5478085.807602] libceph: mds0 xx.xx.xx.xx:6800 connection reset
[5478085.807632] libceph: reset on mds0
[5478085.807633] ceph: mds0 closed our session
[5478085.807634] ceph: mds0 reconnect start
[5478085.807660] ceph: ffff8803404a2a30 auth cap (null) not mds0 ???
[5478085.812112] ceph: mds0 reconnect denied
[5478085.812177] kernel BUG at fs/ceph/mds_client.c:1230!
[5478085.813028] task: ffff880f87e2bf40 ti: ffff880f35394000 task.ti: ffff880f35394000
[5478085.813053] RIP: 0010:[<ffffffffc07401e0>] [<ffffffffc07401e0>] remove_session_caps+0x160/0x170 [ceph]
......
kernel 3.10.0-693.5.2.el7.x86_64
ceph version 12.2.5
- Related to Bug #37769: __ceph_remove_cap caused kernel crash added
this one and #37769 could be the same issue
- Assignee deleted (
Zheng Yan)
Also available in: Atom
PDF