Project

General

Profile

Bug #23706

NULL sock gets passed to ceph_tcp_sendmsg()

Added by Bertrand Gouny 7 months ago. Updated 7 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
libceph
Target version:
-
Start date:
04/13/2018
Due date:
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
1 - critical
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:

Description

Hello,

we noticed some server crash with this kind of logs:

<1>[48385.115715] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
<1>[48385.124186] IP: selinux_socket_sendmsg+0x5/0x20
<6>[48385.129060] PGD 0 P4D 0 
<4>[48385.131921] Oops: 0000 [#1] SMP PTI
<4>[48385.135745] Modules linked in: xfs cbc xt_statistic xt_physdev xt_nat ipt_REJECT nf_reject_ipv4 xt_addrtype xt_comment xt_mark br_netfilter veth tun bridge stp llc nf_conntrack_netlink xfrm_user xfrm_algo ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat overlay nf_conntrack_ipv4 nf_defrag_ipv4 xt_recent xt_conntrack iptable_filter nls_ascii nls_cp437 vfat fat ipmi_ssif sb_edac edac_core coretemp x86_pkg_temp_thermal kvm_intel ipmi_si mei_me ipmi_devintf kvm irqbypass mousedev evdev mei i2c_i801 ipmi_msghandler sch_fq_codel button xt_set ip_set nfnetlink ip6_tables ip_vs nf_conntrack ceph libceph libcrc32c crc32c_generic fscache hid_generic usbhid hid ext4 crc16 mbcache jbd2 fscrypto dm_verity dm_bufio sd_mod crc32c_intel igb ahci libahci ehci_pci i2c_algo_bit aesni_intel ehci_hcd
<4>[48385.209746]  aes_x86_64 libata i2c_core crypto_simd hwmon cryptd usbcore ptp scsi_mod glue_helper pps_core usb_common dm_mirror dm_region_hash dm_log dm_mod dax
<4>[48385.225000] CPU: 0 PID: 22717 Comm: kworker/0:0 Not tainted 4.14.19-coreos #1
<4>[48385.232461] Hardware name: Supermicro SYS-5038MD-H24TRF-OS012/X10SDE-DF, BIOS 1.3 01/05/2018
<4>[48385.241541] Workqueue: ceph-msgr ceph_msg_new [libceph]
<4>[48385.247107] task: ffff9db493603c80 task.stack: ffffbc96ccbf0000
<4>[48385.253366] RIP: 0010:selinux_socket_sendmsg+0x5/0x20
<4>[48385.258770] RSP: 0018:ffffbc96ccbf3d48 EFLAGS: 00010206
<4>[48385.264319] RAX: 000000000000c040 RBX: ffffffff96e66388 RCX: 0000000000000003
<4>[48385.271780] RDX: 0000000000000113 RSI: ffffbc96ccbf3de8 RDI: 0000000000000000
<4>[48385.279242] RBP: 0000000000000113 R08: 0000000000000113 R09: 0000000000000003
<4>[48385.286737] R10: ffffde87dc369900 R11: ffff9db49ec03200 R12: ffffbc96ccbf3de8
<4>[48385.294191] R13: 0000000000000000 R14: ffff9db22de63830 R15: ffff9db22de63830
<4>[48385.301670] FS:  0000000000000000(0000) GS:ffff9db49f200000(0000) knlGS:0000000000000000
<4>[48385.310390] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[48385.316490] CR2: 0000000000000020 CR3: 000000048200a004 CR4: 00000000003606f0
<4>[48385.323962] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>[48385.331424] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<4>[48385.338911] Call Trace:
<4>[48385.341680]  security_socket_sendmsg+0x3c/0x50
<4>[48385.346541]  sock_sendmsg+0x15/0x40
<4>[48385.350388]  ceph_msg_new+0x108e/0x2470 [libceph]
<4>[48385.355451]  ? pick_next_task_fair+0x469/0x580
<4>[48385.360253]  ? __switch_to+0xa8/0x460
<4>[48385.364255]  process_one_work+0x144/0x350
<4>[48385.368605]  worker_thread+0x4d/0x3e0
<4>[48385.372608]  kthread+0xfc/0x130
<4>[48385.376108]  ? rescuer_thread+0x310/0x310
<4>[48385.380459]  ? kthread_park+0x60/0x60
<4>[48385.384472]  ? do_syscall_64+0x66/0x1d0
<4>[48385.388650]  ? SyS_exit_group+0x10/0x10
<4>[48385.392843]  ret_from_fork+0x35/0x40
<4>[48385.396761] Code: 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 7f 20 be 02 00 00 00 e9 bd fe ff ff 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 <48> 8b 7f 20 be 04 00 00 00 e9 9d fe ff ff 0f 1f 00 66 2e 0f 1f 
<1>[48385.416505] RIP: selinux_socket_sendmsg+0x5/0x20 RSP: ffffbc96ccbf3d48
<4>[48385.423374] CR2: 0000000000000020
<4>[48385.427044] ---[ end trace f37cf512d0bc4b2f ]---

not sure if this is linked but we see also a lot of :

<6>[33859.884045] libceph: osd2 up
<6>[34200.599876] libceph: osd1 down
<6>[34205.260202] libceph: osd0 down
<6>[34205.263644] libceph: osd1 up
<6>[34211.300661] libceph: osd0 up
<6>[34413.401422] libceph: osd1 down
<6>[34413.404841] libceph: osd0 down
<6>[34413.408291] libceph: osd1 up
<6>[34413.411539] libceph: osd0 up
<6>[34822.730351] libceph: osd2 down
<6>[34822.733804] libceph: osd2 up
<6>[35058.407999] libceph: osd2 down
<6>[35058.411415] libceph: osd2 up
<6>[35705.727712] libceph: osd1 down
<6>[35709.489983] libceph: osd1 up
<6>[35712.425469] libceph: osd0 down
<6>[35712.428871] libceph: osd0 up
<6>[35724.584535] libceph: osd1 down
...

and

Apr 13 08:53:07 black-mirror-osixia-cluster kernel: libceph: mon2 10.244.73.3:6789 session lost, hunting for new mon
Apr 13 08:53:07 black-mirror-osixia-cluster kernel: libceph: mon1 10.244.46.3:6789 session established
Apr 13 08:53:16 black-mirror-osixia-cluster kernel: libceph: mon2 10.244.73.3:6789 session lost, hunting for new mon
Apr 13 08:53:16 black-mirror-osixia-cluster kernel: libceph: mon0 10.244.43.12:6789 session established
Apr 13 08:53:37 black-mirror-osixia-cluster kernel: libceph: mon1 10.244.46.3:6789 session lost, hunting for new mon
Apr 13 08:53:37 black-mirror-osixia-cluster kernel: libceph: mon2 10.244.73.3:6789 session established
Apr 13 08:53:47 black-mirror-osixia-cluster kernel: libceph: mon0 10.244.43.12:6789 session lost, hunting for new mon
Apr 13 08:53:47 black-mirror-osixia-cluster kernel: libceph: mon1 10.244.46.3:6789 session established

Module infos

filename:       /lib/modules/4.14.32-coreos/kernel/net/ceph/libceph.ko
license:        GPL
description:    Ceph core library
author:         Patience Warnick <patience@newdream.net>
author:         Yehuda Sadeh <yehuda@hq.newdream.net>
author:         Sage Weil <sage@newdream.net>
depends:        libcrc32c
retpoline:      Y
intree:         Y
name:           libceph
vermagic:       4.14.32-coreos SMP mod_unload

vmcore-dmesg.txt View (141 KB) Yong Wang, 04/25/2018 06:13 AM

History

#1 Updated by Ilya Dryomov 7 months ago

<4>[48385.225000] CPU: 0 PID: 22717 Comm: kworker/0:0 Not tainted 4.14.19-coreos #1
filename:       /lib/modules/4.14.32-coreos/kernel/net/ceph/libceph.ko

The kernel versions don't match? Are these from the same machine?

#2 Updated by Bertrand Gouny 7 months ago

maybe not, i had this errors on 3 machines,
but i noticed the error checking the logs after an update, will let you know if this happens again with kernel 4.14.32

#3 Updated by Ilya Dryomov 7 months ago

Can you describe what the workload is -- it looks like this is cephfs? Are those "osd up/down" and "session lost" messages still present? If so, do you see the corresponding "osd up/down" messages in the cluster log?

#4 Updated by Ilya Dryomov 7 months ago

It looks like CoreOS buildbot compiler is doing some really weird inlining. The backtrace doesn't make any sense: ceph_msg_new() is not what gets run on ceph-msgr workqueue. The only other instance of such a crappy backtrace in the tracker is http://tracker.ceph.com/issues/14360, so it's definitely a CoreOS kernel "feature".

That said, I managed to find what I hope is the correct kernel module and "ceph_msg_new+0x108e/0x2470" appears to be the call to kernel_sendmsg() in ceph_tcp_sendmsg(). What we are looking at is probably

  con_work
    try_write
      write_partial_kvec
        ceph_tcp_sendmsg
          kernel_sendmsg
where ceph_tcp_sendmsg() got called with a NULL sock, meaning that con->sock was NULL all along, but that is just a guess.

#5 Updated by Jos Collin 7 months ago

Ilya Dryomov wrote:

That said, I managed to find what I hope is the correct kernel module and "ceph_msg_new+0x108e/0x2470" appears to be the call to kernel_sendmsg() in ceph_tcp_sendmsg(). What we are looking at is probably
[...]where ceph_tcp_sendmsg() got called with a NULL sock, meaning that con->sock was NULL all along, but that is just a guess.

That's my finding too. It came in via con->sock.

#6 Updated by Bertrand Gouny 7 months ago

@Ilya Dryomov
the woarkload is very very low, osd up/down seems to happen from time to time on read and write, but the session lost message occurs every 30s or less.
In cluster logs only "cluster [INF] overall HEALTH_OK" during same time periods.

#7 Updated by Ilya Dryomov 7 months ago

Do you see "session lost" on every machine you mount cephfs on or on just some of them?

#8 Updated by Bertrand Gouny 7 months ago

The "session lost" is logged on every (3) machines.
It seems to start when a monitor came down then up or is replaced by an other.

I mount cephfs using a DNS record with A entries automatically populated with monitors IPs.
Cephfs is mounted about 50 times with several paths by kubernetes for about 50 containers with very low activities.

The weird thing is that the session is lost on IPs corresponding to monitors that or totally fine (not down then up or replaced).

#9 Updated by Jeff Layton 7 months ago

Ilya Dryomov wrote:

[...]where ceph_tcp_sendmsg() got called with a NULL sock, meaning that con->sock was NULL all along, but that is just a guess.

Maybe. I'll note that con_close_socket() also sets con->sock to NULL. Is it possible that a call to that raced in somehow?

#10 Updated by Ilya Dryomov 7 months ago

  • Category set to libceph
  • Assignee set to Ilya Dryomov

That is supposed to be protected by con->mutex. It is probably a bug in connection state handling code, I'll take a look at that.

@Bertrand
Why are the MONs coming down and/or getting replaced?

#11 Updated by Jeff Layton 7 months ago

Yeah:

cancel_con calls cancel_delayed_work, but that can return while the work is still running. So, suppose a call to ceph_con_close races in just before ceph_con_workfn runs, takes the mutex and closes the socket, setting con->sock to NULL and clearing CON_FLAG_SOCK_CLOSED.

Eventually the mutex is dropped, and ceph_con_workfn runs, and hits the null ptr deref here. There is a call to con_sock_closed but that flag gets cleared in con_close. I think that check needs to be re-thought.

#12 Updated by Bertrand Gouny 7 months ago

@Ilya
I guess this is a bad thing but the monitors are run in containers, one on each machine.
Times to times we stop the containers to perform a coreos update.

When the container is restarted the mon came up again,
but when it fail we just delete the container and create a new one to replace it and this become a new mon for ceph.

#13 Updated by Yong Wang 7 months ago

We meet the same panic:

=================

2129 [11093.424272] Call Trace:
2130 [11093.424438] [<ffffffffa050dea5>] ? ceph_tcp_recvmsg+0x55/0x80 [libceph]
2131 [11093.424614] [<ffffffffa0520a4f>] ? dispatch+0x5f/0x470 [libceph]
2132 [11093.424782] [<ffffffff81555147>] kernel_sendmsg+0x37/0x50
2133 [11093.424951] [<ffffffffa050df29>] ceph_tcp_sendmsg+0x59/0x80 [libceph]
2134 [11093.425124] [<ffffffffa050ffa9>] try_write+0x279/0xe60 [libceph]
2135 [11093.425293] [<ffffffff810cbcdc>] ? set_next_entity+0x3c/0xe0
2136 [11093.425464] [<ffffffffa0512d29>] ceph_con_workfn+0xc9/0x650 [libceph]
2137 [11093.425634] [<ffffffff810a7f3b>] process_one_work+0x17b/0x470
2138 [11093.425800] [<ffffffff810a8d76>] worker_thread+0x126/0x410
2139 [11093.425966] [<ffffffff810a8c50>] ? rescuer_thread+0x460/0x460
2140 [11093.426133] [<ffffffff810b052f>] kthread+0xcf/0xe0
2141 [11093.426298] [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140
2142 [11093.426467] [<ffffffff81696418>] ret_from_fork+0x58/0x90
2143 [11093.426633] [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140
2144 [11093.426800] Code: c0 75 33 48 8b 84 24 c0 00 00 00 4c 89 f1 4c 89 ea 4c 89 e6 48 89 df 4c 89 60 18 48 c7 40 28 00 00 00 00 4c 89 68 30 44 89 70 14 <4 9> 8b 44 24 28 ff 90 88 00 00 00 3d ef fd ff ff 74 29 48 8b 8c
2145 [11093.427517] RIP [<ffffffff815550c5>] sock_sendmsg+0xa5/0xf0

2103 [11093.418995] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
2104 [11093.419024] IP: [<ffffffff815550c5>] sock_sendmsg+0xa5/0xf0
2105 [11093.419046] PGD 0

2110 [11093.420165] Hardware name: OEM PR4036TG/SC612DI-8F, BIOS 2.0b 06/09/2017
2111 [11093.420343] Workqueue: ceph-msgr ceph_con_workfn [libceph]
2112 [11093.420511] task: ffff880fbac2de20 ti: ffff880ee4b38000 task.ti: ffff880ee4b38000
2113 [11093.420826] RIP: 0010:[<ffffffff815550c5>] [<ffffffff815550c5>] sock_sendmsg+0xa5/0xf0

==============================

ceph_con_close : 
1) set connection flag CON_STATE_CLOSED
2) delay connect works to ceph_con_workfn's queue
3) con_close_socket close connection->sock, clear CON_STATE_CLOSED

step 2 is async, if step 3 is early run ceph_con_workfn execute the works with cancel_connection,
the ceph_con_workfn found flag is cleared, will do the try_write method use the already set to null ptr session->sock?

did it like happen to this ?

#14 Updated by Yong Wang 7 months ago

ceph_con_workfn will continue to do the below if no flag settled in connections?

write_partial_kvec
ceph_tcp_sendmsg

#15 Updated by Ilya Dryomov 7 months ago

I'm working on a fix. What I'm wondering is why has it popped up now and not in the past.

Yong, which kernel is that? Do you also see "session lost, hunting for new mon" or something similar in dmesg?

#16 Updated by Yong Wang 7 months ago

os centos 7.3 kernel version 3.10.514
not found session lost near the panic timestamp
libceph print a lot of osds up and down

which maybe happened we have done a performance testing ,
in that env,if some osds down which detected by our monitor shell process,the。sh will restart it by systemctl restart.

which restart operations maybe many times

#17 Updated by Ilya Dryomov 7 months ago

Bertrand, Yong, what is the output of

ceph daemon mon.a config get ms_type
ceph daemon osd.0 config get ms_type

Bertrand, the "session lost, hunting for new mon" issue is tracked in #23537.

#18 Updated by Bertrand Gouny 7 months ago

Thanks i will follow #23537 :)

I must have some serious config error when i run

ceph daemon mon.a config get ms_type
or
ceph daemon osd.0 config get ms_type

i have this error :

admin_socket: exception getting command descriptions: [Errno 2] No such file or directory

will try to solve this later.

so i just run on one mon and one osd command like :

ceph --admin-daemon /var/run/ceph/[...].asok config get ms_type

not sure if this is the same as what you whant ?

result on mon:

{
    "ms_type": "async+posix" 
}

result on osd:

{
    "ms_type": "async+posix" 
}

#19 Updated by Ilya Dryomov 7 months ago

Yes, I wanted to make sure you were using async messenger.

I'm still looking into the crash. Did it happen just once or did you see it more than once?

Can you share what proceeded "<1>[48385.115715] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020" in the kernel log? 100-200 lines at least.

#20 Updated by Bertrand Gouny 7 months ago

It happened at least 9 times (from Jan 8 to Mar 31), i have only 3 last crash logs on the 3 servers.

<6>[33859.830433] libceph: osd1 up
<6>[33859.833683] libceph: osd2 down
<6>[33859.837099] libceph: osd2 up
<6>[33859.840330] libceph: osd0 down
<6>[33859.843754] libceph: osd0 up
<6>[33859.847145] libceph: osd1 down
<6>[33859.850626] libceph: osd1 up
<6>[33859.853872] libceph: osd0 down
<6>[33859.857284] libceph: osd0 up
<6>[33859.860532] libceph: osd2 down
<6>[33859.863990] libceph: osd2 up
<6>[33859.867230] libceph: osd1 down
<6>[33859.870656] libceph: osd1 up
<6>[33859.873896] libceph: osd0 down
<6>[33859.877327] libceph: osd0 up
<6>[33859.880551] libceph: osd2 down
<6>[33859.884045] libceph: osd2 up
<6>[34200.599876] libceph: osd1 down
<6>[34205.260202] libceph: osd0 down
<6>[34205.263644] libceph: osd1 up
<6>[34211.300661] libceph: osd0 up
<6>[34413.401422] libceph: osd1 down
<6>[34413.404841] libceph: osd0 down
<6>[34413.408291] libceph: osd1 up
<6>[34413.411539] libceph: osd0 up
<6>[34822.730351] libceph: osd2 down
<6>[34822.733804] libceph: osd2 up
<6>[35058.407999] libceph: osd2 down
<6>[35058.411415] libceph: osd2 up
<6>[35705.727712] libceph: osd1 down
<6>[35709.489983] libceph: osd1 up
<6>[35712.425469] libceph: osd0 down
<6>[35712.428871] libceph: osd0 up
<6>[35724.584535] libceph: osd1 down
<6>[35724.588091] libceph: osd1 up
<6>[35724.591372] libceph: osd0 down
<6>[35724.594845] libceph: osd0 up
<6>[36317.309428] libceph: osd2 down
<6>[36324.126199] libceph: osd2 down
<6>[36324.129677] libceph: osd2 up
<6>[36349.212749] libceph: osd2 up
<6>[36915.934773] libceph: osd0 down
<6>[36918.521912] libceph: osd0 up
<6>[36938.439294] libceph: osd0 down
<6>[36938.442704] libceph: osd0 up
<6>[37210.876180] libceph: osd1 down
<6>[37214.554897] libceph: osd1 up
<6>[37263.439478] libceph: osd1 down
<6>[37263.442974] libceph: osd1 up
<6>[37834.500143] libceph: osd2 down
<6>[37834.503591] libceph: osd2 up
<6>[37888.449206] libceph: osd2 down
<6>[37888.452677] libceph: osd2 up
<6>[38421.706242] libceph: osd0 down
<6>[38424.732614] libceph: osd0 up
<6>[38523.456237] libceph: osd0 down
<6>[38523.459702] libceph: osd0 up
<6>[38716.146609] libceph: osd1 down
<6>[38720.889830] libceph: osd1 up
<6>[38843.462045] libceph: osd1 down
<6>[38843.465463] libceph: osd1 up
<6>[39334.635184] libceph: osd2 down
<6>[39334.638629] libceph: osd2 up
<6>[39468.470135] libceph: osd2 down
<6>[39468.473568] libceph: osd2 up
<6>[39926.861035] libceph: osd0 down
<6>[39931.974642] libceph: osd0 up
<6>[40103.477392] libceph: osd0 down
<6>[40103.480863] libceph: osd0 up
<6>[40221.345070] libceph: osd1 down
<6>[40227.073390] libceph: osd1 up
<6>[40413.486392] libceph: osd1 down
<6>[40413.489901] libceph: osd1 up
<6>[40839.636382] libceph: osd2 down
<6>[40839.639873] libceph: osd2 up
<6>[41048.494340] libceph: osd2 down
<6>[41048.497811] libceph: osd2 up
<6>[41433.012282] libceph: osd0 down
<6>[41433.015711] libceph: osd0 up
<6>[41683.502513] libceph: osd0 down
<6>[41683.505930] libceph: osd0 up
<6>[41726.393681] libceph: osd1 down
<6>[41729.349261] libceph: osd1 up
<6>[41993.503127] libceph: osd1 down
<6>[41993.506567] libceph: osd1 up
<6>[42345.145224] libceph: osd2 down
<6>[42345.148645] libceph: osd2 up
<6>[42359.480810] libceph: osd2 down
<6>[42359.484284] libceph: osd2 up
<6>[42632.125267] libceph: osd1 down
<6>[42637.144798] libceph: osd0 down
<6>[42637.148210] libceph: osd1 up
<6>[42640.181789] libceph: osd0 up
<6>[42646.576589] libceph: osd1 down
<6>[42646.580038] libceph: osd0 down
<6>[42646.583465] libceph: osd1 up
<6>[42646.586719] libceph: osd0 up
<6>[43251.370149] libceph: osd2 down
<6>[43251.373595] libceph: osd2 up
<6>[43268.530098] libceph: osd2 down
<6>[43268.533527] libceph: osd2 up
<6>[44136.495407] libceph: osd1 down
<6>[44141.339506] libceph: osd0 down
<6>[44141.342912] libceph: osd1 up
<6>[44145.306534] libceph: osd0 up
<6>[44218.536885] libceph: osd1 down
<6>[44218.540344] libceph: osd0 down
<6>[44218.543798] libceph: osd1 up
<6>[44218.547073] libceph: osd0 up
<6>[44751.047162] libceph: osd2 down
<6>[44751.050580] libceph: osd2 up
<6>[44848.544635] libceph: osd2 down
<6>[44848.548042] libceph: osd2 up
<6>[45341.653282] libceph: osd1 down
<6>[45345.619921] libceph: osd1 up
<6>[45483.555981] libceph: osd1 down
<6>[45483.559422] libceph: osd1 up
<6>[45647.459034] libceph: osd0 down
<6>[45651.835601] libceph: osd0 up
<6>[45803.557147] libceph: osd0 down
<6>[45803.560582] libceph: osd0 up
<6>[45959.552620] libceph: osd2 down
<6>[45959.556080] libceph: osd2 up
<6>[46118.566233] libceph: osd2 down
<6>[46118.569655] libceph: osd2 up
<6>[46846.651413] libceph: osd1 down
<6>[46852.309630] libceph: osd1 up
<6>[47063.576190] libceph: osd1 down
<6>[47063.579620] libceph: osd1 up
<6>[47152.066313] libceph: osd0 down
<6>[47157.408840] libceph: osd0 up
<6>[47383.584001] libceph: osd0 down
<6>[47383.587431] libceph: osd0 up
<6>[47464.808598] libceph: osd2 down
<6>[47464.812064] libceph: osd2 up
<6>[47708.042212] libceph: osd2 down
<6>[47708.045640] libceph: osd2 up
<6>[48352.227750] libceph: osd1 down
<6>[48354.774997] libceph: osd1 up
<6>[48385.108799] libceph: osd1 down
<6>[48385.112342] libceph: osd1 up
<1>[48385.115715] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020

other example:

<6>[107816.810500] libceph: mon2 10.244.73.5:6789 session lost, hunting for new mon
<6>[107816.819016] libceph: mon0 10.244.43.19:6789 session established
<6>[107847.530010] libceph: mon0 10.244.43.19:6789 session lost, hunting for new mon
<6>[107847.540894] libceph: mon1 10.244.46.5:6789 session established
<6>[107877.737478] libceph: mon1 10.244.46.5:6789 session lost, hunting for new mon
<6>[107877.745951] libceph: mon0 10.244.43.19:6789 session established
<6>[107907.945902] libceph: mon0 10.244.43.19:6789 session lost, hunting for new mon
<6>[107907.961754] libceph: mon2 10.244.73.5:6789 session established
<6>[107922.925352] libceph: osd1 down
<6>[107922.928898] libceph: osd1 up
<6>[107938.664482] libceph: mon2 10.244.73.5:6789 session lost, hunting for new mon
<6>[107938.673853] libceph: mon1 10.244.46.5:6789 session established
<6>[107968.871892] libceph: mon1 10.244.46.5:6789 session lost, hunting for new mon
<6>[107968.880434] libceph: mon0 10.244.43.19:6789 session established
<6>[107978.088073] libceph: osd2 down
<6>[107978.091571] libceph: osd2 up
<6>[107999.591411] libceph: mon0 10.244.43.19:6789 session lost, hunting for new mon
<6>[107999.601418] libceph: mon1 10.244.46.5:6789 session established
<6>[108029.798914] libceph: mon1 10.244.46.5:6789 session lost, hunting for new mon
<6>[108029.807352] libceph: mon0 10.244.43.19:6789 session established
<6>[108060.518451] libceph: mon0 10.244.43.19:6789 session lost, hunting for new mon
<6>[108060.528611] libceph: mon1 10.244.46.5:6789 session established
<6>[108090.725949] libceph: mon1 10.244.46.5:6789 session lost, hunting for new mon
<6>[108090.741469] libceph: mon2 10.244.73.5:6789 session established
<6>[108120.933272] libceph: mon2 10.244.73.5:6789 session lost, hunting for new mon
<6>[108120.946075] libceph: mon1 10.244.46.5:6789 session established
<6>[108151.652839] libceph: mon1 10.244.46.5:6789 session lost, hunting for new mon
<6>[108151.661568] libceph: mon0 10.244.43.19:6789 session established
<6>[108181.860214] libceph: mon0 10.244.43.19:6789 session lost, hunting for new mon
<6>[108181.870050] libceph: mon1 10.244.46.5:6789 session established
<6>[108212.579774] libceph: mon1 10.244.46.5:6789 session lost, hunting for new mon
<6>[108212.595162] libceph: mon2 10.244.73.5:6789 session established
<6>[108237.929683] libceph: osd2 down
<6>[108237.933191] libceph: osd2 up
<6>[108242.787232] libceph: mon2 10.244.73.5:6789 session lost, hunting for new mon
<6>[108242.795585] libceph: mon0 10.244.43.19:6789 session established
<6>[108274.018737] libceph: mon0 10.244.43.19:6789 session lost, hunting for new mon
<6>[108274.029022] libceph: mon1 10.244.46.5:6789 session established
<6>[108304.738176] libceph: mon1 10.244.46.5:6789 session lost, hunting for new mon
<6>[108304.753807] libceph: mon2 10.244.73.5:6789 session established
<6>[108307.120205] libceph: osd0 down
<6>[108309.221068] libceph: osd0 up
<6>[108334.945685] libceph: mon2 10.244.73.5:6789 session lost, hunting for new mon
<6>[108334.955824] libceph: mon1 10.244.46.5:6789 session established
<6>[108365.665155] libceph: mon1 10.244.46.5:6789 session lost, hunting for new mon
<6>[108365.673680] libceph: mon0 10.244.43.19:6789 session established
<6>[108395.872716] libceph: mon0 10.244.43.19:6789 session lost, hunting for new mon
<6>[108395.888236] libceph: mon2 10.244.73.5:6789 session established
<6>[108426.592138] libceph: mon2 10.244.73.5:6789 session lost, hunting for new mon
<6>[108426.601693] libceph: mon1 10.244.46.5:6789 session established
<6>[108456.799599] libceph: mon1 10.244.46.5:6789 session lost, hunting for new mon
<6>[108456.808223] libceph: mon0 10.244.43.19:6789 session established
<6>[108487.519117] libceph: mon0 10.244.43.19:6789 session lost, hunting for new mon
<6>[108487.529368] libceph: mon1 10.244.46.5:6789 session established
<6>[108517.726546] libceph: mon1 10.244.46.5:6789 session lost, hunting for new mon
<6>[108517.763518] libceph: mon2 10.244.73.5:6789 session established
<6>[108547.934045] libceph: mon2 10.244.73.5:6789 session lost, hunting for new mon
<6>[108547.943733] libceph: mon1 10.244.46.5:6789 session established
<6>[108557.935462] libceph: osd0 down
<6>[108557.938973] libceph: osd0 up
<6>[108578.653569] libceph: mon1 10.244.46.5:6789 session lost, hunting for new mon
<6>[108578.669309] libceph: mon2 10.244.73.5:6789 session established
<6>[108608.861070] libceph: mon2 10.244.73.5:6789 session lost, hunting for new mon
<6>[108608.869755] libceph: mon0 10.244.43.19:6789 session established
<6>[108639.580564] libceph: mon0 10.244.43.19:6789 session lost, hunting for new mon
<6>[108639.596916] libceph: mon2 10.244.73.5:6789 session established
<6>[108669.788029] libceph: mon2 10.244.73.5:6789 session lost, hunting for new mon
<6>[108669.798025] libceph: mon1 10.244.46.5:6789 session established
<6>[108700.507460] libceph: mon1 10.244.46.5:6789 session lost, hunting for new mon
<6>[108700.516060] libceph: mon0 10.244.43.19:6789 session established
<6>[108730.714927] libceph: mon0 10.244.43.19:6789 session lost, hunting for new mon
<6>[108730.752570] libceph: mon2 10.244.73.5:6789 session established
<6>[108760.922494] libceph: mon2 10.244.73.5:6789 session lost, hunting for new mon
<6>[108760.932407] libceph: mon1 10.244.46.5:6789 session established
<6>[108791.641993] libceph: mon1 10.244.46.5:6789 session lost, hunting for new mon
<6>[108791.650457] libceph: mon0 10.244.43.19:6789 session established
<6>[108821.849454] libceph: mon0 10.244.43.19:6789 session lost, hunting for new mon
<6>[108821.859946] libceph: mon1 10.244.46.5:6789 session established
<6>[108852.568957] libceph: mon1 10.244.46.5:6789 session lost, hunting for new mon
<6>[108852.584115] libceph: mon2 10.244.73.5:6789 session established
<6>[108882.776430] libceph: mon2 10.244.73.5:6789 session lost, hunting for new mon
<6>[108882.784883] libceph: mon0 10.244.43.19:6789 session established
<6>[108914.007844] libceph: mon0 10.244.43.19:6789 session lost, hunting for new mon
<6>[108914.023208] libceph: mon2 10.244.73.5:6789 session established
<6>[108944.727405] libceph: mon2 10.244.73.5:6789 session lost, hunting for new mon
<6>[108944.735744] libceph: mon0 10.244.43.19:6789 session established
<6>[108974.934869] libceph: mon0 10.244.43.19:6789 session lost, hunting for new mon
<6>[108974.945659] libceph: mon1 10.244.46.5:6789 session established
<6>[109005.654332] libceph: mon1 10.244.46.5:6789 session lost, hunting for new mon
<6>[109005.662772] libceph: mon0 10.244.43.19:6789 session established
<6>[109035.861854] libceph: mon0 10.244.43.19:6789 session lost, hunting for new mon
<6>[109035.876942] libceph: mon2 10.244.73.5:6789 session established
<6>[109066.581352] libceph: mon2 10.244.73.5:6789 session lost, hunting for new mon
<6>[109066.589879] libceph: mon0 10.244.43.19:6789 session established
<6>[109096.788856] libceph: mon0 10.244.43.19:6789 session lost, hunting for new mon
<6>[109096.798779] libceph: mon1 10.244.46.5:6789 session established
<6>[109127.508368] libceph: mon1 10.244.46.5:6789 session lost, hunting for new mon
<6>[109127.516839] libceph: mon0 10.244.43.19:6789 session established
<6>[109157.715831] libceph: mon0 10.244.43.19:6789 session lost, hunting for new mon
<6>[109157.726082] libceph: mon1 10.244.46.5:6789 session established
<6>[109187.923236] libceph: mon1 10.244.46.5:6789 session lost, hunting for new mon
<6>[109187.931814] libceph: mon0 10.244.43.19:6789 session established
<6>[109208.247685] libceph: osd1 down
<6>[109208.251194] libceph: osd1 up
<6>[109208.404093] libceph: osd1 down
<6>[109208.407726] libceph: osd1 up
<1>[109208.411194] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020

#21 Updated by Yong Wang 7 months ago

we used jewel 10.2.10 and ms_type is async. yes.

vmcore-dmesg has been attached.

#22 Updated by Yong Wang 7 months ago

We meet totally twices.

#23 Updated by Yong Wang 7 months ago

http://tracker.ceph.com/issues/17664
https://github.com/ceph/ceph/pull/11601

do you meaning that async server not response ack msg, and client will destory and create mon session continue?

those session which one has been using but ptr is null?

jewel 10.2.10 hasn't contains this patches.

the patch 11601 must be a solution of the kernel painc?

tks for yours work.

#24 Updated by Yong Wang 7 months ago

libceph: mon2 10.244.73.5:6789 session lost, hunting for new mon

the above logs will be usual saw in kernel log,but libceph panic meeted total twices.

#25 Updated by Ilya Dryomov 7 months ago

No, I'll post the patch for the kernel panic later today. The monitor session issue is separate.

Do you see "session lost, hunting for new mon" repeating every ~30 seconds? If so, does it happen right after you map an rbd image or mount cephfs? Note that in Bertrand's case it doesn't happen until a monitor goes down due to an upgrade.

#26 Updated by Ilya Dryomov 7 months ago

async messenger was still experimental in jewel. If you are on jewel, you should be using simple messenger.

#27 Updated by Ilya Dryomov 7 months ago

  • Status changed from New to Need Review

#28 Updated by Yong Wang 7 months ago

yes.

2018-04-26T11:18:36.435400+08:00 node53 kernel: libceph: mon2 10.0.30.53:6789 session established
2018-04-26T11:19:15.996377+08:00 node53 kernel: libceph: mon2 10.0.30.53:6789 session lost, hunting for new mon
2018-04-26T11:19:16.337370+08:00 node53 kernel: libceph: mon1 10.0.30.52:6789 session established
2018-04-26T11:19:46.044371+08:00 node53 kernel: libceph: mon1 10.0.30.52:6789 session lost, hunting for new mon
2018-04-26T11:19:46.044422+08:00 node53 kernel: libceph: mon2 10.0.30.53:6789 session established
2018-04-26T11:20:16.060373+08:00 node53 kernel: libceph: mon2 10.0.30.53:6789 session lost, hunting for new mon
2018-04-26T11:20:46.012366+08:00 node53 kernel: libceph: mon2 10.0.30.53:6789 session established
2018-04-26T11:21:16.092377+08:00 node53 kernel: libceph: mon2 10.0.30.53:6789 session lost, hunting for new mon
2018-04-26T11:21:16.105367+08:00 node53 kernel: libceph: mon1 10.0.30.52:6789 session established
2018-04-26T11:21:55.996224+08:00 node53 kernel: libceph: mon1 10.0.30.52:6789 session lost, hunting for new mon
2018-04-26T11:22:26.459377+08:00 node53 kernel: libceph: mon2 10.0.30.53:6789 session established
2018-04-26T11:23:06.012391+08:00 node53 kernel: libceph: mon2 10.0.30.53:6789 session lost, hunting for new mon
2018-04-26T11:23:36.001400+08:00 node53 kernel: libceph: mon1 10.0.30.52:6789 session established
2018-04-26T11:24:06.076410+08:00 node53 kernel: libceph: mon1 10.0.30.52:6789 session lost, hunting for new mon
2018-04-26T11:24:36.028375+08:00 node53 kernel: libceph: mon1 10.0.30.52:6789 session established
2018-04-26T11:25:06.044377+08:00 node53 kernel: libceph: mon1 10.0.30.52:6789 session lost, hunting for new mon
2018-04-26T11:25:06.189369+08:00 node53 kernel: libceph: mon2 10.0.30.53:6789 session established
2018-04-26T11:25:46.012374+08:00 node53 kernel: libceph: mon2 10.0.30.53:6789 session lost, hunting for new mon
2018-04-26T11:25:46.255368+08:00 node53 kernel: libceph: mon1 10.0.30.52:6789 session established
2018-04-26T11:26:26.012373+08:00 node53 kernel: libceph: mon1 10.0.30.52:6789 session lost, hunting for new mon
2018-04-26T11:26:26.012434+08:00 node53 kernel: libceph: mon2 10.0.30.53:6789 session established
2018-04-26T11:26:56.060378+08:00 node53 kernel: libceph: mon2 10.0.30.53:6789 session lost, hunting for new mon
2018-04-26T11:27:26.012382+08:00 node53 kernel: libceph: mon2 10.0.30.53:6789 session established
2018-04-26T11:27:56.092397+08:00 node53 kernel: libceph: mon2 10.0.30.53:6789 session lost, hunting for new mon

#29 Updated by Ilya Dryomov 7 months ago

Does it happen right after you map an rbd image or mount cephfs? If so, and all your monitors are up, you are hitting #17664. Regardless, because you are on jewel, my recommendation is to switch to simple messenger.

#30 Updated by geng jichao 7 months ago

we encountered similar problems. I don't know whether they are the same reasons.

Task dump for CPU 14:

Call Trace:
[<ffffffff810a4741>] ? deactivate_task+0x81/0xa0
[<ffffffff81095611>] wq_worker_sleeping+0x11/0x90
[<ffffffff817fbc50>] _schedule+0x650/0xa30
[<ffffffff817fc065>] schedule+0x35/0x80
[<ffffffff8107f166>] do_exit+0x766/0x80
[<ffffffff8107f166>] do_exit+0x766/0xaf0
[<ffffffff8102f991>] oops_end+0xa1/0xd0
[<ffffffff8102fe2b>] die+0x4b/0x70
[<ffffffff8102d479>] do_general_protection+0x28/0x30
[<ffffffff81801e98>] general_protection+0x28/0x30
[<ffffffff813e6a66>] ? memcpy_erms+0x6/0x10
[<ffffffff813ebf6e>] ? copy_to_iter+0x1ce/0x330
[<ffffffff813ed3ea>] copy_page_to_iter+0x7a/0x1c0
[<ffffffff816ec82d>] skb_copy_datagram_iter+0x14d/0x280
[<ffffffff81744989>] tcp_recmsg+0x5e9/0xbc0
[<ffffffff81381230>] ? aa_sk_perm+0x70/0x210
[<ffffffff8177212f>] inet_recvmsg+0x6f/0x80
[<ffffffff816dd54b>] sock_recvmsg+0x3b/0x50
[<ffffffff816dd691>] kernel_recvmsg+0x61/0x80
[<ffffffffc05398ab>] ceph_tcp_recvmsg+0x4b/0x60 [libceph]
[<ffffffffc053cfa1>] kernel_recvmsg+0x61/0x80
[<ffffffffc05398ab>] ceph_tcp_recvmsg+0x4b/0x60 [libceph]
[<ffffffffc053cfa8>] read_partial _message+0x3a8/0x860 [libceph]
[<ffffffffc053d9cb>] ceph_mon_workfn+0x56b/0x18b0 [libceph]
[<ffffffff810b0e6d>] ? dequeue_task_fair+0x51d/0x8b0
[<ffffffff810b2915>] ? _switch_to+0x1cd/ox590
[<ffffffff81094315>] process_one_work+0x165/0x480
[<ffffffff8109467b>] worker_thread+0x4b/0x4c0
[<ffffffff81094630>] ? process_one_work+0x490/0x480
[<ffffffff8109a5d9>] kthread+0xc9/0xe0
[<ffffffff8109a510>] ? kthread_create_on_node+0x1c0/0x1c0
[<ffffffff8100010f>] ret_from_fork+0x3f/0x70
[<ffffffff8109a510>] ? kthread_create_on_node+0x1c0/0x1c0

#31 Updated by Ilya Dryomov 7 months ago

No, this doesn't look related at first sight. Can you paste more so I can see which kernel, what happened before the crash, etc?

#32 Updated by geng jichao 7 months ago

Ilya Dryomov wrote:

No, this doesn't look related at first sight. Can you paste more so I can see which kernel, what happened before the crash, etc?

The kernel version we use is 4.4.0, but we often synchronize the ceph.ko and libceph code from the github. The version of the CEPH code we are using is "12.2.1.6 luminous (stable)" . Monitor and OSD did not down when the system hung. There is a strange phenomenon that two hosts almost at the same time appear this problem. I suspect that some unusual messenger has been received. It's a pity that I didn't open the libceph.ko log.

#33 Updated by Ilya Dryomov 7 months ago

  • Subject changed from Kernel panic: unable to handle kernel NULL pointer dereference to NULL sock gets passed to ceph_tcp_sendmsg()
  • Status changed from Need Review to Pending Upstream

#35 Updated by Ilya Dryomov 7 months ago

Bertrand, this patch fixes a crash that is very similar to yours. I wasn't able to trigger it under "normal" conditions though, so feel free to reopen if you continue to see crashes or "session lost" messages repeating every 30 seconds.

#36 Updated by Ilya Dryomov 7 months ago

  • Status changed from Pending Backport to Resolved

In 3.18.108, 4.4.131, 4.9.98, 4.14.39, 4.16.7.

#37 Updated by Bertrand Gouny 7 months ago

i'm monitoring this more closely now i will let you know if it occurs again.
thanks a lot for your work and the prompt responses :)

Also available in: Atom PDF