Activity
From 04/01/2018 to 04/30/2018
04/30/2018
- 07:52 AM Bug #23537 (Pending Backport): libceph: monX xxxxxx session lost, hunting for new mon
- https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7b4c443d139f1d2b5570da475f7a9cbcef86740...
- 07:52 AM Bug #23706 (Pending Backport): NULL sock gets passed to ceph_tcp_sendmsg()
- https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9c55ad1c214d9f8c4594ac2c3fa392c1c32431a...
04/27/2018
- 05:18 PM Bug #23706: NULL sock gets passed to ceph_tcp_sendmsg()
- 12:20 PM Bug #23706: NULL sock gets passed to ceph_tcp_sendmsg()
- Ilya Dryomov wrote:
> No, this doesn't look related at first sight. Can you paste more so I can see which kernel, w... - 10:01 AM Bug #23706: NULL sock gets passed to ceph_tcp_sendmsg()
- No, this doesn't look related at first sight. Can you paste more so I can see which kernel, what happened before the...
- 09:55 AM Bug #23706: NULL sock gets passed to ceph_tcp_sendmsg()
- we encountered similar problems. I don't know whether they are the same reasons.
Task dump for CPU 14:
Call Tr...
04/26/2018
- 07:35 AM Bug #23706: NULL sock gets passed to ceph_tcp_sendmsg()
- Does it happen right after you map an rbd image or mount cephfs? If so, and all your monitors are up, you are hittin...
- 03:29 AM Bug #23706: NULL sock gets passed to ceph_tcp_sendmsg()
- yes.
2018-04-26T11:18:36.435400+08:00 node53 kernel: libceph: mon2 10.0.30.53:6789 session established
2018-04-26...
04/25/2018
- 10:32 AM Bug #23706 (Fix Under Review): NULL sock gets passed to ceph_tcp_sendmsg()
- 09:28 AM Bug #23706: NULL sock gets passed to ceph_tcp_sendmsg()
- async messenger was still experimental in jewel. If you are on jewel, you should be using simple messenger.
- 09:16 AM Bug #23706: NULL sock gets passed to ceph_tcp_sendmsg()
- No, I'll post the patch for the kernel panic later today. The monitor session issue is separate.
Do you see "sess... - 07:03 AM Bug #23706: NULL sock gets passed to ceph_tcp_sendmsg()
- libceph: mon2 10.244.73.5:6789 session lost, hunting for new mon
the above logs will be usual saw in kernel log... - 06:58 AM Bug #23706: NULL sock gets passed to ceph_tcp_sendmsg()
- http://tracker.ceph.com/issues/17664
https://github.com/ceph/ceph/pull/11601
do you meaning that async server not... - 06:25 AM Bug #23706: NULL sock gets passed to ceph_tcp_sendmsg()
- We meet totally twices.
- 06:13 AM Bug #23706: NULL sock gets passed to ceph_tcp_sendmsg()
- we used jewel 10.2.10 and ms_type is async. yes.
vmcore-dmesg has been attached.
04/24/2018
- 02:53 PM Bug #23706: NULL sock gets passed to ceph_tcp_sendmsg()
- It happened at least 9 times (from Jan 8 to Mar 31), i have only 3 last crash logs on the 3 servers....
- 12:58 PM Bug #23706: NULL sock gets passed to ceph_tcp_sendmsg()
- Yes, I wanted to make sure you were using async messenger.
I'm still looking into the crash. Did it happen just o... - 10:09 AM Bug #23706: NULL sock gets passed to ceph_tcp_sendmsg()
- Thanks i will follow #23537 :)
I must have some serious config error when i run ... - 08:38 AM Bug #23706: NULL sock gets passed to ceph_tcp_sendmsg()
- Bertrand, Yong, what is the output of...
- 12:55 PM Bug #23537: libceph: monX xxxxxx session lost, hunting for new mon
04/23/2018
- 05:02 PM Bug #23537: libceph: monX xxxxxx session lost, hunting for new mon
- I think I found the issue. The fix should be in soon and will be backported to stable kernels.
- 05:00 PM Bug #23537 (Fix Under Review): libceph: monX xxxxxx session lost, hunting for new mon
04/20/2018
- 12:40 PM Bug #23706: NULL sock gets passed to ceph_tcp_sendmsg()
- os centos 7.3 kernel version 3.10.514
not found session lost near the panic timestamp
libceph print a lot of osds u... - 06:50 AM Bug #23706: NULL sock gets passed to ceph_tcp_sendmsg()
- I'm working on a fix. What I'm wondering is why has it popped up now and not in the past.
Yong, which kernel is t... - 03:05 AM Bug #23706: NULL sock gets passed to ceph_tcp_sendmsg()
- ceph_con_workfn will continue to do the below if no flag settled in connections?
write_partial_kvec
ceph_tcp_sen... - 03:02 AM Bug #23706: NULL sock gets passed to ceph_tcp_sendmsg()
- We meet the same panic:
=================
2129 [11093.424272] Call Trace:
2130 [11093.424438] [<ffffffffa05...
04/18/2018
- 03:38 PM Bug #23706: NULL sock gets passed to ceph_tcp_sendmsg()
- @Ilya
I guess this is a bad thing but the monitors are run in containers, one on each machine.
Times to times we st... - 03:07 PM Bug #23706: NULL sock gets passed to ceph_tcp_sendmsg()
- Yeah:
cancel_con calls cancel_delayed_work, but that can return while the work is still running. So, suppose a cal... - 03:02 PM Bug #23706: NULL sock gets passed to ceph_tcp_sendmsg()
- That is supposed to be protected by con->mutex. It is probably a bug in connection state handling code, I'll take a ...
- 02:00 PM Bug #23706: NULL sock gets passed to ceph_tcp_sendmsg()
- Ilya Dryomov wrote:
> [...]where ceph_tcp_sendmsg() got called with a NULL sock, meaning that con->sock was NULL a... - 01:16 PM Bug #23706: NULL sock gets passed to ceph_tcp_sendmsg()
- The "session lost" is logged on every (3) machines.
It seems to start when a monitor came down then up or is replace... - 12:40 PM Bug #23706: NULL sock gets passed to ceph_tcp_sendmsg()
- Do you see "session lost" on every machine you mount cephfs on or on just some of them?
- 12:32 PM Bug #23706: NULL sock gets passed to ceph_tcp_sendmsg()
- @Ilya Dryomov
the woarkload is very very low, osd up/down seems to happen from time to time on read and write, but t... - 12:19 PM Bug #23706: NULL sock gets passed to ceph_tcp_sendmsg()
- Ilya Dryomov wrote:
> That said, I managed to find what I hope is the correct kernel module and "ceph_msg_new+0x108e... - 11:03 AM Bug #23706: NULL sock gets passed to ceph_tcp_sendmsg()
- It looks like CoreOS buildbot compiler is doing some really weird inlining. The backtrace doesn't make any sense: ce...
- 09:27 AM Bug #23706: NULL sock gets passed to ceph_tcp_sendmsg()
- Can you describe what the workload is -- it looks like this is cephfs? Are those "osd up/down" and "session lost" me...
- 08:48 AM Bug #23706: NULL sock gets passed to ceph_tcp_sendmsg()
- maybe not, i had this errors on 3 machines,
but i noticed the error checking the logs after an update, will let you ... - 08:00 AM Bug #23706: NULL sock gets passed to ceph_tcp_sendmsg()
- ...
- 09:21 AM Bug #23272: switch port down ,cephfs kernel client lost session, blocked not recover ok util port...
- ...
- 08:32 AM Bug #23097 (Closed): Stale directories and files in CentOS (release <= 7.3 or kernel version < 3....
- no place to commit it because recent rhel kernel also includes backport of the d_invalidate change
04/13/2018
- 08:58 AM Bug #23706 (Resolved): NULL sock gets passed to ceph_tcp_sendmsg()
- Hello,
we noticed some server crash with this kind of logs:...
04/12/2018
- 04:13 PM Feature #12902 (In Progress): krbd: support object-map and fast-diff
- 04:13 PM Feature #23073 (Resolved): Allow set CEPH_OSD_REQUEST_TIMEOUT_DEFAULT on rbd map
- 04:00 PM Feature #23688 (Resolved): get_features with readonly=true for parent images
- Pass the optional read-only flag down to the 'get_features' rbd class method. For the parent image case, it would be...
04/11/2018
- 08:47 AM Bug #22702 (Resolved): cephfs crashed under high memory pressure due to reserved caps number mism...
- https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e30ee58121e34831b9665934d70dbc72ab0fe2f...
- 08:40 AM Feature #4770 (Resolved): krbd: consider including write data with layered existence check
- Done in 4.17.
- 08:39 AM Feature #3837 (Resolved): krbd: support format 2 striping
- Done in 4.17.
04/10/2018
- 12:40 PM Bug #23112: rbd kernel client might hang when write to a quota-full pool
- This isn't specific to the kernel client, I believe other ceph clients behave the same way.
04/09/2018
- 05:38 PM Bug #23537: libceph: monX xxxxxx session lost, hunting for new mon
- v12.2.2 includes the fix for #17664.
Do these messages appear right after you mount or later? Do they go away if ... - 05:29 PM Bug #23537: libceph: monX xxxxxx session lost, hunting for new mon
- Марк Коренберг wrote:
> Important: on another machine with same OS everything is fine.
Another client machine whe...
04/03/2018
- 01:58 PM Bug #18130: soft lockups in ceph.ko
- Reassigning to Ilya since he's working on this.
04/01/2018
- 06:48 PM Bug #23537: libceph: monX xxxxxx session lost, hunting for new mon
- Important: on another machine with same OS everything is fine.
- 06:46 PM Bug #23537 (Resolved): libceph: monX xxxxxx session lost, hunting for new mon
- maybe connected with #17664
I use Luminous 12.2.2 on both client and cluster. Kernel at cephfs client: Linux mmwor...
Also available in: Atom