https://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2019-10-12T03:00:50ZCeph Linux kernel client - Bug #42288: ceph:: kernel BUG at mm/slub.c:3901!https://tracker.ceph.com/issues/42288?journal_id=1484902019-10-12T03:00:50ZYanhu Caogmayyyha@gmail.com
<ul></ul><p>fs/ceph/mds_client.c<br /><pre><code class="c syntaxhl"><span class="CodeRay"><span class="directive">static</span> <span class="directive">void</span> dispatch(<span class="keyword">struct</span> ceph_connection *con, <span class="keyword">struct</span> ceph_msg *msg)
{
<span class="keyword">struct</span> ceph_mds_session *s = con->private;
<span class="keyword">struct</span> ceph_mds_client *mdsc = s->s_mdsc;
<span class="predefined-type">int</span> type = le16_to_cpu(msg->hdr.type);
mutex_lock(&mdsc->mutex);
<span class="keyword">if</span> (__verify_registered_session(mdsc, s) < <span class="integer">0</span>) {
mutex_unlock(&mdsc->mutex);
<span class="keyword">goto</span> out; <== should <span class="keyword">return</span>? <span class="keyword">if</span> mds session does not exist. e.g. https:<span class="comment">//github.com/gmayyyha/ceph-client/blob/1c789249578895bb14ab62b4327306439b754857/net/ceph/mon_client.c#L1208</span>
}
mutex_unlock(&mdsc->mutex);
<span class="keyword">switch</span> (type) {
...
<span class="keyword">default</span>:
pr_err(<span class="string"><span class="delimiter">"</span><span class="content">received unknown message type %d %s</span><span class="char">\n</span><span class="delimiter">"</span></span>, type,
ceph_msg_type_name(type));
}
<span class="label">out:</span>
ceph_msg_put(msg);
}
</span></code></pre></p> Linux kernel client - Bug #42288: ceph:: kernel BUG at mm/slub.c:3901!https://tracker.ceph.com/issues/42288?journal_id=1484912019-10-12T03:17:42ZYanhu Caogmayyyha@gmail.com
<ul><li><strong>File</strong> <a href="/attachments/download/4481/libceph.patch">libceph.patch</a> <a class="icon-only icon-magnifier" title="View" href="/attachments/4481/libceph.patch">View</a> added</li></ul> Linux kernel client - Bug #42288: ceph:: kernel BUG at mm/slub.c:3901!https://tracker.ceph.com/issues/42288?journal_id=1485012019-10-12T21:40:53ZPatrick Donnellypdonnell@redhat.com
<ul><li><strong>Project</strong> changed from <i>CephFS</i> to <i>Linux kernel client</i></li><li><strong>Subject</strong> changed from <i>cephfs kernel: kernel BUG at mm/slub.c:3901!</i> to <i>ceph:: kernel BUG at mm/slub.c:3901!</i></li><li><strong>Status</strong> changed from <i>New</i> to <i>Fix Under Review</i></li><li><strong>Start date</strong> deleted (<del><i>10/12/2019</i></del>)</li></ul> Linux kernel client - Bug #42288: ceph:: kernel BUG at mm/slub.c:3901!https://tracker.ceph.com/issues/42288?journal_id=1485022019-10-12T21:56:54ZJeff Laytonjlayton@redhat.com
<ul></ul><p>Looks like slabcache corruption at first glance, or maybe a bogus pointer got passed to kfree. 4.19.15 is pretty old at this point, do you see this on newer v4.19 kernels too?</p> Linux kernel client - Bug #42288: ceph:: kernel BUG at mm/slub.c:3901!https://tracker.ceph.com/issues/42288?journal_id=1485232019-10-14T01:34:44ZYanhu Caogmayyyha@gmail.com
<ul></ul><p>Jeff Layton wrote:</p>
<blockquote>
<p>Looks like slabcache corruption at first glance, or maybe a bogus pointer got passed to kfree. 4.19.15 is pretty old at this point, do you see this on newer v4.19 kernels too?</p>
</blockquote>
<p>The newer v4.19.79 kernel, same as the old v4.19.15.</p> Linux kernel client - Bug #42288: ceph:: kernel BUG at mm/slub.c:3901!https://tracker.ceph.com/issues/42288?journal_id=1485382019-10-14T10:07:38ZJeff Laytonjlayton@redhat.com
<ul></ul><p>(cc'ing Ilya since he knows this code better)</p>
<p>I don't know the mon code as well, but the monitor dispatch function looks leaky to me too. process_message takes in_msg from the con, and in the case where monc->private is NULL, it just drops that msg onto the floor without releasing the reference. Maybe there is something that mitigates that but I'm not seeing it right offhand.</p> Linux kernel client - Bug #42288: ceph:: kernel BUG at mm/slub.c:3901!https://tracker.ceph.com/issues/42288?journal_id=1485392019-10-14T10:31:55ZIlya Dryomov
<ul></ul><p>I think you meant con->private. That check is bogus, con->private is set when ceph_connection is initialized and is never cleared. I'll remove it.</p> Linux kernel client - Bug #42288: ceph:: kernel BUG at mm/slub.c:3901!https://tracker.ceph.com/issues/42288?journal_id=1485412019-10-14T11:03:57ZJeff Laytonjlayton@redhat.com
<ul></ul><p>Ilya Dryomov wrote:</p>
<blockquote>
<p>I think you meant con->private. That check is bogus, con->private is set when ceph_connection is initialized and is never cleared. I'll remove it.</p>
</blockquote>
<p>Yes, I meant con->private. Great, that was my take too after going over the code.</p>
<p>So to summarize, I think the patch proposed to fix this is not right. We need to put the reference to the message so it doesn't leak. If there's a bogus pointer in the message that's causing this, then we probably need to deal with it another way.</p> Linux kernel client - Bug #42288: ceph:: kernel BUG at mm/slub.c:3901!https://tracker.ceph.com/issues/42288?journal_id=1485422019-10-14T11:31:31ZYanhu Caogmayyyha@gmail.com
<ul></ul><pre><code class="c syntaxhl"><span class="CodeRay">ceph_msg_release
-> ceph_msg_data_destroy <= foreach
-> ceph_pagelist_release <= data->type == CEPH_MSG_DATA_PAGELIST
-> kfree <== crash here ?
-> kvfree <= !m->pool
-> kfree <== crash here ?
</span></code></pre>
<p>I only have vmcore-dmesg.txt, so I'm not pretty sure which <strong>kfree</strong> caused the crash.</p> Linux kernel client - Bug #42288: ceph:: kernel BUG at mm/slub.c:3901!https://tracker.ceph.com/issues/42288?journal_id=1485642019-10-14T12:47:02ZJeff Laytonjlayton@redhat.com
<ul></ul><p>Do you have a way to reproduce this at will?</p>
<p>If you have the debug symbols, you may be able to open the module with gdb and determine where it was:</p>
<pre>
$ gdb path/to/libceph.ko
...
gdb> list *(ceph_msg_release+0x15d)
</pre>
<p>That should show you the next instruction after the kfree returns, and you can use that to infer which one it was. That said this relies on having debug symbols available and the same sources, so ymmv.</p> Linux kernel client - Bug #42288: ceph:: kernel BUG at mm/slub.c:3901!https://tracker.ceph.com/issues/42288?journal_id=1485652019-10-14T12:47:19ZJeff Laytonjlayton@redhat.com
<ul><li><strong>Assignee</strong> set to <i>Jeff Layton</i></li></ul> Linux kernel client - Bug #42288: ceph:: kernel BUG at mm/slub.c:3901!https://tracker.ceph.com/issues/42288?journal_id=1485812019-10-14T13:46:23ZYanhu Caogmayyyha@gmail.com
<ul></ul><p>Only wait for the production environment to reappear.</p>
<pre><code class="c syntaxhl"><span class="CodeRay">Reading symbols from /export/linux-<span class="float">4</span><span class="float">.19</span><span class="float">.15</span>/net/ceph/libceph.ko...done.
(gdb) list *(ceph_msg_release+<span class="hex">0x15d</span>)
<span class="hex">0x348d</span> is in ceph_msg_release (net/ceph/messenger.c:<span class="integer">3534</span>).
<span class="integer">3529</span>
<span class="integer">3530</span> <span class="keyword">if</span> (m->pool)
<span class="integer">3531</span> ceph_msgpool_put(m->pool, m);
<span class="integer">3532</span> <span class="keyword">else</span>
<span class="integer">3533</span> ceph_msg_free(m); <== here
<span class="integer">3534</span> }
<span class="integer">3535</span>
</span></code></pre> Linux kernel client - Bug #42288: ceph:: kernel BUG at mm/slub.c:3901!https://tracker.ceph.com/issues/42288?journal_id=1485972019-10-14T14:51:35ZJeff Laytonjlayton@redhat.com
<ul></ul><p>Do you have a coredump and a way to analyze it?</p>
<p>If so, then it'd be good to track down the msg and see what you can determine about the pointer that's being freed.</p>
<p>This may be unrelated to MDS traffic as well. It'd be good to figure out what sort of socket the thread was servicing. If you can identify the ceph_connection in some of the earlier stack frames, then you may be able to peek at it and determine the socket address of the peer.</p>
<p>It also wouldn't hurt to try a newer kernel on the off chance that this may already be fixed. I'm not aware of anything specific that might address this though.</p>
<p>So in ceph_msg_free:</p>
<pre>
static void ceph_msg_free(struct ceph_msg *m)
{
dout("%s %p\n", __func__, m);
kvfree(m->front.iov_base);
kfree(m->data);
kmem_cache_free(ceph_msg_cache, m);
}
</pre>
<p>"m" comes from a fixed-size slabcache, so the crash probably happened before it got freed. It'd be interesting to see what "m" looks like at this point.</p> Linux kernel client - Bug #42288: ceph:: kernel BUG at mm/slub.c:3901!https://tracker.ceph.com/issues/42288?journal_id=1486372019-10-15T03:04:58ZYanhu Caogmayyyha@gmail.com
<ul></ul><p>No coredump file.</p>
<pre><code class="c syntaxhl"><span class="CodeRay">Reading symbols from /export/linux-<span class="float">4</span><span class="float">.19</span><span class="float">.15</span>/fs/ceph/ceph.ko...done.
(gdb) list *(dispatch+<span class="hex">0x66</span>)
<span class="hex">0x29266</span> is in dispatch (fs/ceph/mds_client.c:<span class="integer">4149</span>).
<span class="integer">4144</span> pr_err(<span class="string"><span class="delimiter">"</span><span class="content">received unknown message type %d %s</span><span class="char">\n</span><span class="delimiter">"</span></span>, type,
<span class="integer">4145</span> ceph_msg_type_name(type));
<span class="integer">4146</span> }
<span class="integer">4147</span> out:
<span class="integer">4148</span> ceph_msg_put(msg);
<span class="integer">4149</span> }
</span></code></pre>
<p>It should be related to MDS according to the next instruction returned by ceph_msg_free, and there is only mds_client.c: dispatch in the CEPH module.</p>
<pre><code class="c syntaxhl"><span class="CodeRay"><span class="keyword">struct</span> ceph_msg *ceph_msg_new(<span class="predefined-type">int</span> type, <span class="predefined-type">int</span> front_len, gfp_t flags,
<span class="predefined-type">bool</span> can_fail)
{
<span class="keyword">struct</span> ceph_msg *m;
m = kmem_cache_zalloc(ceph_msg_cache, flags);
...
<span class="comment">/* front */</span>
<span class="keyword">if</span> (front_len) {
m->front.iov_base = ceph_kvmalloc(front_len, flags);
<span class="keyword">if</span> (m->front.iov_base == <span class="predefined-constant">NULL</span>) {
dout(<span class="string"><span class="delimiter">"</span><span class="content">ceph_msg_new can't allocate %d bytes</span><span class="char">\n</span><span class="delimiter">"</span></span>,
front_len);
<span class="keyword">goto</span> out2;
}
} <span class="keyword">else</span> {
m->front.iov_base = <span class="predefined-constant">NULL</span>; <== Can be judged whether it is <span class="predefined-constant">NULL</span> when kvfree(m->front.iov_base)?
}
...
}
</span></code></pre> Linux kernel client - Bug #42288: ceph:: kernel BUG at mm/slub.c:3901!https://tracker.ceph.com/issues/42288?journal_id=1486622019-10-15T08:17:52ZYanhu Caogmayyyha@gmail.com
<ul></ul><pre><code class="c syntaxhl"><span class="CodeRay"><span class="directive">void</span> *ceph_kvmalloc(size_t size, gfp_t flags)
{
<span class="keyword">if</span> (size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) {
<span class="directive">void</span> *ptr = kmalloc(size, flags | __GFP_NOWARN);
<span class="keyword">if</span> (ptr)
<span class="keyword">return</span> ptr;
}
<span class="keyword">return</span> __vmalloc(size, flags | __GFP_HIGHMEM, PAGE_KERNEL); <== Why use __GFP_HIGHMEM?
}
</span></code></pre>
<p>The <strong>__GFP_HIGHMEM</strong> flag indicates that "the allocated memory may be located in high memory". High Memory is the part of physical memory in a computer which is not directly mapped by the page tables of its operating system kernel.</p>
<p>This zone of memory is not mapped in the kernel's virtual address space, and this prevents the kernel from being capable of directly referring it.</p> Linux kernel client - Bug #42288: ceph:: kernel BUG at mm/slub.c:3901!https://tracker.ceph.com/issues/42288?journal_id=1486632019-10-15T08:27:09ZYanhu Caogmayyyha@gmail.com
<ul></ul><p>Yanhu Cao wrote:</p>
<blockquote>
<p>[...]</p>
<p>The <strong>__GFP_HIGHMEM</strong> flag indicates that "the allocated memory may be located in high memory". High Memory is the part of physical memory in a computer which is not directly mapped by the page tables of its operating system kernel.</p>
<p>This zone of memory is not mapped in the kernel's virtual address space, and this prevents the kernel from being capable of directly referring it.</p>
</blockquote>
<p>Does not match the kernel version used. Ignore this.</p> Linux kernel client - Bug #42288: ceph:: kernel BUG at mm/slub.c:3901!https://tracker.ceph.com/issues/42288?journal_id=1486882019-10-15T10:36:05ZJeff Laytonjlayton@redhat.com
<ul></ul><p>Yanhu Cao wrote:</p>
<blockquote>
<p>No coredump file.</p>
</blockquote>
<p>Too bad. That would help expedite this.</p>
<blockquote>
<p>It should be related to MDS according to the next instruction returned by ceph_msg_free, and there is only mds_client.c: dispatch in the CEPH module.</p>
</blockquote>
<p>Quite right. I missed that that address was in ceph.ko...and yes, x86_64 doesn't really have high memory since it's a 64-bit arch.</p>
<p>I suspect there are two possibilities here:</p>
<ol>
<li>some pointer in the msg structure has been double-freed</li>
<li>some pointer in the msg structure has been scribbled over</li>
</ol>
<p>Determining that without a core will be difficult. Are you building your own kernels? One possibility is to turn on KASAN. It will impact performance, but it may help detect what's going wrong earlie, and provide some clues about what it is.</p> Linux kernel client - Bug #42288: ceph:: kernel BUG at mm/slub.c:3901!https://tracker.ceph.com/issues/42288?journal_id=1492632019-10-22T13:13:33ZIlya Dryomov
<ul><li><strong>Status</strong> changed from <i>Fix Under Review</i> to <i>Need More Info</i></li></ul> Linux kernel client - Bug #42288: ceph:: kernel BUG at mm/slub.c:3901!https://tracker.ceph.com/issues/42288?journal_id=1631122020-04-13T13:30:47ZJeff Laytonjlayton@redhat.com
<ul><li><strong>Status</strong> changed from <i>Need More Info</i> to <i>Rejected</i></li></ul><p>No response in 6 months. Please reopen if this is still an issue.</p>