Bug #16610: Jewel: segfault in ObjectCacher::FlusherThread - CephFS - Ceph

Actions

Copy link

Bug #16610

closed

Jewel: segfault in ObjectCacher::FlusherThread

Added by Patrick Donnelly almost 8 years ago. Updated over 7 years ago.

Status:

Resolved

Priority:

High

Assignee:

Category:

Correctness/Safety

Target version:

% Done:

Source:

Community (user)

Tags:

Backport:

jewel

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

Ceph - v10.2.2

ceph-qa-suite:

Component(FS):

Client, ceph-fuse, osdc

Labels (FS):

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

2016-07-07 04:51:40.254481 7fd8debfd700  3 client.466620 ll_write 0x7fd8d8015720 27637744~48608 = 48608
 ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
 1: (()+0x297fe2) [0x7fd90c967fe2]
 2: (()+0xf7e0) [0x7fd90b3a77e0]
 3: (ObjectCacher::bh_write_scattered(std::list<ObjectCacher::BufferHead*, std::allocator<ObjectCacher::BufferHead*> >&)+0x36) [0x7fd90c918086]
 4: (ObjectCacher::bh_write_adjacencies(ObjectCacher::BufferHead*, std::chrono::time_point<ceph::time_detail::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >, long*, int*)+0x22c) [0x7fd90c918a3c]
 5: (ObjectCacher::flush(long)+0x1ef) [0x7fd90c918cef]
 6: (ObjectCacher::flusher_entry()+0xac4) [0x7fd90c919a34]
 7: (ObjectCacher::FlusherThread::entry()+0xd) [0x7fd90c925c6d]
 8: (()+0x7aa1) [0x7fd90b39faa1]
 9: (clone()+0x6d) [0x7fd90a61893d]
2016-07-07 04:51:40.261479 7fd8fabfd700 -1 *** Caught signal (Segmentation fault) **
 in thread 7fd8fabfd700 thread_name:flusher

See also: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-July/011194.html
Log: https://dl.dropboxusercontent.com/u/2946024/nohup.out.2

Files

Download all files

gdb.txt (204 KB) gdb.txt		Goncalo Borges, 07/12/2016 11:19 AM
debug.out.gz (656 KB) debug.out.gz		Goncalo Borges, 07/12/2016 11:25 AM

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Updated by Patrick Donnelly almost 8 years ago

Log is now here: /ceph/post/i16610/client.log

Actions

Copy link

Updated by Patrick Donnelly almost 8 years ago

Just as a quick update, we're waiting on some more information from Goncalo concerning the possibility of nodes running out of memory (causing the segfault).

Actions

Copy link Download all files

Updated by Goncalo Borges almost 8 years ago

File gdb.txt gdb.txt added
File debug.out.gz debug.out.gz added

Just another update after further investigation and discussion in the mailing list.

1. I have tried to run the application giving origin to this segfault in different kinds of resources.
a./ The segfault is always appearing in a set of VMs with 16 GB of RAM and 4 cores.
b./ The segfault is not appearing in a set of VMs (in principle identical to the 16 GB ones) but with 16 cores and 64 GB of RAM.
c./ the segfault is not appearing in a physicall cluster with machines with 96 GB of RAM and 12 cores (24 because of hyperthreading)

2./ When I was running the application in situation 1c) I was seing ceph-fuse consuming more than 10gb of virtual memory. That made me think that the problem could be machines 1a) running out of memory. However, further discussion demonstrated that the virtual memory footprint is a consequence of the arenas mechanism enabled by glibc > 1.10. It is possible to tune the maximum number of arenas per cpu by setting MALLOC_ARENA_MAX env variables. If I set MALLOC_ARENA_MAX to 4 (the default value is 8), the virtual memory footprint decreases tremendously.

3./ After understanding point 2 it became obvious that the virtual memory footprint of ceph-fuse in scenario 1a) should be much less that the 10gb seen in situation 1c) because it scales with the number of cpus. Nevertheless, I did try to run ceph-fuse with MALLOC_ARENA_MAX = 4 and 2 but the segfaults continues to appear.

4./ At this point I do not have a clear idea of what is triggering the segfault. To allow further debugging, I am appending:

- The gdb.txt I get after running
$ gdb /path/to/ceph-fuse core.XXXX
(gdb) set pag off
(gdb) set log on
(gdb) thread apply all bt
(gdb) thread apply all bt full

- The debug.out (gzipped) I get after running ceph-fuse in debug mode with 'debug client 20' and 'debug objectcacher = 20'

Actions

Copy link

Updated by Greg Farnum almost 8 years ago

Category changed from 45 to Correctness/Safety
Component(FS) Client, ceph-fuse, osdc added

Actions

Copy link

Updated by Brad Hubbard almost 8 years ago

Alas, the gdb log does not give us much more to go on.

Thread 1 (Thread 0x7f891cdfa700 (LWP 5467)):
#0 0x00007f8928c3f6ab in raise () from /lib64/libpthread.so.0
#1 0x00007f892a2000a5 in handle_fatal_signal(int) () at global/signal_handler.cc:71
#2 <signal handler called>
#3 ObjectCacher::Object::get() () at osdc/ObjectCacher.h:359
#4 0x00007f892a1b0097 in ObjectCacher::bh_write_scattered(std::list<ObjectCacher::BufferHead*, std::allocator<ObjectCacher::BufferHead*> >&) () at osdc/ObjectCacher.cc:929
#5 0x00007f892a1b0a3c in ObjectCacher::bh_write_adjacencies(ObjectCacher::BufferHead*, std::chrono::time_point<ceph::time_detail::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >, long*, int*) () at osdc/ObjectCacher.cc:921
#6 0x00007f892a1b0cef in ObjectCacher::flush(long) () at osdc/ObjectCacher.cc:1142
#7 0x00007f892a1b1a34 in ObjectCacher::flusher_entry() () at osdc/ObjectCacher.cc:1741
#8 0x00007f892a1bdc6d in ObjectCacher::FlusherThread::entry() () at osdc/ObjectCacher.h:427
#9 0x00007f8928c37aa1 in start_thread () from /lib64/libpthread.so.0
#10 0x00007f8927eb093d in clone () from /lib64/libc.so.6

The additional information you provided is interesting though. I wonder if this is indicative of some sort of memory allocation failure?

Actions

Copy link

Updated by Zheng Yan almost 8 years ago

looks like ObjectCacher::bh_write_adjacencies() passed an empty list to ObjectCacher::bh_write_scattered(). Maybe the precision of VM's clock is low. Could you please try below patch

diff --git a/src/osdc/ObjectCacher.cc b/src/osdc/ObjectCacher.cc
index 90c8329..67d16de 100644
--- a/src/osdc/ObjectCacher.cc
+++ b/src/osdc/ObjectCacher.cc
@@ -1156,7 +1156,7 @@ void ObjectCacher::flush(loff_t amount)
     BufferHead *bh = static_cast<BufferHead*>(
       bh_lru_dirty.lru_get_next_expire());
     if (!bh) break;
-    if (bh->last_write > cutoff) break;
+    if (bh->last_write >= cutoff) break;

     if (scattered_write) {
       bh_write_adjacencies(bh, cutoff, amount > 0 ? &left : NULL, NULL);

Actions

Copy link

Updated by Goncalo Borges almost 8 years ago

Just to keep the full history in this issue, we have understood that the segfault only appears in VM with AMD62xx processors and does not appear in VMs with AMD63XX or Intel processor (details how we found this correlation are provided in the mailing list thread).

It also turns out that Zheng's patch is the solution.

To convince ourselves that the problem was really solved, we executed 40 jobs (with the user application where the ObjectCacher segfault was seen for the first time) in a dozen of AMD 62XX VMs, and none failed. Before, ceph-fuse was always segfaulting a couple of minutes after job startup.

Actions

Copy link