Bug #16610
closedJewel: segfault in ObjectCacher::FlusherThread
0%
Description
2016-07-07 04:51:40.254481 7fd8debfd700 3 client.466620 ll_write 0x7fd8d8015720 27637744~48608 = 48608 ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374) 1: (()+0x297fe2) [0x7fd90c967fe2] 2: (()+0xf7e0) [0x7fd90b3a77e0] 3: (ObjectCacher::bh_write_scattered(std::list<ObjectCacher::BufferHead*, std::allocator<ObjectCacher::BufferHead*> >&)+0x36) [0x7fd90c918086] 4: (ObjectCacher::bh_write_adjacencies(ObjectCacher::BufferHead*, std::chrono::time_point<ceph::time_detail::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >, long*, int*)+0x22c) [0x7fd90c918a3c] 5: (ObjectCacher::flush(long)+0x1ef) [0x7fd90c918cef] 6: (ObjectCacher::flusher_entry()+0xac4) [0x7fd90c919a34] 7: (ObjectCacher::FlusherThread::entry()+0xd) [0x7fd90c925c6d] 8: (()+0x7aa1) [0x7fd90b39faa1] 9: (clone()+0x6d) [0x7fd90a61893d] 2016-07-07 04:51:40.261479 7fd8fabfd700 -1 *** Caught signal (Segmentation fault) ** in thread 7fd8fabfd700 thread_name:flusher
See also: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-July/011194.html
Log: https://dl.dropboxusercontent.com/u/2946024/nohup.out.2
Files
Updated by Patrick Donnelly almost 8 years ago
Log is now here: /ceph/post/i16610/client.log
Updated by Patrick Donnelly almost 8 years ago
Just as a quick update, we're waiting on some more information from Goncalo concerning the possibility of nodes running out of memory (causing the segfault).
Updated by Goncalo Borges almost 8 years ago
- File gdb.txt gdb.txt added
- File debug.out.gz debug.out.gz added
Just another update after further investigation and discussion in the mailing list.
1. I have tried to run the application giving origin to this segfault in different kinds of resources.
a./ The segfault is always appearing in a set of VMs with 16 GB of RAM and 4 cores.
b./ The segfault is not appearing in a set of VMs (in principle identical to the 16 GB ones) but with 16 cores and 64 GB of RAM.
c./ the segfault is not appearing in a physicall cluster with machines with 96 GB of RAM and 12 cores (24 because of hyperthreading)
2./ When I was running the application in situation 1c) I was seing ceph-fuse consuming more than 10gb of virtual memory. That made me think that the problem could be machines 1a) running out of memory. However, further discussion demonstrated that the virtual memory footprint is a consequence of the arenas mechanism enabled by glibc > 1.10. It is possible to tune the maximum number of arenas per cpu by setting MALLOC_ARENA_MAX env variables. If I set MALLOC_ARENA_MAX to 4 (the default value is 8), the virtual memory footprint decreases tremendously.
3./ After understanding point 2 it became obvious that the virtual memory footprint of ceph-fuse in scenario 1a) should be much less that the 10gb seen in situation 1c) because it scales with the number of cpus. Nevertheless, I did try to run ceph-fuse with MALLOC_ARENA_MAX = 4 and 2 but the segfaults continues to appear.
4./ At this point I do not have a clear idea of what is triggering the segfault. To allow further debugging, I am appending:
- The gdb.txt I get after running
$ gdb /path/to/ceph-fuse core.XXXX
(gdb) set pag off
(gdb) set log on
(gdb) thread apply all bt
(gdb) thread apply all bt full
- The debug.out (gzipped) I get after running ceph-fuse in debug mode with 'debug client 20' and 'debug objectcacher = 20'
Updated by Greg Farnum almost 8 years ago
- Category changed from 45 to Correctness/Safety
- Component(FS) Client, ceph-fuse, osdc added
Updated by Brad Hubbard almost 8 years ago
Alas, the gdb log does not give us much more to go on.
Thread 1 (Thread 0x7f891cdfa700 (LWP 5467)):
#0 0x00007f8928c3f6ab in raise () from /lib64/libpthread.so.0
#1 0x00007f892a2000a5 in handle_fatal_signal(int) () at global/signal_handler.cc:71
#2 <signal handler called>
#3 ObjectCacher::Object::get() () at osdc/ObjectCacher.h:359
#4 0x00007f892a1b0097 in ObjectCacher::bh_write_scattered(std::list<ObjectCacher::BufferHead*, std::allocator<ObjectCacher::BufferHead*> >&) () at osdc/ObjectCacher.cc:929
#5 0x00007f892a1b0a3c in ObjectCacher::bh_write_adjacencies(ObjectCacher::BufferHead*, std::chrono::time_point<ceph::time_detail::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >, long*, int*) () at osdc/ObjectCacher.cc:921
#6 0x00007f892a1b0cef in ObjectCacher::flush(long) () at osdc/ObjectCacher.cc:1142
#7 0x00007f892a1b1a34 in ObjectCacher::flusher_entry() () at osdc/ObjectCacher.cc:1741
#8 0x00007f892a1bdc6d in ObjectCacher::FlusherThread::entry() () at osdc/ObjectCacher.h:427
#9 0x00007f8928c37aa1 in start_thread () from /lib64/libpthread.so.0
#10 0x00007f8927eb093d in clone () from /lib64/libc.so.6
The additional information you provided is interesting though. I wonder if this is indicative of some sort of memory allocation failure?
Updated by Zheng Yan almost 8 years ago
looks like ObjectCacher::bh_write_adjacencies() passed an empty list to ObjectCacher::bh_write_scattered(). Maybe the precision of VM's clock is low. Could you please try below patch
diff --git a/src/osdc/ObjectCacher.cc b/src/osdc/ObjectCacher.cc index 90c8329..67d16de 100644 --- a/src/osdc/ObjectCacher.cc +++ b/src/osdc/ObjectCacher.cc @@ -1156,7 +1156,7 @@ void ObjectCacher::flush(loff_t amount) BufferHead *bh = static_cast<BufferHead*>( bh_lru_dirty.lru_get_next_expire()); if (!bh) break; - if (bh->last_write > cutoff) break; + if (bh->last_write >= cutoff) break; if (scattered_write) { bh_write_adjacencies(bh, cutoff, amount > 0 ? &left : NULL, NULL);
Updated by Goncalo Borges almost 8 years ago
Just to keep the full history in this issue, we have understood that the segfault only appears in VM with AMD62xx processors and does not appear in VMs with AMD63XX or Intel processor (details how we found this correlation are provided in the mailing list thread).
It also turns out that Zheng's patch is the solution.
To convince ourselves that the problem was really solved, we executed 40 jobs (with the user application where the ObjectCacher segfault was seen for the first time) in a dozen of AMD 62XX VMs, and none failed. Before, ceph-fuse was always segfaulting a couple of minutes after job startup.
Updated by Zheng Yan almost 8 years ago
- Status changed from New to Fix Under Review
Updated by Greg Farnum over 7 years ago
- Status changed from Fix Under Review to Pending Backport
This got merged to master forever ago. Guess it should get backported too.
Updated by Loïc Dachary over 7 years ago
- Copied to Backport #17131: jewel: Jewel: segfault in ObjectCacher::FlusherThread added
Updated by Nathan Cutler over 7 years ago
- Status changed from Pending Backport to Resolved