Project

General

Profile

Bug #16610

Jewel: segfault in ObjectCacher::FlusherThread

Added by Patrick Donnelly over 2 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
Correctness/Safety
Target version:
-
Start date:
07/07/2016
Due date:
% Done:

0%

Source:
Community (user)
Tags:
Backport:
jewel
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Client, ceph-fuse, osdc
Labels (FS):
Pull request ID:

Description

2016-07-07 04:51:40.254481 7fd8debfd700  3 client.466620 ll_write 0x7fd8d8015720 27637744~48608 = 48608
 ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
 1: (()+0x297fe2) [0x7fd90c967fe2]
 2: (()+0xf7e0) [0x7fd90b3a77e0]
 3: (ObjectCacher::bh_write_scattered(std::list<ObjectCacher::BufferHead*, std::allocator<ObjectCacher::BufferHead*> >&)+0x36) [0x7fd90c918086]
 4: (ObjectCacher::bh_write_adjacencies(ObjectCacher::BufferHead*, std::chrono::time_point<ceph::time_detail::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >, long*, int*)+0x22c) [0x7fd90c918a3c]
 5: (ObjectCacher::flush(long)+0x1ef) [0x7fd90c918cef]
 6: (ObjectCacher::flusher_entry()+0xac4) [0x7fd90c919a34]
 7: (ObjectCacher::FlusherThread::entry()+0xd) [0x7fd90c925c6d]
 8: (()+0x7aa1) [0x7fd90b39faa1]
 9: (clone()+0x6d) [0x7fd90a61893d]
2016-07-07 04:51:40.261479 7fd8fabfd700 -1 *** Caught signal (Segmentation fault) **
 in thread 7fd8fabfd700 thread_name:flusher

See also: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-July/011194.html
Log: https://dl.dropboxusercontent.com/u/2946024/nohup.out.2

gdb.txt View (204 KB) Goncalo Borges, 07/12/2016 11:19 AM

debug.out.gz (656 KB) Goncalo Borges, 07/12/2016 11:25 AM


Related issues

Copied to fs - Backport #17131: jewel: Jewel: segfault in ObjectCacher::FlusherThread Resolved

History

#1 Updated by Patrick Donnelly over 2 years ago

Log is now here: /ceph/post/i16610/client.log

#2 Updated by Patrick Donnelly over 2 years ago

Just as a quick update, we're waiting on some more information from Goncalo concerning the possibility of nodes running out of memory (causing the segfault).

#3 Updated by Goncalo Borges over 2 years ago

Just another update after further investigation and discussion in the mailing list.

1. I have tried to run the application giving origin to this segfault in different kinds of resources.
a./ The segfault is always appearing in a set of VMs with 16 GB of RAM and 4 cores.
b./ The segfault is not appearing in a set of VMs (in principle identical to the 16 GB ones) but with 16 cores and 64 GB of RAM.
c./ the segfault is not appearing in a physicall cluster with machines with 96 GB of RAM and 12 cores (24 because of hyperthreading)

2./ When I was running the application in situation 1c) I was seing ceph-fuse consuming more than 10gb of virtual memory. That made me think that the problem could be machines 1a) running out of memory. However, further discussion demonstrated that the virtual memory footprint is a consequence of the arenas mechanism enabled by glibc > 1.10. It is possible to tune the maximum number of arenas per cpu by setting MALLOC_ARENA_MAX env variables. If I set MALLOC_ARENA_MAX to 4 (the default value is 8), the virtual memory footprint decreases tremendously.

3./ After understanding point 2 it became obvious that the virtual memory footprint of ceph-fuse in scenario 1a) should be much less that the 10gb seen in situation 1c) because it scales with the number of cpus. Nevertheless, I did try to run ceph-fuse with MALLOC_ARENA_MAX = 4 and 2 but the segfaults continues to appear.

4./ At this point I do not have a clear idea of what is triggering the segfault. To allow further debugging, I am appending:

- The gdb.txt I get after running
$ gdb /path/to/ceph-fuse core.XXXX
(gdb) set pag off
(gdb) set log on
(gdb) thread apply all bt
(gdb) thread apply all bt full

- The debug.out (gzipped) I get after running ceph-fuse in debug mode with 'debug client 20' and 'debug objectcacher = 20'

#4 Updated by Greg Farnum over 2 years ago

  • Category changed from 45 to Correctness/Safety
  • Component(FS) Client, ceph-fuse, osdc added

#5 Updated by Brad Hubbard over 2 years ago

Alas, the gdb log does not give us much more to go on.

Thread 1 (Thread 0x7f891cdfa700 (LWP 5467)):
#0 0x00007f8928c3f6ab in raise () from /lib64/libpthread.so.0
#1 0x00007f892a2000a5 in handle_fatal_signal(int) () at global/signal_handler.cc:71
#2 <signal handler called>
#3 ObjectCacher::Object::get() () at osdc/ObjectCacher.h:359
#4 0x00007f892a1b0097 in ObjectCacher::bh_write_scattered(std::list<ObjectCacher::BufferHead*, std::allocator<ObjectCacher::BufferHead*> >&) () at osdc/ObjectCacher.cc:929
#5 0x00007f892a1b0a3c in ObjectCacher::bh_write_adjacencies(ObjectCacher::BufferHead*, std::chrono::time_point<ceph::time_detail::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >, long*, int*) () at osdc/ObjectCacher.cc:921
#6 0x00007f892a1b0cef in ObjectCacher::flush(long) () at osdc/ObjectCacher.cc:1142
#7 0x00007f892a1b1a34 in ObjectCacher::flusher_entry() () at osdc/ObjectCacher.cc:1741
#8 0x00007f892a1bdc6d in ObjectCacher::FlusherThread::entry() () at osdc/ObjectCacher.h:427
#9 0x00007f8928c37aa1 in start_thread () from /lib64/libpthread.so.0
#10 0x00007f8927eb093d in clone () from /lib64/libc.so.6

The additional information you provided is interesting though. I wonder if this is indicative of some sort of memory allocation failure?

#6 Updated by Zheng Yan over 2 years ago

looks like ObjectCacher::bh_write_adjacencies() passed an empty list to ObjectCacher::bh_write_scattered(). Maybe the precision of VM's clock is low. Could you please try below patch

diff --git a/src/osdc/ObjectCacher.cc b/src/osdc/ObjectCacher.cc
index 90c8329..67d16de 100644
--- a/src/osdc/ObjectCacher.cc
+++ b/src/osdc/ObjectCacher.cc
@@ -1156,7 +1156,7 @@ void ObjectCacher::flush(loff_t amount)
     BufferHead *bh = static_cast<BufferHead*>(
       bh_lru_dirty.lru_get_next_expire());
     if (!bh) break;
-    if (bh->last_write > cutoff) break;
+    if (bh->last_write >= cutoff) break;

     if (scattered_write) {
       bh_write_adjacencies(bh, cutoff, amount > 0 ? &left : NULL, NULL);

#7 Updated by Goncalo Borges over 2 years ago

Just to keep the full history in this issue, we have understood that the segfault only appears in VM with AMD62xx processors and does not appear in VMs with AMD63XX or Intel processor (details how we found this correlation are provided in the mailing list thread).

It also turns out that Zheng's patch is the solution.

To convince ourselves that the problem was really solved, we executed 40 jobs (with the user application where the ObjectCacher segfault was seen for the first time) in a dozen of AMD 62XX VMs, and none failed. Before, ceph-fuse was always segfaulting a couple of minutes after job startup.

#8 Updated by Zheng Yan over 2 years ago

  • Status changed from New to Need Review

#10 Updated by Greg Farnum over 2 years ago

  • Status changed from Need Review to Pending Backport

This got merged to master forever ago. Guess it should get backported too.

#11 Updated by Loic Dachary over 2 years ago

  • Backport set to jewel

#12 Updated by Loic Dachary over 2 years ago

  • Copied to Backport #17131: jewel: Jewel: segfault in ObjectCacher::FlusherThread added

#13 Updated by Nathan Cutler over 2 years ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF