Project

General

Profile

Bug #5072

mon: segfault on leveldb::Table::Open() during monitor start

Added by Joao Eduardo Luis almost 11 years ago. Updated almost 11 years ago.

Status:
Can't reproduce
Priority:
High
Assignee:
-
Category:
Monitor
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Jim Schutt is hitting this after triggering #4999

2013-05-10 11:04:02.077643 7ffff7fbe780  0 ceph version 0.61.1-2-gc33273a (c33273ab4c8602f1e1e9b1b14ac84bf5ccffbd32), process ceph-mon, pid 222926
2013-05-10 11:04:02.084842 7ffff7fbe780 10 needs_conversion
2013-05-10 11:04:02.465975 7ffff428f700 -1 *** Caught signal (Segmentation fault) **
 in thread 7ffff428f700

 ceph version 0.61.1-2-gc33273a (c33273ab4c8602f1e1e9b1b14ac84bf5ccffbd32)
 1: /usr/bin/ceph-mon() [0x5892d9]
 2: (()+0xf4a0) [0x7ffff76584a0]
 3: (leveldb::Table::Open(leveldb::Options const&, leveldb::RandomAccessFile*, unsigned long, leveldb::Table**)+0x35d) [0x7ffff6fb79ad]
 4: (leveldb::TableCache::FindTable(unsigned long, unsigned long, leveldb::Cache::Handle**)+0x1db) [0x7ffff6fa7d8b]
 5: (leveldb::TableCache::NewIterator(leveldb::ReadOptions const&, unsigned long, unsigned long, leveldb::Table**)+0x46) [0x7ffff6fa7ee6]
 6: (leveldb::DBImpl::FinishCompactionOutputFile(leveldb::DBImpl::CompactionState*, leveldb::Iterator*)+0x1d9) [0x7ffff6f98c09]
 7: (leveldb::DBImpl::DoCompactionWork(leveldb::DBImpl::CompactionState*)+0x79b) [0x7ffff6f9cf2b]
 8: (leveldb::DBImpl::BackgroundCompaction()+0x251) [0x7ffff6f9d531]
 9: (leveldb::DBImpl::BackgroundCall()+0x90) [0x7ffff6f9dc50]
 10: (()+0x3dc6f) [0x7ffff6fbcc6f]
 11: (()+0x77f1) [0x7ffff76507f1]
 12: (clone()+0x6d) [0x7ffff6a50ccd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- begin dump of recent events ---
   -21> 2013-05-10 11:04:02.070118 7ffff7fbe780  5 asok(0x1240000) register_command perfcounters_dump hook 0x1200010
   -20> 2013-05-10 11:04:02.070202 7ffff7fbe780  5 asok(0x1240000) register_command 1 hook 0x1200010
   -19> 2013-05-10 11:04:02.070206 7ffff7fbe780  5 asok(0x1240000) register_command perf dump hook 0x1200010
   -18> 2013-05-10 11:04:02.070216 7ffff7fbe780  5 asok(0x1240000) register_command perfcounters_schema hook 0x1200010
   -17> 2013-05-10 11:04:02.070226 7ffff7fbe780  5 asok(0x1240000) register_command 2 hook 0x1200010
   -16> 2013-05-10 11:04:02.070228 7ffff7fbe780  5 asok(0x1240000) register_command perf schema hook 0x1200010
   -15> 2013-05-10 11:04:02.070238 7ffff7fbe780  5 asok(0x1240000) register_command config show hook 0x1200010
   -14> 2013-05-10 11:04:02.070242 7ffff7fbe780  5 asok(0x1240000) register_command config set hook 0x1200010
   -13> 2013-05-10 11:04:02.070296 7ffff7fbe780  5 asok(0x1240000) register_command log flush hook 0x1200010
   -12> 2013-05-10 11:04:02.070299 7ffff7fbe780  5 asok(0x1240000) register_command log dump hook 0x1200010
   -11> 2013-05-10 11:04:02.070312 7ffff7fbe780  5 asok(0x1240000) register_command log reopen hook 0x1200010
   -10> 2013-05-10 11:04:02.077643 7ffff7fbe780  0 ceph version 0.61.1-2-gc33273a (c33273ab4c8602f1e1e9b1b14ac84bf5ccffbd32), process ceph-mon, pid 222926
    -9> 2013-05-10 11:04:02.078759 7ffff7fbe780  1 finished global_init_daemonize
    -8> 2013-05-10 11:04:02.084675 7ffff7fbe780  5 asok(0x1240000) init /var/run/ceph/ceph-mon.cs30.asok
    -7> 2013-05-10 11:04:02.084703 7ffff7fbe780  5 asok(0x1240000) bind_and_listen /var/run/ceph/ceph-mon.cs30.asok
    -6> 2013-05-10 11:04:02.084766 7ffff7fbe780  5 asok(0x1240000) register_command 0 hook 0x11f80b8
    -5> 2013-05-10 11:04:02.084775 7ffff7fbe780  5 asok(0x1240000) register_command version hook 0x11f80b8
    -4> 2013-05-10 11:04:02.084780 7ffff7fbe780  5 asok(0x1240000) register_command git_version hook 0x11f80b8
    -3> 2013-05-10 11:04:02.084783 7ffff7fbe780  5 asok(0x1240000) register_command help hook 0x12000b0
    -2> 2013-05-10 11:04:02.084815 7ffff4a90700  5 asok(0x1240000) entry start
    -1> 2013-05-10 11:04:02.084842 7ffff7fbe780 10 needs_conversion
     0> 2013-05-10 11:04:02.465975 7ffff428f700 -1 *** Caught signal (Segmentation fault) **
 in thread 7ffff428f700

 ceph version 0.61.1-2-gc33273a (c33273ab4c8602f1e1e9b1b14ac84bf5ccffbd32)
 1: /usr/bin/ceph-mon() [0x5892d9]
 2: (()+0xf4a0) [0x7ffff76584a0]
 3: (leveldb::Table::Open(leveldb::Options const&, leveldb::RandomAccessFile*, unsigned long, leveldb::Table**)+0x35d) [0x7ffff6fb79ad]
 4: (leveldb::TableCache::FindTable(unsigned long, unsigned long, leveldb::Cache::Handle**)+0x1db) [0x7ffff6fa7d8b]
 5: (leveldb::TableCache::NewIterator(leveldb::ReadOptions const&, unsigned long, unsigned long, leveldb::Table**)+0x46) [0x7ffff6fa7ee6]
 6: (leveldb::DBImpl::FinishCompactionOutputFile(leveldb::DBImpl::CompactionState*, leveldb::Iterator*)+0x1d9) [0x7ffff6f98c09]
 7: (leveldb::DBImpl::DoCompactionWork(leveldb::DBImpl::CompactionState*)+0x79b) [0x7ffff6f9cf2b]
 8: (leveldb::DBImpl::BackgroundCompaction()+0x251) [0x7ffff6f9d531]
 9: (leveldb::DBImpl::BackgroundCall()+0x90) [0x7ffff6f9dc50]
 10: (()+0x3dc6f) [0x7ffff6fbcc6f]
 11: (()+0x77f1) [0x7ffff76507f1]
 12: (clone()+0x6d) [0x7ffff6a50ccd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

After asking for gdb debug, this is what Jim was able to provide:

Core was generated by `/usr/bin/ceph-mon -i cs30 --pid-file /var/run/ceph/mon.cs30.pid -c /etc/ceph/ce'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007ffff765836b in raise (sig=11) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:42

warning: Source file is more recent than executable.
42                 sig);
Missing separate debuginfos, use: debuginfo-install gperftools-libs-2.0-3.el6.2.x86_64 leveldb-1.7.0-2.el6.x86_64 libgcc-4.4.6-3.el6.x86_64 libstdc++-4.4.6-3.el6.x86_64 libuuid-2.17.2-12.4.el6.x86_64 nspr-4.9-1.el6.x86_64 nss-3.13.3-8.el6.x86_64 nss-softokn-3.12.9-11.el6.x86_64 nss-softokn-freebl-3.12.9-11.el6.x86_64 nss-util-3.13.3-2.el6.x86_64 snappy-1.0.4-3.el6.x86_64 sqlite-3.6.20-1.el6.x86_64
(gdb) bt
#0  0x00007ffff765836b in raise (sig=11) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:42
#1  0x000000000058aea7 in reraise_fatal (signum=11) at global/signal_handler.cc:58
#2  handle_fatal_signal (signum=11) at global/signal_handler.cc:104
#3  <signal handler called>
#4  0x00007ffff6fb79ad in leveldb::Table::Open(leveldb::Options const&, leveldb::RandomAccessFile*, unsigned long, leveldb::Table**) () from /usr/lib64/libleveldb.so.1
#5  0x00007ffff6fa7d8b in leveldb::TableCache::FindTable(unsigned long, unsigned long, leveldb::Cache::Handle**) () from /usr/lib64/libleveldb.so.1
#6  0x00007ffff6fa7ee6 in leveldb::TableCache::NewIterator(leveldb::ReadOptions const&, unsigned long, unsigned long, leveldb::Table**) () from /usr/lib64/libleveldb.so.1
#7  0x00007ffff6f98c09 in leveldb::DBImpl::FinishCompactionOutputFile(leveldb::DBImpl::CompactionState*, leveldb::Iterator*) () from /usr/lib64/libleveldb.so.1
#8  0x00007ffff6f9cf2b in leveldb::DBImpl::DoCompactionWork(leveldb::DBImpl::CompactionState*) () from /usr/lib64/libleveldb.so.1
#9  0x00007ffff6f9d531 in leveldb::DBImpl::BackgroundCompaction() () from /usr/lib64/libleveldb.so.1
#10 0x00007ffff6f9dc50 in leveldb::DBImpl::BackgroundCall() () from /usr/lib64/libleveldb.so.1
#11 0x00007ffff6fbcc6f in ?? () from /usr/lib64/libleveldb.so.1
#12 0x00007ffff76507f1 in start_thread (arg=0x7ffff3ce1700) at pthread_create.c:301
#13 0x00007ffff6a50ccd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
(gdb) f 4
#4  0x00007ffff6fb79ad in leveldb::Table::Open(leveldb::Options const&, leveldb::RandomAccessFile*, unsigned long, leveldb::Table**) () from /usr/lib64/libleveldb.so.1
(gdb) l
37        pid = -pid;
38    #endif
39    
40    #if __ASSUME_TGKILL
41      return INLINE_SYSCALL (tgkill, 3, pid, THREAD_GETMEM (THREAD_SELF, tid),
42                 sig);
43    #else
44    # ifdef __NR_tgkill
45      int res = INLINE_SYSCALL (tgkill, 3, pid, THREAD_GETMEM (THREAD_SELF, tid),
46                    sig);
(gdb) f 5
#5  0x00007ffff6fa7d8b in leveldb::TableCache::FindTable(unsigned long, unsigned long, leveldb::Cache::Handle**) () from /usr/lib64/libleveldb.so.1
(gdb) l
47      if (res != -1 || errno != ENOSYS)
48        return res;
49    # endif
50      return INLINE_SYSCALL (tkill, 2, THREAD_GETMEM (THREAD_SELF, tid), sig);
51    #endif
52    }
(gdb) f 6
#6  0x00007ffff6fa7ee6 in leveldb::TableCache::NewIterator(leveldb::ReadOptions const&, unsigned long, unsigned long, leveldb::Table**) () from /usr/lib64/libleveldb.so.1
(gdb) l
Line number 53 out of range; ../nptl/sysdeps/unix/sysv/linux/pt-raise.c has 52 lines.
(gdb) f 7
#7  0x00007ffff6f98c09 in leveldb::DBImpl::FinishCompactionOutputFile(leveldb::DBImpl::CompactionState*, leveldb::Iterator*) () from /usr/lib64/libleveldb.so.1
(gdb) l
Line number 53 out of range; ../nptl/sysdeps/unix/sysv/linux/pt-raise.c has 52 lines.
(gdb) quit

Related issues

Related to Ceph - Bug #4999: monitor sync failure Can't reproduce 05/10/2013

History

#1 Updated by Joao Eduardo Luis almost 11 years ago

  • Description updated (diff)

#2 Updated by Sage Weil almost 11 years ago

  • Status changed from New to Can't reproduce

Also available in: Atom PDF