Project

General

Profile

Actions

Bug #38333

closed

mon crash in AuthMonitor::Incremental::encode buffer code

Added by Sage Weil about 5 years ago. Updated about 5 years ago.

Status:
Duplicate
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

(gdb) bt
#0  raise (sig=sig@entry=11) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x0000561b4dfec8f3 in reraise_fatal (signum=11) at ./src/global/signal_handler.cc:81
#2  handle_fatal_signal (signum=11) at ./src/global/signal_handler.cc:298
#3  <signal handler called>
#4  std::__atomic_base<unsigned int>::fetch_add (__m=std::memory_order_seq_cst, __i=1, this=0x635f454c47474f80) at /usr/include/c++/7/bits/atomic_base.h:514
#5  std::__atomic_base<unsigned int>::operator++ (this=0x635f454c47474f80) at /usr/include/c++/7/bits/atomic_base.h:280
#6  ceph::buffer::ptr::ptr (this=this@entry=0x561b54e9be08, p=...) at ./src/common/buffer.cc:462
#7  0x00007fbe7be3e1b2 in ceph::buffer::ptr_node::ptr_node (this=0x561b54e9be00) at ./src/include/buffer.h:435
#8  ceph::buffer::ptr_node::create<ceph::buffer::ptr_node const&> () at ./src/include/buffer.h:426
#9  ceph::buffer::list::append (this=this@entry=0x7fbe715b3aa0, bl=...) at ./src/common/buffer.cc:1531
#10 0x0000561b4de31a1a in ceph::encode (bl=..., s=...) at ./src/include/encoding.h:254
#11 AuthMonitor::Incremental::encode (features=<optimized out>, bl=..., this=0x561b51960f00) at ./src/mon/AuthMonitor.h:58
#12 AuthMonitor::encode_pending (this=0x561b50b43200, t=std::shared_ptr<MonitorDBStore::Transaction> (use count 3, weak count 0) = {...}) at ./src/mon/AuthMonitor.cc:361
#13 0x0000561b4dede84f in PaxosService::propose_pending (this=this@entry=0x561b50b43200) at ./src/mon/PaxosService.cc:213
#14 0x0000561b4de313e4 in AuthMonitor::tick (this=0x561b50b43200) at ./src/mon/AuthMonitor.cc:98
#15 0x0000561b4ddcbfd0 in Monitor::tick (this=0x561b51b9c000) at ./src/mon/Monitor.cc:5612
#16 0x0000561b4dda3a49 in boost::function1<void, int>::operator() (a0=<optimized out>, this=<optimized out>) at ./obj-x86_64-linux-gnu/boost/include/boost/function/function_template.hpp:768
#17 FunctionContext::finish (r=<optimized out>, this=<optimized out>) at ./src/include/Context.h:487
#18 C_MonContext::finish (this=<optimized out>, r=<optimized out>) at ./src/mon/Monitor.cc:128
#19 0x0000561b4dde5619 in Context::complete (this=0x561b533e2240, r=<optimized out>) at ./src/include/Context.h:77
#20 0x00007fbe7bb81b00 in SafeTimer::timer_thread (this=0x561b51b9c208) at ./src/common/Timer.cc:97
#21 0x00007fbe7bb8332d in SafeTimerThread::entry (this=<optimized out>) at ./src/common/Timer.cc:30
#22 0x00007fbe7a95c6db in start_thread (arg=0x7fbe715b7700) at pthread_create.c:463
#23 0x00007fbe79b3d88f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

   -10> 2019-02-15 04:32:14.919 7fbe715b7700 10 mon.a@0(leader).osd e61 try_prune_purged_snaps actually pruned 0
    -9> 2019-02-15 04:32:14.919 7fbe715b7700  5 mon.a@0(leader).paxos(paxos active c 503..1058) is_readable = 1 - now=2019-02-15 04:32:14.923252 lease_expire=2019-02-15 04:32:19.898845 has v0 lc 1058
    -8> 2019-02-15 04:32:14.919 7fbe715b7700 10 mon.a@0(leader).osd e61  min_last_epoch_clean 0
    -7> 2019-02-15 04:32:14.919 7fbe715b7700 10 mon.a@0(leader).log v509 log
    -6> 2019-02-15 04:32:14.919 7fbe715b7700 10 mon.a@0(leader).paxosservice(logm 1..509) maybe_trim trim_to 9 would only trim 8 < paxos_service_trim_min 250
    -5> 2019-02-15 04:32:14.919 7fbe715b7700 10 mon.a@0(leader).auth v2 auth
    -4> 2019-02-15 04:32:14.919 7fbe715b7700 10 mon.a@0(leader).auth v2 increasing max_global_id to 14096
    -3> 2019-02-15 04:32:14.919 7fbe715b7700 10 cephx keyserver: _check_rotating_secrets
    -2> 2019-02-15 04:32:14.919 7fbe715b7700 10 mon.a@0(leader).paxosservice(auth 1..2) propose_pending
    -1> 2019-02-15 04:32:14.919 7fbe715b7700 10 mon.a@0(leader).auth v2 encode_pending v 3
     0> 2019-02-15 04:32:14.923 7fbe715b7700 -1 *** Caught signal (Segmentation fault) **
 in thread 7fbe715b7700 thread_name:safe_timer

 ceph version 14.0.1-3676-gf97aa4b (f97aa4b0d79bd8576be0d502211c76da33f22b15) nautilus (dev)
 1: (()+0x12890) [0x7fbe7a967890]
 2: (ceph::buffer::ptr::ptr(ceph::buffer::ptr const&)+0x17) [0x7fbe7be3bdf7]
 3: (ceph::buffer::list::append(ceph::buffer::list const&)+0x42) [0x7fbe7be3e1b2]
 4: (AuthMonitor::encode_pending(std::shared_ptr<MonitorDBStore::Transaction>)+0x14a) [0x561b4de31a1a]
 5: (PaxosService::propose_pending()+0x13f) [0x561b4dede84f]
 6: (AuthMonitor::tick()+0x154) [0x561b4de313e4]
 7: (Monitor::tick()+0xa0) [0x561b4ddcbfd0]
 8: (C_MonContext::finish(int)+0x39) [0x561b4dda3a49]
 9: (Context::complete(int)+0x9) [0x561b4dde5619]
 10: (SafeTimer::timer_thread()+0x190) [0x7fbe7bb81b00]
 11: (SafeTimerThread::entry()+0xd) [0x7fbe7bb8332d]
 12: (()+0x76db) [0x7fbe7a95c6db]
 13: (clone()+0x3f) [0x7fbe79b3d88f]

/a/sage-mgr-dashboard-gil-cleanup-2/3592648

core and log in above directory


Related issues 1 (0 open1 closed)

Is duplicate of RADOS - Bug #38372: segfault in "AuthMonitor::increase_max_global_id()"ResolvedSage Weil02/18/2019

Actions
Actions #1

Updated by Greg Farnum about 5 years ago

Is it possible this is a result of some of the buffer list stuff we know was broken?

Actions #2

Updated by Sage Weil about 5 years ago

  • Status changed from 12 to Duplicate

oh, i bet it was the same auth thing i just fixed: #38372

Actions #3

Updated by Sage Weil about 5 years ago

  • Is duplicate of Bug #38372: segfault in "AuthMonitor::increase_max_global_id()" added
Actions

Also available in: Atom PDF