Actions
Bug #20659
closedMDSMonitor: assertion failure if two mds report same health warning
% Done:
0%
Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDSMonitor
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
(gdb) bt #0 0x00007f782719323b in raise () from /lib64/libpthread.so.0 #1 0x0000003226a47d36 in reraise_fatal (signum=6) at /usr/src/debug/ceph-12.1.0-990-gb36c57d/src/global/signal_handler.cc:74 #2 handle_fatal_signal (signum=6) at /usr/src/debug/ceph-12.1.0-990-gb36c57d/src/global/signal_handler.cc:138 #3 <signal handler called> #4 0x00007f78244631d7 in raise () from /lib64/libc.so.6 #5 0x00007f78244648c8 in abort () from /lib64/libc.so.6 #6 0x00000032267ba274 in ceph::__ceph_assert_fail (assertion=assertion@entry=0x3226c2060a "checks.count(code) == 0", file=file@entry=0x3226c20dd0 "/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.1.0-990-gb36c57d/rpm/el7/BUILD/ceph-12.1.0-990"..., line=line@entry=97, func=func@entry=0x3226c44f80 <_ZZN18health_check_map_t3addERKSs15health_status_tS1_E19__PRETTY_FUNCTION__> "health_check_t& health_check_map_t::add(const string&, health_status_t, const string&)") at /usr/src/debug/ceph-12.1.0-990-gb36c57d/src/common/assert.cc:66 #7 0x000000322674e929 in add (summary="%num% MDSs report slow requests", severity=HEALTH_WARN, code="MDS_SLOW_REQUEST", this=0x7f7820bcb170) at /usr/src/debug/ceph-12.1.0-990-gb36c57d/src/mon/health_check.h:97 #8 MDSMonitor::encode_pending (this=0x32308ec800, t=warning: RTTI symbol not found for class 'std::_Sp_counted_ptr<MonitorDBStore::Transaction*, (__gnu_cxx::_Lock_policy)2>' warning: RTTI symbol not found for class 'std::_Sp_counted_ptr<MonitorDBStore::Transaction*, (__gnu_cxx::_Lock_policy)2>' std::shared_ptr (count 3, weak 0) 0x3230b70400) at /usr/src/debug/ceph-12.1.0-990-gb36c57d/src/mon/MDSMonitor.cc:209 #9 0x00000032266b52dd in PaxosService::propose_pending (this=0x32308ec800) at /usr/src/debug/ceph-12.1.0-990-gb36c57d/src/mon/PaxosService.cc:213 #10 0x000000322656b797 in operator() (a0=<optimized out>, this=<optimized out>) at /usr/src/debug/ceph-12.1.0-990-gb36c57d/build/boost/include/boost/function/function_template.hpp:771 #11 finish (r=<optimized out>, this=<optimized out>) at /usr/src/debug/ceph-12.1.0-990-gb36c57d/src/include/Context.h:493 #12 C_MonContext::finish (this=<optimized out>, r=<optimized out>) at /usr/src/debug/ceph-12.1.0-990-gb36c57d/src/mon/Monitor.cc:132 #13 0x00000032265a7129 in Context::complete (this=0x32313a44f0, r=<optimized out>) at /usr/src/debug/ceph-12.1.0-990-gb36c57d/src/include/Context.h:70 #14 0x00000032267b69a4 in SafeTimer::timer_thread (this=0x32308e6490) at /usr/src/debug/ceph-12.1.0-990-gb36c57d/src/common/Timer.cc:97 #15 0x00000032267b83cd in SafeTimerThread::entry (this=<optimized out>) at /usr/src/debug/ceph-12.1.0-990-gb36c57d/src/common/Timer.cc:30 #16 0x00007f782718bdc5 in start_thread () from /lib64/libpthread.so.0 #17 0x00007f782452573d in clone () from /lib64/libc.so.6 (gdb) frame 7 #7 0x000000322674e929 in add (summary="%num% MDSs report slow requests", severity=HEALTH_WARN, code="MDS_SLOW_REQUEST", this=0x7f7820bcb170) at /usr/src/debug/ceph-12.1.0-990-gb36c57d/src/mon/health_check.h:97 97 assert(checks.count(code) == 0); (gdb) print checks $1 = std::map with 1 elements = {["MDS_SLOW_REQUEST"] = {severity = HEALTH_WARN, summary = "%num% MDSs report slow requests", detail = std::list = {[0] = "mdsh(mds.8): 2 slow requests are blocked > 30 sec"}}} (gdb) frame 8 #8 MDSMonitor::encode_pending (this=0x32308ec800, t=warning: RTTI symbol not found for class 'std::_Sp_counted_ptr<MonitorDBStore::Transaction*, (__gnu_cxx::_Lock_policy)2>' warning: RTTI symbol not found for class 'std::_Sp_counted_ptr<MonitorDBStore::Transaction*, (__gnu_cxx::_Lock_policy)2>' std::shared_ptr (count 3, weak 0) 0x3230b70400) at /usr/src/debug/ceph-12.1.0-990-gb36c57d/src/mon/MDSMonitor.cc:209 209 mds_metric_summary(metric.type)); (gdb) print rank $2 = 5
From: /ceph/teuthology-archive/pdonnell-2017-07-14_04:41:06-multimds-wip-pdonnell-20170713-testing-basic-smithi/1399292/remote/smithi008/coredump/1500017698.78430.core
and: /ceph/teuthology-archive/pdonnell-2017-07-14_04:41:06-multimds-wip-pdonnell-20170713-testing-basic-smithi/1399292/remote/smithi008/coredump/1500017698.78430.core
Updated by John Spray almost 7 years ago
- Status changed from New to Resolved
Unless this test run was more recent than the fix, I think this is https://github.com/ceph/ceph/pull/16302
Actions