Bug #19639
mon crash on shutdown
Status:
Can't reproduce
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Monitor
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Mon crash happening during shutdown in a cephfs test run.
Assertion: /mnt/jenkins/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.0.0-2683-g1f1f8e9/rpm/el7/BUILD/ceph-12.0.0-2683-g1f1f8e9/src/mon/Monitor.cc: 1610: FAILED assert(is_probing() || is_synchronizing()) ceph version 12.0.0-2683-g1f1f8e9 (1f1f8e953e708883a2551bf6fb19ca524d2946af) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x64ac80] 2: (Monitor::probe_timeout(int)+0x96) [0x447476] 3: (Context::complete(int)+0x9) [0x45a109] 4: (SafeTimer::timer_thread()+0x104) [0x6451c4] 5: (SafeTimerThread::entry()+0xd) [0x646bed] 6: (()+0x7dc5) [0xa66cdc5] 7: (clone()+0x6d) [0xd09a73d]
/a/jspray-2017-04-16_16:07:43-multimds-wip-jcsp-testing-20170415b-multimds-testing-basic-smithi/1032808
History
#1 Updated by Sage Weil almost 7 years ago
- Priority changed from Normal to Urgent
#2 Updated by John Spray almost 7 years ago
- Subject changed from mon crashes on shutdown to mon crashe on shutdown
- Description updated (diff)
Split out the propose_pending on into http://tracker.ceph.com/issues/19738 with a candidate fix, the other one is still a mystery to me.
#3 Updated by John Spray almost 7 years ago
- Subject changed from mon crashe on shutdown to mon crash on shutdown
#4 Updated by Sage Weil almost 7 years ago
- Status changed from New to Need More Info
what is the "other one" (besides probe_timeout #19738)?
#5 Updated by John Spray almost 7 years ago
Sorry, I made the history confusing by editing the description. The "other one" is the one that is now the only one in the description of this ticket (i.e. the probe_timeout backtrace).
#6 Updated by Greg Farnum almost 7 years ago
Is it reproducing? Wouldn't surprise me if these were linked.
#7 Updated by Greg Farnum almost 7 years ago
- Project changed from Ceph to RADOS
- Category deleted (
Monitor) - Priority changed from Urgent to High
- Component(RADOS) Monitor added
Turning this down; should close if we don't get it happening again.
#8 Updated by John Spray almost 7 years ago
I haven't seen this happen again in recent memory.
#9 Updated by Greg Farnum almost 7 years ago
- Status changed from Need More Info to Can't reproduce