Project

General

Profile

Actions

Bug #20444

closed

mon: Hit assert in PaxosService::propose_pending after election

Added by Trygve Vea almost 7 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
MgrMonitor
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
kraken
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Complete logs attached.

2017-06-28 05:43:30.757612 7f459c309700 0 mon.white01-osl2@0(leader).data_health(348) update_stats avail 98% total 5110 MB, used 79356 kB, avail 5032 MB
2017-06-28 05:43:31.056966 7f459bb08700 0 log_channel(cluster) log [INF] : mon.white01-osl2 calling new monitor election
2017-06-28 05:43:31.057040 7f459bb08700 1 mon.white01-osl2@0(electing).elector(349) init, last seen epoch 349
2017-06-28 05:43:36.057982 7f459c309700 0 log_channel(cluster) log [INF] : mon.white01-osl2@0 won leader election with quorum 0,2
2017-06-28 05:43:36.065727 7f459c309700 0 log_channel(cluster) log [INF] : HEALTH_WARN; 1 mons down, quorum 0,2 white01-osl2,white01-osl3
2017-06-28 05:43:46.058523 7f459c309700 1 mon.white01-osl2@0(leader).paxos(paxos recovering c 49433949..49434613) collect timeout, calling fresh election
2017-06-28 05:43:50.008843 7f459cf2a700 0 -- [2a02:c0:200:10c::1]:6789/0 >> - conn(0x7f45b308a000 :6789 s=STATE_ACCEPTING_WAIT_BANNER_ADDR pgs=0 cs=0 l=0).fault with nothing to send and in the half accept state just closed
2017-06-28 05:43:51.941689 7f459c309700 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/11.2.0/rpm/el7/BUILD/ceph-11.2.0/src/mon/PaxosService.cc: In function 'void Paxo
sService::propose_pending()' thread 7f459c309700 time 2017-06-28 05:43:51.898557
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/11.2.0/rpm/el7/BUILD/ceph-11.2.0/src/mon/PaxosService.cc: 181: FAILED assert(have_pending)

ceph version 11.2.0 (f223e27eeb35991352ebc1f67423d4ebc252adb7)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0x7f45a535a265]
2: (PaxosService::propose_pending()+0x5c6) [0x7f45a51c8256]
3: (MgrMonitor::tick()+0x843) [0x7f45a5266cf3]
4: (Monitor::tick()+0x80) [0x7f45a517b200]
5: (Context::complete(int)+0x9) [0x7f45a51936b9]
6: (SafeTimer::timer_thread()+0x104) [0x7f45a5356d14]
7: (SafeTimerThread::entry()+0xd) [0x7f45a535874d]
8: (()+0x7dc5) [0x7f45a2289dc5]
9: (clone()+0x6d) [0x7f45a179773d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Files

ceph-mon.white01-osl2.log.gz (960 KB) ceph-mon.white01-osl2.log.gz Complete logfile of monitor instance Trygve Vea, 06/28/2017 07:10 AM

Related issues 1 (0 open1 closed)

Copied to mgr - Backport #20640: kraken: mon: Hit assert in PaxosService::propose_pending after electionRejectedActions
Actions #1

Updated by Greg Farnum almost 7 years ago

  • Project changed from Ceph to mgr
  • Category changed from Monitor to MgrMonitor
Actions #2

Updated by Sage Weil almost 7 years ago

  • Status changed from New to Pending Backport
  • Priority changed from Normal to High
  • Backport set to kraken

This is fixed for luminous by 79bf5547cc49b897e285c617e2921c773e625090

Actions #3

Updated by Nathan Cutler almost 7 years ago

  • Copied to Backport #20640: kraken: mon: Hit assert in PaxosService::propose_pending after election added
Actions #4

Updated by Nathan Cutler over 6 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF