Project

General

Profile

Bug #19020

"unlikely race on master" OSDMapMapping.h: 31: FAILED assert(shards == 0)

Added by Yuri Weinstein about 7 years ago. Updated almost 7 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Per Josh' reuest

FAILED: 92 - test-erasure-code.sh (Timeout) (can't attach log too big)

Was part of PRs testing https://trello.com/c/xsJOKEgX

ctest -R test-erasure-code -V

2017-02-20 21:01:41.717556 7f16bff34700 20 -- 127.0.0.1:7101/0 >> 127.0.0.1:6828/5443 conn(0x555e66a6e800 :7101 s=STATE_OPEN_MESSAGE_THROTTLE_DISPATCH_QUEUE pgs=1 cs=1 l=1).process prev state is STATE_OPEN_MESSAGE_THROTTLE_BYTES
2017-02-20 21:01:41.717564 7f16bff34700 20 -- 127.0.0.1:7101/0 >> 127.0.0.1:6828/5443 conn(0x555e66a6e800 :7101 s=STATE_OPEN_MESSAGE_READ_FRONT pgs=1 cs=1 l=1).process prev state is STATE_OPEN_MESSAGE_THROTTLE_DISPATCH_QUEUE
2017-02-20 21:01:41.717573 7f16bff34700 20 -- 127.0.0.1:7101/0 >> 127.0.0.1:6828/5443 conn(0x555e66a6e800 :7101 s=STATE_OPEN_MESSAGE_READ_FRONT pgs=1 cs=1 l=1).process got front 23
2017-02-20 21:01:41.717580 7f16bff34700 10 -- 127.0.0.1:7101/0 >> 127.0.0.1:6828/5443 conn(0x555e66a6e800 :7101 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=1 cs=1 l=1).process aborted = 0
2017-02-20 21:01:41.717582 7f16be731700 -1 /home/yuriw/wip/src/osd/OSDMapMapping.h: In function 'virtual ParallelPGMapper::Job::~Job()' thread 7f16be731700 time 2017-02-20 21:01:41.712058
/home/yuriw/wip/src/osd/OSDMapMapping.h: 31: FAILED assert(shards == 0)

 ceph version 12.0.0-600-g559e473 (559e473dca88cc325f8b6ecd6f70e5f09e9e35a5)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x555e5bfac8e2]
 2: (()+0x3b2d65) [0x555e5be51d65]
 3: (OSDMonitor::encode_pending(std::shared_ptr<MonitorDBStore::Transaction>)+0x203) [0x555e5be23b53]
 4: (PaxosService::propose_pending()+0x25c) [0x555e5bdf589c]
 5: (Context::complete(int)+0x9) [0x555e5bdbea39]
 6: (SafeTimer::timer_thread()+0x452) [0x555e5bfa72a2]
 7: (SafeTimerThread::entry()+0xd) [0x555e5bfa86ad]
 8: (()+0x76ba) [0x7f16c4e636ba]
 9: (clone()+0x6d) [0x7f16c3c1382d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

2017-02-20 21:01:41.717587 7f16bff34700 20 -- 127.0.0.1:7101/0 >> 127.0.0.1:6828/5443 conn(0x555e66a6e800 :7101 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=1 cs=1 l=1).process got 23 + 0 + 0 byte message

History

#1 Updated by Yuri Weinstein about 7 years ago

Also confirmed on master (tests ran on xenial)

#2 Updated by Kefu Chai about 7 years ago

  • Status changed from New to Fix Under Review
  • Assignee changed from Josh Durgin to Sage Weil

#3 Updated by Sage Weil almost 7 years ago

  • Status changed from Fix Under Review to Resolved

Also available in: Atom PDF