Project

General

Profile

Actions

Bug #8646

closed

OSD: assert in share_map() when marked down by an OSDMap

Added by Greg Farnum almost 10 years ago. Updated almost 10 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
OSD
Target version:
-
% Done:

0%

Source:
Community (dev)
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

0> 2014-06-09 18:06:22.922629 7fcfab369700 -1 osd/OSD.cc: In function 'void OSDService::share_map(entity_name_t, Connection*, epoch_t, OSDMapRef&, epoch_t*)' thread 7fcfab369700 time 2014-06-09 18:06:22.921311
osd/OSD.cc: 4781: FAILED assert(osd->is_active() || osd->is_stopping())
ceph version andisk-sprint-2-drop-3-390-g2dbd85c (2dbd85c94cf27a1ff0419c5ea9359af7fe30e9b6)
1: (OSDService::share_map(entity_name_t, Connection*, unsigned int, std::tr1::shared_ptr<OSDMap const>&, unsigned int*)+0x58f) [0x6351df]
2: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x182) [0x635442]
3: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x346) [0x635ce6]
4: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x8ce) [0xa4a1ce]
5: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0xa4c420]
6: (()+0x8182) [0x7fcfc4a7d182]
7: (clone()+0x6d) [0x7fcfc2e1e30d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 

This is from a custom build, but the issue exists in master. We're calling share_map in OSD::dequeue_op(), but we might be dequeuing after changing the OSD state to STATE_WAITING_FOR_HEALTHY. I think the fix is just to condition trying to call share_map on actually being STATE_ACTIVE.

Actions

Also available in: Atom PDF