Bug #16279
closedassert(objiter->second->version > last_divergent_update) failed
0%
Description
We build a cluster with 2 osds,
write 100G files, and shutdown the power during deleting the files.
After the machine reboot, we got one osd crashed,
the osd log shows:
2717> 2016-06-14 11:13:17.222426 7f39273e7700 10 merge_log divergent 54'296 (54'33) delete 67bbeddb/10000000004.0000002c/head//1 by mds.0.1:12518 2016-06-13 17:08:42.484963>second->version > last_divergent_update)
-2716> 2016-06-14 11:13:17.222627 7f39273e7700 10 _merge_object_divergent_entries: merging hoid 67bbeddb/10000000004.0000002c/head//1 entries: 54'296 (54'33) delete 67bbeddb/10000000004.0000002c/head//1 by mds.0.1:12518 2016-06-13 17:08:42.484963
-2715> 2016-06-14 11:13:17.222641 7f39273e7700 10 _merge_object_divergent_entries: hoid 67bbeddb/10000000004.0000002c/head//1 prior_version: 54'33 first_divergent_update: 54'296 last_divergent_update: 54'296
-2714> 2016-06-14 11:13:17.225928 7f39273e7700 -1 osd/PGLog.cc: In function 'static void PGLog::_merge_object_divergent_entries(const PGLog::IndexedLog&, const hobject_t&, const std::list<pg_log_entry_t, std::allocator<pg_log_entry_t> >&, const pg_info_t&, eversion_t, pg_missing_t&, boost::optional<std::pair<eversion_t, hobject_t> >, PGLog::LogEntryHandler)' thread 7f39273e7700 time 2016-06-14 11:13:17.222650
osd/PGLog.cc: 366: FAILED assert(objiter
2016-06-14 11:13:17.358733 7f39273e7700 -1 *** Caught signal (Aborted) *
Relate code:
@
ceph::unordered_map<hobject_t, pg_log_entry_t*>::const_iterator objiter =
log.objects.find(hoid);
if (objiter != log.objects.end() &&
objiter->second->version >= first_divergent_update) {
/// Case 1)
assert(objiter->second->version > last_divergent_update);
ldpp_dout(dpp, 10) << func << ": more recent entry found: "
<< *objiter->second << ", already merged" << dendl;
// ensure missing has been updated appropriately
if (objiter->second->is_update()) {
assert(missing.is_missing(hoid) &&
missing.missing[hoid].need objiter->second->version);
} else {
assert(!missing.is_missing(hoid));
}
missing.revise_have(hoid, eversion_t());
if (rollbacker && !object_not_in_store)
rollbacker->remove(hoid);
return;
}
@
From the log, there is only one divergent item,
which the "first_divergent_update" equals to "last_divergent_update",
if "objiter->second->version" "first_divergent_update", the assert will fail.