Project

General

Profile

Bug #6896

osd/PG.cc: 1302: FAILED assert(active == acting.size())

Added by David Zafman over 10 years ago. Updated over 10 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
David Zafman
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

PG::activate() looks at actingbackfill and finds number of non-backfilling replicas. The count in "active" should be the same as the number of items in the "acting" set which are the non-backfilling replicas. The assert was put in as a sanity check.


Related issues

Duplicates Ceph - Bug #6897: ceph osd crashed while running rados test Duplicate 11/25/2013

Associated revisions

Revision 66f51f82 (diff)
Added by David Zafman over 10 years ago

osd: Remove bogus assert(active == acting.size())

We saw this assert because active is not correctly computed.
Remove assert and incorrectly computed active count.
We already use acting.size() to determine whether to set PG_STATE_DEGRADED.

Fixes: #6896

Signed-off-by: David Zafman <>

History

#1 Updated by David Zafman over 10 years ago

calc_acting() uses this to find backfill targets:


    if (cur_info.is_incomplete() || cur_info.last_update < primary->second.log_tail) {

PG::activate() uses this to find non-backfill candidate which is the same as is_incomplete():


        if (pi.last_backfill == hobject_t::get_max())

But all this boils down to determining if DEGRADED should be set. I could use active for this test and remove the assert to keep the code identical to before. I just don't have a sanity check I thought I could add.

So we probably want to back out this part of the bug #5855 feature code:


+    assert(active == acting.size());
+
     // degraded?
-    if (get_osdmap()->get_pg_size(info.pgid) > active)
+    if (get_osdmap()->get_pg_size(info.pgid) > acting.size())
       state_set(PG_STATE_DEGRADED);

#2 Updated by David Zafman over 10 years ago

So the "active" calculation is a minor existing bug in the way PG_STATE_DEGRADED is set. The assert() is wrong. I'm going to remove the assert, the code that computes "active" and just use the correct value of acting.size() to set degraded state.

#3 Updated by David Zafman over 10 years ago

  • Status changed from In Progress to Fix Under Review

#4 Updated by David Zafman over 10 years ago

  • Status changed from Fix Under Review to Resolved

66f51f82d457f6d3170c47daed2ca3458b888df1

Also available in: Atom PDF