Project

General

Profile

Bug #10241

Incorrect OSD mapping with EC 6+2 setup in Giant

Added by Mark Nelson over 9 years ago. Updated almost 9 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hit this on the performance test cluster during nightly giant testing. Notice the very incorrect mapping.

perf@magna031:/tmp/cbt/ceph/log$ ceph --version
ceph version 0.87-44-g26e8cf1 (26e8cf174b8e76b4282ce9d9c1af6ff12f5565a9)
perf@magna031:/tmp/cbt/ceph/log$ ceph health detail
HEALTH_WARN 1 pgs degraded; 1 pgs stuck degraded; 1 pgs stuck unclean; 1 pgs stuck undersized; 1 pgs undersized
pg 12.244 is stuck unclean since forever, current state active+undersized+degraded, last acting [8,6,2,0,10,2147483647,1,11]
pg 12.244 is stuck undersized for 23610.100317, current state active+undersized+degraded, last acting [8,6,2,0,10,2147483647,1,11]
pg 12.244 is stuck degraded for 23610.100408, current state active+undersized+degraded, last acting [8,6,2,0,10,2147483647,1,11]
pg 12.244 is active+undersized+degraded, acting [8,6,2,0,10,2147483647,1,11]

osd_dump.txt View (3.53 KB) Mark Nelson, 12/03/2014 09:02 PM

pg_dump.txt View (876 KB) Mark Nelson, 12/03/2014 09:02 PM

crushmap.txt View (2.36 KB) Mark Nelson, 12/03/2014 09:02 PM

History

#1 Updated by Sage Weil over 9 years ago

  • Status changed from New to Need More Info

need osdmap or crushmap that triggers the failed mapping

#2 Updated by Loïc Dachary almost 9 years ago

  • Status changed from Need More Info to Resolved
  • Regression set to No

It needs 8 and there are only 9 available, which may be a problem for crush, sometimes. It has been resolved (or made very rare) by increasing the number of tries from 50 to 100 for erasure code crush rulesets a while ago.

Also available in: Atom PDF