Project

General

Profile

Bug #12231

crush unable to generate 3 osds in teuthology run

Added by Samuel Just over 8 years ago. Updated over 8 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Lately, wip-sam-testing (basically master) runs are reliably turning up a case 1 or 2 times per run where 3/6 osds are out and crush is unable to turn up more than 2 of the remaining 3 osds for at least one pg. I grabbed one of the osdmaps and found that on this one, the bad pg is pg 1.37

/home/sam/git-checkouts/ceph4/src/osdmaptool: osdmap file '/tmp/osdmap'
parsed '1.37' -> 1.37
1.37 raw ([4,1], p4) up ([1,4], p1) acting ([1,4], p1)

/home/sam/git-checkouts/ceph4/src/osdmaptool: osdmap file '/tmp/osdmap'
parsed '1.36' -> 1.36
1.36 raw ([4,1,3], p4) up ([4,1,3], p4) acting ([4,1,3], p4)

hashes (attached) has the draws for r=0 through 999999 on that pg and you'll see that indeed osd 3 does not win for the first time until between draws 50 and 60.

I see nothing new with the crush tunables. osdmaptool compiled on firefly agrees with the output, so it's not a change in crush. The straw weights appear to be 65535, so there is nothing wonky with the crush map construction. The two questions are:
1) Is this simply an indication that the hash is really bad and we need to begin switching it (possibly before jewel)?
2) Why has this not come up before? We started testing regularly with size 3 pools in teuthology in February. I haven't seen it yet in hammer runs either. Odd.

osdmap - osdmap extracted from mon after run which does not map the pg (6.5 KB) Samuel Just, 07/07/2015 11:42 PM

osdmaptool_debug - debug output from --test-map-pg on 1.37 (with some extra prints) (61.9 KB) Samuel Just, 07/07/2015 11:42 PM

test_jenkins.c View - generates the first 10k draws for 1.37 (462 Bytes) Samuel Just, 07/07/2015 11:42 PM

hashes - first 10k draws for 1.37 (649 KB) Samuel Just, 07/07/2015 11:42 PM

osdmap2 (5.38 KB) Samuel Just, 07/07/2015 11:53 PM

Associated revisions

Revision 042bd117 (diff)
Added by Samuel Just over 8 years ago

3-size-2-min-size: keep 4 in during thrashing

Workaround for 12231.

Fixes: #12231
Signed-off-by: Samuel Just <>

History

#1 Updated by Samuel Just over 8 years ago

ubuntu@teuthology:/a/samuelj-2015-07-06_17:07:54-rados-wip-sam-testing-distro-basic-multi/963089

is the instance above.

#2 Updated by Samuel Just over 8 years ago

For the other instance in that run (ubuntu@teuthology:/a/samuelj-2015-07-06_17:07:54-rados-wip-sam-testing-distro-basic-multi/962909)

we have a different set of 3 in osds, but still including 3. Thus, pg 1.37 once again has trouble:

~/git-checkouts/ceph4/src/osdmaptool --test-map-pg 1.37 /tmp/osdmap2
/home/sam/git-checkouts/ceph4/src/osdmaptool: osdmap file '/tmp/osdmap2'
parsed '1.37' -> 1.37
1.37 raw ([1,2], p1) up ([1,2], p1) acting ([1,2], p1)

Debugging has the same value for x.

#3 Updated by Samuel Just over 8 years ago

I checked a few similar hammer runs and noticed that pgp_num didn't get high enough for 1.37 to have its own seed. Perhaps something in master is causing us to split more/faster in a single run?

#4 Updated by Samuel Just over 8 years ago

  • Status changed from New to Resolved

Updated ceph-qa-suite to keep 4 in for size 3 pools.

Also available in: Atom PDF