Project

General

Profile

Actions

Bug #8738

closed

divergent osdmaps crush tunables

Added by Samuel Just almost 10 years ago. Updated almost 10 years ago.

Status:
Resolved
Priority:
Immediate
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

555ed048e73024687fc8b106a570db4f osd-20_osdmap.13258__0_4E62BB79__none
6037911f31dc3c18b05499d24dcdbe5c osd-23_osdmap.13258__0_4E62BB79__none

~/logs ยป for i in 20 23; do ../ceph/src/osdmaptool --export-crush
/tmp/crush$i osd-$i*; ../ceph/src/crushtool -d /tmp/crush$i >
/tmp/crush$i.d; done; diff /tmp/crush20.d /tmp/crush23.d
../ceph/src/osdmaptool: osdmap file 'osd-20_osdmap.13258__0_4E62BB79__none'
../ceph/src/osdmaptool: exported crush map to /tmp/crush20
../ceph/src/osdmaptool: osdmap file 'osd-23_osdmap.13258__0_4E62BB79__none'
../ceph/src/osdmaptool: exported crush map to /tmp/crush23
6d5
< tunable chooseleaf_vary_r 1

Looks like the chooseleaf_vary_r tunable somehow ended up divergent?

From ceph-users thread: [ceph-users] Some OSD and MDS crash

I suspect most of the peering state machine crashes after upgrade we have seen recently are due to this bug.

Actions #1

Updated by Greg Farnum almost 10 years ago

  • Status changed from New to 7
  • Assignee set to Greg Farnum

wip-7838-next has a patch on top of current "next" to resolve this by testing all CRUSH maps against the cluster features before accepting them. Testing it locally now, assuming this passes I'll submit a PR and we can figure out what other testing we're interested in.

Actions #2

Updated by Greg Farnum almost 10 years ago

  • Status changed from 7 to Fix Under Review
  • Source changed from other to Development
Actions #3

Updated by Sage Weil almost 10 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #4

Updated by Sage Weil almost 10 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF