Project

General

Profile

Actions

Bug #8601

closed

erasure-code: default profile does not exist after upgrade

Added by Loïc Dachary almost 10 years ago. Updated over 9 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
Monitor
Target version:
% Done:

100%

Source:
other
Tags:
Backport:
firefly
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Workaround

Create the default profile, after upgrading to firefly, with

$ ceph osd erasure-code-profile set default

And verify it is as expected with
$ ceph osd erasure-code-profile get default
directory=.libs
k=2
m=1
plugin=jerasure
ruleset-failure-domain=osd
technique=reed_sol_van

Description

When a firefly cluster is created, it the default erasure code profile is created . When an existing cluster is upgraded, the default erasure code profile is not created.

  • The upgrade notes could be modified to document this
  • The default erasure code profile could be created as a side effect of osd pool create if it is not found

Related issues 1 (0 open1 closed)

Related to Ceph - Bug #8599: Fix check of ruleset id on pool updateResolvedLoïc Dachary06/14/2014

Actions
Actions #1

Updated by Loïc Dachary almost 10 years ago

  • Description updated (diff)
  • Status changed from 12 to Fix Under Review
Actions #2

Updated by Loïc Dachary almost 10 years ago

  • Description updated (diff)
Actions #3

Updated by Loïc Dachary almost 10 years ago

  • % Done changed from 0 to 50
Actions #4

Updated by Sage Weil almost 10 years ago

  • Status changed from Fix Under Review to Pending Backport
  • Backport set to firefly
Actions #5

Updated by Loïc Dachary almost 10 years ago

  • Status changed from Pending Backport to Fix Under Review
  • % Done changed from 50 to 80
Actions #6

Updated by Loïc Dachary almost 10 years ago

  • Target version set to 0.82
Actions #7

Updated by Loïc Dachary almost 10 years ago

  • Target version changed from 0.82 to 0.83 cont.
Actions #8

Updated by Sage Weil over 9 years ago

  • Priority changed from Normal to Urgent
Actions #10

Updated by Loïc Dachary over 9 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #12

Updated by Greg Farnum over 9 years ago

Apparently having an EC pool is still sufficient to prevent kernel clients from mounting, so I don't think we can backport this fix until that problem has been resolved.

Actions #13

Updated by Loïc Dachary over 9 years ago

<gregsfortytwo1> loicd: we've got an issue because (as a feature!), the presence of an EC rule in the osdmap will propagate out as a required feature bit for connections
<gregsfortytwo1> but kernel clients don't support it, so we're accidentally back to the place we were at where any cluster with an EC rule/pool can't be mounted by kernel clients
<loicd> gregsfortytwo: yes, I caught a glimpse of this discussion.
<gregsfortytwo1> which is…bad?
<gregsfortytwo1> so we don't want to automatically create the ec rules on upgrade until this is resolved
<gregsfortytwo1> or unwary upgraders will suddenly find they can't mount their RBD images :(
<loicd> I'm confused
<gregsfortytwo1> we have a bug right now which is preventing us from automatically creating EC rules, right? this is the one you're looking at
<loicd> I don't remember enough of this patch to be 100% sure but I think it only deals with erasure code profiles
<gregsfortytwo1> ah
<loicd> let me check once more ;-)
<gregsfortytwo1> I was under the impression it was about automatically creating the EC CRUSH rule
-*- loicd browsing https://github.com/ceph/ceph/pull/1990/files
<gregsfortytwo1> perhaps I am entirely mistaken
<gregsfortytwo1> (I mean, I thought that a profile included a crush rule)
<gregsfortytwo1> (that otherwise was not being created)
<loicd> gregsfortytwo: the profile is separate from the crush rule. The patch only creates it if it is missing and only when it is required by a command to create an erasure coded pool or ruleset.
<gregsfortytwo1> okay
<gregsfortytwo1> should probably just ignore me, then :)
<loicd> ok :-) Reading https://github.com/ceph/ceph/pull/1990/files#diff-0a5db46a44ae9900e226289a810f10e8R4367 it comes back to me ;-) The patch only addresses the absence of the default profile after an upgrade from emperor to firefly. It does not create anything spontaneously.
<gregsfortytwo1> so the creation of the profile doesn't create the corresponding crush rule?
<gregsfortytwo1> when is the rule created?
<loicd> profile management has its own set of commands https://github.com/ceph/ceph/blob/master/src/mon/MonCommands.h#L477 and its map https://github.com/ceph/ceph/blob/master/src/osd/OSDMap.h#L136
<loicd> the ruleset is created either explicitly or implicitly when an erasure coded pool is created https://github.com/ceph/ceph/blob/master/src/mon/OSDMonitor.cc#L3261 and https://github.com/ceph/ceph/blob/master/src/mon/OSDMonitor.cc#L3036
<loicd> the explicit creation of the ruleset is with https://github.com/ceph/ceph/blob/master/src/mon/MonCommands.h#L464 and implemented using the same function via https://github.com/ceph/ceph/blob/master/src/mon/OSDMonitor.cc#L4355
<loicd> gregsfortytwo: ^ does that clarify the relationship between erasure code profile and the erasure code ruleset ? 
<loicd> I should add that an erasure coded ruleset is created by providing the erasure code profile to the erasure code plugin. Because the erasure code plugin is ultimately trusted to create a sensible ruleset.
<gregsfortytwo1> okay, that helps
<gregsfortytwo1> thanks!
Actions #14

Updated by Loïc Dachary over 9 years ago

  • Status changed from Pending Backport to Resolved
  • % Done changed from 80 to 100
Actions

Also available in: Atom PDF