Project

General

Profile

Actions

Bug #8556

closed

CEPH_FEATURE_OSD_ERASURE_CODES feature bit

Added by Ilya Dryomov almost 10 years ago. Updated almost 10 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Currently OSDs require CEPH_FEATURE_OSD_ERASURE_CODES feature for the clients to be able to connect to the OSDs. This hurts the kernel client - users with erasure coded pools can't use krbd/kcephfs with their replicated pools. The result of the discussion in the rbd standup was to stop requiring OSD_ERASURE_CODES bit globally and figure out a way to do this on the per-pool basis, since rgw depends on that.

Actions #1

Updated by Greg Farnum almost 10 years ago

RBD is the only storage system which can really identify a given client instance as communicating with only a single pool — and even it can't really, because we provide layering and that might in the future include erasure-coded base images.

Do you have any suggestions for how we could actually go about implementing this?

Actions #2

Updated by Ilya Dryomov almost 10 years ago

Not really. The course of action I had in mind was to make an exception for the EC bit in the kernel (i.e. ignore the fact that it's required on the premise that writing to the EC pool would somehow fail) for this and maybe the next kernel release and get the proper EC support into the kernel client (whatever it would amount to) in the meantime. If writing to the EC pool from the non-EC-aware kernel doesn't fail then punt on making the exception and bear with it. However, Sage suggested the per-pool approach and it sounds cleaner too. Adding EC support was on the list, but got postponed in favor of other things because we thought neither rbd nor file system would benefit from it in the short term. The mere presence of the EC pool preventing the kernel client from working wasn't something I expected though..

Actions #3

Updated by Sage Weil almost 10 years ago

  • Status changed from New to 12
  • Source changed from other to Development

We talked about this in core standup and the consensus is currently to just drop the EC requirement for clients entirely. The client behavior is no different for EC and non-EC pools at this stage (we still send writes to the primary). We can also ignore dealing with per-pool feature requirements until we actually have a feature or use-case that requires them; otherwise we'll dream up a solution today that won't be quite right.

Actions #4

Updated by Greg Farnum almost 10 years ago

Hmm, so splitting OSDMap::get_features() into a get_features_client() and get_features_server()?

Actions #5

Updated by Sage Weil almost 10 years ago

  • Status changed from 12 to Fix Under Review
Actions #6

Updated by Sage Weil almost 10 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #7

Updated by Dmitry Smirnov almost 10 years ago

I just want to say thanks for this change. Backport is trivial and now I can finally use erasure pool with kernel RBD clients on 0.80.1. Thank you, thank you. :)
IMHO this feature alone justifies Firefly point release.

Actions #8

Updated by Dmitry Smirnov almost 10 years ago

With this applied to 0.80.1 and erasure pool is in use, all OSDs log

crush map has features 2543728197632, adjusting msgr requires for mons
. What is it about? Is this something to correct for backport?

Actions #9

Updated by Sage Weil almost 10 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF