Project

General

Profile

Actions

Bug #8622

closed

erasure-code: rados command does not enforce alignement constraints

Added by Lluis PJ almost 10 years ago. Updated almost 7 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
-
% Done:

90%

Source:
Development
Tags:
Backport:
Firefly
Regression:
No
Severity:
1 - critical
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Original title for the record : "EC pool fails for certain (k,m) combinations for >4MB objs"

Steps to reproduce the error:

git clone https://github.com/ceph/ceph.git
cd ceph
./do_autogen.sh
make
cd src
OSD=5 ./vstart.sh -l -n -X
./ceph osd erasure-code-profile set ecprofile ruleset-failure-domain=osd k=3 m=2 plugin=jerasure
./ceph osd crush rule create-erasure ecruleset ecprofile
./ceph osd pool create ecpool 1 1 erasure ecprofile ecruleset

Once you have the development cluster working you can try:

dd if=/dev/urandom of=./test.dat bs=1MB count=5
./rados -p ecpool put test ./test.dat

And you get the following error:

error putting ecpool/test: (95) Operation not supported

However, the following case works perfectly:

dd if=/dev/urandom of=./test.dat bs=1MB count=4
./rados -p ecpool put test ./test.dat

If instead of (k=3, m=2, OSD=5) you try (k=2, m=2, OSD=4) it works for both 4MB and 5MB objects.

I observed this bug in an Ubuntu Precise machine and in an up-to-date Arch Linux machine.

This same error was first observed by Michael Nelson: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2014-March/028311.html

Actions #1

Updated by Lluis PJ almost 10 years ago

After some debugging it seems to me that the problem is that rados client reads ands sends data in chunks of size (1<<22) bytes, which is slightly less than 4.2 MB. However, this size might not be aligned to 'stripe_width' that changes with "k". For the first chunk that rados sends, the OSD zero padds the request to make it multiple of 'stripe_width'. The problem comes when the offset of the second chunk (which is always 1<<22) is not 'stripe_width' aligned. The OSD requires all writes to have an offset aligned with 'stripe_width'.

I see two possible solutions:
- Make the OSD read the last stripe of the first chunk, and replace the zero pad with the new data,
- or make rados client access the pool 'stripe_width' value and align all chunk sizes properly.

The second solution is IMHO the fast and easy one.
Any other solutions?

Actions #2

Updated by Lluis PJ almost 10 years ago

I created a pull request for the second option.

https://github.com/ceph/ceph/pull/1981/

Actions #3

Updated by Loïc Dachary almost 10 years ago

  • Status changed from New to In Progress
  • % Done changed from 0 to 80
Actions #4

Updated by Loïc Dachary almost 10 years ago

  • Subject changed from EC pool fails for certain (k,m) combinations for >4MB objs to erasure-code: rados command does not enforce alignement constraints
  • Description updated (diff)
  • Category changed from OSD to 26
Actions #5

Updated by Lluis PJ almost 10 years ago

Loic,

I had some problems squashing the commits and I created a new pull request:
https://github.com/ceph/ceph/pull/1984/

Actions #6

Updated by Loïc Dachary almost 10 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 80 to 100

This is fine, great work :-)

Actions #7

Updated by Loïc Dachary almost 10 years ago

Would you have time to review / try this : https://github.com/ceph/ceph/pull/1987 ?

Actions #8

Updated by Ian Colle almost 10 years ago

  • Status changed from Resolved to Pending Backport
  • Backport set to Firefly

Loic - this needs to be backported to Firefly.

Actions #9

Updated by Loïc Dachary almost 10 years ago

  • % Done changed from 100 to 90

Needs to be backported along with https://github.com/ceph/ceph/pull/2020 which fixes a bug introduced by the fix :-/

Actions #10

Updated by Loïc Dachary almost 10 years ago

  • Target version set to 0.82
Actions #11

Updated by Sage Weil over 9 years ago

  • Status changed from Pending Backport to Resolved

commit:7a58da53ebfcaaf385c21403b654d1d2f1508e1a

Actions #12

Updated by Greg Farnum almost 7 years ago

  • Project changed from Ceph to CephFS
  • Category deleted (26)
  • Target version deleted (0.82)
Actions

Also available in: Atom PDF