Project

General

Profile

Actions

Bug #6336

closed

xattrs limit breaks rgw large objects

Added by Yehuda Sadeh over 10 years ago. Updated over 10 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

a 64k attrs limit was introduced in dumpling, however, that completely breaks rgw when trying to upload large objects. Currently the workaround is to set:

osd max attr size = <very big number here>
Actions #1

Updated by Yehuda Sadeh over 10 years ago

  • Description updated (diff)
Actions #2

Updated by Ian Colle over 10 years ago

  • Assignee set to Greg Farnum
  • Priority changed from High to Urgent
Actions #3

Updated by Greg Farnum over 10 years ago

  • Assignee changed from Greg Farnum to Ian Colle

We should have a discussion with Sam and Sage before taking any action on this. Things I can tell you:
1) The limit was put in place to prevent users settings xattrs larger than Linux will take — Linux has a 64k limit, so we set ours there as well.
2) This follows a general pattern where we need to make sure in the OSD layer that writes will succeed, before giving them to the FileStore/underlying FS — we don't have a good feedback loop from there to the client replies.
3) Any large objects written prior to adding this check would have gotten silently broken if the OSD was depending on xattrs (instead of leveldb) for xattr storage.

So we have two options going forward: we make the gateway deal with this properly, or we do something in the OSD to allow xattrs of unlimited size. We aren't going to allow unlimited-size xattrs in the OSD (because even with leveldb it will eventually start doing bad things), so what that actually becomes is "allow much larger xattrs and hope that no rgw user creates an object with enough pieces to hit that larger limit".

You can see where I'm going with this. ;)

If we do want to make the OSD handle larger xattrs, we need to store them in leveldb. We can either switch to using leveldb instead of xattrs, period, or we can do something where we link too-large xattrs to a leveldb entry. I'm not sure what the performance implications of the first option are; for the second it would definitely slow things down.

Either way this isn't a quick fix and whatever solution we do come up with needs to involve benchmarking it for suitability under use.

Actions #4

Updated by Yehuda Sadeh over 10 years ago

1. First of all, we already split xattrs into multiple xattrs when they get beyond a certain size in the osd. There are certain file systems where this is a problem anyway, but I don't think that's true to either xfs or btrfs.
In any case, it doesn't really matter because that's what we did up until now. So we cannot just push the backend side change that breaks everything without changing the gateway.
2. As for what to do in the future, I'm sure we can come up with a different better solution, but that doesn't change (1).

Actions #5

Updated by Greg Farnum over 10 years ago

Huh, I'd forgotten about that. You're right of course. I'm not sure what issue Sage ran into that he thought we needed to add this, but I presume we hit some problems (in particular, there are some commits since that was first implemented that reference getting ENOSPC back on xattr writes, despite the presence of the chaining code).

Actions #6

Updated by Sage Weil over 10 years ago

Greg Farnum wrote:

Huh, I'd forgotten about that. You're right of course. I'm not sure what issue Sage ran into that he thought we needed to add this, but I presume we hit some problems (in particular, there are some commits since that was first implemented that reference getting ENOSPC back on xattr writes, despite the presence of the chaining code).

Yeah, I was just wrong--this should either be reverted or, at the very least, we should just set the limit way higher than 64 KB.. maybe 1 MB or something. Yehuda, do you think a high limit 1 MB is okay, or should we remove the limit entirely? It seems like at some point we should fail instead of going nuts with chained xattrs...

Actions #7

Updated by Yehuda Sadeh over 10 years ago

I don't think 1MB is ok, we should revert it for dumpling. I opened another two issues for revisiting the object manifest.

Actions #8

Updated by Sage Weil over 10 years ago

  • Status changed from New to Fix Under Review

wip-6336

Actions #9

Updated by Ian Colle over 10 years ago

  • Assignee changed from Ian Colle to Greg Farnum

please review

Actions #10

Updated by Ian Colle over 10 years ago

  • Assignee changed from Greg Farnum to Yehuda Sadeh

Yehuda, please confirm

Actions #11

Updated by Sage Weil over 10 years ago

  • Status changed from Fix Under Review to Resolved
Actions

Also available in: Atom PDF