Project

General

Profile

Actions

Bug #8442

closed

rgw: does not detect/adapt to erasure pool stripe size

Added by Jingjing Zhao almost 10 years ago. Updated over 9 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
firefly
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Step:

1. Put object which size is 4MB by rados gateway (S3).
2. An error returned. The error is HTTP/1.1 500 Internal Server Error.
3. Put object to cluster failed.

Rados gateway's log:

2014-05-27 07:38:05.353186 7f65e73f1700 2 req 61:0.000547:s3:PUT /my_bucket/4MB:put_obj:verifying op params
2014-05-27 07:38:05.353189 7f65e73f1700 2 req 61:0.000550:s3:PUT /my_bucket/4MB:put_obj:executing
2014-05-27 07:38:05.357837 7f65e73f1700 1 -- IP1 --> IP2 -- osd_op(client.4273.0:12239 default.4273.1__shadow_.CRFflfLsbIteptc69HYb7vEaRhVsVVe_1 [writefull 0~524288] 7.623b17f2 ack+ondisk+write e210) v4 -- ?+0 0x7f64ec00c060 con 0x22e6fc0
2014-05-27 07:38:05.360712 7f65e73f1700 1 -- IP1 --> IP2 -- osd_op(client.4273.0:12240 default.4273.1__shadow_.CRFflfLsbIteptc69HYb7vEaRhVsVVe_1 [write 524288~524288] 7.623b17f2 ack+ondisk+write e210) v4 -- ?+0 0x7f64ec00cd30 con 0x22e6fc0
2014-05-27 07:38:05.370620 7f666f841700 1 -- IP1 <== osd.22 IP2 ==== osd_op_reply(12240 default.4273.1__shadow_.CRFflfLsbIteptc69HYb7vEaRhVsVVe_1 [write 524288~524288] v0'0 uv0 ondisk = -95 ((95) Operation not supported)) v6 ==== 224+0+0 (1784400611 0 0) 0x7f6618002480 con 0x22e6fc0

Analysis?

1. According to the log, we can see when use OP_WRITE, the operation failed.
(e.g. [write 524288~524288] , the osd replied -95 ((95) Operation not
supported))

2. Each OP_WRTIE will use 512KB as the basic unit to write.
(e.g. [write 524288~524288] [write 1048576~524288] the first number means the position for write , the second number means the size for write)

3. The source code of OP_WRITE for erasure coding:

_  if (pool.info.requires_aligned_append() &&
(op.extent.offset % pool.info.required_alignment() != 0)) {
result = -EOPNOTSUPP;
break;
}_
Every offset (the basic unit, 512KB) will divide the alignment. If it can't be divided evenly?error will be returned. The alignment is based on the erasure profile, such as w,packetize and so on.

4. In my test, I use 4M object and the alignment is 640KB.
The object will be divided into two chunks
512KB --> for the first chunk in default
3.5M --> for the second chunk, and will be write in seven times. Each time is 512KB.
And the error occurred in the second part, cause the 512KB couldn't be divided evenly by 640KB.

Question:

1. Why should add this condition in source ? [_op.extent.offset % pool.info.required_alignment() != 0_]
2. Why WRITE_OP use 512KB to write each time ?


Related issues 1 (0 open1 closed)

Has duplicate rgw - Bug #8693: rgw: doesn't not automatically detect stripe sizeDuplicate06/29/2014

Actions
Actions

Also available in: Atom PDF