Bug #4242: krbd: xfstest 259 failure (FS size near 4TB) - rbd - Ceph

Actions

Copy link

Bug #4242

closed

krbd: xfstest 259 failure (FS size near 4TB)

Added by Alex Elder about 11 years ago. Updated over 10 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Target version:

% Done:

Source:

Development

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

I have now seen this at least twice. I think it's inconsistent
but it might be important. Not marking it high yet, but it
might be.

Running xfstests #259 reports that XFS is unable to do a
mkfs on a loop device attached to a file in an XFS file system
when the underlying file has sizes close to 2^32.

The mkfs.xfs is getting EIO for files at sizes 2^32-1,
2^32-512, 2^32-1024, 2^32-1025, 2^32-2048, and 2^32-4096
(all the sizes it tests).

Below is what the output looks like.

259 - output mismatch (see 259.out.bad)
--- 259.out 2013-02-22 11:22:27.606208645 -0800
+++ 259.out.bad 2013-02-22 12:54:43.832930774 -0800
@ -1,7 +1,17 @
QA output created by 259
Trying to make (4 TB - 1) B long xfs fs image
+mkfs.xfs: libxfs_device_zero write failed: Input/output error
+mkfs failed!
Trying to make (4 TB - 512) B long xfs fs image
+mkfs.xfs: libxfs_device_zero write failed: Input/output error
+mkfs failed!
Trying to make (4 TB - 1024) B long xfs fs image
+mkfs.xfs: libxfs_device_zero write failed: Input/output error
+mkfs failed!
Trying to make (4 TB - 1025) B long xfs fs image
+mkfs.xfs: libxfs_device_zero write failed: Input/output error
+mkfs failed!
Trying to make (4 TB - 2048) B long xfs fs image
+mkfs.xfs: libxfs_device_zero write failed: Input/output error
+mkfs failed!
Trying to make (4 TB - 4096) B long xfs fs image

Actions

Copy link

Updated by Alex Elder about 11 years ago

Whoops, it was 4 TB (2^42) not 4 GB.

I'll update what I originally wrote shortly.

Actions

Copy link

Updated by Alex Elder about 11 years ago

Priority changed from Normal to High

I"m seeing this fairly consistently now.

Actions

Copy link

Updated by Alex Elder about 11 years ago

Status changed from New to In Progress

I tried to reproduce this with a smaller subset of tests
(just 258 and then 259 repeatedly) over the weekend but
the problem did not occur.

I added some printk() calls that will report information in
the event an error is ever going to get returned to the
initiator of an object request. I'm running a test now
(full set of xfstests) in hopes it will show up and I can
get more information about what the source of the error
is. My suspicion is it's due to an error coming back
from the osd but I need to see evidence of that.

Actions

Copy link

Updated by Alex Elder about 11 years ago

The tests I started running this morning finally hit number 259.
It did have the problem, but I got no messages from rbd as I
had hoped. Here is everything in /var/log/sys/log on plana09
related to running that test:

logger: run xfstest 259
kernel: [ 8766.063203] loop: Write error at byte offset 2199540084736, length 4096.
kernel: [ 8775.544935] loop: Write error at byte offset 2199540084736, length 4096.
kernel: [ 8783.747841] loop: Write error at byte offset 2199540084736, length 4096.
kernel: [ 8792.033191] loop: Write error at byte offset 2199540084736, length 4096.
kernel: [ 8793.332988] libceph: osd2 10.214.131.33:6806 socket closed (con state OPEN)
kernel: [ 8854.762482] loop: Write error at byte offset 2199540084736, length 4096.
udevd[12135]: failed to execute '/lib/udev/ceph-rbdnamer' 'ceph-rbdnamer %n': No such file or directory
libceph: osd0 10.214.131.33:6800 socket closed (con state OPEN)
kernel: [ 9008.283152] libceph: osd2 10.214.131.33:6806 socket closed (con state OPEN)
kernel: [ 9008.425846] libceph: osd3 10.214.131.29:6800 socket closed (con state OPEN)
Feb 25 10:43:21 plana09 kernel: [ 9008.518197] XFS (rbd2): Mounting Filesystem
kernel: [ 9012.473347] XFS (rbd2): Ending clean mount

(Note I am seeing the "udevd... ceph-rbdnamer" message for every test.)

Actions

Copy link

Updated by Alex Elder about 11 years ago

I was reporting if an error code got returned from
the osd. However the loop device function that
reports the "Write error" message prints the message
if the returned number of bytes written is different
from what was requested.

I'm going to update my code to look for that condition
and print information in that case as well.

Actions

Copy link

Updated by Sage Weil about 11 years ago

BTW fixed the rbd namer thing; pull latest teuthology.git.

Actions

Copy link

Updated by Alex Elder about 11 years ago

BTW fixed the rbd namer thing; pull latest teuthology.git.

Well, "fixed" is generous. It had a bug, which I've repaired,
roughly:
- 'TESTDIR={tid}'.format(tdir=testdir),
+ 'TESTDIR={tdir}'.format(tdir=testdir),

Actions

Copy link

Updated by Alex Elder about 11 years ago

Requests to write 4096 bytes came back from the osd with a
written count of 2048 bytes.

Now I'm afraid I have to make sure that the op sent to the
osd was in fact 4096 bytes, or whether the op got truncated
to 2048 bytes before getting sent to the osd...

kernel: [10684.388477] XFS (rbd2): Ending clean mount
logger: run xfstest 259
kernel: [10685.463899] loop: Write error (2048) at byte offset 2199538393088, length 4096.
kernel: [10694.227756] loop: Write error (2048) at byte offset 2199538393088, length 4096.
kernel: [10702.292614] loop: Write error (2048) at byte offset 2199538393088, length 4096.
kernel: [10703.190584] libceph: osd3 10.214.131.29:6800 socket closed (con state OPEN)
kernel: [10710.403547] loop: Write error (2048) at byte offset 2199538393088, length 4096.
kernel: [10720.122285] loop: Write error (2048) at byte offset 2199538393088, length 4096.

Actions

Copy link

Updated by Alex Elder about 11 years ago

The output did not indicate that rbd thought it
did something wrong.

That is, the object request came back and indicated
that it had transferred 2048 bytes, and that's how
many bytes the object request seemed to think it
was supposed to have.

That means that something's probably going wrong while
building up object (and image) requests.

I'm going to look a little more closely at that code,
especially given the specific request we know is
(sometimes) triggering the problem:
write 4096 bytes at image byte offset 2199538393088
which for some reason is writing only 2048 bytes

Actions

Copy link

#10

Updated by Alex Elder about 11 years ago

Of course it's not going to be easy... The problem showed up
this time at offset 2199533527040 instead of 2199538393088.

The offsets written should be predicatable because mkfs.xfs
writes at offsets algorithmically based on the size of the
volume. (But I'll go verify that, just to be sure.)

logger: run xfstest 259
kernel: [ 8791.773490] loop: Write error (2048) at byte offset 2199533527040, length 4096.
kernel: [ 8797.239992] libceph: osd5 10.214.131.29:6806 socket closed (con state OPEN)
kernel: [ 8800.002233] loop: Write error (2048) at byte offset 2199533527040, length 4096.
kernel: [ 8844.351473] loop: Write error (2048) at byte offset 2199533527040, length 4096.
kernel: [ 8849.588921] libceph: osd1 10.214.131.33:6803 socket closed (con state OPEN)
kernel: [ 8849.654201] libceph: osd4 10.214.131.29:6803 socket closed (con state OPEN)
kernel: [ 8849.694890] rbd: obj_request ffff88020d3eca00 was already done
kernel: [ 8849.694890]
kernel: [ 8849.759928] rbd: obj_request ffff88021dd17400 was already done
kernel: [ 8849.759928]
Feb 25 23:45:01 plana09 CRON[12510]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
kernel: [ 8879.369612] libceph: osd1 10.214.131.33:6803 socket closed (con state OPEN)
kernel: [ 8894.720698] loop: Write error (2048) at byte offset 2199533527040, length 4096.
kernel: [ 8935.213701] loop: Write error (2048) at byte offset 2199533527040, length 4096.
kernel: [ 8969.847434] libceph: osd1 10.214.131.33:6803 socket closed (con state OPEN)
udevd[12620]: failed to execute '/lib/udev/ceph-rbdnamer' 'ceph-rbdnamer %n': No such file or directory
kernel: [ 9056.789041] libceph: osd1 10.214.131.33:6803 socket closed (con state OPEN)
kernel: [ 9057.149973] XFS (rbd2): Mounting Filesystem
kernel: [ 9057.447096] XFS (rbd2): Ending clean mount

Oh, and the "obj_request ... was already done" is new code, and I
attribute this to the connection being closed on an error injection
and a lingering watch request was re-sent and completed again.
(There is a bug open to fix this but I can't find it right now.)

Actions

Copy link

#11

Updated by Alex Elder about 11 years ago

The issue for not completing multiple lingering requests
more than once is http://tracker.ceph.com/issues/3967.
(Just following up on my previous post.)

Actions

Copy link

#12

Updated by Tamilarasi muthamizhan about 11 years ago

ubuntu@teuthology:/a/teuthology-2013-02-28_20:00:04-regression-argonaut-master-basic/14358

Actions

Copy link

#13

Updated by Alex Elder about 11 years ago

I have been running tests all afternoon on two different
clusters with some code in place to catch this sort
of error and print more information if it ever occurs.
I have yet to reproduce the problem (today).

The code I've been using is based on the current "testing"
branch, but I've just updated it and will continue testing
with the result.
testing d51342b libceph: initialize data fields on last msg put

branch wip-4242 adds these commits:
82903f9 rbd: look for the magic short request
199a8b7 rbd: add rbd_img_request_print()
75bfd63 loop: print more error info with write errors
4a31501 rbd: print something when errors occur
0268ae7 rbd: add rbd_obj_request_print()
f1bb004 libceph: verify requests queued in order
d51342b libceph: initialize data fields on last msg put

I ran tests 257-259 through 100 iterations on two clusters
and the problem didn't show up. I'm now running this set
of tests in hopes of getting it to happen:
214 221 222 236 241 243 245 246 247 248 249 253 257 258 259

Once it occurs I hope to have a little more to work with.

Actions

Copy link

#14