Project

General

Profile

Tasks #2138

rbd: run xfstests on a local XFS filesystem over RBD

Added by Alex Elder almost 12 years ago. Updated almost 12 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Tags:
Reviewed:
Affected Versions:

Description

This still can't really be done cleanly in a teuthology script, but
I would like to run xfstests on an RBD client system using a local
XFS file system built on top of an RBD device (or devices).

It would also be good to run the tests on a local filesystem built
on a direct-attached spinning disk, to give a comparison result.
(Some xfstests produce expected failures--often just output differences
due to peculiarities of systems.) This could also be used for
comparing performance.

Our goal should be for all xfstests to run with essentially the same
success rate using RBD as is achieved using a local disk.

History

#1 Updated by Alex Elder almost 12 years ago

After setting up two rbd devices and making some fairly simple changes
to xfstests, then setting up appropriate environment variables to point
to the rbd devices for the test and scratch filesystems I was able to
run xfstests up to test 049. At that point the system locked up.

Test 049 is running XFS on a loop device, and I think that may then
be backed by an rbd device. So my first theory was that there was a
deadlock involved for this usage. But the messages on the console
may indicate another as-yet-unexplained problem so we can't really
draw any deeper conclusions yet.

The other observation I had was there appeared to be some strange,
long delays while the tests were running. The things I noticed may
have been during mount processing, but having only run through these
once (and not having done it on a local device recently) I need to
look a little more closely at this to characterize it.

In any case this task is well underway, and I think there will be
lots to learn from this process. I will be submitting to the XFS
mailing list the changes needed to support this testing once I've
given it a little more exercise.

#2 Updated by Alex Elder almost 12 years ago

  • Status changed from New to Resolved

I have two files that implement automated testing using
xfstests over rbd devices.

One is now in the ceph git tree, under qa/run_xfstests.sh

commit bf8847e7c14a52467950c8f3bab88e43f660e3c0
Author: Alex Elder <>
Date: Fri Apr 13 21:26:22 2012 -0500
qa: add run_xfsests.sh script

The other is in the teuthology git tree, as a modification
to teuthology/task/rbd.py:

commit 6ba4efcd3adbd6ed7b21147df484b02a3aafbaf8
Author: Sage Weil <>
Date: Fri Apr 13 22:28:05 2012 -0700
rbd.py: add xfstests functionality

The shell script runs a set of xfstests scripts against two
devices that are provided as arguments. The arguments are:
-s <scratch_dev>
-t <test_dev>
-f <fs_type> (default xfs)
[<test_spec> [<test_spec>...]]

Tests in the suite will use the scratch device for destructive
testing. Tests are not expected to damage the test device, it
is expected to contain an existing filesystem and used as a second
filesystem on tests that need one. Both of these are mandatory
arguments.

By default, tests use xfs as the filesystem type, but btrfs or
ext4 can be used instead by indicating that with the "-f" flag.
(Other filesystem types may also work--now or in the future.)

Some subset of tests is used by default. Tests are put into
named groups, and normally xfstests will run everything in the
"auto" group (meant for automated regression testing--tests in
this group are expected to normally pass).

For the initial checkin I laid out the set of tests I found
worked when using an xfs filesystem over rbd devices. There
were 97 such tests. Other tests should be added back over
time, and ideally rbd devices should pass exactly the same
tests that local devices do.

In any case, specific tests can be specified. A <test_spec>
can indicate a named group, e.g.: g auto
It can also be a list of one or more test numbers. 1 2 10
Inclusive ranges of tests can also be specified: 1 3-5 9 200

I created two teuthology tasks in "rbd.py". One is a top-level
entry for running xfstests on rbd devices. The devices can be
set up explicitly, but if that isn't done it will automatically
set up the rbd devices to use for test, and will arrange to use
the defaults for xfs type and test list.

tasks:
- rbd.xfstests:
client.0:
test_image: 'test_image'
test_size: 100
scratch_image: 'scratch_image'
scratch_size: 250
fs_type: 'xfs'
tests: '1-9 11-15 17 19-21 26-28 31-34 41 45-48'

The other task is used by the one above to actually run the
test script on the target client systems. It is oriented
toward devices generically (not just rbd device), so could
in principle just be used to run xfstests using local disk
partitions, but I have not yet tried that. This task should
not be part of the rbd stuff, but I don't know the right way
to go about that and hopefully someone else can shepherd this
code to wherever it belongs.

tasks:
- ceph:
- rbd.run_xfstests:
client.0:
test_dev: 'test_dev'
scratch_dev: 'scratch_dev'
fs_type: 'xfs'
tests: '1-9 11-15 17 19-21 26-28 31-34 41 45-48'

As noted earlier, test 49 seems to cause a kernel problem, which
in fact may be something that reproduces a problem we've been
having a hard time getting (http://tracker.newdream.net/issues/2260
and 2287).

In addition, the teuthology test typical reports a failure because
a warning message from XFS shows up in the log, something like:
WARNING: at /srv/autobuild-ceph/gitbuilder.git/build/kernel/mutex-debug.c:65 mutex_remove_waiter+0x93/0x130()

That problem is tracked here: http://tracker.newdream.net/issues/2302

In any case, I think that both of these issues--along with any ongoing
work to increase the default tests to be run--should be addressed
separately.

So... All that being said, I think it's time to mark this particular
issue resolved.

Also available in: Atom PDF