Project

General

Profile

Feature #626

qa: add IOR, rompio, or other parallel workloads suite

Added by Greg Farnum over 12 years ago. Updated about 10 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
-
Target version:
% Done:

0%

Source:
Tags:
Backport:
Reviewed:
Affected Versions:
Component(FS):
Labels (FS):
Pull request ID:

Description

We've had reports that rompio is just terrifically unstable, and shows serious scaling issues.

IOR is a more common benchamrk in this area.

History

#1 Updated by Sage Weil over 11 years ago

  • Tracker changed from Tasks to Feature
  • Subject changed from Test rompio on cfuse to qa: add IOR, rompio, or other parallel workloads suite
  • Target version set to v0.37

#2 Updated by Sage Weil over 11 years ago

  • Target version deleted (v0.37)
  • translation missing: en.field_position set to 21

#3 Updated by Sage Weil over 11 years ago

IOR depends on mpi. mpich2 is pretty easy to set up (there's a package).

I think an ior task would need to:
- take a list of clients
- push a machine list to a 'master' node that starts the job (probably not where teuth itself is running)
- set up a temp ssh key so they can connect to each other (or we could make this part of the teuth worker config?)
- set up a symlink on each client so that a single path (/tmp/cephtest/sharedmnt.0 or something) gets you into the mount point on all machines
- download the ior tarball, compile and build it (needs to build with mpicc, which requires mpich2 be installed)
- run ior with whatever parameters the task specifies.

Might make sense to make a generic 'mpi' task that sets up the mpi environment, and separate that from the ior bits.

#4 Updated by Sage Weil over 11 years ago

  • Target version set to v0.36
  • translation missing: en.field_position deleted (25)
  • translation missing: en.field_position set to 31

#5 Updated by Sage Weil over 11 years ago

  • translation missing: en.field_position deleted (53)
  • translation missing: en.field_position set to 878

#6 Updated by Sage Weil over 11 years ago

  • translation missing: en.field_position deleted (886)
  • translation missing: en.field_position set to 1
  • translation missing: en.field_position changed from 1 to 899

#7 Updated by Sage Weil over 11 years ago

  • Target version changed from v0.36 to v0.37

#8 Updated by Sage Weil over 11 years ago

  • Target version deleted (v0.37)

#9 Updated by Sage Weil about 10 years ago

  • Project changed from Ceph to CephFS
  • Category deleted (qa)

#10 Updated by Sage Weil about 10 years ago

  • translation missing: en.field_position deleted (1339)
  • translation missing: en.field_position set to 2

#11 Updated by Greg Farnum about 10 years ago

SamL has done some work on getting MPI going under teuthology, and on running some multi-client FS tests. I'm not sure what the status of that work is, but whoever does this bug will need to check into that.

#12 Updated by Sage Weil about 10 years ago

  • Status changed from New to In Progress
  • Assignee set to Sam Lang

Yeah, that's what slang's working on to enable this. Assigning this to him.

#13 Updated by Sage Weil about 10 years ago

  • translation missing: en.field_position deleted (4)
  • translation missing: en.field_position set to 1

#14 Updated by Sage Weil about 10 years ago

  • Target version set to v0.57b
  • translation missing: en.field_position deleted (1)
  • translation missing: en.field_position set to 5

#15 Updated by Sam Lang about 10 years ago

  • Status changed from In Progress to Closed

Added tests to the marginal qa suite that run IOR, mdtest, and fsx-mpi.

Also available in: Atom PDF