Project

General

Profile

Actions

Bug #5761

closed

teuthology: MPI test sometimes fails with a permission denied

Added by Greg Farnum over 10 years ago. Updated over 10 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

End of the log:

2013-07-25T05:22:00.273 INFO:teuthology.orchestra.run.out:access    bw(MiB/s)  block(KiB) xfer(KiB)  open(s)    wr/rd(s)   close(s)   total(s)   iter
2013-07-25T05:22:00.273 INFO:teuthology.orchestra.run.out:------    ---------  ---------- ---------  --------   --------   --------   --------   ----
2013-07-25T05:22:00.331 INFO:teuthology.orchestra.run.out:ior ERROR: open64() failed, errno 13, Permission denied (aiori-POSIX.c:158)
2013-07-25T05:22:00.332 INFO:teuthology.orchestra.run.err:application called MPI_Abort(MPI_COMM_WORLD, -1) - process 2
2013-07-25T05:22:00.862 INFO:teuthology.orchestra.run.err:Fatal error in PMPI_Reduce: Other MPI error, error stack:
2013-07-25T05:22:00.862 INFO:teuthology.orchestra.run.err:PMPI_Reduce(1270)...............: MPI_Reduce(sbuf=0x7fff5018a9bc, rbuf=0x7fff5018a9c8, count=1, MPI_INT, MPI_SUM, root=0, comm=0x84000000) failed
2013-07-25T05:22:00.863 INFO:teuthology.orchestra.run.err:MPIR_Reduce_impl(1087)..........:
2013-07-25T05:22:00.863 INFO:teuthology.orchestra.run.err:MPIR_Reduce_intra(895)..........:
2013-07-25T05:22:00.863 INFO:teuthology.orchestra.run.err:MPIR_Reduce_binomial(144).......:
2013-07-25T05:22:00.863 INFO:teuthology.orchestra.run.err:MPIDI_CH3U_Recvq_FDU_or_AEP(380): Communication error with rank 2

What logs we have are at /a/teuthology-2013-07-25_01:01:06-fs-next-testing-basic/82667/teuthology.log.


Related issues 1 (0 open1 closed)

Related to CephFS - Bug #5367: multiclient tests: kernel mount gets EPERMResolved06/15/2013

Actions
Actions #1

Updated by Sage Weil over 10 years ago

  • Priority changed from Normal to High
Actions #2

Updated by Sage Weil over 10 years ago

  • Status changed from New to Resolved
Actions #3

Updated by Greg Farnum over 10 years ago

Essentially a duplciate of #5637 — mismatched UIDs on different nodes.

Actions

Also available in: Atom PDF