Project

General

Profile

Actions

Bug #9177

closed

ceph-fuse: failing MPI mdtest runs

Added by Greg Farnum over 9 years ago. Updated over 9 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2014-08-19T05:06:15.018 INFO:teuthology.orchestra.run.burnupi08.stdout:mdtest-1.8.3 was launched with 3 total task(s) on 3 nodes
2014-08-19T05:06:15.018 INFO:teuthology.orchestra.run.burnupi08.stdout:Command line used: /home/ubuntu/cephtest/mdtest-1.8.4/mdtest -d /home/ubuntu/cephtest/gmnt -I 20 -z 5 -b 2 -R
2014-08-19T05:06:15.019 INFO:teuthology.orchestra.run.burnupi08.stderr:*** buffer overflow detected ***: /home/ubuntu/cephtest/mdtest-1.8.4/mdtest terminated
2014-08-19T05:06:15.019 INFO:teuthology.orchestra.run.burnupi08.stderr:======= Backtrace: =========
2014-08-19T05:06:15.020 INFO:teuthology.orchestra.run.burnupi08.stderr:/lib/x86_64-linux-gnu/libc.so.6(+0x741df)[0x7f92d81ad1df]
2014-08-19T05:06:15.020 INFO:teuthology.orchestra.run.burnupi08.stderr:/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x5c)[0x7f92d8244bac]
2014-08-19T05:06:15.020 INFO:teuthology.orchestra.run.burnupi08.stderr:/lib/x86_64-linux-gnu/libc.so.6(+0x10aa70)[0x7f92d8243a70]
2014-08-19T05:06:15.020 INFO:teuthology.orchestra.run.burnupi08.stderr:/lib/x86_64-linux-gnu/libc.so.6(+0x10b004)[0x7f92d8244004]
2014-08-19T05:06:15.020 INFO:teuthology.orchestra.run.burnupi08.stderr:/home/ubuntu/cephtest/mdtest-1.8.4/mdtest[0x407306]
2014-08-19T05:06:15.020 INFO:teuthology.orchestra.run.burnupi08.stderr:/home/ubuntu/cephtest/mdtest-1.8.4/mdtest[0x407584]
2014-08-19T05:06:15.020 INFO:teuthology.orchestra.run.burnupi08.stderr:/home/ubuntu/cephtest/mdtest-1.8.4/mdtest[0x402bb3]
2014-08-19T05:06:15.020 INFO:teuthology.orchestra.run.burnupi08.stderr:/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f92d815aec5]
2014-08-19T05:06:15.021 INFO:teuthology.orchestra.run.burnupi08.stderr:/home/ubuntu/cephtest/mdtest-1.8.4/mdtest[0x402e06]
2014-08-19T05:06:15.021 INFO:teuthology.orchestra.run.burnupi08.stderr:======= Memory map: ========
2014-08-19T05:06:15.021 INFO:teuthology.orchestra.run.burnupi08.stderr:00400000-0040a000 r-xp 00000000 08:01 22152126                           /home/ubuntu/cephtest/mdtest-1.8.4/mdtest
2014-08-19T05:06:15.021 INFO:teuthology.orchestra.run.burnupi08.stderr:00609000-0060a000 r--p 00009000 08:01 22152126                           /home/ubuntu/cephtest/mdtest-1.8.4/mdtest
2014-08-19T05:06:15.021 INFO:teuthology.orchestra.run.burnupi08.stderr:0060a000-0060b000 rw-p 0000a000 08:01 22152126                           /home/ubuntu/cephtest/mdtest-1.8.4/mdtest
2014-08-19T05:06:15.021 INFO:teuthology.orchestra.run.burnupi08.stderr:0060b000-0060f000 rw-p 00000000 00:00 0
2014-08-19T05:06:15.021 INFO:teuthology.orchestra.run.burnupi08.stderr:01e67000-01e88000 rw-p 00000000 00:00 0                                  [heap]
2014-08-19T05:06:15.021 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d7306000-7f92d7707000 rw-p 00000000 00:00 0
2014-08-19T05:06:15.022 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d7707000-7f92d770a000 r-xp 00000000 08:01 57147596                   /lib/x86_64-linux-gnu/libdl-2.19.so
2014-08-19T05:06:15.022 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d770a000-7f92d7909000 ---p 00003000 08:01 57147596                   /lib/x86_64-linux-gnu/libdl-2.19.so
2014-08-19T05:06:15.022 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d7909000-7f92d790a000 r--p 00002000 08:01 57147596                   /lib/x86_64-linux-gnu/libdl-2.19.so
2014-08-19T05:06:15.022 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d790a000-7f92d790b000 rw-p 00003000 08:01 57147596                   /lib/x86_64-linux-gnu/libdl-2.19.so
2014-08-19T05:06:15.022 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d790b000-7f92d7921000 r-xp 00000000 08:01 57147455                   /lib/x86_64-linux-gnu/libgcc_s.so.1
2014-08-19T05:06:15.022 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d7921000-7f92d7b20000 ---p 00016000 08:01 57147455                   /lib/x86_64-linux-gnu/libgcc_s.so.1
2014-08-19T05:06:15.022 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d7b20000-7f92d7b21000 rw-p 00015000 08:01 57147455                   /lib/x86_64-linux-gnu/libgcc_s.so.1
2014-08-19T05:06:15.023 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d7b21000-7f92d7b29000 r-xp 00000000 08:01 47199200                   /usr/lib/libcr.so.0.5.5
2014-08-19T05:06:15.023 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d7b29000-7f92d7d28000 ---p 00008000 08:01 47199200                   /usr/lib/libcr.so.0.5.5
2014-08-19T05:06:15.023 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d7d28000-7f92d7d29000 r--p 00007000 08:01 47199200                   /usr/lib/libcr.so.0.5.5
2014-08-19T05:06:15.023 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d7d29000-7f92d7d2a000 rw-p 00008000 08:01 47199200                   /usr/lib/libcr.so.0.5.5
2014-08-19T05:06:15.023 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d7d2a000-7f92d7d2b000 rw-p 00000000 00:00 0
2014-08-19T05:06:15.023 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d7d2b000-7f92d7d32000 r-xp 00000000 08:01 57147603                   /lib/x86_64-linux-gnu/librt-2.19.so
2014-08-19T05:06:15.023 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d7d32000-7f92d7f31000 ---p 00007000 08:01 57147603                   /lib/x86_64-linux-gnu/librt-2.19.so
2014-08-19T05:06:15.024 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d7f31000-7f92d7f32000 r--p 00006000 08:01 57147603                   /lib/x86_64-linux-gnu/librt-2.19.so
2014-08-19T05:06:15.024 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d7f32000-7f92d7f33000 rw-p 00007000 08:01 57147603                   /lib/x86_64-linux-gnu/librt-2.19.so
2014-08-19T05:06:15.024 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d7f33000-7f92d7f38000 r-xp 00000000 08:01 47199206                   /usr/lib/x86_64-linux-gnu/libmpl.so.1.0.0
2014-08-19T05:06:15.024 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d7f38000-7f92d8137000 ---p 00005000 08:01 47199206                   /usr/lib/x86_64-linux-gnu/libmpl.so.1.0.0
2014-08-19T05:06:15.024 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d8137000-7f92d8138000 r--p 00004000 08:01 47199206                   /usr/lib/x86_64-linux-gnu/libmpl.so.1.0.0
2014-08-19T05:06:15.024 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d8138000-7f92d8139000 rw-p 00005000 08:01 47199206                   /usr/lib/x86_64-linux-gnu/libmpl.so.1.0.0
2014-08-19T05:06:15.024 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d8139000-7f92d82f5000 r-xp 00000000 08:01 57147610                   /lib/x86_64-linux-gnu/libc-2.19.so
2014-08-19T05:06:15.024 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d82f5000-7f92d84f4000 ---p 001bc000 08:01 57147610                   /lib/x86_64-linux-gnu/libc-2.19.so
2014-08-19T05:06:15.025 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d84f4000-7f92d84f8000 r--p 001bb000 08:01 57147610                   /lib/x86_64-linux-gnu/libc-2.19.so
2014-08-19T05:06:15.025 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d84f8000-7f92d84fa000 rw-p 001bf000 08:01 57147610                   /lib/x86_64-linux-gnu/libc-2.19.so
2014-08-19T05:06:15.025 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d84fa000-7f92d84ff000 rw-p 00000000 00:00 0
2014-08-19T05:06:15.025 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d84ff000-7f92d8518000 r-xp 00000000 08:01 57147612                   /lib/x86_64-linux-gnu/libpthread-2.19.so
2014-08-19T05:06:15.025 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d8518000-7f92d8717000 ---p 00019000 08:01 57147612                   /lib/x86_64-linux-gnu/libpthread-2.19.so
2014-08-19T05:06:15.025 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d8717000-7f92d8718000 r--p 00018000 08:01 57147612                   /lib/x86_64-linux-gnu/libpthread-2.19.so
2014-08-19T05:06:15.025 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d8718000-7f92d8719000 rw-p 00019000 08:01 57147612                   /lib/x86_64-linux-gnu/libpthread-2.19.so
2014-08-19T05:06:15.025 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d8719000-7f92d871d000 rw-p 00000000 00:00 0
2014-08-19T05:06:15.026 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d871d000-7f92d892f000 r-xp 00000000 08:01 47199209                   /usr/lib/x86_64-linux-gnu/libmpich.so.10.0.4
2014-08-19T05:06:15.026 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d892f000-7f92d8b2e000 ---p 00212000 08:01 47199209                   /usr/lib/x86_64-linux-gnu/libmpich.so.10.0.4
2014-08-19T05:06:15.026 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d8b2e000-7f92d8b3b000 r--p 00211000 08:01 47199209                   /usr/lib/x86_64-linux-gnu/libmpich.so.10.0.4
2014-08-19T05:06:15.026 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d8b3b000-7f92d8b3e000 rw-p 0021e000 08:01 47199209                   /usr/lib/x86_64-linux-gnu/libmpich.so.10.0.4
2014-08-19T05:06:15.026 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d8b3e000-7f92d8b77000 rw-p 00000000 00:00 0
2014-08-19T05:06:15.026 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d8b77000-7f92d8c7c000 r-xp 00000000 08:01 57147604                   /lib/x86_64-linux-gnu/libm-2.19.so
2014-08-19T05:06:15.026 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d8c7c000-7f92d8e7b000 ---p 00105000 08:01 57147604                   /lib/x86_64-linux-gnu/libm-2.19.so
2014-08-19T05:06:15.027 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d8e7b000-7f92d8e7c000 r--p 00104000 08:01 57147604                   /lib/x86_64-linux-gnu/libm-2.19.so
2014-08-19T05:06:15.027 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d8e7c000-7f92d8e7d000 rw-p 00105000 08:01 57147604                   /lib/x86_64-linux-gnu/libm-2.19.so
2014-08-19T05:06:15.027 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d8e7d000-7f92d8ea0000 r-xp 00000000 08:01 57147595                   /lib/x86_64-linux-gnu/ld-2.19.so
2014-08-19T05:06:15.027 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d908a000-7f92d9090000 rw-p 00000000 00:00 0
2014-08-19T05:06:15.027 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d909c000-7f92d909f000 rw-p 00000000 00:00 0
2014-08-19T05:06:15.027 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d909f000-7f92d90a0000 r--p 00022000 08:01 57147595                   /lib/x86_64-linux-gnu/ld-2.19.so
2014-08-19T05:06:15.027 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d90a0000-7f92d90a1000 rw-p 00023000 08:01 57147595                   /lib/x86_64-linux-gnu/ld-2.19.so
2014-08-19T05:06:15.027 INFO:teuthology.orchestra.run.burnupi08.stderr:7f92d90a1000-7f92d90a2000 rw-p 00000000 00:00 0
2014-08-19T05:06:15.028 INFO:teuthology.orchestra.run.burnupi08.stderr:7fff8c9c9000-7fff8c9ea000 rw-p 00000000 00:00 0                          [stack]
2014-08-19T05:06:15.028 INFO:teuthology.orchestra.run.burnupi08.stderr:7fff8c9fd000-7fff8c9fe000 r-xp 00000000 00:00 0                          [vdso]
2014-08-19T05:06:15.028 INFO:teuthology.orchestra.run.burnupi08.stderr:7fff8c9fe000-7fff8ca00000 r--p 00000000 00:00 0                          [vvar]
2014-08-19T05:06:15.028 INFO:teuthology.orchestra.run.burnupi08.stderr:ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
2014-08-19T05:06:15.028 INFO:teuthology.orchestra.run.burnupi08.stderr:[proxy:0:1@burnupi61] HYD_pmcd_pmip_control_cmd_cb (./pm/pmiserv/pmip_cb.c:886): assert (!closed) failed
2014-08-19T05:06:15.028 INFO:teuthology.orchestra.run.burnupi08.stderr:[proxy:0:1@burnupi61] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback returned error status
2014-08-19T05:06:15.028 INFO:teuthology.orchestra.run.burnupi08.stderr:[proxy:0:1@burnupi61] [proxy:0:2@burnupi29] HYD_pmcd_pmip_control_cmd_cb (./pm/pmiserv/pmip_cb.c:886): assert (!closed) failed
2014-08-19T05:06:15.028 INFO:teuthology.orchestra.run.burnupi08.stderr:[proxy:0:2@burnupi29] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback returned error status
2014-08-19T05:06:15.029 INFO:teuthology.orchestra.run.burnupi08.stderr:[proxy:0:2@burnupi29] main (./pm/pmiserv/pmip.c:206): demux engine error waiting for event
2014-08-19T05:06:15.029 INFO:teuthology.orchestra.run.burnupi08.stderr:main (./pm/pmiserv/pmip.c:206): demux engine error waiting for event
2014-08-19T05:06:15.029 INFO:teuthology.orchestra.run.burnupi08.stdout:
2014-08-19T05:06:15.029 INFO:teuthology.orchestra.run.burnupi08.stdout:===================================================================================
2014-08-19T05:06:15.029 INFO:teuthology.orchestra.run.burnupi08.stdout:=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
2014-08-19T05:06:15.029 INFO:teuthology.orchestra.run.burnupi08.stdout:=   EXIT CODE: 6
2014-08-19T05:06:15.029 INFO:teuthology.orchestra.run.burnupi08.stdout:=   CLEANING UP REMAINING PROCESSES
2014-08-19T05:06:15.030 INFO:teuthology.orchestra.run.burnupi08.stdout:=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
2014-08-19T05:06:15.030 INFO:teuthology.orchestra.run.burnupi08.stdout:===================================================================================
2014-08-19T05:06:15.030 INFO:teuthology.orchestra.run.burnupi08.stderr:[mpiexec@burnupi08] HYDT_bscu_wait_for_completion (./tools/bootstrap/utils/bscu_wait.c:76): one of the processes terminated badly; aborting
2014-08-19T05:06:15.031 INFO:teuthology.orchestra.run.burnupi08.stderr:[mpiexec@burnupi08] HYDT_bsci_wait_for_completion (./tools/bootstrap/src/bsci_wait.c:23): launcher returned error waiting for completion
2014-08-19T05:06:15.031 INFO:teuthology.orchestra.run.burnupi08.stderr:[mpiexec@burnupi08] HYD_pmci_wait_for_completion (./pm/pmiserv/pmiserv_pmci.c:217): launcher returned error waiting for completion
2014-08-19T05:06:15.031 INFO:teuthology.orchestra.run.burnupi08.stderr:[mpiexec@burnupi08] main (./ui/mpich/mpiexec.c:331): process manager error waiting for completion
2014-08-19T05:06:15.031 ERROR:teuthology.run_tasks:Saw exception from tasks.
Traceback (most recent call last):
  File "/home/teuthworker/src/teuthology_master/teuthology/run_tasks.py", line 51, in run_tasks
    manager = run_one_task(taskname, ctx=ctx, config=config)
  File "/home/teuthworker/src/teuthology_master/teuthology/run_tasks.py", line 39, in run_one_task
    return fn(**kwargs)
  File "/home/teuthworker/src/teuthology_master/teuthology/task/mpi.py", line 136, in task
    master_remote.run(args=args, )
  File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/remote.py", line 114, in run
    r = self._runner(client=self.ssh, name=self.shortname, **kwargs)
  File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/run.py", line 401, in run
    r.wait()
  File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/run.py", line 102, in wait
    exitstatus=status, node=self.hostname)
CommandFailedError: Command failed on burnupi08 with status 255: 'mpiexec -f /home/ubuntu/cephtest/mpi-hosts /home/ubuntu/cephtest/mdtest-1.8.4/mdtest -d /home/ubuntu/cephtest/gmnt -I 20 -z 5 -b 2 -R'

http://qa-proxy.ceph.com/teuthology/teuthology-2014-08-18_23:04:01-fs-master-testing-basic-multi/434755/
http://qa-proxy.ceph.com/teuthology/teuthology-2014-08-18_23:04:01-fs-master-testing-basic-multi/434753/
http://qa-proxy.ceph.com/teuthology/teuthology-2014-08-19_19:06:01-fs-dumpling-testing-basic-multi/436313/

That this is happening on both dumpling and master branches makes me think perhaps it's a new kind of infrastructure issue, but it could also be a bad backport or something...

Actions #1

Updated by John Spray over 9 years ago

mdtest has a getcwd call into an unzeroed buffer that it doesn't check the error of. If fuse is failing the getcwd for any reason (permissions?) then that could make mdtest crash. Doesn't explain why it would spontaneously start though.

Actions #2

Updated by Greg Farnum over 9 years ago

How did you track it down to getcwd? If that is the issue there are a bunch of avenues of attack here, and we should check how long it's been since we actually saw this test pass (I dismissed many failures, but I think there have been days when everything including this passed...).

Actions #4

Updated by John Spray over 9 years ago

The compiler is spitting out a warning about getcwd -- no evidence that that's what it's actually hitting in this instance.

Actions #6

Updated by Greg Farnum over 9 years ago

/teuthology-2014-09-08_19:06:01-fs-dumpling-testing-basic-multi/473897/
/teuthology-2014-09-08_19:06:01-fs-dumpling-testing-basic-multi/473899
teuthology-2014-09-08_23:04:01-fs-master-testing-basic-multi/474438/

Actions #7

Updated by John Spray over 9 years ago

  • Status changed from 12 to 7
Actions #9

Updated by Greg Farnum over 9 years ago

  • Status changed from 7 to Resolved
  • Assignee set to John Spray

John fixed this by updating mdtest in ceph-qa-suite as of commit:b1365a80982dba4160e861c28d887b066ca451b6.

Actions

Also available in: Atom PDF