Project

General

Profile

Bug #16886

Updated by Patrick Donnelly over 7 years ago

There are strange pauses which are showing up in several tests for the kclient:

* http://qa-proxy.ceph.com/teuthology/pdonnell-2016-07-21_13:20:27-multimds-master---basic-mira/327530/
Timeout 5h running fs/misc/dirfrag.sh
1 jobs: ['327530']
suites: ['clusters/3-mds.yaml', 'debug/mds_client.yaml', 'fs/btrfs.yaml', 'inline/yes.yaml', 'mount/kclient.yaml', 'multimds/basic/{ceph/base.yaml', 'overrides/whitelist_wrongly_marked_down.yaml', 'tasks/misc.yaml}']

<pre>
2016-07-23T20:36:24.517 INFO:tasks.workunit:Running workunit fs/misc/dirfrag.sh... 2016-07-19T15:05:13.028 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/cpu_pm.h’
2016-07-23T20:36:24.517 INFO:teuthology.orchestra.run.mira066:Running (workunit test fs/misc/dirfrag.sh): 'mkdir -p -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=a0294e64507a7916fdd9707ae22ba40b0d7b65d1 TESTDIR="/home/ubuntu/cephtest" CEPH_ARGS="--cluster ceph" CEPH_ID="0" PATH=$PATH:/usr/sbin adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 5h /home/ubuntu/cephtest/workunit.client.0/fs/misc/dirfrag.sh' 2016-07-19T15:05:13.028 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/bcd.h’
2016-07-23T20:36:24.643 INFO:tasks.workunit.client.0.mira066.stdout:creating folder hierarchy 2016-07-19T15:05:13.030 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/shmem_fs.h’
2016-07-23T20:40:39.985 INFO:tasks.workunit.client.0.mira066.stdout:created hierarchy, now cleaning up 2016-07-19T15:05:13.030 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/pm_clock.h’
2016-07-24T01:36:24.635 2016-07-19T15:05:13.031 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/leds_pwm.h’
2016-07-19T15:05:13.031 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/timerfd.h’
2016-07-19T15:05:13.031 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/jz4740-adc.h’
2016-07-19T15:05:13.031 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/omapfb.h’
2016-07-19T15:05:13.031 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/marvell_phy.h’
2016-07-19T15:05:13.032 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/hid-sensor-hub.h’
2016-07-19T15:05:13.032 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/frontswap.h’
2016-07-19T15:05:13.032 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/cgroup.h’
2016-07-19T17:37:29.506 INFO:tasks.workunit.client.0.mira112.stdout:r
2016-07-19T17:37:29.506
INFO:tasks.workunit:Stopping ['fs/misc'] ['kernel_untar_build.sh'] on client.0...
2016-07-24T01:36:24.635 INFO:teuthology.orchestra.run.mira066:Running: 2016-07-19T17:37:29.507 INFO:teuthology.orchestra.run.mira112:Running: 'rm -rf -- /home/ubuntu/cephtest/workunits.list.client.0 /home/ubuntu/cephtest/workunit.client.0 /home/ubuntu/cephtest/clone'
2016-07-24T01:36:24.913 2016-07-19T17:37:29.783 ERROR:teuthology.parallel:Exception in parallel execution
Traceback (most recent call last):
File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 83, in __exit__
for result in self:
File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 101, in next
resurrect_traceback(result)
File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 19, in capture_traceback
return func(*args, **kwargs)
File "/var/lib/teuthworker/src/ceph-qa-suite_master/tasks/workunit.py", line 404, in _run_tests
label="workunit test {workunit}".format(workunit=workunit)
File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/remote.py", line 196, in run
r = self._runner(client=self.ssh, name=self.shortname, **kwargs)
File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/run.py", line 394, in run
r.wait()
File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/run.py", line 121, in wait
label=self.label)
CommandFailedError: Command failed (workunit test fs/misc/dirfrag.sh) kernel_untar_build.sh) on mira066 mira112 with status 124: 'mkdir -p -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=a0294e64507a7916fdd9707ae22ba40b0d7b65d1 CEPH_REF=12c068365c43a140fe1fe23bf68318342710e84d TESTDIR="/home/ubuntu/cephtest" CEPH_ARGS="--cluster ceph" CEPH_ID="0" PATH=$PATH:/usr/sbin adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 5h /home/ubuntu/cephtest/workunit.client.0/fs/misc/dirfrag.sh' 3h /home/ubuntu/cephtest/workunit.client.0/kernel_untar_build.sh'
</pre>

* http://qa-proxy.ceph.com/teuthology/pdonnell-2016-07-29_08:28:00-multimds-master---basic-mira/339920/
Timeout 3h running suites/blogbench.sh
1 jobs: ['339920']
suites: ['clusters/3-mds.yaml', 'debug/mds_client.yaml', 'fs/btrfs.yaml', 'inline/no.yaml', 'mount/kclient.yaml', 'multimds/basic/{ceph/base.yaml', 'overrides/whitelist_wrongly_marked_down.yaml', 'tasks/suites_blogbench.yaml}']

<pre>
2016-07-29T21:42:37.528 INFO:tasks.workunit:Running workunits matching suites/blogbench.sh on client.0...
2016-07-29T21:42:37.528 INFO:tasks.workunit:Running workunit suites/blogbench.sh...
2016-07-29T21:42:37.528 INFO:teuthology.orchestra.run.mira061:Running (workunit test suites/blogbench.sh): 'mkdir -p -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=e7cfdc842e731ba964c72a8b3fd07459a1d27c39 TESTDIR="/home/ubuntu/cephtest" CEPH_ARGS="--cluster ceph" CEPH_ID="0" PATH=$PATH:/usr/sbin adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/workunit.client.0/suites/blogbench.sh'
2016-07-30T00:42:37.635 INFO:tasks.workunit.client.0.mira061.stdout:getting blogbench
2016-07-30T00:42:37.673 INFO:tasks.workunit.client.0.mira061.stdout:blogbench-1.0/
2016-07-30T00:42:37.673 INFO:tasks.workunit.client.0.mira061.stdout:blogbench-1.0/m4/
...
2016-07-30T00:42:37.730 INFO:tasks.workunit.client.0.mira061.stderr:Connecting
This appears to download.ceph.com (download.ceph.com)|173.236.253.173|:80... connected.
2016-07-30T00:42:37.730 INFO:tasks.workunit.client.0.mira061.stderr:HTTP request sent, awaiting response... 200 OK
2016-07-30T00:42:37.730 INFO:tasks.workunit.client.0.mira061.stderr:Length: 103355 (101K) [application/octet-stream]
2016-07-30T00:42:37.730 INFO:tasks.workunit.client.0.mira061.stderr:Saving to: ‘blogbench-1.0.tar.bz2’
2016-07-30T00:42:37.731 INFO:tasks.workunit.client.0.mira061.stderr:
2016-07-30T00:42:37.731 INFO:tasks.workunit.client.0.mira061.stderr: 0K .......... .......... .......... .......... .......... 49% 4.43M 0s
2016-07-30T00:42:37.731 INFO:tasks.workunit.client.0.mira061.stderr: 50K .......... .......... .......... .......... .......... 99% 5.73M 0s
2016-07-30T00:42:37.731 INFO:tasks.workunit.client.0.mira061.stderr: 100K 100% 1779G=0.02s
2016-07-30T00:42:37.732 INFO:tasks.workunit.client.0.mira061.stderr:
2016-07-30T00:42:37.732 INFO:tasks.workunit.client.0.mira061.stderr:2016-07-30 04:42:37 (5.04 MB/s) - ‘blogbench-1.0.tar.bz2’ saved [103355/103355]
2016-07-30T00:42:37.732 INFO:tasks.workunit.client.0.mira061.stderr:
2016-07-30T00:42:37.732 INFO:tasks.workunit:Stopping ['suites/blogbench.sh'] on client.0...
</pre>
be a problem in the kernel client build, only the kclient had problems:

Also:

http://qa-proxy.ceph.com/teuthology/pdonnell-2016-07-29_12:44:23-multimds-master---basic-mira/340842/
Timeout 3h running suites/blogbench.sh
1 jobs: ['340842']
suites: ['clusters/3-mds.yaml', 'debug/mds_client.yaml', 'fs/btrfs.yaml', 'inline/no.yaml', 'mount/kclient.yaml', 'multimds/basic/{ceph/base.yaml', 'overrides/whitelist_wrongly_marked_down.yaml', 'tasks/suites_blogbench.yaml}']

* http://qa-proxy.ceph.com/teuthology/pdonnell-2016-07-18_20:02:54-multimds-master---basic-mira/321725/
Timeout 3h running suites/fsstress.sh
1 jobs: ['321725']
suites: ['clusters/3-mds.yaml', 'debug/mds_client.yaml', 'fs/btrfs.yaml', 'inline/no.yaml', 'mount/kclient.yaml', 'multimds/basic/{ceph/base.yaml', 'overrides/whitelist_wrongly_marked_down.yaml', 'tasks/suites_fsstress.yaml}']

<pre>
2016-07-18T22:36:20.009 INFO:tasks.workunit.client.0.mira054.stdout:5/995: creat d5/db/dc/d9f/db1/f134 x:0 0 0
2016-07-18T22:36:20.017 INFO:tasks.workunit.client.0.mira054.stdout:5/996: link d5/db/dc/d18/f41 d5/db/dc/d16/d68/dbb/dc3/dc6/f135 0
2016-07-18T22:36:20.017 INFO:tasks.workunit.client.0.mira054.stdout:5/997: write d5/db/dc/d16/d68/dbb/f92 [4623915,122047] 0
2016-07-18T22:36:20.106 INFO:tasks.workunit.client.0.mira054.stdout:5/998: dwrite d5/db/dc/d16/d89/d97/f122 [0,4194304] 0
2016-07-18T22:36:20.116 INFO:tasks.workunit.client.0.mira054.stdout:5/999: symlink d5/db/dc/d4e/d62/l136 0
2016-07-19T01:34:46.305 INFO:tasks.workunit.client.0.mira054.stderr:/home/ubuntu/cephtest/workunit.client.0/suites/fsstress.sh: line 1: 17105 Terminated $command
2016-07-19T01:34:46.306 INFO:tasks.workunit:Stopping ['suites/fsstress.sh'] on client.0...
</pre>

*
http://pulpito.ceph.com/pdonnell-2016-07-18_20:02:54-multimds-master---basic-mira/321819/
Timeout 3h running kernel_untar_build.sh
1 jobs: ['321819']
suites: ['clusters/9-mds.yaml', 'debug/mds_client.yaml', 'fs/btrfs.yaml', 'inline/yes.yaml', 'mount/kclient.yaml', 'multimds/basic/{ceph/base.yaml', 'overrides/whitelist_wrongly_marked_down.yaml', 'tasks/kernel_untar_build.yaml}']

<pre>
2016-07-19T15:05:13.028 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/cpu_pm.h’
2016-07-19T15:05:13.028 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/bcd.h’
2016-07-19T15:05:13.030 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/shmem_fs.h’
2016-07-19T15:05:13.030 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/pm_clock.h’
2016-07-19T15:05:13.031 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/leds_pwm.h’
2016-07-19T15:05:13.031 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/timerfd.h’
2016-07-19T15:05:13.031 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/jz4740-adc.h’
2016-07-19T15:05:13.031 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/omapfb.h’
2016-07-19T15:05:13.031 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/marvell_phy.h’
2016-07-19T15:05:13.032 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/hid-sensor-hub.h’
2016-07-19T15:05:13.032 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/frontswap.h’
2016-07-19T15:05:13.032 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/cgroup.h’
2016-07-19T17:37:29.506 INFO:tasks.workunit.client.0.mira112.stdout:r
2016-07-19T17:37:29.506 INFO:tasks.workunit:Stopping ['kernel_untar_build.sh'] on client.0...
2016-07-19T17:37:29.507 INFO:teuthology.orchestra.run.mira112:Running: 'rm -rf -- /home/ubuntu/cephtest/workunits.list.client.0 /home/ubuntu/cephtest/workunit.client.0 /home/ubuntu/cephtest/clone'
</pre>

Also:

http://pulpito.ceph.com/pdonnell-2016-07-21_13:20:27-multimds-master---basic-mira/327456/
1 jobs: ['327456']
suites: ['clusters/3-mds.yaml', 'debug/mds_client.yaml', 'fs/btrfs.yaml', 'inline/no.yaml', 'mount/kclient.yaml', 'multimds/basic/{ceph/base.yaml', 'overrides/whitelist_wrongly_marked_down.yaml', 'tasks/kernel_untar_build.yaml}']

http://pulpito.ceph.com/pdonnell-2016-07-29_08:28:00-multimds-master---basic-mira/339899/
1 jobs: ['339899']
suites: ['clusters/9-mds.yaml', 'debug/mds_client.yaml', 'fs/btrfs.yaml', 'inline/no.yaml', 'mount/kclient.yaml', 'multimds/basic/{ceph/base.yaml', 'overrides/whitelist_wrongly_marked_down.yaml', 'tasks/kernel_untar_build.yaml}']

http://pulpito.ceph.com/pdonnell-2016-07-29_12:44:23-multimds-master---basic-mira/340753/
1 jobs: ['340753']
suites: ['clusters/3-mds.yaml', 'debug/mds_client.yaml', 'fs/btrfs.yaml', 'inline/no.yaml', 'mount/kclient.yaml', 'multimds/basic/{ceph/base.yaml', 'overrides/whitelist_wrongly_marked_down.yaml', 'tasks/kernel_untar_build.yaml}']

Note: not every one of these kernel_untar_build failures has a an obvious pause at the end of the test log. For example:

http://qa-proxy.ceph.com/teuthology/pdonnell-2016-07-29_08:28:00-multimds-master---basic-mira/339899/teuthology.log

has the pause prior to unpacking the tarball:

<pre>
2016-07-29T16:46:25.488 INFO:tasks.workunit:Running workunit kernel_untar_build.sh...
2016-07-29T16:46:25.488 INFO:teuthology.orchestra.run.mira037:Running (workunit test kernel_untar_build.sh): 'mkdir -p -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=e7cfdc842e731ba964c72a8b3fd07459a1d27c39 TESTDIR="/home/ubuntu/cephtest" CEPH_ARGS="--cluster ceph" CEPH_ID="0" PATH=$PATH:/usr/sbin adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/workunit.client.0/kernel_untar_build.sh'
2016-07-29T19:46:25.582 INFO:tasks.workunit.client.0.mira037.stdout:linux-4.0.5/
2016-07-29T19:46:25.582 INFO:tasks.workunit.client.0.mira037.stdout:linux-4.0.5/.gitignore
2016-07-29T19:46:25.583 INFO:tasks.workunit.client.0.mira037.stdout:linux-4.0.5/.mailmap
2016-07-29T19:46:25.583 INFO:tasks.workunit.client.0.mira037.stdout:linux-4.0.5/COPYING
...
2016-07-29T19:52:45.834 INFO:tasks.workunit.client.0.mira037.stdout:removed ‘linux-4.0.5/include/linux/mfd/pcf50633/core.h’
2016-07-29T19:52:45.834 INFO:tasks.workunit.client.0.mira037.stdout:removed ‘linux-4.0.5/include/linux/mfd/pcf50633/backlight.h’
2016-07-29T19:52:45.835 INFO:tasks.workunit.client.0.mira037.stdout:removed ‘linux-4.0.5/include/linux/mfd/pcf50633/mbc.h’
2016-07-29T19:52:45.836 INFO:tasks.workunit.client.0.mira037.stdout:removed directory: ‘linux-4.0.5/include/linux/mfd/pcf50633’
2016-07-29T19:52:45.837 INFO:tasks.workunit.client.0.mira037.stdout:removed ‘linux-4.0.5/include/linux/mfd/htc-pasic3.h’
2016-07-29T19:52:45.838 INFO:tasks.workunit.client.0.mira037.stdout:removed ‘linux-4.0.5/include/linux/mfd/si476x-platform.h’
2016-07-29T19:52:45.860 INFO:tasks.workunit.client.0.mira037.stdout:removed ‘linux-4.0.5/include/linux/mfd/tps65912.h’
2016-07-29T19:52:45.860 INFO:tasks.workunit.client.0.mira037.stdout:removed ‘linux-4.0.5/include/linux/mfd/ezx-pcap.h’
2016-07-29T19:52:45.881 INFO:tasks.workunit.client.0.mira037.stdout:removed ‘linux-4.0.5/include/linux/mfd/wm8400-private.h’
2016-07-29T19:52:45.881 INFO:tasks.workunit.client.0.mira037.stdout:removed ‘linux-
2016-07-29T19:52:45.948 INFO:tasks.workunit:Stopping ['kernel_untar_build.sh'] on client.0...
</pre>

Back