Project

General

Profile

Bug #16886

Updated by Patrick Donnelly almost 8 years ago

There are strange pauses which are showing up in several tests for the kclient: 

 * http://qa-proxy.ceph.com/teuthology/pdonnell-2016-07-21_13:20:27-multimds-master---basic-mira/327530/ 
	 Timeout 5h running fs/misc/dirfrag.sh 
	 1 jobs: ['327530'] 
	 suites: ['clusters/3-mds.yaml', 'debug/mds_client.yaml', 'fs/btrfs.yaml', 'inline/yes.yaml', 'mount/kclient.yaml', 'multimds/basic/{ceph/base.yaml', 'overrides/whitelist_wrongly_marked_down.yaml', 'tasks/misc.yaml}'] 

 <pre> 
 2016-07-23T20:36:24.517 INFO:tasks.workunit:Running workunit fs/misc/dirfrag.sh... 2016-07-19T15:05:13.028 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/cpu_pm.h’ 
 2016-07-23T20:36:24.517 INFO:teuthology.orchestra.run.mira066:Running (workunit test fs/misc/dirfrag.sh): 'mkdir -p -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=a0294e64507a7916fdd9707ae22ba40b0d7b65d1 TESTDIR="/home/ubuntu/cephtest" CEPH_ARGS="--cluster ceph" CEPH_ID="0" PATH=$PATH:/usr/sbin adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 5h /home/ubuntu/cephtest/workunit.client.0/fs/misc/dirfrag.sh' 2016-07-19T15:05:13.028 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/bcd.h’ 
 2016-07-23T20:36:24.643 INFO:tasks.workunit.client.0.mira066.stdout:creating folder hierarchy 2016-07-19T15:05:13.030 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/shmem_fs.h’ 
 2016-07-23T20:40:39.985 INFO:tasks.workunit.client.0.mira066.stdout:created hierarchy, now cleaning up 2016-07-19T15:05:13.030 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/pm_clock.h’ 
 2016-07-24T01:36:24.635 2016-07-19T15:05:13.031 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/leds_pwm.h’ 
 2016-07-19T15:05:13.031 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/timerfd.h’ 
 2016-07-19T15:05:13.031 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/jz4740-adc.h’ 
 2016-07-19T15:05:13.031 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/omapfb.h’ 
 2016-07-19T15:05:13.031 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/marvell_phy.h’ 
 2016-07-19T15:05:13.032 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/hid-sensor-hub.h’ 
 2016-07-19T15:05:13.032 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/frontswap.h’ 
 2016-07-19T15:05:13.032 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/cgroup.h’ 
 2016-07-19T17:37:29.506 INFO:tasks.workunit.client.0.mira112.stdout:r 
 2016-07-19T17:37:29.506 INFO:tasks.workunit:Stopping ['fs/misc'] ['kernel_untar_build.sh'] on client.0... 
 2016-07-24T01:36:24.635 INFO:teuthology.orchestra.run.mira066:Running: 2016-07-19T17:37:29.507 INFO:teuthology.orchestra.run.mira112:Running: 'rm -rf -- /home/ubuntu/cephtest/workunits.list.client.0 /home/ubuntu/cephtest/workunit.client.0 /home/ubuntu/cephtest/clone' 
 2016-07-24T01:36:24.913 2016-07-19T17:37:29.783 ERROR:teuthology.parallel:Exception in parallel execution 
 Traceback (most recent call last): 
   File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 83, in __exit__ 
     for result in self: 
   File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 101, in next 
     resurrect_traceback(result) 
   File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 19, in capture_traceback 
     return func(*args, **kwargs) 
   File "/var/lib/teuthworker/src/ceph-qa-suite_master/tasks/workunit.py", line 404, in _run_tests 
     label="workunit test {workunit}".format(workunit=workunit) 
   File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/remote.py", line 196, in run 
     r = self._runner(client=self.ssh, name=self.shortname, **kwargs) 
   File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/run.py", line 394, in run 
     r.wait() 
   File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/run.py", line 121, in wait 
     label=self.label) 
 CommandFailedError: Command failed (workunit test fs/misc/dirfrag.sh) kernel_untar_build.sh) on mira066 mira112 with status 124: 'mkdir -p -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=a0294e64507a7916fdd9707ae22ba40b0d7b65d1 CEPH_REF=12c068365c43a140fe1fe23bf68318342710e84d TESTDIR="/home/ubuntu/cephtest" CEPH_ARGS="--cluster ceph" CEPH_ID="0" PATH=$PATH:/usr/sbin adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 5h /home/ubuntu/cephtest/workunit.client.0/fs/misc/dirfrag.sh' 3h /home/ubuntu/cephtest/workunit.client.0/kernel_untar_build.sh' 
 </pre> 

 * http://qa-proxy.ceph.com/teuthology/pdonnell-2016-07-29_08:28:00-multimds-master---basic-mira/339920/ 
	 Timeout 3h running suites/blogbench.sh 
	 1 jobs: ['339920'] 
	 suites: ['clusters/3-mds.yaml', 'debug/mds_client.yaml', 'fs/btrfs.yaml', 'inline/no.yaml', 'mount/kclient.yaml', 'multimds/basic/{ceph/base.yaml', 'overrides/whitelist_wrongly_marked_down.yaml', 'tasks/suites_blogbench.yaml}'] 


 <pre> 
 2016-07-29T21:42:37.528 INFO:tasks.workunit:Running workunits matching suites/blogbench.sh on client.0... 
 2016-07-29T21:42:37.528 INFO:tasks.workunit:Running workunit suites/blogbench.sh... 
 2016-07-29T21:42:37.528 INFO:teuthology.orchestra.run.mira061:Running (workunit test suites/blogbench.sh): 'mkdir -p -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=e7cfdc842e731ba964c72a8b3fd07459a1d27c39 TESTDIR="/home/ubuntu/cephtest" CEPH_ARGS="--cluster ceph" CEPH_ID="0" PATH=$PATH:/usr/sbin adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/workunit.client.0/suites/blogbench.sh' 
 2016-07-30T00:42:37.635 INFO:tasks.workunit.client.0.mira061.stdout:getting blogbench 
 2016-07-30T00:42:37.673 INFO:tasks.workunit.client.0.mira061.stdout:blogbench-1.0/ 
 2016-07-30T00:42:37.673 INFO:tasks.workunit.client.0.mira061.stdout:blogbench-1.0/m4/ 
 ... 
 2016-07-30T00:42:37.730 INFO:tasks.workunit.client.0.mira061.stderr:Connecting This appears to download.ceph.com (download.ceph.com)|173.236.253.173|:80... connected. 
 2016-07-30T00:42:37.730 INFO:tasks.workunit.client.0.mira061.stderr:HTTP request sent, awaiting response... 200 OK 
 2016-07-30T00:42:37.730 INFO:tasks.workunit.client.0.mira061.stderr:Length: 103355 (101K) [application/octet-stream] 
 2016-07-30T00:42:37.730 INFO:tasks.workunit.client.0.mira061.stderr:Saving to: ‘blogbench-1.0.tar.bz2’ 
 2016-07-30T00:42:37.731 INFO:tasks.workunit.client.0.mira061.stderr: 
 2016-07-30T00:42:37.731 INFO:tasks.workunit.client.0.mira061.stderr:       0K .......... .......... .......... .......... .......... 49% 4.43M 0s 
 2016-07-30T00:42:37.731 INFO:tasks.workunit.client.0.mira061.stderr:      50K .......... .......... .......... .......... .......... 99% 5.73M 0s 
 2016-07-30T00:42:37.731 INFO:tasks.workunit.client.0.mira061.stderr:     100K                                                         100% 1779G=0.02s 
 2016-07-30T00:42:37.732 INFO:tasks.workunit.client.0.mira061.stderr: 
 2016-07-30T00:42:37.732 INFO:tasks.workunit.client.0.mira061.stderr:2016-07-30 04:42:37 (5.04 MB/s) - ‘blogbench-1.0.tar.bz2’ saved [103355/103355] 
 2016-07-30T00:42:37.732 INFO:tasks.workunit.client.0.mira061.stderr: 
 2016-07-30T00:42:37.732 INFO:tasks.workunit:Stopping ['suites/blogbench.sh'] on client.0... 
 </pre> be a problem in the kernel client build, only the kclient had problems: 

 Also: 

 http://qa-proxy.ceph.com/teuthology/pdonnell-2016-07-29_12:44:23-multimds-master---basic-mira/340842/ 
	 Timeout 3h running suites/blogbench.sh 
	 1 jobs: ['340842'] 
	 suites: ['clusters/3-mds.yaml', 'debug/mds_client.yaml', 'fs/btrfs.yaml', 'inline/no.yaml', 'mount/kclient.yaml', 'multimds/basic/{ceph/base.yaml', 'overrides/whitelist_wrongly_marked_down.yaml', 'tasks/suites_blogbench.yaml}'] 


 * http://qa-proxy.ceph.com/teuthology/pdonnell-2016-07-18_20:02:54-multimds-master---basic-mira/321725/ 
	 Timeout 3h running suites/fsstress.sh 
	 1 jobs: ['321725'] 
	 suites: ['clusters/3-mds.yaml', 'debug/mds_client.yaml', 'fs/btrfs.yaml', 'inline/no.yaml', 'mount/kclient.yaml', 'multimds/basic/{ceph/base.yaml', 'overrides/whitelist_wrongly_marked_down.yaml', 'tasks/suites_fsstress.yaml}'] 

 <pre> 
 2016-07-18T22:36:20.009 INFO:tasks.workunit.client.0.mira054.stdout:5/995: creat d5/db/dc/d9f/db1/f134 x:0 0 0 
 2016-07-18T22:36:20.017 INFO:tasks.workunit.client.0.mira054.stdout:5/996: link d5/db/dc/d18/f41 d5/db/dc/d16/d68/dbb/dc3/dc6/f135 0 
 2016-07-18T22:36:20.017 INFO:tasks.workunit.client.0.mira054.stdout:5/997: write d5/db/dc/d16/d68/dbb/f92 [4623915,122047] 0 
 2016-07-18T22:36:20.106 INFO:tasks.workunit.client.0.mira054.stdout:5/998: dwrite d5/db/dc/d16/d89/d97/f122 [0,4194304] 0 
 2016-07-18T22:36:20.116 INFO:tasks.workunit.client.0.mira054.stdout:5/999: symlink d5/db/dc/d4e/d62/l136 0 
 2016-07-19T01:34:46.305 INFO:tasks.workunit.client.0.mira054.stderr:/home/ubuntu/cephtest/workunit.client.0/suites/fsstress.sh: line 1: 17105 Terminated                $command 
 2016-07-19T01:34:46.306 INFO:tasks.workunit:Stopping ['suites/fsstress.sh'] on client.0... 
 </pre> 

 * http://pulpito.ceph.com/pdonnell-2016-07-18_20:02:54-multimds-master---basic-mira/321819/ 
     Timeout 3h running kernel_untar_build.sh 
     1 jobs: ['321819'] 
     suites: ['clusters/9-mds.yaml', 'debug/mds_client.yaml', 'fs/btrfs.yaml', 'inline/yes.yaml', 'mount/kclient.yaml', 'multimds/basic/{ceph/base.yaml', 'overrides/whitelist_wrongly_marked_down.yaml', 'tasks/kernel_untar_build.yaml}'] 

 <pre> 
 2016-07-19T15:05:13.028 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/cpu_pm.h’ 
 2016-07-19T15:05:13.028 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/bcd.h’ 
 2016-07-19T15:05:13.030 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/shmem_fs.h’ 
 2016-07-19T15:05:13.030 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/pm_clock.h’ 
 2016-07-19T15:05:13.031 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/leds_pwm.h’ 
 2016-07-19T15:05:13.031 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/timerfd.h’ 
 2016-07-19T15:05:13.031 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/jz4740-adc.h’ 
 2016-07-19T15:05:13.031 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/omapfb.h’ 
 2016-07-19T15:05:13.031 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/marvell_phy.h’ 
 2016-07-19T15:05:13.032 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/hid-sensor-hub.h’ 
 2016-07-19T15:05:13.032 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/frontswap.h’ 
 2016-07-19T15:05:13.032 INFO:tasks.workunit.client.0.mira112.stdout:removed ‘linux-4.0.5/include/linux/cgroup.h’ 
 2016-07-19T17:37:29.506 INFO:tasks.workunit.client.0.mira112.stdout:r 
 2016-07-19T17:37:29.506 INFO:tasks.workunit:Stopping ['kernel_untar_build.sh'] on client.0... 
 2016-07-19T17:37:29.507 INFO:teuthology.orchestra.run.mira112:Running: 'rm -rf -- /home/ubuntu/cephtest/workunits.list.client.0 /home/ubuntu/cephtest/workunit.client.0 /home/ubuntu/cephtest/clone' 
 </pre> 

 Also: 

 http://pulpito.ceph.com/pdonnell-2016-07-21_13:20:27-multimds-master---basic-mira/327456/ 
     1 jobs: ['327456'] 
     suites: ['clusters/3-mds.yaml', 'debug/mds_client.yaml', 'fs/btrfs.yaml', 'inline/no.yaml', 'mount/kclient.yaml', 'multimds/basic/{ceph/base.yaml', 'overrides/whitelist_wrongly_marked_down.yaml', 'tasks/kernel_untar_build.yaml}'] 

 http://pulpito.ceph.com/pdonnell-2016-07-29_08:28:00-multimds-master---basic-mira/339899/ 
     1 jobs: ['339899'] 
     suites: ['clusters/9-mds.yaml', 'debug/mds_client.yaml', 'fs/btrfs.yaml', 'inline/no.yaml', 'mount/kclient.yaml', 'multimds/basic/{ceph/base.yaml', 'overrides/whitelist_wrongly_marked_down.yaml', 'tasks/kernel_untar_build.yaml}'] 

 http://pulpito.ceph.com/pdonnell-2016-07-29_12:44:23-multimds-master---basic-mira/340753/ 
     1 jobs: ['340753'] 
     suites: ['clusters/3-mds.yaml', 'debug/mds_client.yaml', 'fs/btrfs.yaml', 'inline/no.yaml', 'mount/kclient.yaml', 'multimds/basic/{ceph/base.yaml', 'overrides/whitelist_wrongly_marked_down.yaml', 'tasks/kernel_untar_build.yaml}'] 

 Note: not every one of these kernel_untar_build failures has a an obvious pause at the end of the test log. For example: 

 http://qa-proxy.ceph.com/teuthology/pdonnell-2016-07-29_08:28:00-multimds-master---basic-mira/339899/teuthology.log 

 has the pause prior to unpacking the tarball: 

 <pre> 
 2016-07-29T16:46:25.488 INFO:tasks.workunit:Running workunit kernel_untar_build.sh... 
 2016-07-29T16:46:25.488 INFO:teuthology.orchestra.run.mira037:Running (workunit test kernel_untar_build.sh): 'mkdir -p -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=e7cfdc842e731ba964c72a8b3fd07459a1d27c39 TESTDIR="/home/ubuntu/cephtest" CEPH_ARGS="--cluster ceph" CEPH_ID="0" PATH=$PATH:/usr/sbin adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/workunit.client.0/kernel_untar_build.sh' 
 2016-07-29T19:46:25.582 INFO:tasks.workunit.client.0.mira037.stdout:linux-4.0.5/ 
 2016-07-29T19:46:25.582 INFO:tasks.workunit.client.0.mira037.stdout:linux-4.0.5/.gitignore 
 2016-07-29T19:46:25.583 INFO:tasks.workunit.client.0.mira037.stdout:linux-4.0.5/.mailmap 
 2016-07-29T19:46:25.583 INFO:tasks.workunit.client.0.mira037.stdout:linux-4.0.5/COPYING 
 ... 
 2016-07-29T19:52:45.834 INFO:tasks.workunit.client.0.mira037.stdout:removed ‘linux-4.0.5/include/linux/mfd/pcf50633/core.h’ 
 2016-07-29T19:52:45.834 INFO:tasks.workunit.client.0.mira037.stdout:removed ‘linux-4.0.5/include/linux/mfd/pcf50633/backlight.h’ 
 2016-07-29T19:52:45.835 INFO:tasks.workunit.client.0.mira037.stdout:removed ‘linux-4.0.5/include/linux/mfd/pcf50633/mbc.h’ 
 2016-07-29T19:52:45.836 INFO:tasks.workunit.client.0.mira037.stdout:removed directory: ‘linux-4.0.5/include/linux/mfd/pcf50633’ 
 2016-07-29T19:52:45.837 INFO:tasks.workunit.client.0.mira037.stdout:removed ‘linux-4.0.5/include/linux/mfd/htc-pasic3.h’ 
 2016-07-29T19:52:45.838 INFO:tasks.workunit.client.0.mira037.stdout:removed ‘linux-4.0.5/include/linux/mfd/si476x-platform.h’ 
 2016-07-29T19:52:45.860 INFO:tasks.workunit.client.0.mira037.stdout:removed ‘linux-4.0.5/include/linux/mfd/tps65912.h’ 
 2016-07-29T19:52:45.860 INFO:tasks.workunit.client.0.mira037.stdout:removed ‘linux-4.0.5/include/linux/mfd/ezx-pcap.h’ 
 2016-07-29T19:52:45.881 INFO:tasks.workunit.client.0.mira037.stdout:removed ‘linux-4.0.5/include/linux/mfd/wm8400-private.h’ 
 2016-07-29T19:52:45.881 INFO:tasks.workunit.client.0.mira037.stdout:removed ‘linux- 
 2016-07-29T19:52:45.948 INFO:tasks.workunit:Stopping ['kernel_untar_build.sh'] on client.0... 
 </pre>

Back