Actions
Bug #19636
closedupgrade:client-upgrade/{hammer,jewel}-client-x/rbd failing in kraken 11.2.1 integration testing
Status:
Resolved
Priority:
High
Assignee:
Jason Dillaman
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
kraken
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
test descriptions:
- upgrade:client-upgrade/hammer-client-x/rbd/{0-cluster/start.yaml 1-install/hammer-client-x.yaml 2-workload/rbd_notification_tests.yaml}
- upgrade:client-upgrade/jewel-client-x/rbd/{0-cluster/start.yaml 1-install/jewel-client-x.yaml 2-workload/rbd_notification_tests.yaml}
test failure reproducible? YES
test URLs:
- http://pulpito.ceph.com/smithfarm-2017-04-16_18:30:46-upgrade:client-upgrade-wip-kraken-backports-distro-basic-vps/1033423/
- http://pulpito.ceph.com/smithfarm-2017-04-16_18:30:46-upgrade:client-upgrade-wip-kraken-backports-distro-basic-vps/1033424/
test cluster:
roles: - - mon.a - mon.b - mon.c - osd.0 - osd.1 - osd.2 - client.0 - - client.1
what seems to happen:
- hammer is installed on both nodes
- the "client.1" node is upgraded to kraken (wip-kraken-backports)
- ceph task runs
- the "hammer" version of "rbd/notify_master.sh" is run on client.0 and the "hammer" version of "rbd/notify_slave.sh" is run on client.1 (afaict the hammer and kraken versions of these scripts are identical)
2017-04-16T18:44:20.785 INFO:tasks.workunit.client.0.vpm175.stderr:+ dirname /home/ubuntu/cephtest/clone.client.0/qa/workunits/rbd/notify_master.sh 2017-04-16T18:44:20.786 INFO:tasks.workunit.client.0.vpm175.stderr:+ relpath=/home/ubuntu/cephtest/clone.client.0/qa/workunits/rbd/../../../src/test/librbd 2017-04-16T18:44:20.787 INFO:tasks.workunit.client.0.vpm175.stderr:+ python /home/ubuntu/cephtest/clone.client.0/qa/workunits/rbd/../../../src/test/librbd/test_notify.py master
Three hours later, they are stopped (timeout):
2017-04-16T21:44:02.648 INFO:tasks.workunit:Stopping ['rbd/notify_master.sh'] on client.0... 2017-04-16T21:44:02.649 INFO:teuthology.orchestra.run.vpm089:Running: 'rm -rf -- /home/ubuntu/cephtest/workunits.list.client.0 /home/ubuntu/cephtest/clone.client.0' 2017-04-16T21:44:02.668 INFO:tasks.workunit:Stopping ['rbd/notify_slave.sh'] on client.1... 2017-04-16T21:44:02.669 INFO:teuthology.orchestra.run.vpm101:Running: 'rm -rf -- /home/ubuntu/cephtest/workunits.list.client.1 /home/ubuntu/cephtest/clone.client.1'
And the job fails with an exception:
CommandFailedError: Command failed (workunit test rbd/notify_master.sh) on vpm089 with status 124: 'mkdir -p -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=hammer TESTDIR="/home/ubuntu/cephtest" CEPH_ARGS="--cluster ceph" CEPH_ID="0" PATH=$PATH:/usr/sbin CEPH_BASE=/home/ubuntu/cephtest/clone.client.0 RBD_FEATURES=13 adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/clone.client.0/qa/workunits/rbd/notify_master.sh'
Actions