Bug #10270
"[ FAILED ] LibRBD.ListChildren" in upgrade:firefly-x-giant-distro-basic-multi run
0%
Description
2014-12-07T18:34:16.978 INFO:tasks.rados.rados.0.plana54.stdout:593: finishing copy_from to plana5413911-17 2014-12-07T18:34:16.978 INFO:tasks.rados.rados.0.plana54.stdout:update_object_version oid 17 v 247 (ObjNum 123 snap 28 seq_num 123) dirty exists 2014-12-07T18:34:16.984 INFO:tasks.rados.rados.0.plana54.stdout:596: expect (ObjNum 155 snap 37 seq_num 155) 2014-12-07T18:34:17.161 INFO:tasks.workunit.client.3.plana54.stdout:[ OK ] LibRBD.ZeroLengthRead (3557 ms) 2014-12-07T18:34:17.161 INFO:tasks.workunit.client.3.plana54.stdout:[----------] 26 tests from LibRBD (242947 ms total) 2014-12-07T18:34:17.161 INFO:tasks.workunit.client.3.plana54.stdout: 2014-12-07T18:34:17.161 INFO:tasks.workunit.client.3.plana54.stdout:[----------] Global test environment tear-down 2014-12-07T18:34:17.162 INFO:tasks.workunit.client.3.plana54.stdout:[==========] 26 tests from 1 test case ran. (242947 ms total) 2014-12-07T18:34:17.162 INFO:tasks.workunit.client.3.plana54.stdout:[ PASSED ] 25 tests. 2014-12-07T18:34:17.162 INFO:tasks.workunit.client.3.plana54.stdout:[ FAILED ] 1 test, listed below: 2014-12-07T18:34:17.162 INFO:tasks.workunit.client.3.plana54.stdout:[ FAILED ] LibRBD.ListChildren 2014-12-07T18:34:17.162 INFO:tasks.workunit.client.3.plana54.stdout: 2014-12-07T18:34:17.163 INFO:tasks.workunit.client.3.plana54.stdout: 1 FAILED TEST 2014-12-07T18:34:17.163 INFO:tasks.workunit:Stopping ['rbd/test_librbd.sh'] on client.3... 2014-12-07T18:34:17.164 INFO:teuthology.orchestra.run.plana54:Running: 'rm -rf -- /home/ubuntu/cephtest/workunits.list /home/ubuntu/cephtest/workunit.client.3' 2014-12-07T18:34:17.237 ERROR:teuthology.parallel:Exception in parallel execution Traceback (most recent call last): File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 82, in __exit__ for result in self: File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 101, in next resurrect_traceback(result) File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 19, in capture_traceback return func(*args, **kwargs) File "/var/lib/teuthworker/src/ceph-qa-suite_giant/tasks/workunit.py", line 359, in _run_tests args=args, File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/remote.py", line 128, in run r = self._runner(client=self.ssh, name=self.shortname, **kwargs) File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/run.py", line 368, in run r.wait() File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/run.py", line 106, in wait exitstatus=status, node=self.hostname) CommandFailedError: Command failed on plana54 with status 1: 'mkdir -p -- /home/ubuntu/cephtest/mnt.3/client.3/tmp && cd -- /home/ubuntu/cephtest/mnt.3/client.3/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=firefly TESTDIR="/home/ubuntu/cephtest" CEPH_ID="3" adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/workunit.client.3/rbd/test_librbd.sh' 2014-12-07T18:34:17.262 ERROR:teuthology.parallel:Exception in parallel execution Traceback (most recent call last): File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 82, in __exit__ for result in self: File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 101, in next resurrect_traceback(result) File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 19, in capture_traceback return func(*args, **kwargs) File "/home/teuthworker/src/teuthology_master/teuthology/task/parallel.py", line 50, in _run_spawned mgr = run_tasks.run_one_task(taskname, ctx=ctx, config=config) File "/home/teuthworker/src/teuthology_master/teuthology/run_tasks.py", line 41, in run_one_task return fn(**kwargs) File "/var/lib/teuthworker/src/ceph-qa-suite_giant/tasks/workunit.py", line 105, in task config.get('env'), timeout=timeout) File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 82, in __exit__ for result in self: File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 101, in next resurrect_traceback(result) File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 19, in capture_traceback return func(*args, **kwargs) File "/var/lib/teuthworker/src/ceph-qa-suite_giant/tasks/workunit.py", line 359, in _run_tests args=args, File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/remote.py", line 128, in run r = self._runner(client=self.ssh, name=self.shortname, **kwargs) File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/run.py", line 368, in run r.wait() File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/run.py", line 106, in wait exitstatus=status, node=self.hostname) CommandFailedError: Command failed on plana54 with status 1: 'mkdir -p -- /home/ubuntu/cephtest/mnt.3/client.3/tmp && cd -- /home/ubuntu/cephtest/mnt.3/client.3/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=firefly TESTDIR="/home/ubuntu/cephtest" CEPH_ID="3" adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/workunit.client.3/rbd/test_librbd.sh' 2014-12-07T18:34:17.263 ERROR:teuthology.parallel:Exception in parallel execution Traceback (most recent call last): File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 82, in __exit__ for result in self: File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 98, in next result = self.results.get() File "/usr/lib/python2.7/dist-packages/gevent/queue.py", line 190, in get return waiter.get() File "/usr/lib/python2.7/dist-packages/gevent/hub.py", line 321, in get return get_hub().switch() File "/usr/lib/python2.7/dist-packages/gevent/hub.py", line 164, in switch return greenlet.switch(self) GreenletExit 2014-12-07T18:34:17.264 ERROR:teuthology.parallel:Exception in parallel execution Traceback (most recent call last): File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 82, in __exit__ for result in self: File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 98, in next result = self.results.get() File "/usr/lib/python2.7/dist-packages/gevent/queue.py", line 190, in get return waiter.get() File "/usr/lib/python2.7/dist-packages/gevent/hub.py", line 321, in get return get_hub().switch() File "/usr/lib/python2.7/dist-packages/gevent/hub.py", line 164, in switch return greenlet.switch(self) GreenletExit 2014-12-07T18:34:17.264 INFO:tasks.workunit:Stopping ['rbd/test_librbd_python.sh'] on client.4... 2014-12-07T18:34:17.265 INFO:teuthology.orchestra.run.plana54:Running: 'rm -rf -- /home/ubuntu/cephtest/workunits.list /home/ubuntu/cephtest/workunit.client.4' 2014-12-07T18:34:17.266 INFO:tasks.workunit:Stopping ['rados/load-gen-big.sh'] on client.2... 2014-12-07T18:34:17.266 INFO:teuthology.orchestra.run.plana54:Running: 'rm -rf -- /home/ubuntu/cephtest/workunits.list /home/ubuntu/cephtest/workunit.client.2' 2014-12-07T18:34:17.288 ERROR:teuthology.parallel:Exception in parallel execution Traceback (most recent call last): File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 82, in __exit__ for result in self: File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 101, in next resurrect_traceback(result) File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 19, in capture_traceback return func(*args, **kwargs) File "/home/teuthworker/src/teuthology_master/teuthology/task/parallel.py", line 50, in _run_spawned mgr = run_tasks.run_one_task(taskname, ctx=ctx, config=config) File "/home/teuthworker/src/teuthology_master/teuthology/run_tasks.py", line 41, in run_one_task return fn(**kwargs) File "/home/teuthworker/src/teuthology_master/teuthology/task/parallel.py", line 43, in task p.spawn(_run_spawned, ctx, confg, taskname) File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 89, in __exit__ raise CommandFailedError: Command failed on plana54 with status 1: 'mkdir -p -- /home/ubuntu/cephtest/mnt.3/client.3/tmp && cd -- /home/ubuntu/cephtest/mnt.3/client.3/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=firefly TESTDIR="/home/ubuntu/cephtest" CEPH_ID="3" adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/workunit.client.3/rbd/test_librbd.sh' 2014-12-07T18:34:17.289 ERROR:teuthology.run_tasks:Saw exception from tasks. Traceback (most recent call last): File "/home/teuthworker/src/teuthology_master/teuthology/run_tasks.py", line 53, in run_tasks manager = run_one_task(taskname, ctx=ctx, config=config) File "/home/teuthworker/src/teuthology_master/teuthology/run_tasks.py", line 41, in run_one_task return fn(**kwargs) File "/home/teuthworker/src/teuthology_master/teuthology/task/parallel.py", line 43, in task p.spawn(_run_spawned, ctx, confg, taskname) File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 89, in __exit__ raise CommandFailedError: Command failed on plana54 with status 1: 'mkdir -p -- /home/ubuntu/cephtest/mnt.3/client.3/tmp && cd -- /home/ubuntu/cephtest/mnt.3/client.3/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=firefly TESTDIR="/home/ubuntu/cephtest" CEPH_ID="3" adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/workunit.client.3/rbd/test_librbd.sh' 2014-12-07T18:34:17.290 DEBUG:teuthology.run_tasks:Unwinding manager ceph 2014-12-07T18:34:17.290 INFO:teuthology.orchestra.run.burnupi31:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph pg dump --format json' 2014-12-07T18:34:17.461 INFO:teuthology.orchestra.run.burnupi31.stderr:dumped all in format json 2014-12-07T18:34:17.476 INFO:tasks.ceph:Waiting for all osds to be active and clean.
Related issues
Associated revisions
librbd: gracefully handle deleted/renamed pools
snap_unprotect and list_children both attempt to scan all
pools. If a pool is deleted or renamed during the scan,
the methods would previously return -ENOENT. Both methods
have been modified to more gracefully handle this condition.
Fixes: #10270
Backport: giant, firefly
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
librbd: gracefully handle deleted/renamed pools
snap_unprotect and list_children both attempt to scan all
pools. If a pool is deleted or renamed during the scan,
the methods would previously return -ENOENT. Both methods
have been modified to more gracefully handle this condition.
Fixes: #10270
Backport: giant, firefly
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
librbd: gracefully handle deleted/renamed pools
snap_unprotect and list_children both attempt to scan all
pools. If a pool is deleted or renamed during the scan,
the methods would previously return -ENOENT. Both methods
have been modified to more gracefully handle this condition.
Fixes: #10270
Backport: giant, firefly
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 436923c68b77c900b7774fbef918c0d6e1614a36)
librbd: gracefully handle deleted/renamed pools
snap_unprotect and list_children both attempt to scan all
pools. If a pool is deleted or renamed during the scan,
the methods would previously return -ENOENT. Both methods
have been modified to more gracefully handle this condition.
Fixes: #10270, #10122
Backport: giant, firefly
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 436923c68b77c900b7774fbef918c0d6e1614a36)
History
#1 Updated by Sage Weil over 9 years ago
- Project changed from Ceph to rbd
- Priority changed from Normal to Urgent
#2 Updated by Yuri Weinstein over 9 years ago
Same issue in run http://pulpito.front.sepia.ceph.com/teuthology-2014-12-09_13:52:17-upgrade:firefly-x-giant-distro-basic-vps/
Jobs ['645326', '645329', '645331']
2014-12-09T14:09:25.852 INFO:tasks.workunit.client.3.vpm014.stdout:[ FAILED ] 1 test, listed below: 2014-12-09T14:09:25.853 INFO:tasks.workunit.client.3.vpm014.stdout:[ FAILED ] LibRBD.ListChildren 2014-12-09T14:09:25.853 INFO:tasks.workunit.client.3.vpm014.stdout: 2014-12-09T14:09:25.853 INFO:tasks.workunit.client.3.vpm014.stdout: 1 FAILED TEST
#3 Updated by Jason Dillaman over 9 years ago
- Assignee set to Jason Dillaman
#4 Updated by Jason Dillaman over 9 years ago
The 'LibRBD.ListChildren' test failed because other tests running in the background (cls_rgw and cls_rbd) deleted the temporary pool they created while rbd_list_children was attempting to iterate through all available pools.
#5 Updated by Jason Dillaman over 9 years ago
- Status changed from New to Fix Under Review
#6 Updated by Jason Dillaman over 9 years ago
- Backport set to giant,firefly
#8 Updated by Yuri Weinstein over 9 years ago
#9 Updated by Yuri Weinstein over 9 years ago
Same issue in run http://pulpito.ceph.com/teuthology-2014-12-21_17:13:02-upgrade:firefly-x-next-distro-basic-multi/
Jobs ['671668', '671669']
#10 Updated by Josh Durgin about 9 years ago
- Status changed from Fix Under Review to Pending Backport
commit:53929ba1751fad9c9cd8545c4cd6985982d2eb5f
#12 Updated by Loïc Dachary about 9 years ago
<loicd> jdillaman: regarding http://tracker.ceph.com/issues/10270 do you have a backport somewhere already ? I tried to cherry-pick -x commit:53929ba1751fad9c9cd8545c4cd6985982d2eb5f but it's non trivial. <loicd> I mean for giant :-) <jdillaman> loicd: i think the goal was to do something along the lines of revision ec5d8c7a for the backports <jdillaman> loicd: but instead of skipping pools it has already checked on a retry, it should rescan all pools <jdillaman> loicd: whoops — meant revision commit:c94f1aae
#13 Updated by Yuri Weinstein about 9 years ago
Same
Run http://pulpito.ceph.com/teuthology-2015-01-16_18:13:02-upgrade:firefly-x-giant-distro-basic-multi/
Jobs ['707806', '707807']
#15 Updated by Loïc Dachary about 9 years ago
- Backport changed from giant,firefly to giant,firefly,dumpling
#18 Updated by Loïc Dachary about 9 years ago
- Status changed from Pending Backport to Resolved
#19 Updated by Yuri Weinstein about 9 years ago
Run: http://pulpito.ceph.com/teuthology-2015-02-20_18:13:01-upgrade:firefly-x-giant-distro-basic-multi/
Job: 771823
Logs: http://qa-proxy.ceph.com/teuthology/teuthology-2015-02-20_18:13:01-upgrade:firefly-x-giant-distro-basic-multi/771823/teuthology.log
2015-02-21T01:23:33.614 INFO:tasks.rados.rados.0.plana78.stdout:611: write oid 381 current snap is 12 2015-02-21T01:23:33.614 INFO:tasks.rados.rados.0.plana78.stdout:611: seq_num 528 ranges {730457=772628,2121222=209235} 2015-02-21T01:23:33.614 INFO:tasks.workunit.client.3.plana78.stdout:[ FAILED ] LibRBD.ListChildren (10918 ms)
#20 Updated by Yuri Weinstein about 9 years ago
- Status changed from Resolved to New
#21 Updated by Jason Dillaman about 9 years ago
- Status changed from New to Pending Backport
Still awaiting backport to Firefly: https://github.com/ceph/ceph/pull/3404
#23 Updated by Josh Durgin about 9 years ago
- Status changed from Pending Backport to Resolved