Bug #15214
closed"IndexError: list index out of range" in powercycle-jewel-testing-basic-smithi
0%
Description
Run: http://pulpito.ceph.com/teuthology-2016-03-18_14:41:27-powercycle-jewel-testing-basic-smithi/
Jobs: ['72750', '72755', '72757', '72761', '72766', '72771', '72773']
Logs: http://qa-proxy.ceph.com/teuthology/teuthology-2016-03-18_14:41:27-powercycle-jewel-testing-basic-smithi/72757/teuthology.log
2016-03-19T04:00:57.919 INFO:teuthology.task.ansible.out:[0;31mfailed: [smithi008.front.sepia.ceph.com] => (item=boost-random,boost-program-options,leveldb,xmlstarlet,python-jinja2,python-ceph,python-flask,python-requests,boost-random,python-urllib3,python-babel,hdparm,python-markupsafe,python-werkzeug,python-itsdangerous) => {"changed": false, "failed": true, "item": "boost-random,boost-program-options,leveldb,xmlstarlet,python-jinja2,python-ceph,python-flask,python-requests,boost-random,python-urllib3,python-babel,hdparm,python-markupsafe,python-werkzeug,python-itsdangerous", "rc": 1, "results": ["xmlstarlet is not installed", "python-ceph is not installed", "python-flask is not installed", "hdparm is not installed", "Loaded plugins: fastestmirror, langpacks, priorities\n"]}[0m [0;31mmsg: Traceback (most recent call last): File "/usr/bin/yum", line 29, in <module> yummain.user_main(sys.argv[1:], exit_code=True) File "/usr/share/yum-cli/yummain.py", line 365, in user_main errcode = main(args) File "/usr/share/yum-cli/yummain.py", line 174, in main result, resultmsgs = base.doCommands() 2016-03-19T04:00:57.921 INFO:teuthology.task.ansible.out: File "/usr/share/yum-cli/cli.py", line 573, in doCommands return self.yum_cli_commands[self.basecmd].doCommand(self, self.basecmd, self.extcmds) File "/usr/share/yum-cli/yumcommands.py", line 878, in doCommand ret = base.erasePkgs(extcmds, pos=pos, basecmd=basecmd) File "/usr/share/yum-cli/cli.py", line 1198, in erasePkgs rms = self.remove(pattern=arg) File "/usr/lib/python2.7/site-packages/yum/__init__.py", line 5399, in remove (e,m,u) = self.rpmdb.matchPackageNames([kwargs['pattern']]) File "/usr/lib/python2.7/site-packages/yum/packageSack.py", line 304, in matchPackageNames exactmatch.append(self.searchPkgTuple(pkgtup)[0]) IndexError: list index out of range
Updated by Yuri Weinstein about 8 years ago
- Project changed from ceph-cm-ansible to sepia
- Assignee set to David Galloway
suspect smithi007 and 008 005 028 at fault, marked down.
Updated by David Galloway about 8 years ago
- Category set to Test Node
- Status changed from New to In Progress
Machines are getting forcefully rebooted during yum transactions. The tests need to be modified or the rpmdb is just going to continue getting corrupted during these powercycle runs.
Using smithi008 as an example...
http://pulpito.ceph.com/teuthology-2016-03-18_14:41:27-powercycle-jewel-testing-basic-smithi/72776/
The testnode is in UTC and the logs are in PST so keep that in mind.
[root@smithi008 ~]# yum history list Loaded plugins: fastestmirror, langpacks, priorities ID | Command line | Date and time | Action(s) | Altered ------------------------------------------------------------------------------- ... 3977 | remove librados2 -y | 2016-03-19 07:24 | Erase | 11 3976 | remove libcephfs1 -y | 2016-03-19 07:24 | Erase | 3 EE 3975 | erase ceph-release-1-0.e | 2016-03-19 07:24 | Erase | 1 PP 3974 | install ceph-radosgw -y | 2016-03-19 07:21 | I, U | 28 ** 3973 | -y localinstall ceph-rel | 2016-03-19 07:21 | Install | 1 3972 | -d 2 -y install @core @b | 2016-03-19 07:17 | Install | 5 3971 | -d 2 -y remove boost-ran | 2016-03-19 07:16 | Erase | 13 3970 | erase ceph-release-1-0.e | 2016-03-19 07:09 | Erase | 1 3969 | remove librados2 -y | 2016-03-19 07:08 | Erase | 12
[root@smithi008 ~]# yum history info 3974 Loaded plugins: fastestmirror, langpacks, priorities Transaction ID : 3974 Begin time : Sat Mar 19 07:21:54 2016 Begin rpmdb : 880:3a47cdc06c27deff05f23b1c6283ab364f4fa496 User : <ubuntu> Return-Code : ** Aborted ** Command Line : install ceph-radosgw -y Transaction performed with: Installed rpm-4.11.3-17.el7.x86_64 @anaconda Installed yum-3.4.3-132.el7.centos.0.1.noarch @anaconda Installed yum-metadata-parser-1.1.4-10.el7.x86_64 @anaconda Installed yum-plugin-fastestmirror-1.1.31-34.el7.noarch @anaconda Packages Altered: Dep-Install boost-program-options-1.53.0-25.el7.x86_64 @base Dep-Install boost-random-1.53.0-25.el7.x86_64 @base ** Dep-Install ceph-1:10.0.5-2514.g4b97cd7.el7.x86_64 @Ceph ** Dep-Install ceph-base-1:10.0.5-2514.g4b97cd7.el7.x86_64 @Ceph Dep-Install ceph-common-1:10.0.5-2514.g4b97cd7.el7.x86_64 @Ceph ** Dep-Install ceph-mds-1:10.0.5-2514.g4b97cd7.el7.x86_64 @Ceph ** Dep-Install ceph-mon-1:10.0.5-2514.g4b97cd7.el7.x86_64 @Ceph ** Dep-Install ceph-osd-1:10.0.5-2514.g4b97cd7.el7.x86_64 @Ceph ** Install ceph-radosgw-1:10.0.5-2514.g4b97cd7.el7.x86_64 @Ceph ** Dep-Install ceph-selinux-1:10.0.5-2514.g4b97cd7.el7.x86_64 @Ceph ** Dep-Install hdparm-9.43-5.el7.x86_64 @base Dep-Install leveldb-1.12.0-5.el7.x86_64 @epel Dep-Install libcephfs1-1:10.0.5-2514.g4b97cd7.el7.x86_64 @Ceph ** Updated librados2-1:0.80.7-3.el7.x86_64 @base Update 1:10.0.5-2514.g4b97cd7.el7.x86_64 @Ceph Dep-Install libradosstriper1-1:10.0.5-2514.g4b97cd7.el7.x86_64 @Ceph ** Updated librbd1-1:0.80.7-3.el7.x86_64 @base Update 1:10.0.5-2514.g4b97cd7.el7.x86_64 @Ceph Dep-Install librgw2-1:10.0.5-2514.g4b97cd7.el7.x86_64 @Ceph Dep-Install python-babel-0.9.6-8.el7.noarch @base Dep-Install python-cephfs-1:10.0.5-2514.g4b97cd7.el7.x86_64 @Ceph ** Dep-Install python-flask-1:0.10.1-4.el7.noarch @extras ** Dep-Install python-itsdangerous-0.23-2.el7.noarch @extras Dep-Install python-jinja2-2.7.2-2.el7.noarch @base Dep-Install python-markupsafe-0.11-10.el7.x86_64 @base Dep-Install python-rados-1:10.0.5-2514.g4b97cd7.el7.x86_64 @Ceph Dep-Install python-rbd-1:10.0.5-2514.g4b97cd7.el7.x86_64 @Ceph Dep-Install python-requests-2.6.0-1.el7_1.noarch @base Dep-Install python-urllib3-1.10.2-2.el7_1.noarch @base Dep-Install python-werkzeug-0.9.1-2.el7.noarch @extras history info
You can even see in the teuthology log that yum output is still being produced on smithi008 when a forceful reboot command is sent.
2016-03-19T00:22:00.382 INFO:teuthology.orchestra.run.smithi008:Running: 'sync & sleep 5 ; sudo reboot -f -n' 2016-03-19T00:22:00.386 DEBUG:teuthology.nuke:no kernel mount on ubuntu@smithi004.front.sepia.ceph.com 2016-03-19T00:22:00.387 INFO:teuthology.nuke:rebooting ubuntu@smithi004.front.sepia.ceph.com 2016-03-19T00:22:00.387 INFO:teuthology.orchestra.run.smithi004:Running: 'sync & sleep 5 ; sudo reboot -f -n' 2016-03-19T00:22:00.389 INFO:teuthology.nuke:waiting for nodes to reboot 2016-03-19T00:22:00.394 INFO:teuthology.nuke:waiting for nodes to reboot 2016-03-19T00:22:00.407 DEBUG:teuthology.nuke:no kernel mount on ubuntu@smithi021.front.sepia.ceph.com 2016-03-19T00:22:00.407 INFO:teuthology.nuke:rebooting ubuntu@smithi021.front.sepia.ceph.com 2016-03-19T00:22:00.408 INFO:teuthology.orchestra.run.smithi021:Running: 'sync & sleep 5 ; sudo reboot -f -n' 2016-03-19T00:22:00.414 INFO:teuthology.nuke:waiting for nodes to reboot 2016-03-19T00:22:00.470 INFO:teuthology.orchestra.run.smithi008.stdout: Installing : 1:python-rbd-10.0.5-2514.g4b97cd7.el7.x86_64 10/30 2016-03-19T00:22:00.490 INFO:teuthology.orchestra.run.smithi028.stdout: Installing : python-urllib3-1.10.2-2.el7_1.noarch 12/30 2016-03-19T00:22:00.500 DEBUG:teuthology.nuke:no kernel mount on ubuntu@smithi028.front.sepia.ceph.com 2016-03-19T00:22:00.500 INFO:teuthology.nuke:rebooting ubuntu@smithi028.front.sepia.ceph.com 2016-03-19T00:22:00.501 INFO:teuthology.orchestra.run.smithi028:Running: 'sync & sleep 5 ; sudo reboot -f -n' 2016-03-19T00:22:00.507 INFO:teuthology.nuke:waiting for nodes to reboot 2016-03-19T00:22:00.614 INFO:teuthology.orchestra.run.smithi004.stdout: Installing : 1:python-rbd-10.0.5-2514.g4b97cd7.el7.x86_64 10/30 2016-03-19T00:22:02.979 INFO:teuthology.orchestra.run.smithi008.stdout: Installing : 1:libradosstriper1-10.0.5-2514.g4b97cd7.el7.x86_64 11/30 2016-03-19T00:22:03.161 INFO:teuthology.orchestra.run.smithi004.stdout: Installing : 1:libradosstriper1-10.0.5-2514.g4b97cd7.el7.x86_64 11/30 2016-03-19T00:22:03.399 INFO:teuthology.orchestra.run.smithi008.stdout: Installing : python-urllib3-1.10.2-2.el7_1.noarch 12/30 2016-03-19T00:22:03.600 INFO:teuthology.orchestra.run.smithi004.stdout: Installing : python-urllib3-1.10.2-2.el7_1.noarch 12/30 2016-03-19T00:22:04.402 INFO:teuthology.orchestra.run.smithi028.stdout: Installing : python-requests-2.6.0-1.el7_1.noarch 13/30 2016-03-19T00:22:05.047 INFO:teuthology.orchestra.run.smithi028.stdout: Installing : 1:ceph-common-10.0.5-2514.g4b97cd7.el7.x86_64 14/30 2016-03-19T00:22:05.438 INFO:teuthology.orchestra.run.smithi008.stdout: Installing : python-requests-2.6.0-1.el7_1.noarch 13/30 2016-03-19T00:22:05.611 INFO:teuthology.orchestra.run.smithi021.stderr:Rebooting. 2016-03-19T00:22:05.698 INFO:teuthology.orchestra.run.smithi004.stdout: Installing : python-requests-2.6.0-1.el7_1.noarch 13/30 2016-03-19T00:22:05.904 INFO:teuthology.orchestra.run.smithi028.stdout: Installing : python-werkzeug-0.9.1-2.el7.noarch 15/30 2016-03-19T00:22:06.158 INFO:teuthology.orchestra.run.smithi004.stderr:Rebooting. 2016-03-19T00:22:06.511 INFO:teuthology.orchestra.run.smithi008.stderr:Rebooting. 2016-03-19T00:22:06.631 INFO:teuthology.orchestra.run.smithi028.stderr:Rebooting. 2016-03-19T00:22:06.984 INFO:teuthology.orchestra.run.smithi028.stdout: Installing : python-babel-0.9.6-8.el7.noarch 16/30
Updated by David Galloway about 8 years ago
- Related to Bug #15208: Reimage smithi022 added
Updated by Yuri Weinstein about 8 years ago
Maybe smithi are too fast and we need to add sleep
somewhere? Not sure.
David it's worth a separate bug against powercycle suite.
Updated by Sage Weil about 8 years ago
2016-03-19T00:21:59.229 DEBUG:teuthology.run_tasks:Unwinding manager internal.lock_machines 2016-03-19T00:21:59.262 DEBUG:teuthology.run_tasks:Exception was not quenched, exiting: CommandFailedError: Command failed on smithi021 with status 1: 'sudo yum install ceph-radosgw -y' 2016-03-19T00:21:59.262 INFO:teuthology.nuke:Checking targets against current locks 2016-03-19T00:21:59.451 DEBUG:teuthology.nuke:shortname: smithi008
something is triggering a nuke that is running while install is still running?
Updated by Yuri Weinstein about 8 years ago
The test does not seem to rebooting a node during install , hmm ...
tasks: - install: null - ceph: null - thrashosds: chance_down: 1.0 powercycle: true timeout: 600 - ceph-fuse: null - workunit: clients: client.0: - rados/test.sh
Updated by Yuri Weinstein about 8 years ago
Sage Weil wrote:
[...]
something is triggering a nuke that is running while install is still running?
I was running nuke/stale over weekend, if stale
state was misreported then it could have been a suspect?
Updated by David Galloway about 8 years ago
Yuri Weinstein wrote:
Sage Weil wrote:
[...]
something is triggering a nuke that is running while install is still running?
I was running nuke/stale over weekend, if
stale
state was misreported then it could have been a suspect?
I don't think this is it since this is in teuthology.log:
2016-03-19T00:22:00.382 INFO:teuthology.orchestra.run.smithi008:Running: 'sync & sleep 5 ; sudo reboot -f -n'
Updated by David Galloway about 8 years ago
I'm leaving smithi008 in its current bad state and reimaging the rest of the down smithis.
smithi005.front.sepia.ceph.com unlocked None "by yuriw see http://tracker.ceph.com/issues/15214" smithi007.front.sepia.ceph.com unlocked None "by yuriw see http://tracker.ceph.com/issues/15214" smithi008.front.sepia.ceph.com unlocked None "by yuriw see http://tracker.ceph.com/issues/15214" smithi010.front.sepia.ceph.com unlocked None "None" smithi011.front.sepia.ceph.com unlocked None "None" smithi012.front.sepia.ceph.com unlocked None "yum failures" smithi018.front.sepia.ceph.com unlocked None "None" smithi019.front.sepia.ceph.com unlocked None "None" smithi021.front.sepia.ceph.com unlocked None "yum failures" smithi022.front.sepia.ceph.com unlocked None "rpmdb corruption" smithi027.front.sepia.ceph.com unlocked None "ansible_failures" smithi028.front.sepia.ceph.com unlocked None "yum_failing"
Updated by David Galloway about 8 years ago
- Project changed from sepia to teuthology
- Category deleted (
Test Node)
Updated by David Galloway about 8 years ago
Related to http://tracker.ceph.com/issues/15229
Updated by David Galloway about 8 years ago
- Project changed from teuthology to ceph-cm-ansible
Updated by Nathan Cutler about 8 years ago
- Related to Bug #15272: ansible failure in [testnode | Ensure ceph dependency packages are not present] added
Updated by David Galloway almost 8 years ago
- Status changed from In Progress to Resolved