Project

General

Profile

Actions

Bug #15214

closed

"IndexError: list index out of range" in powercycle-jewel-testing-basic-smithi

Added by Yuri Weinstein about 8 years ago. Updated almost 8 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
powercycle
Crash signature (v1):
Crash signature (v2):

Description

Run: http://pulpito.ceph.com/teuthology-2016-03-18_14:41:27-powercycle-jewel-testing-basic-smithi/
Jobs: ['72750', '72755', '72757', '72761', '72766', '72771', '72773']
Logs: http://qa-proxy.ceph.com/teuthology/teuthology-2016-03-18_14:41:27-powercycle-jewel-testing-basic-smithi/72757/teuthology.log

2016-03-19T04:00:57.919 INFO:teuthology.task.ansible.out:[0;31mfailed: [smithi008.front.sepia.ceph.com] => (item=boost-random,boost-program-options,leveldb,xmlstarlet,python-jinja2,python-ceph,python-flask,python-requests,boost-random,python-urllib3,python-babel,hdparm,python-markupsafe,python-werkzeug,python-itsdangerous) => {"changed": false, "failed": true, "item": "boost-random,boost-program-options,leveldb,xmlstarlet,python-jinja2,python-ceph,python-flask,python-requests,boost-random,python-urllib3,python-babel,hdparm,python-markupsafe,python-werkzeug,python-itsdangerous", "rc": 1, "results": ["xmlstarlet is not installed", "python-ceph is not installed", "python-flask is not installed", "hdparm is not installed", "Loaded plugins: fastestmirror, langpacks, priorities\n"]}[0m
[0;31mmsg: Traceback (most recent call last):
  File "/usr/bin/yum", line 29, in <module>
    yummain.user_main(sys.argv[1:], exit_code=True)
  File "/usr/share/yum-cli/yummain.py", line 365, in user_main
    errcode = main(args)
  File "/usr/share/yum-cli/yummain.py", line 174, in main
    result, resultmsgs = base.doCommands()

2016-03-19T04:00:57.921 INFO:teuthology.task.ansible.out:  File "/usr/share/yum-cli/cli.py", line 573, in doCommands
    return self.yum_cli_commands[self.basecmd].doCommand(self, self.basecmd, self.extcmds)
  File "/usr/share/yum-cli/yumcommands.py", line 878, in doCommand
    ret = base.erasePkgs(extcmds, pos=pos, basecmd=basecmd)
  File "/usr/share/yum-cli/cli.py", line 1198, in erasePkgs
    rms = self.remove(pattern=arg)
  File "/usr/lib/python2.7/site-packages/yum/__init__.py", line 5399, in remove
    (e,m,u) = self.rpmdb.matchPackageNames([kwargs['pattern']])
  File "/usr/lib/python2.7/site-packages/yum/packageSack.py", line 304, in matchPackageNames
    exactmatch.append(self.searchPkgTuple(pkgtup)[0])
IndexError: list index out of range

Related issues 1 (0 open1 closed)

Related to ceph-cm-ansible - Bug #15272: ansible failure in [testnode | Ensure ceph dependency packages are not present]ResolvedSamuel Just03/24/2016

Actions
Actions #1

Updated by Yuri Weinstein about 8 years ago

  • Project changed from ceph-cm-ansible to sepia
  • Assignee set to David Galloway

suspect smithi007 and 008 005 028 at fault, marked down.

Actions #2

Updated by David Galloway about 8 years ago

  • Category set to Test Node
  • Status changed from New to In Progress

Machines are getting forcefully rebooted during yum transactions. The tests need to be modified or the rpmdb is just going to continue getting corrupted during these powercycle runs.

Using smithi008 as an example...

http://pulpito.ceph.com/teuthology-2016-03-18_14:41:27-powercycle-jewel-testing-basic-smithi/72776/

The testnode is in UTC and the logs are in PST so keep that in mind.

[root@smithi008 ~]# yum history list
Loaded plugins: fastestmirror, langpacks, priorities
ID     | Command line             | Date and time    | Action(s)      | Altered
-------------------------------------------------------------------------------
...

  3977 | remove librados2 -y      | 2016-03-19 07:24 | Erase          |   11   
  3976 | remove libcephfs1 -y     | 2016-03-19 07:24 | Erase          |    3 EE
  3975 | erase ceph-release-1-0.e | 2016-03-19 07:24 | Erase          |    1 PP
  3974 | install ceph-radosgw -y  | 2016-03-19 07:21 | I, U           |   28 **
  3973 | -y localinstall ceph-rel | 2016-03-19 07:21 | Install        |    1   
  3972 | -d 2 -y install @core @b | 2016-03-19 07:17 | Install        |    5   
  3971 | -d 2 -y remove boost-ran | 2016-03-19 07:16 | Erase          |   13   
  3970 | erase ceph-release-1-0.e | 2016-03-19 07:09 | Erase          |    1   
  3969 | remove librados2 -y      | 2016-03-19 07:08 | Erase          |   12   
[root@smithi008 ~]# yum history info 3974
Loaded plugins: fastestmirror, langpacks, priorities
Transaction ID : 3974
Begin time     : Sat Mar 19 07:21:54 2016
Begin rpmdb    : 880:3a47cdc06c27deff05f23b1c6283ab364f4fa496
User           :  <ubuntu>
Return-Code    : ** Aborted **
Command Line   : install ceph-radosgw -y
Transaction performed with:
    Installed     rpm-4.11.3-17.el7.x86_64                      @anaconda
    Installed     yum-3.4.3-132.el7.centos.0.1.noarch           @anaconda
    Installed     yum-metadata-parser-1.1.4-10.el7.x86_64       @anaconda
    Installed     yum-plugin-fastestmirror-1.1.31-34.el7.noarch @anaconda
Packages Altered:
    Dep-Install boost-program-options-1.53.0-25.el7.x86_64         @base
    Dep-Install boost-random-1.53.0-25.el7.x86_64                  @base
 ** Dep-Install ceph-1:10.0.5-2514.g4b97cd7.el7.x86_64             @Ceph
 ** Dep-Install ceph-base-1:10.0.5-2514.g4b97cd7.el7.x86_64        @Ceph
    Dep-Install ceph-common-1:10.0.5-2514.g4b97cd7.el7.x86_64      @Ceph
 ** Dep-Install ceph-mds-1:10.0.5-2514.g4b97cd7.el7.x86_64         @Ceph
 ** Dep-Install ceph-mon-1:10.0.5-2514.g4b97cd7.el7.x86_64         @Ceph
 ** Dep-Install ceph-osd-1:10.0.5-2514.g4b97cd7.el7.x86_64         @Ceph
 ** Install     ceph-radosgw-1:10.0.5-2514.g4b97cd7.el7.x86_64     @Ceph
 ** Dep-Install ceph-selinux-1:10.0.5-2514.g4b97cd7.el7.x86_64     @Ceph
 ** Dep-Install hdparm-9.43-5.el7.x86_64                           @base
    Dep-Install leveldb-1.12.0-5.el7.x86_64                        @epel
    Dep-Install libcephfs1-1:10.0.5-2514.g4b97cd7.el7.x86_64       @Ceph
 ** Updated     librados2-1:0.80.7-3.el7.x86_64                    @base
    Update                1:10.0.5-2514.g4b97cd7.el7.x86_64        @Ceph
    Dep-Install libradosstriper1-1:10.0.5-2514.g4b97cd7.el7.x86_64 @Ceph
 ** Updated     librbd1-1:0.80.7-3.el7.x86_64                      @base
    Update              1:10.0.5-2514.g4b97cd7.el7.x86_64          @Ceph
    Dep-Install librgw2-1:10.0.5-2514.g4b97cd7.el7.x86_64          @Ceph
    Dep-Install python-babel-0.9.6-8.el7.noarch                    @base
    Dep-Install python-cephfs-1:10.0.5-2514.g4b97cd7.el7.x86_64    @Ceph
 ** Dep-Install python-flask-1:0.10.1-4.el7.noarch                 @extras
 ** Dep-Install python-itsdangerous-0.23-2.el7.noarch              @extras
    Dep-Install python-jinja2-2.7.2-2.el7.noarch                   @base
    Dep-Install python-markupsafe-0.11-10.el7.x86_64               @base
    Dep-Install python-rados-1:10.0.5-2514.g4b97cd7.el7.x86_64     @Ceph
    Dep-Install python-rbd-1:10.0.5-2514.g4b97cd7.el7.x86_64       @Ceph
    Dep-Install python-requests-2.6.0-1.el7_1.noarch               @base
    Dep-Install python-urllib3-1.10.2-2.el7_1.noarch               @base
    Dep-Install python-werkzeug-0.9.1-2.el7.noarch                 @extras
history info

You can even see in the teuthology log that yum output is still being produced on smithi008 when a forceful reboot command is sent.

2016-03-19T00:22:00.382 INFO:teuthology.orchestra.run.smithi008:Running: 'sync & sleep 5 ; sudo reboot -f -n'
2016-03-19T00:22:00.386 DEBUG:teuthology.nuke:no kernel mount on ubuntu@smithi004.front.sepia.ceph.com
2016-03-19T00:22:00.387 INFO:teuthology.nuke:rebooting ubuntu@smithi004.front.sepia.ceph.com
2016-03-19T00:22:00.387 INFO:teuthology.orchestra.run.smithi004:Running: 'sync & sleep 5 ; sudo reboot -f -n'
2016-03-19T00:22:00.389 INFO:teuthology.nuke:waiting for nodes to reboot
2016-03-19T00:22:00.394 INFO:teuthology.nuke:waiting for nodes to reboot
2016-03-19T00:22:00.407 DEBUG:teuthology.nuke:no kernel mount on ubuntu@smithi021.front.sepia.ceph.com
2016-03-19T00:22:00.407 INFO:teuthology.nuke:rebooting ubuntu@smithi021.front.sepia.ceph.com
2016-03-19T00:22:00.408 INFO:teuthology.orchestra.run.smithi021:Running: 'sync & sleep 5 ; sudo reboot -f -n'
2016-03-19T00:22:00.414 INFO:teuthology.nuke:waiting for nodes to reboot
2016-03-19T00:22:00.470 INFO:teuthology.orchestra.run.smithi008.stdout:  Installing : 1:python-rbd-10.0.5-2514.g4b97cd7.el7.x86_64               10/30
2016-03-19T00:22:00.490 INFO:teuthology.orchestra.run.smithi028.stdout:  Installing : python-urllib3-1.10.2-2.el7_1.noarch                       12/30
2016-03-19T00:22:00.500 DEBUG:teuthology.nuke:no kernel mount on ubuntu@smithi028.front.sepia.ceph.com
2016-03-19T00:22:00.500 INFO:teuthology.nuke:rebooting ubuntu@smithi028.front.sepia.ceph.com
2016-03-19T00:22:00.501 INFO:teuthology.orchestra.run.smithi028:Running: 'sync & sleep 5 ; sudo reboot -f -n'
2016-03-19T00:22:00.507 INFO:teuthology.nuke:waiting for nodes to reboot
2016-03-19T00:22:00.614 INFO:teuthology.orchestra.run.smithi004.stdout:  Installing : 1:python-rbd-10.0.5-2514.g4b97cd7.el7.x86_64               10/30
2016-03-19T00:22:02.979 INFO:teuthology.orchestra.run.smithi008.stdout:  Installing : 1:libradosstriper1-10.0.5-2514.g4b97cd7.el7.x86_64         11/30
2016-03-19T00:22:03.161 INFO:teuthology.orchestra.run.smithi004.stdout:  Installing : 1:libradosstriper1-10.0.5-2514.g4b97cd7.el7.x86_64         11/30
2016-03-19T00:22:03.399 INFO:teuthology.orchestra.run.smithi008.stdout:  Installing : python-urllib3-1.10.2-2.el7_1.noarch                       12/30
2016-03-19T00:22:03.600 INFO:teuthology.orchestra.run.smithi004.stdout:  Installing : python-urllib3-1.10.2-2.el7_1.noarch                       12/30
2016-03-19T00:22:04.402 INFO:teuthology.orchestra.run.smithi028.stdout:  Installing : python-requests-2.6.0-1.el7_1.noarch                       13/30
2016-03-19T00:22:05.047 INFO:teuthology.orchestra.run.smithi028.stdout:  Installing : 1:ceph-common-10.0.5-2514.g4b97cd7.el7.x86_64              14/30
2016-03-19T00:22:05.438 INFO:teuthology.orchestra.run.smithi008.stdout:  Installing : python-requests-2.6.0-1.el7_1.noarch                       13/30
2016-03-19T00:22:05.611 INFO:teuthology.orchestra.run.smithi021.stderr:Rebooting.
2016-03-19T00:22:05.698 INFO:teuthology.orchestra.run.smithi004.stdout:  Installing : python-requests-2.6.0-1.el7_1.noarch                       13/30
2016-03-19T00:22:05.904 INFO:teuthology.orchestra.run.smithi028.stdout:  Installing : python-werkzeug-0.9.1-2.el7.noarch                         15/30
2016-03-19T00:22:06.158 INFO:teuthology.orchestra.run.smithi004.stderr:Rebooting.
2016-03-19T00:22:06.511 INFO:teuthology.orchestra.run.smithi008.stderr:Rebooting.
2016-03-19T00:22:06.631 INFO:teuthology.orchestra.run.smithi028.stderr:Rebooting.
2016-03-19T00:22:06.984 INFO:teuthology.orchestra.run.smithi028.stdout:  Installing : python-babel-0.9.6-8.el7.noarch                            16/30
Actions #3

Updated by David Galloway about 8 years ago

Actions #4

Updated by Yuri Weinstein about 8 years ago

Maybe smithi are too fast and we need to add sleep somewhere? Not sure.
David it's worth a separate bug against powercycle suite.

Actions #5

Updated by Sage Weil about 8 years ago

2016-03-19T00:21:59.229 DEBUG:teuthology.run_tasks:Unwinding manager internal.lock_machines
2016-03-19T00:21:59.262 DEBUG:teuthology.run_tasks:Exception was not quenched, exiting: CommandFailedError: Command failed on smithi021 with status 1: 'sudo yum install ceph-radosgw -y'
2016-03-19T00:21:59.262 INFO:teuthology.nuke:Checking targets against current locks
2016-03-19T00:21:59.451 DEBUG:teuthology.nuke:shortname: smithi008

something is triggering a nuke that is running while install is still running?

Actions #6

Updated by Yuri Weinstein about 8 years ago

The test does not seem to rebooting a node during install , hmm ...

tasks:
- install: null
- ceph: null
- thrashosds:
    chance_down: 1.0
    powercycle: true
    timeout: 600
- ceph-fuse: null
- workunit:
    clients:
      client.0:
      - rados/test.sh
Actions #7

Updated by Yuri Weinstein about 8 years ago

Sage Weil wrote:

[...]

something is triggering a nuke that is running while install is still running?

I was running nuke/stale over weekend, if stale state was misreported then it could have been a suspect?

Actions #8

Updated by David Galloway about 8 years ago

Yuri Weinstein wrote:

Sage Weil wrote:

[...]

something is triggering a nuke that is running while install is still running?

I was running nuke/stale over weekend, if stale state was misreported then it could have been a suspect?

I don't think this is it since this is in teuthology.log:

2016-03-19T00:22:00.382 INFO:teuthology.orchestra.run.smithi008:Running: 'sync & sleep 5 ; sudo reboot -f -n'

Actions #9

Updated by David Galloway about 8 years ago

I'm leaving smithi008 in its current bad state and reimaging the rest of the down smithis.

smithi005.front.sepia.ceph.com unlocked None "by yuriw see http://tracker.ceph.com/issues/15214" 
smithi007.front.sepia.ceph.com unlocked None "by yuriw see http://tracker.ceph.com/issues/15214" 
smithi008.front.sepia.ceph.com unlocked None "by yuriw see http://tracker.ceph.com/issues/15214" 
smithi010.front.sepia.ceph.com unlocked None "None" 
smithi011.front.sepia.ceph.com unlocked None "None" 
smithi012.front.sepia.ceph.com unlocked None "yum failures" 
smithi018.front.sepia.ceph.com unlocked None "None" 
smithi019.front.sepia.ceph.com unlocked None "None" 
smithi021.front.sepia.ceph.com unlocked None "yum failures" 
smithi022.front.sepia.ceph.com unlocked None "rpmdb corruption" 
smithi027.front.sepia.ceph.com unlocked None "ansible_failures" 
smithi028.front.sepia.ceph.com unlocked None "yum_failing" 
Actions #10

Updated by David Galloway about 8 years ago

  • Project changed from sepia to teuthology
  • Category deleted (Test Node)
Actions #11

Updated by David Galloway about 8 years ago

  • Assignee deleted (David Galloway)
Actions #13

Updated by David Galloway about 8 years ago

  • Project changed from teuthology to ceph-cm-ansible
Actions #14

Updated by Nathan Cutler about 8 years ago

  • Related to Bug #15272: ansible failure in [testnode | Ensure ceph dependency packages are not present] added
Actions #15

Updated by David Galloway almost 8 years ago

  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF