Bug #37335
closedQA run failures "Command failed on smithi with status 1: '\n sudo yum -y install ceph-radosgw\n ' "
0%
Description
When running the QA tests for dashboard PRs we often get the above mentioned error.
More detailed error message from the logs:
2018-11-19T16:22:00.644 INFO:teuthology.orchestra.run.smithi047.stderr:Error: Package: 2:ceph-selinux-14.0.1-881.g09f2bb4.el7.x86_64 (ceph)
2018-11-19T16:22:00.644 INFO:teuthology.orchestra.run.smithi047.stderr: Requires: selinux-policy-base >= 3.13.1-229.el7
2018-11-19T16:22:00.644 INFO:teuthology.orchestra.run.smithi047.stderr: Installed: selinux-policy-targeted-3.13.1-192.el7_5.6.noarch (@updates)
2018-11-19T16:22:00.645 INFO:teuthology.orchestra.run.smithi047.stderr: selinux-policy-base = 3.13.1-192.el7_5.6
2018-11-19T16:22:00.645 INFO:teuthology.orchestra.run.smithi047.stderr: Available: selinux-policy-minimum-3.13.1-192.el7.noarch (base)
2018-11-19T16:22:00.645 INFO:teuthology.orchestra.run.smithi047.stderr: selinux-policy-base = 3.13.1-192.el7
2018-11-19T16:22:00.645 INFO:teuthology.orchestra.run.smithi047.stderr: Available: selinux-policy-minimum-3.13.1-192.el7_5.3.noarch (updates)
2018-11-19T16:22:00.645 INFO:teuthology.orchestra.run.smithi047.stderr: selinux-policy-base = 3.13.1-192.el7_5.3
2018-11-19T16:22:00.645 INFO:teuthology.orchestra.run.smithi047.stderr: Available: selinux-policy-minimum-3.13.1-192.el7_5.4.noarch (updates)
2018-11-19T16:22:00.645 INFO:teuthology.orchestra.run.smithi047.stderr: selinux-policy-base = 3.13.1-192.el7_5.4
2018-11-19T16:22:00.646 INFO:teuthology.orchestra.run.smithi047.stderr: Available: selinux-policy-minimum-3.13.1-192.el7_5.6.noarch (updates)
2018-11-19T16:22:00.646 INFO:teuthology.orchestra.run.smithi047.stderr: selinux-policy-base = 3.13.1-192.el7_5.6
2018-11-19T16:22:00.646 INFO:teuthology.orchestra.run.smithi047.stderr: Available: selinux-policy-mls-3.13.1-192.el7.noarch (base)
2018-11-19T16:22:00.646 INFO:teuthology.orchestra.run.smithi047.stderr: selinux-policy-base = 3.13.1-192.el7
2018-11-19T16:22:00.646 INFO:teuthology.orchestra.run.smithi047.stderr: Available: selinux-policy-mls-3.13.1-192.el7_5.3.noarch (updates)
2018-11-19T16:22:00.646 INFO:teuthology.orchestra.run.smithi047.stderr: selinux-policy-base = 3.13.1-192.el7_5.3
2018-11-19T16:22:00.646 INFO:teuthology.orchestra.run.smithi047.stderr: Available: selinux-policy-mls-3.13.1-192.el7_5.4.noarch (updates)
2018-11-19T16:22:00.647 INFO:teuthology.orchestra.run.smithi047.stderr: selinux-policy-base = 3.13.1-192.el7_5.4
2018-11-19T16:22:00.647 INFO:teuthology.orchestra.run.smithi047.stderr: Available: selinux-policy-mls-3.13.1-192.el7_5.6.noarch (updates)
2018-11-19T16:22:00.647 INFO:teuthology.orchestra.run.smithi047.stderr: selinux-policy-base = 3.13.1-192.el7_5.6
2018-11-19T16:22:00.647 INFO:teuthology.orchestra.run.smithi047.stderr: Available: selinux-policy-targeted-3.13.1-192.el7.noarch (base)
2018-11-19T16:22:00.647 INFO:teuthology.orchestra.run.smithi047.stderr: selinux-policy-base = 3.13.1-192.el7
2018-11-19T16:22:00.647 INFO:teuthology.orchestra.run.smithi047.stderr: Available: selinux-policy-targeted-3.13.1-192.el7_5.3.noarch (updates)
2018-11-19T16:22:00.647 INFO:teuthology.orchestra.run.smithi047.stderr: selinux-policy-base = 3.13.1-192.el7_5.3
2018-11-19T16:22:00.647 INFO:teuthology.orchestra.run.smithi047.stderr: Available: selinux-policy-targeted-3.13.1-192.el7_5.4.noarch (updates)
2018-11-19T16:22:00.648 INFO:teuthology.orchestra.run.smithi047.stderr: selinux-policy-base = 3.13.1-192.el7_5.4
2018-11-19T16:22:00.648 INFO:teuthology.orchestra.run.smithi047.stdout: You could try using --skip-broken to work around the problem
2018-11-19T16:22:08.189 INFO:teuthology.orchestra.run.smithi187.stdout: You could try running: rpm -Va --nofiles --nodigest
2018-11-19T16:22:08.267 DEBUG:teuthology.orchestra.run:got remote process result: 1
2018-11-19T16:22:08.268 ERROR:teuthology.contextutil:Saw exception from nested tasks
Traceback (most recent call last):
File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/contextutil.py", line 30, in nested
vars.append(enter())
File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
return self.gen.next()
File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/task/install/__init__.py", line 258, in install
install_packages(ctx, package_list, config)
File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/task/install/__init__.py", line 125, in install_packages
ctx, remote, pkgs[system_type], config)
File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/parallel.py", line 85, in __exit__
for result in self:
File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/parallel.py", line 99, in next
resurrect_traceback(result)
File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/parallel.py", line 22, in capture_traceback
return func(*args, **kwargs)
File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/task/install/rpm.py", line 180, in _update_package_list_and_install
cpack=cpack))
File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/remote.py", line 194, in run
r = self._runner(client=self.ssh, name=self.shortname, **kwargs)
File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 430, in run
r.wait()
File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 162, in wait
self._raise_for_status()
File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 184, in _raise_for_status
node=self.hostname, label=self.label
CommandFailedError: Command failed on smithi187 with status 1: '\n sudo yum -y install ceph-radosgw\n '
2018-11-19T16:22:08.269 ERROR:teuthology.run_tasks:Saw exception from tasks.
Traceback (most recent call last):
File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/run_tasks.py", line 89, in run_tasks
manager.__enter__()
File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
return self.gen.next()
File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/task/install/__init__.py", line 627, in task
lambda: ship_utilities(ctx=ctx, config=None),
File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
return self.gen.next()
File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/contextutil.py", line 30, in nested
vars.append(enter())
File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
return self.gen.next()
File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/task/install/__init__.py", line 258, in install
install_packages(ctx, package_list, config)
File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/task/install/__init__.py", line 125, in install_packages
ctx, remote, pkgs[system_type], config)
File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/parallel.py", line 85, in __exit__
for result in self:
File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/parallel.py", line 99, in next
resurrect_traceback(result)
File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/parallel.py", line 22, in capture_traceback
return func(*args, **kwargs)
File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/task/install/rpm.py", line 180, in _update_package_list_and_install
cpack=cpack))
File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/remote.py", line 194, in run
r = self._runner(client=self.ssh, name=self.shortname, **kwargs)
File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 430, in run
r.wait()
File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 162, in wait
self._raise_for_status()
File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 184, in _raise_for_status
node=self.hostname, label=self.label
CommandFailedError: Command failed on smithi187 with status 1: '\n sudo yum -y install ceph-radosgw\n
Also see:
http://pulpito.ceph.com/laura-2018-11-16_13:47:15-rados:mgr-wip-lpaduano-testing-24851-distro-basic-smithi/3261399/
http://pulpito.ceph.com/pnawracay-2018-11-20_07:28:07-rados:mgr-pna-fix-safe-to-destroy-distro-basic-smithi/3273594/
http://pulpito.ceph.com/tdehler-2018-11-19_15:27:38-rados:mgr-wip-tdehler-testing-25121-distro-basic-smithi/3270274/
Updated by Kefu Chai over 5 years ago
seems the centos build host is more updated, and was building ceph packages with "selinux-policy-base >= 3.13.1-229.el7", while the test node was still using selinux-policy-base = 3.13.1-192.el7. which is slightly older than 3.13.1-229.el7.
Updated by Kefu Chai over 5 years ago
- Project changed from mgr to sepia
- Category deleted (
testing) - Assignee set to Brad Hubbard
Updated by Brad Hubbard over 5 years ago
It looks to me as though the build machine has the Continuous Release repo [1] enabled as that is the only place I can find the 3.13.1-229.el7 version packages (although there could be other explanations along similar lines). If that's the case then the test nodes need access to this repo as well. I think only the infrastructure guys can confirm this theory and do anything about it. I'll pursue this in teh morning my time when most of those guys should be around.
[1] https://wiki.centos.org/AdditionalResources/Repositories/CR
Updated by David Galloway over 5 years ago
The version of selinux-policy-base required is based on what the build host had on it and not what's in a spec file somewhere? [1] OK.
I don't see in the slave-building ansible [1] where the CR repo gets added. How'd it get there? Do we expect Ceph users to also add it?
Adding the repo to the testnodes is relatively easy. I just feel like we've been adding random packages (python-jwt rings a bell [3]) and now repos that are adding additional barriers to entry for users.
[1] https://github.com/ceph/ceph-ci/blob/wip-lpaduano-testing-24851-2/ceph.spec.in#L914
[2] https://github.com/ceph/ceph-build/blob/master/ansible/slave.yml
[3] https://tracker.ceph.com/issues/36653
Updated by Ken Dreyer over 5 years ago
I know of no reason to enable the CentOS CR repository on our build slaves. I recommend disabling it.
It would be great to build the RPMs with mock to isolate the packages from changes on the build slaves. (Like we do with pbuilder on Ubuntu.) This would make it easier to build for el7, el8, and Fedora on the same slaves.
Updated by Brad Hubbard over 5 years ago
After talking to David, Ken, and Alfredo we believe this was caused by one of the build hosts being "dirty" at the time the build was done. The build log [1] shows "--> Already installed : selinux-policy-devel-3.13.1-229.el7.noarch" so it seems some '3.13.1-229' packages were already installed on the build host at that time. It was not my intention to suggest adding the CR repo to the test node as a permanent fix, but more as short-term solution. Obviously, the way to move forward is to work out how the newer packages are being introduced to the build host.
We may need to track these to try and establish a pattern.
Laura (and anyone else), can you post the build here when you see this again?
Updated by Brad Hubbard over 5 years ago
- Status changed from New to Need More Info
Updated by Brad Hubbard over 5 years ago
- Status changed from Need More Info to 12
So, with a few tips from Kefu found the following.
$ ag "install-deps" ceph-dev-new-setup/build/build 42:if [ -x install-deps.sh ]; then 44: ./install-deps.sh
So ceph-build's ceph-dev-new-setup step invokes install-deps.sh.
$ ag --nopager -B1 "enable cr" install-deps.sh 312- if test $ID = centos -a $VERSION_ID = 7 ; then 313: $SUDO yum-config-manager --enable cr -- 329- elif test $ID = virtuozzo -a $MAJOR_VERSION = 7 ; then 330: $SUDO yum-config-manager --enable cr
We are, therefore, enabling the CR repo on the build hosts for CentOS 7.
The reason we haven't noticed this before?
1: There hasn't been a newer version of the selinux packages in the CR repo.
2: Not every job runs on CentOS so it appears intermittent.
I'm building packages now to test this theory.
Updated by Brad Hubbard over 5 years ago
So I think that one possible solution for this on the build nodes would be to do something like the following before any call to install-deps.sh.
# yum-config-manager --enable cr # TMP=$(grep -m1 ^#baseurl /etc/yum.repos.d/CentOS-Base.repo|sed -e 's/#//'); sed -i -e "s@^baseurl=.*@$TMP@" /etc/yum.repos.d/CentOS-CR.repo
That would set the baseurl for the CR repo to the same as the Base repo rendering the 'yum-config-manager --enable cr' command in install-deps.sh a noop. Not particularly pretty but could accomplish what we want? Anyway, throwing it out there for comment.
Updated by Alfredo Deza over 5 years ago
The commit that added that line mentions:
To get libunwind from the CR repositories until CentOS 7.2.1511 is released.
And references tracker issue: http://tracker.ceph.com/issues/13997
IMO we should remove that CR line since the condition doesn't seem to apply anymore, and test if we are still seeing that problem
Updated by Brad Hubbard over 5 years ago
Haha, and not only that, I was even involved in getting that line put there! Talk about your fails :P Thanks Alfredo, I'll get that organised.
Updated by Brad Hubbard over 5 years ago
- Project changed from sepia to Ceph
- Status changed from 12 to In Progress
- Backport set to mimic, luminous
Updated by Brad Hubbard over 5 years ago
- Project changed from Ceph to sepia
- Status changed from In Progress to Resolved
Updated by Ken Dreyer over 5 years ago
I've begun setting up the pieces to build within mock. https://wiki.centos.org/SpecialInterestGroup/Storage/Ceph/Mock
Updated by Brad Hubbard over 5 years ago
Good move Ken. Does CentOS have an equivalent of Fedora's copr?
Updated by Ken Dreyer over 5 years ago
Great question Brad. I looked at Copr's settings, and I think it's possible.
When you set up a Copr project, in the web UI, if you de-select all the Chroots for the project, and then choose "custom-1-x86_64", and then insert all the Yum repositories from into the "External Repositories" section (https://github.com/CentOS-Storage-SIG/mock-ceph-config/blob/master/storage7-ceph-nautilus-el7-x86_64.cfg), it should work.
Updated by Brad Hubbard over 5 years ago
Thanks Ken, I'll give this a try when I do my weekly fedora copr run.
Updated by Nathan Cutler over 4 years ago
- Status changed from Resolved to Pending Backport
oops - this didn't get backported to mimic, causing #41603
Updated by Nathan Cutler over 4 years ago
- Copied to Backport #41644: luminous: QA run failures "Command failed on smithi with status 1: '\n sudo yum -y install ceph-radosgw\n ' " added
Updated by Nathan Cutler over 4 years ago
- Copied to Backport #41645: mimic: QA run failures "Command failed on smithi with status 1: '\n sudo yum -y install ceph-radosgw\n ' " added
Updated by Nathan Cutler over 4 years ago
- Related to Bug #41603: "make check" failing in GitHub due to python packaging conflict added
Updated by Brad Hubbard over 4 years ago
Sorry about that. Strange we didn't notice this in mimic for 10+ months
Updated by Nathan Cutler over 4 years ago
- Status changed from Pending Backport to Resolved
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved".
Updated by David Galloway over 4 years ago
Just to log this somewhere... @Dimitri Savineau reached out to me because ceph-container and ceph-ansible jobs were failing due to this.
slave-centos01 and slave-centos05 still had selinux-policy 3.13.1-252.el7 installed. I downgraded the package to 3.13.1-229 on those two hosts and ran yum-config-manager --disable cr
on all of the slave-centos* Jenkins slaves.