Project

General

Profile

Actions

Bug #18089

closed

Various official builds missing from CI/Shaman

Added by Abhishek Lekshmanan over 7 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
Urgent
Category:
Infrastructure Service
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

While testing current jewel branch many tests are failing with a reason similar to
Failed to fetch package version from
http://gitbuilder.ceph.com/ceph-deb-xenial-x86_64-basic/ref/infernalis/version

(similar for earlier versions like firefly, dumpling etc).
Eg for these runs

http://pulpito.ceph.com/abhi-2016-11-30_10:12:21-rados-wip-jewel-10-2-4-distro-basic-smithi/

http://pulpito.ceph.com/abhi-2016-11-30_09:07:29-rados-wip-jewel-10-2-4-distro-basic-smithi/

Actions #1

Updated by Alfredo Deza over 7 years ago

  • Project changed from CI to Infrastructure
Actions #2

Updated by Abhishek Lekshmanan over 7 years ago

  • Description updated (diff)
Actions #3

Updated by David Galloway over 7 years ago

  • Project changed from Infrastructure to sepia
  • Category set to Gitbuilder
  • Assignee set to David Galloway
Actions #4

Updated by David Galloway over 7 years ago

  • Related to Bug #18069: rados suite upgrade tests fail on xenial nodes added
Actions #5

Updated by David Galloway over 7 years ago

Building infernalis on Xenial is stuck due to the distribute library not installing - http://tracker.ceph.com/issues/18069#note-6

I'm not quite sure where/how to fix this yet.

I forced a rebuild on v0.80.8. It had been marked PASS from when the gitbuilder was created.

I'm working on getting v0.67.10 on Xenial - http://tracker.ceph.com/issues/18069#note-4

Actions #6

Updated by David Galloway over 7 years ago

  • Status changed from New to Closed
Actions #7

Updated by Nathan Cutler about 7 years ago

  • Status changed from Closed to 12

This bug is still present in jewel integration testing. We have a branch, wip-jewel-backports, which is based on jewel. This test fails: http://pulpito.ceph.com/loic-2017-01-26_22:01:29-rados-wip-jewel-backports-distro-basic-smithi/753358/

The failure reason is "Failed to fetch package version from https://shaman.ceph.com/api/search/?status=ready&project=ceph&flavor=default&distros=ubuntu%2F14.04%2Fx86_64&ref=v0.80.8"

The test yaml has:

os_type: ubuntu
os_version: 14.04
Actions #8

Updated by Nathan Cutler about 7 years ago

Here's another instance of the same failure. This job installs dumpling, then upgrades to firefly, and then to the test branch (wip-jewel-backports in this case): http://pulpito.ceph.com/loic-2017-01-26_22:01:29-rados-wip-jewel-backports-distro-basic-smithi/753382/

Actions #9

Updated by Nathan Cutler about 7 years ago

jewel PR: https://github.com/ceph/ceph/pull/13153

This should fix upgrade:hammer-x/f-h-x-offline in jewel, once the firefly packages for Ubuntu 14.04 are fixed in Shaman that is.

Actions #10

Updated by David Galloway about 7 years ago

I may just try to push an existing build of Firefly to the new CI. Is there a particular version or sha1 needed?

Actions #11

Updated by Nathan Cutler about 7 years ago

@David: as far as firefly is concerned, one job specifies v0.80.8 (sha1 69eaad7f8308f21573c604f121956e64679a52a7) and others just want the firefly branch (sha1 8abf95af405e117298c5012aeaa4c60caf86a4fd).

The job that installs dumpling and then upgrades it to firefly uses v0.67.10 (sha1 9d446bd416c52cd785ccf048ca67737ceafcdd7f) for dumpling and then branch firefly (sha1 8abf95af405e117298c5012aeaa4c60caf86a4fd) for the upgrade to firefly.

Actions #13

Updated by Nathan Cutler about 7 years ago

@David Thanks for trying. I scheduled a run with just these two tests, but the jobs still fail: http://pulpito.ceph.com/smithfarm-2017-01-29_02:56:52-rados-wip-jewel-backports-distro-basic-smithi/

Actions #14

Updated by Nathan Cutler about 7 years ago

Failure looks like this:

2017-01-29T02:59:06.656 WARNING:teuthology.packaging:More than one of ref, tag, branch, or sha1 supplied; using branch
2017-01-29T02:59:06.656 INFO:teuthology.packaging:ref: None
2017-01-29T02:59:06.656 INFO:teuthology.packaging:tag: None
2017-01-29T02:59:06.656 INFO:teuthology.packaging:branch: v0.80.8
2017-01-29T02:59:06.656 INFO:teuthology.packaging:sha1: 566dea345152702387becaff00e019bdb088a7fe
2017-01-29T02:59:06.657 DEBUG:teuthology.packaging:Querying https://shaman.ceph.com/api/search?status=ready&project=ceph&flavor=default&distros=ubuntu%2F14.04%2Fx86_64&ref=v0.80.8
2017-01-29T02:59:06.878 INFO:teuthology.task.install.deb:Pulling from https://2.chacra.ceph.com/r/ceph/v0.80.8/69eaad7f8308f21573c604f121956e64679a52a7/ubuntu/trusty/flavors/default/
2017-01-29T02:59:06.878 ERROR:teuthology.parallel:Exception in parallel execution
Traceback (most recent call last):
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/parallel.py", line 83, in __exit__
    for result in self:
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/parallel.py", line 101, in next
    resurrect_traceback(result)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/parallel.py", line 19, in capture_traceback
    return func(*args, **kwargs)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/task/install/deb.py", line 54, in _update_package_list_and_install
    version = builder.version
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/packaging.py", line 538, in version
    self._version = self._get_package_version()
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/packaging.py", line 946, in _get_package_version
    return self._result.json()[0]['extra']['package_manager_version']
KeyError: 'package_manager_version'
Actions #15

Updated by David Galloway about 7 years ago

Ah, I see I forgot some additional required JSON. I'll work on that.

Actions #16

Updated by David Galloway about 7 years ago

Nathan Cutler wrote:

@David Thanks for trying. I scheduled a run with just these two tests, but the jobs still fail: http://pulpito.ceph.com/smithfarm-2017-01-29_02:56:52-rados-wip-jewel-backports-distro-basic-smithi/

Fixed and updated my documentation on this process for next time: http://wiki.front.sepia.ceph.com/doku.php?id=production:chacra.ceph.com#pushing_a_gitbuilder-built_repo_to_a_chacra_host

Please retry.

Actions #17

Updated by Nathan Cutler about 7 years ago

Actions #18

Updated by David Galloway about 7 years ago

Nathan Cutler wrote:

@David - thanks, will do. Does your fix cover these two failures, too? http://pulpito.ceph.com/smithfarm-2017-01-30_12:05:09-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/

The package_manager_version error, yes. The second one, no. I'll need to push that to a chacra host. Will let you know when that's done.

Actions #19

Updated by Nathan Cutler about 7 years ago

filter="rados/singleton-nomsgr/{all/11429.yaml rados.yaml},rados/singleton-nomsgr/{all/13234.yaml rados.yaml}" ./virtualenv/bin/teuthology-suite -k distro --priority 101 --suite rados --ceph jewel --machine-type smithi --ceph-repo http://github.com/ceph/ceph.git --suite-repo http://github.com/ceph/ceph.git --email ncutler@suse.com --filter="$filter"

pass http://pulpito.ceph.com:80/smithfarm-2017-01-30_18:24:07-rados-jewel-distro-basic-smithi/

WOOHOO!

Actions #20

Updated by David Galloway about 7 years ago

Actually, that second one I'm a bit confused. The tag and sha1 are specified and looks like it defaults to tag.

2017-01-30T12:17:35.251 WARNING:teuthology.packaging:More than one of ref, tag, branch, or sha1 supplied; using tag
2017-01-30T12:17:35.252 INFO:teuthology.packaging:ref: None
2017-01-30T12:17:35.252 INFO:teuthology.packaging:tag: v0.94.3
2017-01-30T12:17:35.252 INFO:teuthology.packaging:branch: None
2017-01-30T12:17:35.252 INFO:teuthology.packaging:sha1: b671230f7f70b620905eb02c6dbd93d051b53fb7
2017-01-30T12:17:35.303 DEBUG:teuthology.repo_utils:git ls-remote git://git.ceph.com/ceph-ci v0.94.3^{} -> None
2017-01-30T12:17:35.304 INFO:teuthology.packaging:Tag 'v0.94.3' not found in git://git.ceph.com/ceph-ci; will also look in git://git.ceph.com/ceph
2017-01-30T12:17:35.748 DEBUG:teuthology.repo_utils:git ls-remote git://git.ceph.com/ceph v0.94.3^{} -> 95cefea9fd9ab740263bf8bb4796fd864d9afe2b
2017-01-30T12:17:35.749 DEBUG:teuthology.packaging:Querying https://shaman.ceph.com/api/search?status=ready&project=ceph&flavor=default&distros=ubuntu%2F14.04%2Fx86_64&sha1=95cefea9fd9ab740263bf8bb4796fd864d9afe2b

The sha1 requested is built [1] but does not match the version/tag. Which version needs to be installed there?

I see on the gitbuilder host that v0.94.3 -> ../sha1/95cefea9fd9ab740263bf8bb4796fd864d9afe2b

[1] https://shaman.ceph.com/api/search/?status=ready&project=ceph&flavor=default&distros=ubuntu%2F14.04%2Fx86_64&sha1=b671230f7f70b620905eb02c6dbd93d051b53fb7

Actions #21

Updated by Nathan Cutler about 7 years ago

Weird. The test yaml stipulates:

- print: "**** Install version lower than v0.94.4" 
- install:
    tag: v0.94.3
- ceph:
    fs: xfs

Teuthology somehow converts this into:

- print: **** Install version lower than v0.94.4
- install:
      tag: v0.94.3
      sha1: b671230f7f70b620905eb02c6dbd93d051b53fb7
- ceph:
      fs: xfs

Where b671230f7f70b620905eb02c6dbd93d051b53fb7 is the SHA1 of wip-jewel-backports (the job was run with --ceph wip-jewel-backports).

Actions #22

Updated by Nathan Cutler about 7 years ago

The sha1 requested is built [1] but does not match the version/tag. Which version needs to be installed there?

This is trying to install v0.94.3, i.e. 95cefea9fd9ab740263bf8bb4796fd864d9afe2b, so if you push that one to chacra that will do the trick.

The sha1 is not actually requested, though. Teuthology fills that in for some reason, but it is rightfully ignored.

Actions #23

Updated by David Galloway about 7 years ago

Nathan Cutler wrote:

The sha1 requested is built [1] but does not match the version/tag. Which version needs to be installed there?

This is trying to install v0.94.3, i.e. 95cefea9fd9ab740263bf8bb4796fd864d9afe2b, so if you push that one to chacra that will do the trick.

The sha1 is not actually requested, though. Teuthology fills that in for some reason, but it is rightfully ignored.

Done!

https://shaman.ceph.com/api/search/?status=ready&project=ceph&flavor=default&distros=ubuntu%2F14.04%2Fx86_64&sha1=95cefea9fd9ab740263bf8bb4796fd864d9afe2b

Actions #24

Updated by Nathan Cutler about 7 years ago

David, please push this one as well:

  • v10.2.0 3a9fba20ec743699b69bd0181dd6c54dc01c64b9
Actions #25

Updated by David Galloway about 7 years ago

Nathan Cutler wrote:

David, please push this one as well:

  • v10.2.0 3a9fba20ec743699b69bd0181dd6c54dc01c64b9

That one's already built: https://shaman.ceph.com/api/search/?status=ready&project=ceph&flavor=default&distros=ubuntu%2F14.04%2Fx86_64&sha1=3a9fba20ec743699b69bd0181dd6c54dc01c64b9

Actions #26

Updated by Nathan Cutler about 7 years ago

@David: I found a small problem with the v0.94.3 Ubuntu 14.04 packages (sha1 95cefea9fd9ab740263bf8bb4796fd864d9afe2b). They appear to be 0.80.8 instead.

https://shaman.ceph.com/api/search/?status=ready&project=ceph&flavor=default&distros=ubuntu%2F14.04%2Fx86_64&sha1=95cefea9fd9ab740263bf8bb4796fd864d9afe2b returns (note "package_manager_version" field):

[{"status": "ready", "sha1": "95cefea9fd9ab740263bf8bb4796fd864d9afe2b", "extra": {"build_url": "", "root_build_cause": "Imported from gitbuilder build", "version": "0.94.3", "node_name": "", "job_name": "", "package_manager_version": "0.80.8-1trusty"}, "url": "https://2.chacra.ceph.com/r/ceph/v0.94.3/95cefea9fd9ab740263bf8bb4796fd864d9afe2b/ubuntu/trusty/flavors/default/", "distro_codename": "trusty", "modified": "2017-01-30 20:08:55.219279", "distro_version": "14.04", "project": "ceph", "flavor": "default", "ref": "v0.94.3", "chacra_url": "https://2.chacra.ceph.com/repos/ceph/v0.94.3/95cefea9fd9ab740263bf8bb4796fd864d9afe2b/ubuntu/trusty/flavors/default/", "archs": ["x86_64"], "distro": "ubuntu"}]

and at least one upgrade/hammer-x job goes south:

2017-01-31T21:22:34.377 DEBUG:teuthology.misc:System to be installed: Ubuntu
2017-01-31T21:22:34.377 INFO:teuthology.task.install.deb:Installing packages: ceph, ceph-mds, ceph-common, ceph-fuse, ceph-test, radosgw, python-ceph, libcephfs1, libcephfs-java, libcephfs-jni, librados2, librbd1, rbd-fuse on remote deb x86_64
2017-01-31T21:22:34.377 WARNING:teuthology.packaging:More than one of ref, tag, branch, or sha1 supplied; using tag
2017-01-31T21:22:34.377 INFO:teuthology.packaging:ref: None
2017-01-31T21:22:34.377 INFO:teuthology.packaging:tag: v0.94.3
2017-01-31T21:22:34.377 INFO:teuthology.packaging:branch: None
2017-01-31T21:22:34.377 INFO:teuthology.packaging:sha1: 6d7b95218200aaa6629c6de9bd29b8fcae30760e
2017-01-31T21:22:34.402 DEBUG:teuthology.repo_utils:git ls-remote git://git.ceph.com/ceph-ci v0.94.3^{} -> None
2017-01-31T21:22:34.403 INFO:teuthology.packaging:Tag 'v0.94.3' not found in git://git.ceph.com/ceph-ci; will also look in git://git.ceph.com/ceph
2017-01-31T21:22:34.859 DEBUG:teuthology.repo_utils:git ls-remote git://git.ceph.com/ceph v0.94.3^{} -> 95cefea9fd9ab740263bf8bb4796fd864d9afe2b
2017-01-31T21:22:34.859 DEBUG:teuthology.packaging:Querying https://shaman.ceph.com/api/search?status=ready&project=ceph&flavor=default&distros=ubuntu%2F14.04%2Fx86_64&sha1=95cefea9fd9ab740263bf8bb4796fd864d9afe2b
2017-01-31T21:22:34.862 INFO:teuthology.orchestra.run.vpm005.stdout:Ubuntu
2017-01-31T21:22:34.863 DEBUG:teuthology.misc:System to be installed: Ubuntu
2017-01-31T21:22:34.863 INFO:teuthology.task.install.deb:Installing packages: ceph, ceph-mds, ceph-common, ceph-fuse, ceph-test, radosgw, python-ceph, libcephfs1, libcephfs-java, libcephfs-jni, librados2, librbd1, rbd-fuse on remote deb x86_64
2017-01-31T21:22:34.863 WARNING:teuthology.packaging:More than one of ref, tag, branch, or sha1 supplied; using tag
2017-01-31T21:22:34.863 INFO:teuthology.packaging:ref: None
2017-01-31T21:22:34.863 INFO:teuthology.packaging:tag: v0.94.3
2017-01-31T21:22:34.863 INFO:teuthology.packaging:branch: None
2017-01-31T21:22:34.863 INFO:teuthology.packaging:sha1: 6d7b95218200aaa6629c6de9bd29b8fcae30760e
2017-01-31T21:22:34.883 DEBUG:teuthology.repo_utils:git ls-remote git://git.ceph.com/ceph-ci v0.94.3^{} -> None
2017-01-31T21:22:34.883 INFO:teuthology.packaging:Tag 'v0.94.3' not found in git://git.ceph.com/ceph-ci; will also look in git://git.ceph.com/ceph
2017-01-31T21:22:35.336 DEBUG:teuthology.repo_utils:git ls-remote git://git.ceph.com/ceph v0.94.3^{} -> 95cefea9fd9ab740263bf8bb4796fd864d9afe2b
2017-01-31T21:22:35.336 DEBUG:teuthology.packaging:Querying https://shaman.ceph.com/api/search?status=ready&project=ceph&flavor=default&distros=ubuntu%2F14.04%2Fx86_64&sha1=95cefea9fd9ab740263bf8bb4796fd864d9afe2b
2017-01-31T21:22:35.534 INFO:teuthology.task.install.deb:Pulling from https://2.chacra.ceph.com/r/ceph/v0.94.3/95cefea9fd9ab740263bf8bb4796fd864d9afe2b/ubuntu/trusty/flavors/default/
2017-01-31T21:22:35.534 INFO:teuthology.task.install.deb:Package version is 0.80.8-1trusty
2017-01-31T21:22:35.553 INFO:teuthology.task.install.deb:Pulling from https://2.chacra.ceph.com/r/ceph/v0.94.3/95cefea9fd9ab740263bf8bb4796fd864d9afe2b/ubuntu/trusty/flavors/default/
2017-01-31T21:22:35.554 INFO:teuthology.task.install.deb:Package version is 0.80.8-1trusty

finally failing with Command failed on vpm005 with status 100: u'sudo DEBIAN_FRONTEND=noninteractive apt-get -y --force-yes -o Dpkg::Options::="--force-confdef" -o Dpkg::Options::="--force-confold" install ceph=0.80.8-1trusty ceph-mds=0.80.8-1trusty ceph-common=0.80.8-1trusty ceph-fuse=0.80.8-1trusty ceph-test=0.80.8-1trusty radosgw=0.80.8-1trusty python-ceph=0.80.8-1trusty libcephfs1=0.80.8-1trusty libcephfs-java=0.80.8-1trusty libcephfs-jni=0.80.8-1trusty librados2=0.80.8-1trusty librbd1=0.80.8-1trusty rbd-fuse=0.80.8-1trusty'

Actions #27

Updated by David Galloway about 7 years ago

That was a typo on my part. I re-pushed the extra job info. Should sync with shaman in a bit.

Actions #28

Updated by David Galloway about 7 years ago

@Nathan Weinberg, are you satisfied with the outcome of this or do you need anything else?

Actions #29

Updated by Nathan Cutler about 7 years ago

  • Status changed from 12 to 4
  • Assignee changed from David Galloway to Nathan Cutler

@David Progress is being made, but I need to do some more runs to verify that there are no more instances of this bug. Assigning to myself and setting status to "Feedback".

Actions #30

Updated by Nathan Cutler about 7 years ago

  • Assignee changed from Nathan Cutler to David Galloway

@David: Can you look at this test failure and determine if it's caused by this bug? http://pulpito.ceph.com/smithfarm-2017-02-13_20:36:56-rados-wip-jewel-backports-distro-basic-smithi/811008/

Actions #31

Updated by David Galloway about 7 years ago

  • Assignee changed from David Galloway to Nathan Cutler

Yep that was my fault. I somehow managed to delete all the packages for that repo.

I re-pushed them and added the extra json needed by teuthology.

https://shaman.ceph.com/api/search/?status=ready&project=ceph&flavor=default&distros=ubuntu%2F14.04%2Fx86_64&ref=v0.80.8

Actions #33

Updated by Nathan Cutler about 7 years ago

@David, ISTR reading somewhere that Shaman keeps builds for only a few days or weeks. Are you safeguarding these builds against that?

Actions #34

Updated by David Galloway about 7 years ago

Nathan Cutler wrote:

@David, ISTR reading somewhere that Shaman keeps builds for only a few days or weeks. Are you safeguarding these builds against that?

Yeah, I thought there wasn't anything implemented to clean up old builds yet but I checked with Andrew and these are getting deleted. Figuring out a fix now.

Actions #35

Updated by David Galloway about 7 years ago

@Nathan Weinberg, can you guys use use_shaman: False for these runs or are the newer versions of Ceph you're upgrading to only built by Jenkins slaves, and thus on chacra nodes, now?

There's no way to flag that a build should be kept on a chacra node.

It sounds like you need to install an upstream release, maybe from download.ceph.com, then upgrade to a CI-built version of Ceph. Is that right?

Actions #36

Updated by Nathan Cutler about 7 years ago

@Nathan Weinberg, can you guys use use_shaman: False for these runs or are the newer versions of Ceph you're upgrading to only built by Jenkins slaves, and thus on chacra nodes, now?

Correct. The way we work is we stage backport PRs in an integration branch, push it to ceph-ci/ceph.git for Shaman to build it, and then run tests on it.

There's no way to flag that a build should be kept on a chacra node.

That's a bug. It means that the migration to Shaman breaks all tests that upgrade from legacy releases.

Actions #37

Updated by Nathan Cutler about 7 years ago

  • Subject changed from tests failing with failure to get older package versions to No way to prevent Shaman from deleting builds, breaks upgrade test
  • Priority changed from Normal to Urgent
Actions #38

Updated by Nathan Cutler about 7 years ago

  • Subject changed from No way to prevent Shaman from deleting builds, breaks upgrade test to No way to prevent Shaman from deleting builds, breaks a number of upgrade tests
Actions #39

Updated by David Galloway about 7 years ago

  • Project changed from sepia to CI
  • Subject changed from No way to prevent Shaman from deleting builds, breaks a number of upgrade tests to No way to prevent Chacra from deleting builds, breaks a number of upgrade tests
  • Category deleted (Gitbuilder)
  • Assignee deleted (David Galloway)

Nathan Cutler wrote:

@Nathan Weinberg, can you guys use use_shaman: False for these runs or are the newer versions of Ceph you're upgrading to only built by Jenkins slaves, and thus on chacra nodes, now?

Correct. The way we work is we stage backport PRs in an integration branch, push it to ceph-ci/ceph.git for Shaman to build it, and then run tests on it.

There's no way to flag that a build should be kept on a chacra node.

That's a bug. It means that the migration to Shaman breaks all tests that upgrade from legacy releases.

Well, I should clarify. Chacra nodes can, individually, be set to keep all builds. The 4 chacra nodes we use for test builds, however, are all set to delete builds after a set number of days: https://github.com/ceph/chacra/blob/master/deploy/playbooks/roles/common/templates/prod.py.j2#L85-L135

Shaman's just an aggregator. Chacra's the tool that actually removes repos. I've renamed the bug as such and I'll move this to the CI queue.

Actions #40

Updated by Nathan Cutler about 7 years ago

Thanks, David. We really need a way to protect individual builds from being deleted. See http://tracker.ceph.com/issues/19080 for another example - a test is installing "branch: infernalis" and then upgrading. Shaman gives it the v9.2.1 build, but this is missing a commit that was added to the infernalis branch after the v9.2.1 release. I can fix the test to specify the SHA1, but then the test will fail because Shaman/Chacra will not be able to reliably provide a build for that SHA1.

Actions #41

Updated by Nathan Cutler about 7 years ago

  • Status changed from 4 to 12
Actions #42

Updated by Nathan Cutler about 7 years ago

How does Shaman/Chacra determine that a build is a "test build"? Builds specified in test YAML should not fall into this category.

Actions #43

Updated by Nathan Cutler about 7 years ago

@David, it sounds like you could simply put the legacy builds used in the upgrade tests on a Chacra host that is not configured to delete builds after a certain number of days?

Actions #44

Updated by David Galloway about 7 years ago

Nathan Cutler wrote:

@David, it sounds like you could simply put the legacy builds used in the upgrade tests on a Chacra host that is not configured to delete builds after a certain number of days?

The only Chacra host that doesn't delete builds automatically is chacra.ceph.com and according to Andrew, it was intentionally removed from Shaman's config by Alfredo. I'm not sure what the reasoning is or what the ramifications might be of re-adding it to Shaman and pushing permanent builds to it.

Actions #45

Updated by Andrew Schoen about 7 years ago

David Galloway wrote:

Nathan Cutler wrote:

@David, it sounds like you could simply put the legacy builds used in the upgrade tests on a Chacra host that is not configured to delete builds after a certain number of days?

The only Chacra host that doesn't delete builds automatically is chacra.ceph.com and according to Andrew, it was intentionally removed from Shaman's config by Alfredo. I'm not sure what the reasoning is or what the ramifications might be of re-adding it to Shaman and pushing permanent builds to it.

I've updated the config on chacra.ceph.com to update shaman with repo status so that repos stored there can be used. For any newer releases (0.94.10, 12.0.0) you'll have to use chacractl to force a rebuild so shaman is notified. Older repos that doe not exist on chacra.ceph.com will have to be uploaded from download.ceph.com using chacractl.

Actions #46

Updated by Andrew Schoen about 7 years ago

Nathan Cutler wrote:

Thanks, David. We really need a way to protect individual builds from being deleted. See http://tracker.ceph.com/issues/19080 for another example - a test is installing "branch: infernalis" and then upgrading. Shaman gives it the v9.2.1 build, but this is missing a commit that was added to the infernalis branch after the v9.2.1 release. I can fix the test to specify the SHA1, but then the test will fail because Shaman/Chacra will not be able to reliably provide a build for that SHA1.

Nathan,

The dev chacra instances are configured to keep a minimum of 5 infernalis repos, repos for the infernalis ref is also configured to be retained for 30 days. https://github.com/ceph/chacra/blob/master/deploy/playbooks/roles/common/templates/prod.py.j2#L108

I just attempted to build infernalis and the problem with infernalis is that any attempt to rebuild fails with errors like: https://jenkins.ceph.com/job/ceph-dev-setup/5877/console

This would mean that any merge to the infernalis branch also failed to build. IIRC, this problem is related to trying to build infernalis on a xenial node which is the primary node type picked up by ceph-dev-setup.

Actions #47

Updated by David Galloway about 7 years ago

Andrew Schoen wrote:

David Galloway wrote:

Nathan Cutler wrote:

@David, it sounds like you could simply put the legacy builds used in the upgrade tests on a Chacra host that is not configured to delete builds after a certain number of days?

The only Chacra host that doesn't delete builds automatically is chacra.ceph.com and according to Andrew, it was intentionally removed from Shaman's config by Alfredo. I'm not sure what the reasoning is or what the ramifications might be of re-adding it to Shaman and pushing permanent builds to it.

I've updated the config on chacra.ceph.com to update shaman with repo status so that repos stored there can be used. For any newer releases (0.94.10, 12.0.0) you'll have to use chacractl to force a rebuild so shaman is notified. Older repos that doe not exist on chacra.ceph.com will have to be uploaded from download.ceph.com using chacractl.

You sure? I don't see it in https://shaman.ceph.com/api/nodes/

Actions #48

Updated by Andrew Schoen about 7 years ago

David Galloway wrote:

Andrew Schoen wrote:

David Galloway wrote:

Nathan Cutler wrote:

@David, it sounds like you could simply put the legacy builds used in the upgrade tests on a Chacra host that is not configured to delete builds after a certain number of days?

The only Chacra host that doesn't delete builds automatically is chacra.ceph.com and according to Andrew, it was intentionally removed from Shaman's config by Alfredo. I'm not sure what the reasoning is or what the ramifications might be of re-adding it to Shaman and pushing permanent builds to it.

I've updated the config on chacra.ceph.com to update shaman with repo status so that repos stored there can be used. For any newer releases (0.94.10, 12.0.0) you'll have to use chacractl to force a rebuild so shaman is notified. Older repos that doe not exist on chacra.ceph.com will have to be uploaded from download.ceph.com using chacractl.

You sure? I don't see it in https://shaman.ceph.com/api/nodes/

It's purposely out of the shaman rotation, we don't want dev builds being sent there. We just want the upstream releases to register themselves with shaman.

Actions #49

Updated by David Galloway about 7 years ago

Andrew Schoen wrote:

It's purposely out of the shaman rotation, we don't want dev builds being sent there. We just want the upstream releases to register themselves with shaman.

Maybe I'm missing a step but shouldn't Shaman know about these repos now then? https://chacra.ceph.com/repos/ceph/firefly/

Actions #50

Updated by Andrew Schoen about 7 years ago

David Galloway wrote:

Andrew Schoen wrote:

It's purposely out of the shaman rotation, we don't want dev builds being sent there. We just want the upstream releases to register themselves with shaman.

Maybe I'm missing a step but shouldn't Shaman know about these repos now then? https://chacra.ceph.com/repos/ceph/firefly/

The repos you linked to have been have been there a long time so they were here before that configuration change I just made today. Only one has a valid sha1 and it's only for centos 6 https://chacra.ceph.com/repos/ceph/firefly/8424145d49264624a3b0a204aedb127835161070/centos/6/

If you're wanting to upload old firefly repos you'll need to import them from download.ceph.com and then use chacractl to to force a rebuild of the repo. When you rebuild the repo it should register with shaman.

Actions #51

Updated by David Galloway about 7 years ago

  • Project changed from CI to sepia
  • Category set to Infrastructure Service
  • Assignee set to David Galloway

Okay, I can try again to push these old builds to the chacra node that won't delete builds. Can you provide the refs and sha1s that need to be pushed, please? I've gotten confused about exactly what versions/refs/sha1s are needed.

Actions #52

Updated by Nathan Cutler about 7 years ago

@David: OK, I started http://pad.ceph.com/p/upgrade-builds and will put the references/SHA1s there. Please update that pad if/when you fix a particular build.

Actions #53

Updated by David Galloway about 7 years ago

  • Subject changed from No way to prevent Chacra from deleting builds, breaks a number of upgrade tests to Pre-hammer official builds missing from CI/Shaman
Actions #55

Updated by David Galloway about 7 years ago

How's this looking?

Actions #56

Updated by Nathan Cutler about 7 years ago

@David, please keep the bug open until we do another round of upgrade tests in hammer and jewel. Since that may take awhile, feel free to reassign the bug to me.

Actions #57

Updated by Nathan Cutler about 7 years ago

Hi David, could you work your magic on:

tag: v10.2.0
sha1: 3a9fba20ec743699b69bd0181dd6c54dc01c64b9

(xenial build is needed - see http://pad.ceph.com/p/upgrade-builds)

Actions #58

Updated by Nathan Cutler about 7 years ago

  • Subject changed from Pre-hammer official builds missing from CI/Shaman to Various official builds missing from CI/Shaman
Actions #60

Updated by David Galloway about 7 years ago

I imported the firefly build for CentOS.

If I import all the 10.2.0-1xenial_amd64 packages from http://download.ceph.com/debian-jewel/pool/main/c/ceph/ would that suffice for "3a9fba20ec743699b69bd0181dd6c54dc01c64b9 (v10.2.0 tag) - xenial" ?

Actions #61

Updated by David Galloway about 7 years ago

  • Assignee changed from David Galloway to Nathan Cutler

Pushed and updated etherpad. Let me know if you need any others.

Actions #62

Updated by Nathan Cutler about 7 years ago

@David - please

baf17c90980f3c61f75775f561ced5b2a1d2141c (tip of infernalis) (trusty, xenial, centos)

Actions #63

Updated by Nathan Cutler about 7 years ago

  • Assignee changed from Nathan Cutler to David Galloway
Actions #64

Updated by David Galloway about 7 years ago

Nathan Cutler wrote:

@David - please

baf17c90980f3c61f75775f561ced5b2a1d2141c (tip of infernalis) (trusty, xenial, centos)

Will this sha1 eventually be made into an official release? I'm rebuilding it now using the CI but am wondering if I need to push them to the main chacra node so the builds don't get deleted.

Actions #65

Updated by Jason Dillaman about 7 years ago

@David: Infernalis is an end-of-life release so we won't be doing any official builds. However, we would want to keep the current tip of the infernalis branch around for testing.

Actions #66

Updated by Nathan Cutler about 7 years ago

@David: what @Jason Borden said - the issue here is that when we specify "branch: infernalis" in the test yaml (install task), Shaman/Chacra gives us the "official" build (v9.2.1) which is a different SHA1 than the current tip of infernalis.

Actions #67

Updated by David Galloway about 7 years ago

Nathan Cutler wrote:

@David: what @Jason Borden said - the issue here is that when we specify "branch: infernalis" in the test yaml (install task), Shaman/Chacra gives us the "official" build (v9.2.1) which is a different SHA1 than the current tip of infernalis.

OK, I rebuilt the tip of infernalis in the CI and pushed copies of trusty, xenial, and centos to the chacra node that won't delete builds.

https://shaman.ceph.com/repos/ceph/infernalis/baf17c90980f3c61f75775f561ced5b2a1d2141c/default/24096/
https://shaman.ceph.com/repos/ceph/infernalis/baf17c90980f3c61f75775f561ced5b2a1d2141c/default/24064/
https://shaman.ceph.com/repos/ceph/infernalis/baf17c90980f3c61f75775f561ced5b2a1d2141c/default/18766/

Actions #68

Updated by Nathan Cutler about 7 years ago

@David - the "libcephfs-java" package appears to be missing in the Xenial 10.2.0 repo:

2017-04-17T10:39:28.867 INFO:teuthology.orchestra.run.smithi117:Running: u'sudo DEBIAN_FRONTEND=noninteractive apt-get -y --force-yes -o Dpkg::Options::="--force-confdef" -o Dpkg::Options::="--force-confold" install ceph-mds=10.2.0-1xenial rbd-fuse=10.2.0-1xenial librbd1=10.2.0-1xenial ceph-fuse=10.2.0-1xenial python-ceph=10.2.0-1xenial ceph-common=10.2.0-1xenial libcephfs-java=10.2.0-1xenial ceph=10.2.0-1xenial libcephfs-jni=10.2.0-1xenial ceph-test=10.2.0-1xenial radosgw=10.2.0-1xenial librados2=10.2.0-1xenial'
2017-04-17T10:39:28.929 INFO:teuthology.orchestra.run.smithi117.stdout:Reading package lists...
2017-04-17T10:39:29.164 INFO:teuthology.orchestra.run.smithi117.stdout:Building dependency tree...
2017-04-17T10:39:29.164 INFO:teuthology.orchestra.run.smithi117.stdout:Reading state information...
2017-04-17T10:39:29.189 INFO:teuthology.orchestra.run.smithi117.stderr:E: Version '10.2.0-1xenial' for 'libcephfs-java' was not found
Actions #69

Updated by David Galloway about 7 years ago

Nathan Cutler wrote:

@David - the "libcephfs-java" package appears to be missing in the Xenial 10.2.0 repo:

[...]

That build was pushed directly from download.ceph.com and the package doesn't appear to have been built/pushed there: http://download.ceph.com/debian-jewel/pool/main/c/ceph/

I was able to find a repo on chacra.ceph.com of the same sha1 using ref 'jewel' instead of 'v10.2.0' and copied the package from there.

https://chacra.ceph.com/r/ceph/jewel/3a9fba20ec743699b69bd0181dd6c54dc01c64b9/ubuntu/xenial/flavors/default/pool/main/c/ceph/

Actions #70

Updated by Nathan Cutler about 7 years ago

Thanks, David. I realized later that I could have modified the test yaml to exclude that package.

Actions #71

Updated by Nathan Cutler about 7 years ago

@David: Hmm, now it's failing in a slightly different way. Could you have a look?

http://pulpito.ceph.com/smithfarm-2017-04-18_13:23:55-upgrade:jewel-x-wip-kraken-backports-distro-basic-vps/

(I ran the test 5 times to be sure)

Actions #72

Updated by David Galloway about 7 years ago

Nathan Cutler wrote:

@David: Hmm, now it's failing in a slightly different way. Could you have a look?

http://pulpito.ceph.com/smithfarm-2017-04-18_13:23:55-upgrade:jewel-x-wip-kraken-backports-distro-basic-vps/

(I ran the test 5 times to be sure)

This worked:

sudo DEBIAN_FRONTEND=noninteractive apt-get -y --force-yes -o Dpkg::Options::="--force-confdef" -o Dpkg::Options::="--force-confold" install ceph-mds=10.2.0-1xenial rbd-fuse=10.2.0-1xenial librbd1=10.2.0-1xenial ceph-fuse=10.2.0-1xenial python-ceph=10.2.0-1xenial ceph-common=10.2.0-1xenial libcephfs-java=10.2.0-1xenial ceph=10.2.0-1xenial libcephfs-jni=10.2.0-1xenial ceph-test=10.2.0-1xenial radosgw=10.2.0-1xenial librados2=10.2.0-1xenial libradosstriper1=10.2.0-1xenial librgw2=10.2.0-1xenial python-rados=10.2.0-1xenial python-cephfs=10.2.0-1xenial python-rbd=10.2.0-1xenial libcephfs1=10.2.0-1xenial

Just had to add libradosstriper1=10.2.0-1xenial librgw2=10.2.0-1xenial python-rados=10.2.0-1xenial python-cephfs=10.2.0-1xenial python-rbd=10.2.0-1xenial libcephfs1=10.2.0-1xenial

Actions #73

Updated by Nathan Cutler almost 7 years ago

@David: Please add e832001feaf8c176593e0325c8298e3f16dfb403 (v0.94.6) - trusty build needed for https://github.com/ceph/ceph/pull/14930

Actions #75

Updated by Nathan Cutler almost 7 years ago

@David Please add the CentOS 7.3, Ubuntu 14.04, and Ubuntu 16.04 builds of v10.2.7 to the Chacra node that doesn't delete builds.

Ideally, this would happen automatically for every point release.

Actions #76

Updated by David Galloway almost 7 years ago

Nathan Cutler wrote:

@David Please add the CentOS 7.3, Ubuntu 14.04, and Ubuntu 16.04 builds of v10.2.7 to the Chacra node that doesn't delete builds.

Ideally, this would happen automatically for every point release.

Can you send me the teuthology log where this failed? All point releases are pushed to chacra.ceph.com and this shouldn't have failed because that release /is/ available.

git ls-remote git://git.ceph.com/ceph v10.2.7^{}
50e863e0f4bc8f4b9e31156de690d765af245185    refs/tags/v10.2.7^{}

dgalloway@w541 ~ ()$ curl "https://shaman.ceph.com/api/search/?status=ready&project=ceph&sha1=50e863e0f4bc8f4b9e31156de690d765af245185" | jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  3715  100  3715    0     0  10806      0 --:--:-- --:--:-- --:--:-- 10830
[
  {
    "status": "ready",
    "sha1": "50e863e0f4bc8f4b9e31156de690d765af245185",
    "extra": {
      "build_url": "https://jenkins.ceph.com/job/ceph-build/ARCH=arm64,AVAILABLE_ARCH=arm64,AVAILABLE_DIST=centos7,DIST=centos7,MACHINE_SIZE=huge/219/",
      "root_build_cause": "MANUALTRIGGER",
      "version": "10.2.7",
      "node_name": "172.21.4.53+omani003",
      "job_name": "ceph-build/ARCH=arm64,AVAILABLE_ARCH=arm64,AVAILABLE_DIST=centos7,DIST=centos7,MACHINE_SIZE=huge",
      "package_manager_version": "10.2.7-0" 
    },
    "url": "https://chacra.ceph.com/r/ceph/jewel/50e863e0f4bc8f4b9e31156de690d765af245185/centos/7/flavors/default/",
    "distro_codename": null,
    "modified": "2017-04-10 17:26:41.629376",
    "distro_version": "7",
    "project": "ceph",
    "flavor": "default",
    "ref": "jewel",
    "chacra_url": "https://chacra.ceph.com/repos/ceph/jewel/50e863e0f4bc8f4b9e31156de690d765af245185/centos/7/flavors/default/",
    "archs": [
      "source",
      "x86_64",
      "arm64" 
    ],
    "distro": "centos" 
  },
  {
    "status": "ready",
    "sha1": "50e863e0f4bc8f4b9e31156de690d765af245185",
    "extra": {
      "build_url": "https://jenkins.ceph.com/job/ceph-build/ARCH=arm64,AVAILABLE_ARCH=arm64,AVAILABLE_DIST=xenial,DIST=xenial,MACHINE_SIZE=huge/219/",
      "root_build_cause": "MANUALTRIGGER",
      "version": "10.2.7",
      "node_name": "172.21.4.52+omani002",
      "job_name": "ceph-build/ARCH=arm64,AVAILABLE_ARCH=arm64,AVAILABLE_DIST=xenial,DIST=xenial,MACHINE_SIZE=huge",
      "package_manager_version": "10.2.7-1xenial" 
    },
    "url": "https://chacra.ceph.com/r/ceph/jewel/50e863e0f4bc8f4b9e31156de690d765af245185/ubuntu/xenial/flavors/default/",
    "distro_codename": "xenial",
    "modified": "2017-04-10 16:51:31.647774",
    "distro_version": "16.04",
    "project": "ceph",
    "flavor": "default",
    "ref": "jewel",
    "chacra_url": "https://chacra.ceph.com/repos/ceph/jewel/50e863e0f4bc8f4b9e31156de690d765af245185/ubuntu/xenial/flavors/default/",
    "archs": [
      "x86_64",
      "arm64" 
    ],
    "distro": "ubuntu" 
  },
  {
    "status": "ready",
    "sha1": "50e863e0f4bc8f4b9e31156de690d765af245185",
    "extra": {
      "build_url": "https://jenkins.ceph.com/job/ceph-build/ARCH=x86_64,AVAILABLE_ARCH=x86_64,AVAILABLE_DIST=jessie,DIST=jessie,MACHINE_SIZE=huge/218/",
      "root_build_cause": "MANUALTRIGGER",
      "version": "10.2.7",
      "node_name": "172.21.15.124+smithi124",
      "job_name": "ceph-build/ARCH=x86_64,AVAILABLE_ARCH=x86_64,AVAILABLE_DIST=jessie,DIST=jessie,MACHINE_SIZE=huge",
      "package_manager_version": "10.2.7-1~bpo80+1" 
    },
    "url": "https://chacra.ceph.com/r/ceph/jewel/50e863e0f4bc8f4b9e31156de690d765af245185/debian/jessie/flavors/default/",
    "distro_codename": "jessie",
    "modified": "2017-04-10 13:40:13.307632",
    "distro_version": "8",
    "project": "ceph",
    "flavor": "default",
    "ref": "jewel",
    "chacra_url": "https://chacra.ceph.com/repos/ceph/jewel/50e863e0f4bc8f4b9e31156de690d765af245185/debian/jessie/flavors/default/",
    "archs": [
      "x86_64" 
    ],
    "distro": "debian" 
  },
  {
    "status": "ready",
    "sha1": "50e863e0f4bc8f4b9e31156de690d765af245185",
    "extra": {
      "build_url": "https://jenkins.ceph.com/job/ceph-build/ARCH=x86_64,AVAILABLE_ARCH=x86_64,AVAILABLE_DIST=trusty,DIST=trusty,MACHINE_SIZE=huge/218/",
      "root_build_cause": "MANUALTRIGGER",
      "version": "10.2.7",
      "node_name": "172.21.1.39+slave-ubuntu03",
      "job_name": "ceph-build/ARCH=x86_64,AVAILABLE_ARCH=x86_64,AVAILABLE_DIST=trusty,DIST=trusty,MACHINE_SIZE=huge",
      "package_manager_version": "10.2.7-1trusty" 
    },
    "url": "https://chacra.ceph.com/r/ceph/jewel/50e863e0f4bc8f4b9e31156de690d765af245185/ubuntu/trusty/flavors/default/",
    "distro_codename": "trusty",
    "modified": "2017-04-10 13:26:44.133870",
    "distro_version": "14.04",
    "project": "ceph",
    "flavor": "default",
    "ref": "jewel",
    "chacra_url": "https://chacra.ceph.com/repos/ceph/jewel/50e863e0f4bc8f4b9e31156de690d765af245185/ubuntu/trusty/flavors/default/",
    "archs": [
      "x86_64" 
    ],
    "distro": "ubuntu" 
  }
]
Actions #77

Updated by Nathan Cutler almost 7 years ago

Oh, I didn't realize that the release workflow already has this covered. Thanks, David, and sorry for the trouble.

And I can confirm that the v10.2.7 upgrade test (which I thought might fail) succeeded.

Actions #78

Updated by David Galloway over 6 years ago

  • Status changed from 12 to Resolved

Think this one's sorted now. Feel free to reopen if you come across builds that need pushing.

Actions

Also available in: Atom PDF