Project

General

Profile

Documentation #57858

v17.2.4 release does not contain latest cherry-picks

Added by Laura Flores 4 months ago. Updated 4 months ago.

Status:
Resolved
Priority:
Immediate
Category:
-
Target version:
-
% Done:

0%

Tags:
Backport:
Reviewed:
Affected Versions:
Pull request ID:

Description

Earlier today, I went to check one of the Telemetry commands in the Long Running Cluster, and the command caused a crash in the mgr.
The crash was one that we had already seen during our testing of the Quincy RC, and we have already pushed a commit to fix it.

After further investigation, I found that this commit, along with 4 other commits, were not included in the v17.2.4 release when they should have been.
These commits include:
mgr/telemetry: handle daemons with complex ids
ceph-volume: fix regression in activate
mgr/rook: fix error when trying to get the list of nfs services
Revert "osd/PeeringState: proc_lease_ack break once found from OSD"
Revert "osd/PeeringState: fix missed `recheck_readable` from laggy"

Adam King also verified this by checking a 17.2.4 container, which proved to not contain the ceph-volume patch.

It was after Quincy RC was initially declared that we upgraded the upstream clusters (Gibba and LRC) and found a handful of regressions. We fixed these
regressions with the above patches and added them to the Quincy RC, which we verified again on our upstream clusters. Unfortunately, due to a flaw in the release process, these last-minute patches were not included in v17.2.4. Currently, v17.2.4 is based on the older Quincy RC when it should be based on the latest version.

Here is a link to the latest Quincy RC branch, which should have been released as v17.2.4:
https://github.com/ceph/ceph/commits/quincy-release

Here is a link to v17.2.4, which does not contain 5 commits:
https://github.com/ceph/ceph/commits/v17.2.4

Here is a link to the diff between v17.2.4 and quincy-release. The 5 commits shown should have been included in v17.2.4:
https://github.com/ceph/ceph/compare/v17.2.4...quincy-release

One missing piece of the puzzle is that this ceph-jenkins PR for v17.2.4 was never merged to Quincy, which means the v17.2.4 tag is not present in the Quincy branch: https://github.com/ceph/ceph/pull/48290

History

#1 Updated by Laura Flores 4 months ago

The signed v17.2.4 tag was also not included in https://github.com/ceph/ceph/pull/48290. This seems to have occurred due to a force-push, which added the cherry-picks, but lost the signed version commit.

#2 Updated by Laura Flores 4 months ago

  • Priority changed from Normal to Immediate

#3 Updated by Laura Flores 4 months ago

Here's how I think we should go about this.

We know that the v17.2.4 tag is missing from the Quincy branch. We should get this merged first; instead of including "old quincy-release" + "the 5 last-minute cherry-picks", we should have v17.2.4 tag just the "old quincy-release" commits, since this is what we've officially released.

On PR https://github.com/ceph/ceph/pull/48290:
  1. `git reset --hard 1353ed3` (this will reset the branch so it only contains the signed tag commit)
  2. Verify that this tag encompasses just the "old quincy-release" commits
  3. merge PR https://github.com/ceph/ceph/pull/48290 into the quincy branch
    i.e. something like this draft PR: https://github.com/ceph/ceph/pull/48469

Next, we can start v17.2.5 by following the hotfix guidelines in https://docs.ceph.com/en/latest/dev/release-process/.

#4 Updated by Vikhyat Umrao 4 months ago

  • Tracker changed from Bug to Tasks
  • Project changed from Ceph to Stable releases

#5 Updated by Laura Flores 4 months ago

Yuri:
I wonder in the doc you said: "Once QE has determined a stopping point in the working (e.g., quincy) branch, that commit should be pushed to the corresponding quincy-release branch." does it mean ceph-ci.git or ceph.git?

In other words,  does the Jenkins job take "quincy-release" from ceph-ci or ceph github?

David:
Ceph.git. So you should check out either the tip of Quincy or whatever commit should be the tip of the next release and force push that to quincy-release then run the Jenkins job.

Also I noticed the ceph-tag job failed on your first attempt.  If you opened the main “ceph” job, there should have been a “Resume build” button and you wouldn’t have had to wait for it to build all over again.

#6 Updated by Laura Flores 4 months ago

Bottom line: The quincy-release branch (and future release branches) should be up-to-date on the Ceph repository for the build to go correctly. Updating the branch on ceph-ci just creates a build for us to test on, but the real place it needs to be updated is the Ceph repo.

Keeping this Tracker open for now so we can use it to improve the ceph release documentation.

#7 Updated by Laura Flores 4 months ago

  • Tracker changed from Tasks to Documentation
  • Project changed from Stable releases to Ceph

#8 Updated by Yuri Weinstein 4 months ago

  • Status changed from New to Resolved
  • Assignee set to Yuri Weinstein
  • Affected Versions v17.2.5 added

17.2.5 was released with all missing commits

Also available in: Atom PDF