Skip to content

Conversation

@dgrisonnet
Copy link
Member

@dgrisonnet dgrisonnet commented Jul 10, 2025

Problem

When upgrading OpenShift to a version that includes revision readiness checks, the revision controller could become blocked if existing revision-status configmaps lack the operator.openshift.io/revision-ready annotation. The getLatestAvailableRevision() function only considers revisions with revision-ready set to "true" as valid, meaning older revision-status configmaps created before this feature would be ignored.

This could cause issues during version upgrades where:

  1. An existing revision-status configmap exists (e.g., revision-status-1) without the readiness annotation
  2. The new controller version checks for readiness but finds no "ready" revisions
  3. The controller cannot properly determine the latest available revision

Solution

This PR ensures backward compatibility by automatically setting the operator.openshift.io/revision-ready annotation to "true" on the current latest available revision's configmap during the sync process.

Changes made:

  1. Added revision readiness backfill logic in sync() method:

    • When a LatestAvailableRevision exists, check if its corresponding configmap has the readiness annotation
    • If the annotation is missing or not set to "true", update it accordingly
    • This prevents the controller from being blocked when upgrading to versions that require revision readiness
  2. Updated tests to reflect the new behavior:

    • Added proper revision-ready annotations to test fixtures
    • Added a new test case latest-revision-ready-annotation-set to verify the annotation is correctly applied
    • Updated existing test objects to include the expected annotation

Testing

  • All existing tests pass with the new behavior
  • New test case validates that the readiness annotation is properly set during sync
  • Manual testing confirmed that upgrades from versions without revision readiness work correctly

This change is backward compatible and prevents upgrade blocking scenarios while maintaining the existing revision management functionality.


Related: OCPBUGS-58412

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jul 10, 2025
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 10, 2025
@openshift-ci-robot openshift-ci-robot added the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label Jul 10, 2025
@openshift-ci-robot
Copy link

@dgrisonnet: This pull request references Jira Issue OCPBUGS-58412, which is invalid:

  • expected the bug to target the "4.18.z" version, but no target version was set
  • release note text must be set and not match the template OR release note type must be set to "Release Note Not Required". For more information you can reference the OpenShift Bug Process.
  • expected Jira Issue OCPBUGS-58412 to depend on a bug targeting a version in 4.19.0, 4.19.z and in one of the following states: VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA), but no dependents were found

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested review from jsafrane and p0lyn0mial July 10, 2025 10:24
@dgrisonnet dgrisonnet force-pushed the OCPBUGS-58412-4.18 branch from 682151c to d9b942c Compare July 23, 2025 12:54
Comment on lines 369 to 372
// Make sure that revision-ready is set to true on the
// revision-status{status.latestAvailableRevision} configmap. This prevents
// blocking the controller when upgrading to a version that checks revision
// readiness.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is correct. The other controllers in the revision system don't care about the revision-status configmaps, so if latestAvailableRevision has already been updated then the revision has already "shipped" and might be installed.

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 15, 2025
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 15, 2025
@dgrisonnet dgrisonnet force-pushed the OCPBUGS-58412-4.18 branch 3 times, most recently from 22d8b91 to 4f3cb82 Compare October 15, 2025 11:35
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 15, 2025

@dgrisonnet: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-encryption f74e9fc link true /test e2e-aws-encryption

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@dgrisonnet dgrisonnet changed the title WIP: [release-4.18] OCPBUGS-58412: fix incomplete revisions [release-4.18] OCPBUGS-58412: Fix revision readiness annotation for upgrade compatibility Oct 24, 2025
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 24, 2025
Copy link
Contributor

@benluddy benluddy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand, the problem here is that the pre-4.18 revision controller would update the latest available revision on the operator status whether or not the latest revision-status configmap had the revision-ready annotation. In 4.18, 0c1a7b5 causes the revision controller's sync loop to exclude revisions with unready statuses when determining the latest available revision, but it's possible that we'd already incremented latest available revision while on 4.17.

With openshift/api@b4969bd#diff-8b3c4bb83b89154fa4b10cb549b4fb2ae0edb4f75a419fec94365f255f2e23b0, another 4.18 change, API validation prevents the revision controller from decreasing the latest available revision if it had already been written, and the revision controller gets stuck early in its control loop.

The current PR forces revision-ready to true in this specific case where we've already incremented the latest available revision (i.e. it's "shipped", the installer controller might already be attempting to roll it out), which corrects the condition that we could have entered whilst on 4.17. We don't need this in 4.19+, because the latest available revision will no longer get ahead of the latest ready revision-status.

@benluddy
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Nov 21, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 21, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: benluddy, dgrisonnet
Once this PR has been reviewed and has the lgtm label, please assign p0lyn0mial for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@benluddy
Copy link
Contributor

/assign @p0lyn0mial

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants