-
Notifications
You must be signed in to change notification settings - Fork 18
test: refactor operatorpolicy tests to run faster #419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
test: refactor operatorpolicy tests to run faster #419
Conversation
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jan-law The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
e3ead21 to
4fb0eb9
Compare
Refactors the case38 setup methods to deploy each test into one namespace per Ordered test. The namespaces and policies are suffixed with a hash of the test name. ref: https://issues.redhat.com/browse/ACM-26529 Signed-off-by: Janelle Law <[email protected]>
These tests intentionally break the CatalogSource and expect the PackageManifests to disappear. Run as Serial to prevent interference from tests that frequently query the PackageManifest API ref: https://issues.redhat.com/browse/ACM-23597 Signed-off-by: Janelle Law <[email protected]>
ref: https://issues.redhat.com/browse/ACM-26529 Signed-off-by: Janelle Law <[email protected]>
Prepare to refactor the tests that enforce an operatorpolicy for the same operator while relying on the default operator group behavior ref: https://issues.redhat.com/browse/ACM-26529 Signed-off-by: Janelle Law <[email protected]>
These tests deploy the same operator with no OperatorGroup, which defaults to AllNamespaces mode. We need to run these tests one at a time relative to each other. Otherwise, the status displays unexpected messages about the same operator managing resources in another test's namespace. ref: https://issues.redhat.com/browse/ACM-26529 Signed-off-by: Janelle Law <[email protected]>
Move tests that delete and recreate an operator crd to the same location ref: https://issues.redhat.com/browse/ACM-26529 Signed-off-by: Janelle Law <[email protected]>
These tests delete and recreate the same CRD. They will need to run one at a time relative to each other. ref: https://issues.redhat.com/browse/ACM-26529 Signed-off-by: Janelle Law <[email protected]>
Replaces the operator under test with another operator exclusively used for the crd tests. Prevents conflicts in other tests when the operator's crd is deleted as part of the crd test setup. ref: https://issues.redhat.com/browse/ACM-26529 Signed-off-by: Janelle Law <[email protected]>
Match the same catalog source cleanup logic as "Testing an all default operator policy" ref: https://issues.redhat.com/browse/ACM-26529 Signed-off-by: Janelle Law <[email protected]>
Signed-off-by: Janelle Law <[email protected]>
| @@ -672,22 +832,28 @@ var _ = Describe("Testing OperatorPolicy", Ordered, Label("supports-hosted"), fu | |||
| ) | |||
| }) | |||
| }) | |||
|
|
|||
| Describe("Testing Subscription behavior for musthave mode while enforcing", Ordered, func() { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test flakes occasionally. Both times have the same events, passed the Eventually() in checkFunc, but failed at Consistently(). The last It() step in the test intentionally breaks the spec/subscription/sourceNamespace, but then an install plan starts installing during Consistently(). The subscription status changes to Compliant instead of remaining at the expected ConstraintsNotSatisfiable. Any ideas why?
{
"lastTransitionTime": "2025-12-04T19:14:57Z",
"message": "the Subscription matches what is required by the policy",
"reason": "SubscriptionMatches",
"status": "True",
"type": "SubscriptionCompliant"
},
{
"lastTransitionTime": "2025-12-04T19:14:23Z",
"message": "the policy spec is valid",
"reason": "PolicyValidated",
"status": "True",
"type": "ValidPolicySpec"
}
],
"history": [
{
"lastTimestamp": "2025-12-04T19:14:58.015269Z", # unexpected
"message": "NonCompliant; the policy spec is valid, the OperatorGroup matches what is required by the policy, the Subscription matches what is required by the policy, a relevant InstallPlan is actively installing, the ClusterServiceVersion required by the policy was not found, no CRDs were found for the operator, there are no relevant deployments because the ClusterServiceVersion is missing, CatalogSource 'operatorhubio-catalog' was not found"
},
{
"lastTimestamp": "2025-12-04T19:14:55.701762Z", # this should be the last message
"message": "NonCompliant; the policy spec is valid, the OperatorGroup matches what is required by the policy, constraints not satisfiable: refer to the Subscription for more details, there are no relevant InstallPlans in the namespace, the ClusterServiceVersion required by the policy was not found, no CRDs were found for the operator, there are no relevant deployments because the ClusterServiceVersion is missing, CatalogSource 'operatorhubio-catalog' was not found"
},
...
wanted related objects: [{Properties:<nil> Object:{Metadata:{Name:project-quay Namespace:operator-policy-testns-124d3267} APIVersion:operators.coreos.com/v1alpha1 Kind:Subscription} Compliant:NonCompliant Reason:ConstraintsNotSatisfiable}]
wanted condition: {Type:SubscriptionCompliant Status:False ObservedGeneration:0 LastTransitionTime:0001-01-01 00:00:00 +0000 UTC Reason:ConstraintsNotSatisfiable Message:constraints not satisfiable: refer to the Subscription for more details}
...
[FAILED] Failed after 2.327s.
The function passed to Consistently failed at /home/runner/work/config-policy-controller/config-policy-controller/test/e2e/case38_install_operator_test.go:169 with:
Expected
<string>: Compliant
to equal
<string>: NonCompliant
Retry the patch commands to prevent a test flake timeout when the patch commands hung without retrying Signed-off-by: Janelle Law <[email protected]>
| utils.Kubectl("patch", "operatorpolicy", opPolName, "-n", testNamespace, "--type=json", "-p", | ||
| `[{"op": "replace", "path": "/spec/subscription/startingCSV", "value": "`+goodVersion+`"},`+ | ||
| `{"op": "replace", "path": "/spec/remediationAction", "value": "inform"}]`) | ||
| KubectlTarget("patch", "subscription.operator", subName, "-n", opPolTestNS, "--type=json", "-p", | ||
| `[{"op": "replace", "path": "/spec/startingCSV", "value": "`+goodVersion+`"}]`) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On one test flake, the logs indicated the subscription wasn't patched, then timed out. Hopefully retrying helps. https://github.com/open-cluster-management-io/config-policy-controller/actions/runs/19972532656/job/57291714147#logs
2025-12-05T21:04:04.310Z debug controllers/operatorpolicy_controller.go:1711 ..."controllerKind": "OperatorPolicy", "OperatorPolicy": {"name":"oppol-manual-upgrades-e8e934d1","namespace":"managed"}, ... "subscriptionConditionMessage": "constraints not satisfiable: no operators found with name strimzi-cluster-operator.v0.0.0.1337 in channel strimzi-0.36.x of package strimzi-kafka-operator in the catalog referenced by subscription strimzi-kafka-operator, subscription strimzi-kafka-operator exists"}
Aiming to reduce total e2e test time by 10min without adding more flakes
refactor tests ref: https://issues.redhat.com/browse/ACM-26529
test flake ref: https://issues.redhat.com/browse/ACM-23597
Summary of changes
Ordereddecorator. Tests will run in parallel test processes unless marked withSerialDescribecontainer with its own parent policy and namespace suffixed with a hash of the test nameutils.KubectlJSONPatchToFileto update the namespaces and policy name suffixes before creating the resourcesexample-operatorwithAllNamespacesmode into anOrderedDescribecontainerOrderedDescribecontainer. Replaced the operator in these tests with a different operator than the other testsSerial: 2 tests that break the catalog source (including ACM-23597), and 1 test that counts controller reconcilesAssisted by Claude 4.5