OCPBUGS-56281: gatewayapicontroller: Clean up resources when done #29900

Miciah · 2025-06-09T13:27:16Z

gatewayapicontroller: Add checks for empty slices

Check whether the slice of parent resource references in an httproute's status is empty before indexing the slice.

Before this commit, the "Ensure HTTPRoute object is created" test sometimes panicked with "runtime error: index out of range [0] with length 0".

Similarly, check whether the slice of load-balancer ingress points in a service's status is empty before indexing it.

gatewayapicontroller: Clean up resources when done

Delete the gatewayclass and uninstall OSSM after all the Gateway API controller tests are done.

Before this change, the Gateway API controller tests left OSSM installed, including the subscription, CSV, installplan, bundled CRDs, RBAC resources, deployment, service, serviceaccount, etc., when the tests were finished. This clutter could cause problems for other tests, or for the same test if it was run again.

The new cleanup logic uses the OperatorsV1 client from github.com/operator-framework/operator-lifecycle-manager. Importing this package requires a replace stanza for openshift/api in go.mod.

This vendors github.com/operator-framework/operator-lifecycle-manager v0.30.1-0.20250114164243-1b6752ec65fa rather than the newest revision in order to avoid bringing in additional problematic vendor bumps that the newest revision would bring in.

gatewayapicontroller: Always log errors

Add the error value to some log messages that were missing it.

Check whether the slice of parent resource references in an httproute's status is empty before indexing the slice. Before this commit, the "Ensure HTTPRoute object is created" test sometimes panicked with "runtime error: index out of range [0] with length 0". Similarly, check whether the slice of load-balancer ingress points in a service's status is empty before indexing it. * test/extended/router/gatewayapicontroller.go (buildGateway) (createHttpRoute): Add checks.

openshift-ci-robot · 2025-06-09T13:27:22Z

@Miciah: This pull request references Jira Issue OCPBUGS-56281, which is invalid:

expected the bug to target the "4.20.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

gatewayapicontroller: Add checks for empty slices

Check whether the slice of parent resource references in an httproute's status is empty before indexing the slice.

Before this commit, the "Ensure HTTPRoute object is created" test sometimes panicked with "runtime error: index out of range [0] with length 0".

Similarly, check whether the slice of load-balancer ingress points in a service's status is empty before indexing it.

gatewayapicontroller: Clean up resources when done

Delete the gatewayclass and uninstall OSSM after all the Gateway API controller tests are done.

Before this change, the Gateway API controller tests left OSSM installed, including the subscription, CSV, installplan, bundled CRDs, RBAC resources, deployment, service, serviceaccount, etc., when the tests were finished. This clutter could cause problems for other tests, or for the same test if it was run again.

The new cleanup logic uses the OperatorsV1 client from github.com/operator-framework/operator-lifecycle-manager. Importing this package requires a replace stanza for openshift/api in go.mod.

This vendors github.com/operator-framework/operator-lifecycle-manager v0.30.1-0.20250114164243-1b6752ec65fa rather than the newest revision in order to avoid bringing in additional problematic vendor bumps that the newest revision would bring in.

gatewayapicontroller: Always log errors

Add the error value to some log messages that were missing it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-trt · 2025-06-09T23:15:18Z

Job Failure Risk Analysis for sha: bf853bf

Job Name	Failure Risk
pull-ci-openshift-origin-main-e2e-gcp-disruptive	IncompleteTests Tests for this run (19) are below the historical average (1505): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-main-e2e-gcp-fips-serial-1of2	IncompleteTests Tests for this run (19) are below the historical average (1822): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-main-e2e-metal-ipi-ovn-kube-apiserver-rollout	IncompleteTests Tests for this run (29) are below the historical average (1778): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)

rhamini3 · 2025-06-11T17:45:39Z

LGTM, @melvinjoseph86 PTAL

melvinjoseph86 · 2025-06-12T07:11:17Z

/lgtm

openshift-ci · 2025-06-12T07:13:34Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: melvinjoseph86, Miciah
Once this PR has been reviewed and has the lgtm label, please assign bertinatto for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

melvinjoseph86 · 2025-06-12T07:14:05Z

/retest

openshift-trt · 2025-06-12T15:26:42Z

Job Failure Risk Analysis for sha: 1967dd2

Job Name	Failure Risk
pull-ci-openshift-origin-main-4.12-upgrade-from-stable-4.11-e2e-aws-ovn-upgrade-rollback	IncompleteTests Tests for this run (94) are below the historical average (209): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-main-e2e-aws-ovn-edge-zones	High [sig-network-edge][OCPFeatureGate:GatewayAPIController][Feature:Router][apigroup:gateway.networking.k8s.io] Ensure custom gatewayclass can be accepted [Suite:openshift/conformance/parallel] This test has passed 98.38% of 2463 runs on release 4.20 [Overall] in the last week.
pull-ci-openshift-origin-main-e2e-azure-ovn-upgrade	IncompleteTests Tests for this run (196) are below the historical average (3374): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-main-e2e-gcp-ovn-etcd-scaling	Low [bz-kube-storage-version-migrator] clusteroperator/kube-storage-version-migrator should not change condition/Available This test has passed 0.00% of 1 runs on release 4.20 [Architecture:amd64 FeatureSet:default Installer:ipi JobTier:rare Network:ovn NetworkStack:ipv4 Owner:eng Platform:gcp SecurityMode:default Topology:ha Upgrade:none] in the last week. Open Bugs [CI] e2e-openstack-ovn-etcd-scaling job permanent fails at many openshift-test tests etcd-scaling jobs failing ~60% of the time --- [bz-Cloud Compute] clusteroperator/control-plane-machine-set should not change condition/Degraded This test has passed 0.00% of 1 runs on release 4.20 [Architecture:amd64 FeatureSet:default Installer:ipi JobTier:rare Network:ovn NetworkStack:ipv4 Owner:eng Platform:gcp SecurityMode:default Topology:ha Upgrade:none] in the last week. Open Bugs etcd-scaling jobs failing ~60% of the time
pull-ci-openshift-origin-main-e2e-vsphere-ovn-etcd-scaling	Medium [sig-instrumentation] disruption/metrics-api connection/new should be available throughout the test Potential external regression detected for High Risk Test analysis

Delete the gatewayclass and uninstall OSSM after all the Gateway API controller tests are done. Before this commit, the Gateway API controller tests left OSSM installed, including the subscription, CSV, installplan, bundled CRDs, RBAC resources, deployment, service, serviceaccount, etc., when the tests were finished. This clutter could cause problems for other tests, or for the same test if it was run again. The new cleanup logic uses the OperatorsV1 client from github.com/operator-framework/operator-lifecycle-manager. Importing this package requires a replace stanza for openshift/api in go.mod. This vendors github.com/operator-framework/operator-lifecycle-manager v0.30.1-0.20250114164243-1b6752ec65fa rather than the newest revision in order to avoid bringing in additional problematic vendor bumps that the newest revision would bring in. This commit fixes OCPBUGS-56281. https://issues.redhat.com/browse/OCPBUGS-56281 * test/extended/router/gatewayapicontroller.go: Delete the gatewayclass that the test creates. Use the OperatorsV1 client to look up the Operator object for OSSM, and delete all the resources that the Operator object references. * go.mod: Vendor the operatorsv1 client code from github.com/operator-framework/operator-lifecycle-manager. * go.sum: * vendor/*: Regenerate.

* test/extended/router/gatewayapicontroller.go: Add the error value to some log messages that were missing it.

openshift-ci · 2025-06-13T01:53:50Z

New changes are detected. LGTM label has been removed.

openshift-trt · 2025-06-13T08:14:44Z

Job Failure Risk Analysis for sha: ab81b79

Job Name	Failure Risk
pull-ci-openshift-origin-main-4.12-upgrade-from-stable-4.11-e2e-aws-ovn-upgrade-rollback	MissingData
pull-ci-openshift-origin-main-e2e-aws-ovn-edge-zones	High [sig-network-edge][OCPFeatureGate:GatewayAPIController][Feature:Router][apigroup:gateway.networking.k8s.io] Ensure custom gatewayclass can be accepted [Suite:openshift/conformance/parallel] This test has passed 99.76% of 2503 runs on release 4.20 [Overall] in the last week.
pull-ci-openshift-origin-main-e2e-aws-ovn-etcd-scaling	Low [bz-Cloud Compute] clusteroperator/control-plane-machine-set should not change condition/Degraded This test has passed 50.00% of 2 runs on release 4.20 [Architecture:amd64 FeatureSet:default Installer:ipi JobTier:rare Network:ovn NetworkStack:ipv4 Owner:eng Platform:aws SecurityMode:default Topology:ha Upgrade:none] in the last week. Open Bugs etcd-scaling jobs failing ~60% of the time
pull-ci-openshift-origin-main-e2e-azure-ovn-upgrade	IncompleteTests Tests for this run (2125) are below the historical average (3401): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)

Miciah · 2025-06-13T19:35:15Z

https://github.com/openshift/origin/compare/1967dd22c83963e780eb9953bc38da760e090dc8..1dcc98a3c2ec7c38dcee818e750e14ce57d70892 made these changes:

Add logic to delete the Istio CR in the test cleanup.
Declare package consts for istioName and ingressNamespace and use these instead of function-local variables and string literals.
Omit the namespace when getting the Istio CR, which is cluster-scoped.

Before these changes, pods.json from e2e-aws #1932229162710339584 had the istiod pod. After these changes, pods.json from e2e-aws #1933552902287134720 does not have the istiod pod. It appears that the istiod pod cleanup is working properly.

Also, comparing must-gather.tar from 1933552902287134720 and must-gather.tar from 1932229162710339584, the older must-gather archive has the istiorevisions.sailoperator.io.yaml CRD whereas the newer must-gather archive does not. Neither must-gather archive has any other istio.io or sailoperator.io CRDs. I believe that deleting the Istio CR enables the cleanup to delete all OSSM-installed CRDs successfully.

openshift-trt · 2025-06-13T23:17:41Z

Job Failure Risk Analysis for sha: 1dcc98a

Job Name	Failure Risk
pull-ci-openshift-origin-main-e2e-aws-ovn	High [sig-network-edge][OCPFeatureGate:GatewayAPIController][Feature:Router][apigroup:gateway.networking.k8s.io] Ensure HTTPRoute object is created [Suite:openshift/conformance/parallel] This test has passed 99.22% of 2451 runs on release 4.20 [Overall] in the last week. Open Bugs Component Readiness: [Networking / router] [OCPFeatureGate:GatewayAPIController] test regressed on HyperShift Azure AKS
pull-ci-openshift-origin-main-e2e-aws-ovn-etcd-scaling	Low [bz-Cloud Compute] clusteroperator/control-plane-machine-set should not change condition/Degraded This test has passed 50.00% of 2 runs on release 4.20 [Architecture:amd64 FeatureSet:default Installer:ipi JobTier:rare Network:ovn NetworkStack:ipv4 Owner:eng Platform:aws SecurityMode:default Topology:ha Upgrade:none] in the last week. Open Bugs etcd-scaling jobs failing ~60% of the time
pull-ci-openshift-origin-main-e2e-aws-ovn-microshift	High [sig-api-machinery] API priority and fairness should ensure that requests can be classified by adding FlowSchema and PriorityLevelConfiguration [Suite:openshift/conformance/parallel] [Suite:k8s] This test has passed 99.97% of 3060 runs on release 4.20 [Overall] in the last week.
pull-ci-openshift-origin-main-e2e-azure-ovn-upgrade	IncompleteTests Tests for this run (2125) are below the historical average (3318): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-main-e2e-gcp-ovn-rt-upgrade	IncompleteTests Tests for this run (19) are below the historical average (1620): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-main-e2e-vsphere-ovn-etcd-scaling	Low [bz-Cloud Compute] clusteroperator/control-plane-machine-set should not change condition/Degraded This test has passed 0.00% of 1 runs on release 4.20 [Architecture:amd64 FeatureSet:default Installer:ipi JobTier:rare Network:ovn NetworkStack:ipv4 Owner:eng Platform:vsphere SecurityMode:default Topology:ha Upgrade:none] in the last week. Open Bugs etcd-scaling jobs failing ~60% of the time --- [sig-api-machinery] disruption/cache-openshift-api apiserver/openshift-apiserver connection/new should be available throughout the test This test has passed 0.00% of 1 runs on release 4.20 [Architecture:amd64 FeatureSet:default Installer:ipi JobTier:rare Network:ovn NetworkStack:ipv4 Owner:eng Platform:vsphere SecurityMode:default Topology:ha Upgrade:none] in the last week. --- [sig-api-machinery] disruption/cache-oauth-api apiserver/oauth-apiserver connection/new should be available throughout the test This test has passed 0.00% of 1 runs on release 4.20 [Architecture:amd64 FeatureSet:default Installer:ipi JobTier:rare Network:ovn NetworkStack:ipv4 Owner:eng Platform:vsphere SecurityMode:default Topology:ha Upgrade:none] in the last week.

abhat · 2025-06-16T16:00:44Z

/payload-aggregate periodic-ci-openshift-release-master-ci-4.19-upgrade-from-stable-4.18-e2e-gcp-ovn-rt-upgrade 5

openshift-ci · 2025-06-16T16:00:48Z

@abhat: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

periodic-ci-openshift-release-master-ci-4.19-upgrade-from-stable-4.18-e2e-gcp-ovn-rt-upgrade

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/101e8ee0-4acb-11f0-928a-4bd1c2be89d0-0

alebedev87 · 2025-06-16T22:01:29Z

test/extended/router/gatewayapicontroller.go

+			e2e.Failf("Failed to delete GatewayClass %q", gatewayClassName)
+		}
+
+		g.By("Deleting the OSSM Operator resources")


I'm curious, why we don't use an owner reference for Subscription? We could owner reference the gatewayclass and let Kube do the cascading deletion.

Upd: Deletion of Subscription doesn't delete CSV or CRDs. The CRD part is understandable: there can be some data loss. But CSV is kinda interesting.

There are a few reasons not to put or rely on an owner reference on the subscription:

You could create the subscription manually; we cannot assume that the operator created it.

You could have multiple gatewayclasses with our controller name, and then it isn't clear how we would configure the owner references on the subscription. Would we add only the first gatewayclass with our controller name? Would we add all gatewayclasses with our controller name? If we added more than one owner reference, would we need to delete old owner references when the corresponding gatewayclasses were deleted? If we did delete stale owner references, would that prevent garbage collection, or would we always leave one non-stale reference to trigger garbage collection?

I don't know for sure that OLM doesn't look at the owner reference. We would need to check this.

I am not confident that an owner reference would cause the subscription to be deleted as the owner reference on the Istio CR didn't cause it to be deleted (see OCPBUGS-56281: gatewayapicontroller: Clean up resources when done #29900 (comment)).

Deleting the Istio CR only requires changing the test, it is more explicit than relying on garbage collection, and it is more obviously safe to backport.

alebedev87 · 2025-06-16T22:20:09Z

test/extended/router/gatewayapicontroller.go

+		g.By("Deleting the Istio CR")
+
+		o.Expect(oc.AsAdmin().Run("delete").Args("--ignore-not-found=true", "istio", istioName).Execute()).Should(o.Succeed())


Istio CR is supposed to be garbage collected since its owner reference is gatewayclass.

The owner reference on the Istio CR didn't cause it to be deleted (see #29900 (comment)).

The owner reference on the Istio CR didn't cause it to be deleted

I didn't manage to reproduce this behavior. I saw Istio CR gets deleted after GatewayClass:

$ oc get gc NAME CONTROLLER ACCEPTED AGE openshift-default openshift.io/gateway-controller/v1 True 4m12s 04:57:08 $ oc get istio NAME REVISIONS READY IN USE ACTIVE REVISION STATUS VERSION AGE openshift-gateway 1 1 0 openshift-gateway Healthy v1.24.3 4m18s 04:57:14 $ oc get istio openshift-gateway -o yaml | yq .metadata.ownerReferences[0] apiVersion: gateway.networking.k8s.io/v1 kind: GatewayClass name: openshift-default uid: 3f6ef6ed-9e6b-4821-9706-221ff0bca83e 04:57:34 $ oc -n openshift-ingress get pods NAME READY STATUS RESTARTS AGE istiod-openshift-gateway-7b567bc8b4-z9972 1/1 Running 0 4m48s router-default-76c4888886-fmtzq 1/1 Running 0 77m router-default-76c4888886-nm9mb 1/1 Running 2 (78m ago) 89m 04:57:52 $ oc delete gc openshift-default gatewayclass.gateway.networking.k8s.io "openshift-default" deleted 04:58:07 $ oc get istio No resources found 04:58:14 $ oc -n openshift-ingress get pods NAME READY STATUS RESTARTS AGE router-default-76c4888886-fmtzq 1/1 Running 0 78m router-default-76c4888886-nm9mb 1/1 Running 2 (78m ago) 89m

alebedev87 · 2025-06-16T22:44:51Z

test/extended/router/gatewayapicontroller.go

+			if err != nil && strings.Contains(err.Error(), "not found") {
+				e2e.Logf("Subscription %q not found; retrying...", expectedSubscriptionName)
+				return false, nil
+			}


I think that we should be consistent among all the polls we do in this block. I personally prefer how it's done for the OSSM deployment below:

if err != nil { e2e.Logf("Failed to get OSSM operator deployment %q: %v; retrying...", deploymentOSSMName, err) return false, nil }

No assertions, just a retry for any error until the timeout is triggered. I think that some errors (not only "Not Found") can be temporary or intermittent.

I was trying to keep my changes more narrowly focused. All right, I can make the polling loop for the subscription retry on all errors.

Fixed in https://github.com/openshift/origin/compare/1dcc98a3c2ec7c38dcee818e750e14ce57d70892..38d8018dfd320088688bd559b77c7f73e998ef13.

Log errors and then retry in the polling loops for the Subscription and Istio CRs. Before this commit, the gatewayapicontroller tests sometimes failed because OSSM was still installing when these polling loops ran, and the polling loops would fail on a "not found" error (if the CR had not yet been created) or a "server doesn't have a resource type" error (if the CRD had not yet been created). In order to make the tests more reliable, they need to retry on these errors. For consistency with other polling loops, this commit makes these polling loops retry on all errors (not just "not found" or "doesn't have a resource type" errors). * test/extended/router/gatewayapicontroller.go: Retry when the test fails to get the OSSM subscription or the Istio CR.

* test/extended/router/gatewayapicontroller.go: Increase the timeouts on some polling loops that have been observed to flake but then succeed on retry.

* test/extended/router/gatewayapicontroller.go (ingressNamespace): New const. (waitForIstioHealthy, createAndCheckGateway) (assertGatewayLoadbalancerReady, assertDNSRecordStatus, createHttpRoute) (assertHttpRouteSuccessful): Use the new const instead of function-level variables or string literals.

* test/extended/router/gatewayapicontroller.go: Omit the namespace when getting the Istio CR, which is cluster-scoped.

* test/extended/router/gatewayapicontroller.go (istioName): Declare const. (waitForIstioHealthy): Use the new const instead of a string literal.

* test/extended/router/gatewayapicontroller.go: Delete the Istio CR and wait for the istiod pod to be deleted as part of the test cleanup.

Thealisyed · 2025-06-17T10:52:18Z

LGTM, holding off for @alebedev87 comments

openshift-ci · 2025-06-17T12:46:02Z

@Miciah: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/e2e-metal-ipi-ovn	`38d8018`	link	false	`/test e2e-metal-ipi-ovn`
ci/prow/e2e-aws-ovn-serial-publicnet-1of2	`38d8018`	link	false	`/test e2e-aws-ovn-serial-publicnet-1of2`
ci/prow/e2e-azure-ovn-etcd-scaling	`38d8018`	link	false	`/test e2e-azure-ovn-etcd-scaling`
ci/prow/e2e-aws-disruptive	`38d8018`	link	false	`/test e2e-aws-disruptive`
ci/prow/e2e-gcp-ovn-etcd-scaling	`38d8018`	link	false	`/test e2e-gcp-ovn-etcd-scaling`
ci/prow/e2e-gcp-csi	`38d8018`	link	false	`/test e2e-gcp-csi`
ci/prow/e2e-gcp-ovn-rt-upgrade	`38d8018`	link	false	`/test e2e-gcp-ovn-rt-upgrade`
ci/prow/e2e-aws-ovn-single-node	`38d8018`	link	false	`/test e2e-aws-ovn-single-node`
ci/prow/e2e-hypershift-conformance	`38d8018`	link	false	`/test e2e-hypershift-conformance`
ci/prow/e2e-vsphere-ovn-etcd-scaling	`38d8018`	link	false	`/test e2e-vsphere-ovn-etcd-scaling`
ci/prow/e2e-gcp-fips-serial-2of2	`38d8018`	link	false	`/test e2e-gcp-fips-serial-2of2`
ci/prow/e2e-gcp-fips-serial-1of2	`38d8018`	link	false	`/test e2e-gcp-fips-serial-1of2`
ci/prow/e2e-aws-ovn-etcd-scaling	`38d8018`	link	false	`/test e2e-aws-ovn-etcd-scaling`
ci/prow/e2e-metal-ipi-serial-1of2	`38d8018`	link	false	`/test e2e-metal-ipi-serial-1of2`
ci/prow/e2e-vsphere-ovn-dualstack-primaryv6	`38d8018`	link	false	`/test e2e-vsphere-ovn-dualstack-primaryv6`
ci/prow/e2e-azure	`38d8018`	link	false	`/test e2e-azure`
ci/prow/e2e-aws-ovn-kube-apiserver-rollout	`38d8018`	link	false	`/test e2e-aws-ovn-kube-apiserver-rollout`
ci/prow/e2e-gcp-ovn	`38d8018`	link	true	`/test e2e-gcp-ovn`
ci/prow/e2e-openstack-ovn	`38d8018`	link	false	`/test e2e-openstack-ovn`
ci/prow/e2e-gcp-ovn-upgrade	`38d8018`	link	true	`/test e2e-gcp-ovn-upgrade`
ci/prow/e2e-aws-ovn-single-node-upgrade	`38d8018`	link	false	`/test e2e-aws-ovn-single-node-upgrade`
ci/prow/e2e-aws-ovn-serial-publicnet-2of2	`38d8018`	link	false	`/test e2e-aws-ovn-serial-publicnet-2of2`
ci/prow/e2e-aws-ovn	`38d8018`	link	false	`/test e2e-aws-ovn`
ci/prow/e2e-aws	`38d8018`	link	false	`/test e2e-aws`
ci/prow/e2e-azure-ovn-upgrade	`38d8018`	link	false	`/test e2e-azure-ovn-upgrade`
ci/prow/e2e-aws-ovn-microshift	`38d8018`	link	true	`/test e2e-aws-ovn-microshift`
ci/prow/e2e-aws-ovn-edge-zones	`38d8018`	link	true	`/test e2e-aws-ovn-edge-zones`
ci/prow/e2e-gcp-disruptive	`38d8018`	link	false	`/test e2e-gcp-disruptive`
ci/prow/e2e-metal-ipi-ovn-dualstack-local-gateway	`38d8018`	link	false	`/test e2e-metal-ipi-ovn-dualstack-local-gateway`
ci/prow/okd-e2e-gcp	`38d8018`	link	false	`/test okd-e2e-gcp`
ci/prow/4.12-upgrade-from-stable-4.11-e2e-aws-ovn-upgrade-rollback	`38d8018`	link	false	`/test 4.12-upgrade-from-stable-4.11-e2e-aws-ovn-upgrade-rollback`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-trt · 2025-06-17T13:13:39Z

Job Failure Risk Analysis for sha: 38d8018

Job Name	Failure Risk
pull-ci-openshift-origin-main-e2e-gcp-csi	IncompleteTests Tests for this run (19) are below the historical average (1374): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-main-e2e-gcp-disruptive	IncompleteTests Tests for this run (19) are below the historical average (1140): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-main-e2e-gcp-fips-serial-1of2	IncompleteTests Tests for this run (18) are below the historical average (1403): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-main-e2e-gcp-fips-serial-2of2	IncompleteTests Tests for this run (19) are below the historical average (1430): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-main-e2e-gcp-ovn	IncompleteTests Tests for this run (19) are below the historical average (1146): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-main-e2e-gcp-ovn-etcd-scaling	IncompleteTests Tests for this run (19) are below the historical average (1343): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-main-e2e-gcp-ovn-rt-upgrade	IncompleteTests Tests for this run (19) are below the historical average (1315): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-main-e2e-gcp-ovn-upgrade	IncompleteTests Tests for this run (19) are below the historical average (810): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)

Miciah · 2025-06-19T03:44:12Z

The aggregated jobs each failed while buliding the tests-openshift.origin-amd64 image, with the error message, "Error: Unable to find a match: python3-cinderclient" (missing RPM package). I'll retry in case it was glitch with the Yum repository.

/payload-aggregate periodic-ci-openshift-release-master-ci-4.19-upgrade-from-stable-4.18-e2e-gcp-ovn-rt-upgrade 5

openshift-ci · 2025-06-19T03:44:17Z

@Miciah: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

periodic-ci-openshift-release-master-ci-4.19-upgrade-from-stable-4.18-e2e-gcp-ovn-rt-upgrade

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/aad73360-4cbf-11f0-9efa-6a57a5235fed-0

Miciah · 2025-06-19T19:51:10Z

This time all the aggregated jobs failed to build the image with the erorr message, "Error: Unable to find a match: realtime-tests rteval".

/payload-aggregate periodic-ci-openshift-release-master-ci-4.19-upgrade-from-stable-4.18-e2e-gcp-ovn-rt-upgrade 5

openshift-ci · 2025-06-19T19:51:13Z

@Miciah: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

periodic-ci-openshift-release-master-ci-4.19-upgrade-from-stable-4.18-e2e-gcp-ovn-rt-upgrade

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/c0124e40-4d46-11f0-9623-e230cc269dc8-0

Miciah · 2025-06-20T15:51:52Z

This time all the aggregated jobs failed with, "Error: Unable to find a match: python3-cinderclient realtime-tests rteval". I have filed OCPBUGS-57921 for these failures.

Miciah · 2025-06-20T18:21:51Z

/payload-aggregate periodic-ci-openshift-release-master-ci-4.19-upgrade-from-stable-4.18-e2e-gcp-ovn-rt-upgrade 5

openshift-ci · 2025-06-20T18:21:58Z

@Miciah: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

periodic-ci-openshift-release-master-ci-4.19-upgrade-from-stable-4.18-e2e-gcp-ovn-rt-upgrade

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/7029ee70-4e03-11f0-8a1d-a44ec557b951-0

openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Jun 9, 2025

openshift-ci bot requested review from knobunc and p0lyn0mial June 9, 2025 13:29

openshift-ci bot added the vendor-update Touching vendor dir or related files label Jun 9, 2025

Miciah force-pushed the OCPBUGS-56281-gatewayapicontroller-clean-up-resources-when-done branch from fc08232 to bf853bf Compare June 9, 2025 16:11

openshift-ci bot assigned melvinjoseph86 Jun 12, 2025

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jun 12, 2025

Miciah added 2 commits June 12, 2025 21:43

gatewayapicontroller: Always log errors

ad7b2f9

* test/extended/router/gatewayapicontroller.go: Add the error value to some log messages that were missing it.

Miciah force-pushed the OCPBUGS-56281-gatewayapicontroller-clean-up-resources-when-done branch from 1967dd2 to ab81b79 Compare June 13, 2025 01:53

openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Jun 13, 2025

Miciah force-pushed the OCPBUGS-56281-gatewayapicontroller-clean-up-resources-when-done branch from ab81b79 to 1dcc98a Compare June 13, 2025 15:51

alebedev87 reviewed Jun 16, 2025

View reviewed changes

Miciah added 2 commits June 17, 2025 03:46

gatewayapicontroller: Increase some timeouts

d776348

* test/extended/router/gatewayapicontroller.go: Increase the timeouts on some polling loops that have been observed to flake but then succeed on retry.

Miciah added 4 commits June 17, 2025 03:46

gatewayapicontroller: Istio CR is cluster-scoped

afad3a4

* test/extended/router/gatewayapicontroller.go: Omit the namespace when getting the Istio CR, which is cluster-scoped.

gatewayapicontroller: Add istioName const

40c645b

* test/extended/router/gatewayapicontroller.go (istioName): Declare const. (waitForIstioHealthy): Use the new const instead of a string literal.

gatewayapicontroller: Delete the Istio CR

38d8018

* test/extended/router/gatewayapicontroller.go: Delete the Istio CR and wait for the istiod pod to be deleted as part of the test cleanup.

Miciah force-pushed the OCPBUGS-56281-gatewayapicontroller-clean-up-resources-when-done branch from 1dcc98a to 38d8018 Compare June 17, 2025 07:48

		g.By("Deleting the Istio CR")

		o.Expect(oc.AsAdmin().Run("delete").Args("--ignore-not-found=true", "istio", istioName).Execute()).Should(o.Succeed())

OCPBUGS-56281: gatewayapicontroller: Clean up resources when done #29900

Are you sure you want to change the base?

OCPBUGS-56281: gatewayapicontroller: Clean up resources when done #29900

Uh oh!

Conversation

Miciah commented Jun 9, 2025

gatewayapicontroller: Add checks for empty slices

gatewayapicontroller: Clean up resources when done

gatewayapicontroller: Always log errors

Uh oh!

openshift-ci-robot commented Jun 9, 2025

gatewayapicontroller: Add checks for empty slices

gatewayapicontroller: Clean up resources when done

gatewayapicontroller: Always log errors

Uh oh!

openshift-trt bot commented Jun 9, 2025

Uh oh!

rhamini3 commented Jun 11, 2025

Uh oh!

melvinjoseph86 commented Jun 12, 2025

Uh oh!

openshift-ci bot commented Jun 12, 2025

Uh oh!

melvinjoseph86 commented Jun 12, 2025

Uh oh!

openshift-trt bot commented Jun 12, 2025

Uh oh!

openshift-ci bot commented Jun 13, 2025

Uh oh!

openshift-trt bot commented Jun 13, 2025

Uh oh!

Miciah commented Jun 13, 2025

Uh oh!

openshift-trt bot commented Jun 13, 2025

Uh oh!

abhat commented Jun 16, 2025

Uh oh!

openshift-ci bot commented Jun 16, 2025

Uh oh!

alebedev87 Jun 16, 2025

Choose a reason for hiding this comment

Uh oh!

Miciah Jun 17, 2025

Choose a reason for hiding this comment

Uh oh!

alebedev87 Jun 16, 2025

Choose a reason for hiding this comment

Uh oh!

Miciah Jun 17, 2025

Choose a reason for hiding this comment

Uh oh!

alebedev87 Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

alebedev87 Jun 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Miciah Jun 17, 2025

Choose a reason for hiding this comment

Uh oh!

Miciah Jun 17, 2025

Choose a reason for hiding this comment

Uh oh!

Thealisyed commented Jun 17, 2025

Uh oh!

openshift-ci bot commented Jun 17, 2025

Uh oh!

openshift-trt bot commented Jun 17, 2025

Uh oh!

Miciah commented Jun 19, 2025

Uh oh!

openshift-ci bot commented Jun 19, 2025

Uh oh!

Miciah commented Jun 19, 2025

Uh oh!

openshift-ci bot commented Jun 19, 2025

Uh oh!

Miciah commented Jun 20, 2025

Uh oh!

alebedev87 Jun 16, 2025 •

edited

Loading