Skip to content

OCPBUGS-77493: e2e: Increase GatewayClass acceptance timeout to 5m#1372

Merged
openshift-merge-bot[bot] merged 1 commit intoopenshift:masterfrom
Thealisyed:fix-gwapi-flaky-timeout
Mar 10, 2026
Merged

OCPBUGS-77493: e2e: Increase GatewayClass acceptance timeout to 5m#1372
openshift-merge-bot[bot] merged 1 commit intoopenshift:masterfrom
Thealisyed:fix-gwapi-flaky-timeout

Conversation

@Thealisyed
Copy link
Copy Markdown
Contributor

@Thealisyed Thealisyed commented Mar 3, 2026

Summary

  • Increase assertGatewayClassSuccessful timeout from 2 minutes to 5 minutes
  • The 2-minute timeout was originally set for OSSM 2.x (Maistra). After the OSSM 3.0 bump (Sail Operator, Istio 1.24), the installation chain is longer but this timeout was never updated
  • The GatewayClass Accepted condition is set by Istiod, requiring the full OSSM install pipeline to complete first (Subscription → OLM install → Istio CR → Istiod startup)
  • CI failures show "Waiting for controller" — confirming the timeout expires before Istiod registers

Test plan

  • Verify Gateway API e2e tests pass consistently on CI (azure, gcp, aws)
  • Confirm no new "Waiting for controller" timeouts in testGatewayAPIObjects

🤖 Generated with Claude Code

The 2-minute timeout was set for OSSM 2.x but the OSSM 3.0
installation pipeline takes longer, causing flaky "Waiting for
controller" failures. Increase to 5 minutes to accommodate the
full Istiod startup chain.

Assisted with Claude
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 3, 2026

📝 Walkthrough

Walkthrough

A test utility function in the Gateway API test suite was modified to increase the polling timeout for waiting on GatewayClass acceptance. The timeout was extended from 2 minutes to 5 minutes. Explanatory comments were added describing the dependent installation chain (Subscription, OSSM operator, Istio CR, Istiod) that can delay GatewayClass acceptance. The polling interval remains at 2 seconds, and the status checking logic is unchanged. No public API declarations were modified in this change.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description check ✅ Passed The description provides clear context about the timeout change, explaining the rationale (OSSM 3.0 upgrade), the dependency chain causing delays, and the test plan for verification.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Title check ✅ Passed The title directly and specifically describes the main change: increasing the GatewayClass acceptance timeout from 2m to 5m in e2e tests.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot requested review from jcmoraisjr and rfredette March 3, 2026 12:31
@Thealisyed
Copy link
Copy Markdown
Contributor Author

/retest

@rikatz
Copy link
Copy Markdown
Member

rikatz commented Mar 5, 2026

/test e2e-vsphere-static-metallb-operator-gwapi

@rikatz
Copy link
Copy Markdown
Member

rikatz commented Mar 5, 2026

test worked fine again, I think we can try this and see if flakes reduces

/lgtm
/approve

Thanks @Thealisyed

@rikatz
Copy link
Copy Markdown
Member

rikatz commented Mar 5, 2026

/retest-required

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 5, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 5, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rikatz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 5, 2026
@rikatz
Copy link
Copy Markdown
Member

rikatz commented Mar 5, 2026

/retest-required
/retitle OCPBUGS-77493: e2e: Increase GatewayClass acceptance timeout to 5m

@openshift-ci openshift-ci bot changed the title e2e: Increase GatewayClass acceptance timeout to 5m OCPBUGS-77493: e2e: Increase GatewayClass acceptance timeout to 5m Mar 5, 2026
@openshift-ci-robot openshift-ci-robot added jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Mar 5, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@Thealisyed: This pull request references Jira Issue OCPBUGS-77493, which is invalid:

  • expected the bug to target the "4.22.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Summary

  • Increase assertGatewayClassSuccessful timeout from 2 minutes to 5 minutes
  • The 2-minute timeout was originally set for OSSM 2.x (Maistra). After the OSSM 3.0 bump (Sail Operator, Istio 1.24), the installation chain is longer but this timeout was never updated
  • The GatewayClass Accepted condition is set by Istiod, requiring the full OSSM install pipeline to complete first (Subscription → OLM install → Istio CR → Istiod startup)
  • CI failures show "Waiting for controller" — confirming the timeout expires before Istiod registers

Test plan

  • Verify Gateway API e2e tests pass consistently on CI (azure, gcp, aws)
  • Confirm no new "Waiting for controller" timeouts in testGatewayAPIObjects

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@rikatz
Copy link
Copy Markdown
Member

rikatz commented Mar 5, 2026

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Mar 5, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@rikatz: This pull request references Jira Issue OCPBUGS-77493, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.22.0) matches configured target version for branch (4.22.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

No GitHub users were found matching the public email listed for the QA contact in Jira (iamin@redhat.com), skipping review request.

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@rikatz
Copy link
Copy Markdown
Member

rikatz commented Mar 5, 2026

/test e2e-aws-ovn

@rikatz
Copy link
Copy Markdown
Member

rikatz commented Mar 6, 2026

/verified by e2e

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Mar 6, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@rikatz: This PR has been marked as verified by e2e.

Details

In response to this:

/verified by e2e

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@rikatz
Copy link
Copy Markdown
Member

rikatz commented Mar 6, 2026

/test e2e-hypershift

@gcs278
Copy link
Copy Markdown
Contributor

gcs278 commented Mar 6, 2026

@Thealisyed I also came to a conclusion that 3 minutes is too short before finding this PR. We should definitely bump to 5 minutes to eliminate slow CCM/api-server/etcd blips in CI.

/lgtm

@rikatz
Copy link
Copy Markdown
Member

rikatz commented Mar 8, 2026

/test e2e-hypershift

@Thealisyed
Copy link
Copy Markdown
Contributor Author

/retest

TestCreateCluster/ValidateHostedCluster:

 Failed to wait for 3 nodes to become ready in 45m0s: context deadline exceeded 
  expected 3 nodes, got 0
  DNS resolution failures and i/o timeouts connecting to the guest API server

TestAutoscaling/Teardown:

Failed to wait for infra resources in guest cluster to be deleted: context deadline exceeded

This is a flaky HyperShift CI infrastructure issue.
The hosted cluster simply failed to provision its worker nodes within the 45-minute window.

@gcs278
Copy link
Copy Markdown
Contributor

gcs278 commented Mar 9, 2026

TestUpgradeControlPlane failed on e2e-hypershift, not related (this is just an E2E change anyways)...

/test e2e-hypershift

@rikatz
Copy link
Copy Markdown
Member

rikatz commented Mar 9, 2026

I have the feeling that this hypershift test is broken. I am considering overriding this right now, given the bug being fixed here will avoid retest of other jobs that are failing today due to Gateway API tests

@rikatz
Copy link
Copy Markdown
Member

rikatz commented Mar 9, 2026

/test e2e-hypershift

@rikatz
Copy link
Copy Markdown
Member

rikatz commented Mar 10, 2026

/retest-required

@rikatz
Copy link
Copy Markdown
Member

rikatz commented Mar 10, 2026

/test hypershift-e2e-aks

@rikatz
Copy link
Copy Markdown
Member

rikatz commented Mar 10, 2026

/test e2e-gcp-operator

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 10, 2026

@Thealisyed: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit 4d7cbb0 into openshift:master Mar 10, 2026
18 checks passed
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@Thealisyed: Jira Issue Verification Checks: Jira Issue OCPBUGS-77493
✔️ This pull request was pre-merge verified.
✔️ All associated pull requests have merged.
✔️ All associated, merged pull requests were pre-merge verified.

Jira Issue OCPBUGS-77493 has been moved to the MODIFIED state and will move to the VERIFIED state when the change is available in an accepted nightly payload. 🕓

Details

In response to this:

Summary

  • Increase assertGatewayClassSuccessful timeout from 2 minutes to 5 minutes
  • The 2-minute timeout was originally set for OSSM 2.x (Maistra). After the OSSM 3.0 bump (Sail Operator, Istio 1.24), the installation chain is longer but this timeout was never updated
  • The GatewayClass Accepted condition is set by Istiod, requiring the full OSSM install pipeline to complete first (Subscription → OLM install → Istio CR → Istiod startup)
  • CI failures show "Waiting for controller" — confirming the timeout expires before Istiod registers

Test plan

  • Verify Gateway API e2e tests pass consistently on CI (azure, gcp, aws)
  • Confirm no new "Waiting for controller" timeouts in testGatewayAPIObjects

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-merge-robot
Copy link
Copy Markdown
Contributor

Fix included in accepted release 4.22.0-0.nightly-2026-03-11-034211

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants