Skip to content

kube: relax flaky GOAWAY concurrent test assertion#66672

Merged
jakealti merged 2 commits into
masterfrom
jakealti/goaway-flaky
May 13, 2026
Merged

kube: relax flaky GOAWAY concurrent test assertion#66672
jakealti merged 2 commits into
masterfrom
jakealti/goaway-flaky

Conversation

@jakealti
Copy link
Copy Markdown
Contributor

@jakealti jakealti commented May 12, 2026

Follow-up to #66605. The soft require.NotZero(retried.Load(), ...) check in TestGOAWAYHandling_Concurrent was flaky: some scheduler timings produce only network-level failures without exercising the rewind path, so the 429 count legitimately reaches zero.

Demoted to t.Logf. The strict assertion that catches the actual bug (require.Zero(rewindLeaks, "rewind-body error must never reach the client")) is unchanged.

Flake: https://github.com/gravitational/teleport.e/actions/runs/25729502557/job/75550786265

Local testing

  • go test -count=200 -race -run '^TestGOAWAYHandling_Concurrent$' ./lib/kube/proxy/ — 200/200 pass in 61s.
  • stress -p 8 ./goaway.test -test.run='^TestGOAWAYHandling_Concurrent$' -test.timeout=30s958 runs, 0 failures over ~3 minutes.
  • Reverted the underlying fix in formatStatusResponseError and re-ran -count=20 -race: the test fails reliably (20/20) with rewind-body error must never reach the client. The strict assertion still catches the original bug.

@jakealti jakealti added kubernetes-access no-changelog Indicates that a PR does not require a changelog entry no-test-plan Bypasses the test plan validation bot labels May 12, 2026
@jakealti jakealti marked this pull request as ready for review May 12, 2026 15:48
@jakealti jakealti requested review from rosstimothy and tigrato May 12, 2026 15:48
Comment thread lib/kube/proxy/forwarder_test.go Outdated
// Some scheduler timings produce only network-level failures
// (broken pipe, conn reset) with no rewind path traversed.
// That's fine for what this test asserts, so log instead of requiring.
t.Logf("requests caught by GOAWAY 429 translation: %d/%d", retried.Load(), concurrency)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does logging this provide any value? If we don't care to assert a particular value, then should we just remove the log and assertion?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. I've removed it.

@jakealti jakealti requested a review from rosstimothy May 12, 2026 20:18
@jakealti jakealti added this pull request to the merge queue May 13, 2026
Merged via the queue into master with commit 19b9456 May 13, 2026
47 checks passed
@jakealti jakealti deleted the jakealti/goaway-flaky branch May 13, 2026 13:59
ivan-bax pushed a commit to ivan-bax/teleport that referenced this pull request May 22, 2026
…6710)

* kube: handle net/http rewind body GOAWAY error (gravitational#66605)

* kube: handle net/http rewind body GOAWAY error

* kube: fix data race in GOAWAY concurrent test

* kube: match wrapped rewind body errors too

* kube: set GetBody on buffered create requests

* kube: relax flaky GOAWAY concurrent test assertion (gravitational#66672)

* kube: relax flaky goaway concurrent assertion

* kube: drop unused retried counter and log
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kubernetes-access no-changelog Indicates that a PR does not require a changelog entry no-test-plan Bypasses the test plan validation bot size/sm

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants