Skip to content

kube: handle goaway test fixture conns concurrently#66934

Merged
jakealti merged 3 commits into
masterfrom
jakealti/goaway-concurrent-fixture
May 22, 2026
Merged

kube: handle goaway test fixture conns concurrently#66934
jakealti merged 3 commits into
masterfrom
jakealti/goaway-concurrent-fixture

Conversation

@jakealti
Copy link
Copy Markdown
Contributor

Follow-up to #66672. The fake goawayServer in forwarder_test.go calls handleConn inline, so concurrent clients in TestGOAWAYHandling_Concurrent are serialized. Under -race + slow CI, the 50-deep backlog can outlast the per-request 10s context, leaving the httptest handler stuck in TLS handshake and forwarderServer.Close() waiting on it until Go's 10-minute test timeout fires.

Example: https://github.com/gravitational/teleport/actions/runs/26193145084/job/77066278419?pr=66930

Fix: spawn a goroutine per accepted conn. No production code touched.

Stress: stress -p 8 -test.cpu=1 -race for ~4 min — 1,317 runs, 0 failures.

@jakealti jakealti added kubernetes-access no-changelog Indicates that a PR does not require a changelog entry no-test-plan Bypasses the test plan validation bot labels May 20, 2026
@jakealti jakealti marked this pull request as ready for review May 20, 2026 23:03
@github-actions github-actions Bot requested review from atburke and codingllama May 20, 2026 23:04
@jakealti jakealti enabled auto-merge May 20, 2026 23:11
Copy link
Copy Markdown
Contributor

@espadolini espadolini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want a way to synchronously wait for all the handling goroutines to exit? It's something that we indirectly get from synctest but TestGOAWAYHandling and TestGOAWAYHandling_Concurrent use real listeners and don't use synctest.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 795722f11b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +2470 to +2471
g.mu.Unlock()
g.wg.Wait()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Prevent Close from returning before accepted conn handlers start

A connection accepted just before Close() takes g.mu can be appended and launched after the close loop unlocks, because Serve() does append+wg.Go under the same mutex but Close() releases the mutex before wg.Wait(). In that interleaving, Close() can miss closing that conn and return without waiting for its handler, leaving a goroutine blocked in TLS/frame reads and reintroducing teardown flakiness under concurrent tests.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor but I addressed in 8e73d4d

@jakealti jakealti requested a review from espadolini May 21, 2026 21:39
@jakealti jakealti added this pull request to the merge queue May 22, 2026
Merged via the queue into master with commit b977685 May 22, 2026
47 checks passed
@jakealti jakealti deleted the jakealti/goaway-concurrent-fixture branch May 22, 2026 11:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kubernetes-access no-changelog Indicates that a PR does not require a changelog entry no-test-plan Bypasses the test plan validation bot size/sm

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants