Skip to content

fix: add idle timeout to upstream WS ReadMessage (#2998)#3013

Closed
thiscantbeserious wants to merge 1 commit intomaximhq:mainfrom
thiscantbeserious:fix/2998-ws-upstream-idle-timeout
Closed

fix: add idle timeout to upstream WS ReadMessage (#2998)#3013
thiscantbeserious wants to merge 1 commit intomaximhq:mainfrom
thiscantbeserious:fix/2998-ws-upstream-idle-timeout

Conversation

@thiscantbeserious
Copy link
Copy Markdown

Summary

Fixes #2998. When a native WS upstream accepts the upgrade and then sends no frames, upstream.ReadMessage() blocked indefinitely. This change adds a per-read idle deadline so the goroutine recovers and the client receives a 504 rather than a silent hang.

Changes

  • Add wsUpstreamIdleTimeout constant (60s, matching DefaultStreamIdleTimeoutInSeconds).
  • Add upstreamWSIdleTimeout() method that reads NetworkConfig.StreamIdleTimeoutInSeconds from the provider config, falling back to the constant when no override is set.
  • Before each upstream.ReadMessage() in tryNativeWSUpstream, set SetReadDeadline(now + idleTimeout). Clear the deadline on successful read so active streams that send a frame periodically are not interrupted.
  • On net.Error.Timeout(), discard the upstream from the pool, write a 504 upstream_timeout WS error frame to the client, return cleanly.

Type of change

  • Bug fix

Affected areas

  • Transports (HTTP)

How to test

go test -cover github.com/maximhq/bifrost/transports/bifrost-http/handlers
go test -cover github.com/maximhq/bifrost/transports/bifrost-http/websocket

Manual, start a mock WS upstream that upgrades and stays silent, fire a request through Bifrost. Before this fix the request hangs indefinitely. After this fix it returns a 504 within stream_idle_timeout_in_seconds.

Tests added cover the stalling-server timeout, periodic-frame keepalive, and one-then-silent cases.

Breaking changes

  • No

Related issues

Closes #2998. Extracted from #2775, the PR branch contains this fix as a side effect of the OAuth feature work and it has been pulled out here as a focused change.

Checklist

  • I read docs/contributing/README.md and followed the guidelines
  • I added tests
  • Build passes

Before each upstream.ReadMessage() call in tryNativeWSUpstream, set a
per-read deadline sourced from NetworkConfig.StreamIdleTimeoutInSeconds
(default 60s). Clear the deadline immediately after each successful read
so long-running streams that send a frame every <60s are never
interrupted. On net.Error.Timeout(), discard the upstream connection,
write a 504 upstream_timeout WS error frame to the client, and return
cleanly. This prevents goroutine pile-up when an upstream accepts the
WS upgrade and then goes silent.

Closes maximhq#2998
Extracted from maximhq#2775

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 24, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d1a7dce9-9941-4401-b62c-65d2a1cf562d

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

thiscantbeserious added a commit to thiscantbeserious/bifrost that referenced this pull request Apr 24, 2026
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@thiscantbeserious
Copy link
Copy Markdown
Author

Superseded by #3018 (merged 2026-04-24), which bundles the fix for this issue and several other native WS reliability bugs. Closing this draft as redundant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Native WebSocket upstream requests hang indefinitely when upstream accepts but sends no frames

1 participant