Skip to content

fix: correct SSE subgraph subscription complete behaviour#2262

Merged
alepane21 merged 6 commits intomainfrom
jesse/eng-8249-sse-complete
Oct 22, 2025
Merged

fix: correct SSE subgraph subscription complete behaviour#2262
alepane21 merged 6 commits intomainfrom
jesse/eng-8249-sse-complete

Conversation

@endigma
Copy link
Copy Markdown
Member

@endigma endigma commented Oct 6, 2025

Summary by CodeRabbit

  • Bug Fixes

    • Improved streaming reliability by adding start markers, periodic heartbeat signals, and more robust termination handling to reduce dropped or stalled streams.
  • Chores

    • Updated project dependencies to incorporate recent upstream fixes and build metadata adjustments.

Checklist

@endigma endigma marked this pull request as draft October 6, 2025 12:22
@github-actions github-actions Bot added the router label Oct 6, 2025
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Oct 6, 2025

Walkthrough

Updated two go.mod files to change a GraphQL tool dependency and adjusted websocket/SSE tests to emit an initial SSE "next" marker and run periodic :heartbeat emissions with context-aware shutdown.

Changes

Cohort / File(s) Summary
Dependency updates
router-tests/go.mod, router/go.mod
Updated github.com/wundergraph/graphql-go-tools/v2 from v2.0.0-rc.231v2.0.0-rc.232. Also changed indirect github.com/shoenig/go-m1cpu from v0.1.7v0.1.6 in router/go.mod.
Websocket / SSE test changes
router-tests/websocket_test.go
Reworked SSE test handlers to emit an initial "next" marker and run a periodic heartbeat loop that writes :heartbeat (every ~50ms), flushes, and observes context cancellation/close signaling instead of emitting a single data payload.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

Possibly related PRs

Pre-merge checks

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title "fix: correct SSE subgraph subscription complete behaviour" directly aligns with the primary changes in the changeset. The main modifications are in router-tests/websocket_test.go, where SSE streaming markers and a heartbeat mechanism have been added to websocket tests, addressing the completion behavior of SSE subgraph subscriptions. The title is concise, specific, and clearly communicates the intent of fixing SSE subscription behavior. A reviewer scanning the repository history would understand this PR is addressing a bug in how SSE subgraph subscriptions complete their lifecycle.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ba6f5df and 047d859.

⛔ Files ignored due to path filters (2)
  • router-tests/go.sum is excluded by !**/*.sum
  • router/go.sum is excluded by !**/*.sum
📒 Files selected for processing (2)
  • router-tests/go.mod (1 hunks)
  • router/go.mod (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • router-tests/go.mod
  • router/go.mod
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: build_push_image (nonroot)
  • GitHub Check: build_test
  • GitHub Check: build_push_image
  • GitHub Check: integration_test (./. ./fuzzquery ./lifecycle ./modules)
  • GitHub Check: integration_test (./telemetry)
  • GitHub Check: integration_test (./events)
  • GitHub Check: Analyze (go)

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Oct 6, 2025

Router image scan passed

✅ No security vulnerabilities found in image:

ghcr.io/wundergraph/cosmo/router:sha-c45cba852e4240c0613961660178508f45d1a6c6

@alepane21 alepane21 marked this pull request as ready for review October 21, 2025 21:16
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
router-tests/websocket_test.go (1)

938-997: SSE handlers: fix require calls in non-test goroutines, use close(closeCh) with t.Cleanup, and increase heartbeat timing

The verification confirms all three issues:

  1. Buffered channel + send pattern (L938, L1074): Both POST and GET tests use closeCh := make(chan struct{}, 1) and signal via closeCh <- struct{}{} (L1058). GET test never signals closeCh, leaving it unhandled. Use close(closeCh) with t.Cleanup instead—it's idiomatic, broadcast-safe, and avoids edge cases.

  2. Aggressive heartbeat (L982, L1119): Both handlers use time.NewTicker(50 * time.Millisecond). Increase to 150ms+ to reduce CI flakiness.

  3. require inside handlers (L961, L963, L964, L975, L977 in POST; L1098, L1099, L1101, L1115, L1117 in GET): Using require.Equal() and require.NoError() inside http.HandlerFunc callbacks violates Go testing semantics—FailNow cannot stop non-test goroutines. Use assertions with explicit error handling or early return.

Apply the suggested diff for both POST (L938–997) and GET (L1075–1133) tests, and refactor handler assertions to avoid require/FailNow calls.

🧹 Nitpick comments (1)
router/go.mod (1)

150-150: Align go-m1cpu versions across modules for consistency

Router pins github.com/shoenig/go-m1cpu to v0.1.6 (indirect) while router-tests and demo use v0.1.7, and aws-lambda-router also uses v0.1.6. v0.1.6 added macOS SDK compatibility fixes; v0.1.7 includes code quality updates and testing improvements. While Go's module resolution will consolidate these to v0.1.7 at build time, aligning all modules to a single version reduces maintenance overhead and eliminates unnecessary go.mod churn.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a15eb23 and 1525ca9.

⛔ Files ignored due to path filters (2)
  • router-tests/go.sum is excluded by !**/*.sum
  • router/go.sum is excluded by !**/*.sum
📒 Files selected for processing (3)
  • router-tests/go.mod (1 hunks)
  • router-tests/websocket_test.go (5 hunks)
  • router/go.mod (2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-20T22:13:25.222Z
Learnt from: StarpTech
PR: wundergraph/cosmo#2157
File: router-tests/go.mod:16-16
Timestamp: 2025-08-20T22:13:25.222Z
Learning: github.com/mark3labs/mcp-go v0.38.0 has regressions and should not be used in the wundergraph/cosmo project. v0.36.0 is the stable version that should be used across router-tests and other modules.

Applied to files:

  • router-tests/go.mod
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build_test
🔇 Additional comments (2)
router-tests/go.mod (1)

30-30: LGTM on graphql-go-tools alignment

router-tests matches router’s pseudo-version; this keeps planner/runtime behavior consistent.

router-tests/websocket_test.go (1)

1075-1075: Signal closeCh in the GET test to match POST test pattern, or drop closeCh and rely on r.Context().Done()

The POST test at line 938 correctly signals closeCh at line 1058, allowing the middleware to exit gracefully. The GET test at line 1075 allocates closeCh but never signals it—it relies solely on r.Context().Done(). While this works (context cleanup will eventually exit the loop), it's inconsistent with the POST test and leaves a dangling channel.

Either:

  1. Add the signal at the end of the GET test (after line 1191): closeCh <- struct{}{} to mirror the POST test pattern
  2. Remove closeCh entirely and rely only on r.Context().Done() for cleanup

The suggested refactor with t.Cleanup(func() { close(closeCh) }) (changing to unbuffered channel) is also valid and idiomatic—close(closeCh) unblocks all receivers waiting on it.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
router-tests/websocket_test.go (2)

976-994: Guard heartbeat writes against ctx cancel to avoid race/flaky failures

If the request context is canceled at the same time the ticker fires, select may take the ticker branch and attempt a write/flush on a closed stream (broken pipe). Add a fast ctx.Err() check in the ticker case to exit cleanly.

 case <-ticker.C:
-  _, err = io.WriteString(w, ":heartbeat\n\n")
-  require.NoError(t, err)
-  flusher.Flush()
+  if r.Context().Err() != nil {
+    return
+  }
+  _, err = io.WriteString(w, ":heartbeat\n\n")
+  require.NoError(t, err)
+  flusher.Flush()

1109-1127: Same ctx-cancel race in SSE GET heartbeat loop

Apply the same defensive check here to prevent writes after cancellation.

 case <-ticker.C:
-  _, err = io.WriteString(w, ":heartbeat\n\n")
-  require.NoError(t, err)
-  flusher.Flush()
+  if r.Context().Err() != nil {
+    return
+  }
+  _, err = io.WriteString(w, ":heartbeat\n\n")
+  require.NoError(t, err)
+  flusher.Flush()
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1525ca9 and ba6f5df.

📒 Files selected for processing (1)
  • router-tests/websocket_test.go (4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Analyze (go)
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: image_scan
  • GitHub Check: image_scan (nonroot)
  • GitHub Check: build_push_image (nonroot)
  • GitHub Check: integration_test (./. ./fuzzquery ./lifecycle ./modules)
  • GitHub Check: build_push_image
  • GitHub Check: integration_test (./telemetry)
  • GitHub Check: build_test

@alepane21 alepane21 merged commit e849bb9 into main Oct 22, 2025
29 checks passed
@alepane21 alepane21 deleted the jesse/eng-8249-sse-complete branch October 22, 2025 10:23
@coderabbitai coderabbitai Bot mentioned this pull request Feb 11, 2026
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants