fix: correct SSE subgraph subscription complete behaviour#2262
fix: correct SSE subgraph subscription complete behaviour#2262
Conversation
WalkthroughUpdated two go.mod files to change a GraphQL tool dependency and adjusted websocket/SSE tests to emit an initial SSE "next" marker and run periodic Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
Possibly related PRs
Pre-merge checks✅ Passed checks (3 passed)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro ⛔ Files ignored due to path filters (2)
📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
Comment |
Router image scan passed✅ No security vulnerabilities found in image: |
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
router-tests/websocket_test.go (1)
938-997: SSE handlers: fix require calls in non-test goroutines, use close(closeCh) with t.Cleanup, and increase heartbeat timingThe verification confirms all three issues:
Buffered channel + send pattern (L938, L1074): Both POST and GET tests use
closeCh := make(chan struct{}, 1)and signal viacloseCh <- struct{}{}(L1058). GET test never signals closeCh, leaving it unhandled. Useclose(closeCh)witht.Cleanupinstead—it's idiomatic, broadcast-safe, and avoids edge cases.Aggressive heartbeat (L982, L1119): Both handlers use
time.NewTicker(50 * time.Millisecond). Increase to 150ms+ to reduce CI flakiness.require inside handlers (L961, L963, L964, L975, L977 in POST; L1098, L1099, L1101, L1115, L1117 in GET): Using
require.Equal()andrequire.NoError()inside http.HandlerFunc callbacks violates Go testing semantics—FailNowcannot stop non-test goroutines. Use assertions with explicit error handling or early return.Apply the suggested diff for both POST (L938–997) and GET (L1075–1133) tests, and refactor handler assertions to avoid require/FailNow calls.
🧹 Nitpick comments (1)
router/go.mod (1)
150-150: Align go-m1cpu versions across modules for consistencyRouter pins github.com/shoenig/go-m1cpu to v0.1.6 (indirect) while router-tests and demo use v0.1.7, and aws-lambda-router also uses v0.1.6. v0.1.6 added macOS SDK compatibility fixes; v0.1.7 includes code quality updates and testing improvements. While Go's module resolution will consolidate these to v0.1.7 at build time, aligning all modules to a single version reduces maintenance overhead and eliminates unnecessary go.mod churn.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
router-tests/go.sumis excluded by!**/*.sumrouter/go.sumis excluded by!**/*.sum
📒 Files selected for processing (3)
router-tests/go.mod(1 hunks)router-tests/websocket_test.go(5 hunks)router/go.mod(2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-20T22:13:25.222Z
Learnt from: StarpTech
PR: wundergraph/cosmo#2157
File: router-tests/go.mod:16-16
Timestamp: 2025-08-20T22:13:25.222Z
Learning: github.com/mark3labs/mcp-go v0.38.0 has regressions and should not be used in the wundergraph/cosmo project. v0.36.0 is the stable version that should be used across router-tests and other modules.
Applied to files:
router-tests/go.mod
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build_test
🔇 Additional comments (2)
router-tests/go.mod (1)
30-30: LGTM on graphql-go-tools alignmentrouter-tests matches router’s pseudo-version; this keeps planner/runtime behavior consistent.
router-tests/websocket_test.go (1)
1075-1075: Signal closeCh in the GET test to match POST test pattern, or drop closeCh and rely on r.Context().Done()The POST test at line 938 correctly signals
closeChat line 1058, allowing the middleware to exit gracefully. The GET test at line 1075 allocatescloseChbut never signals it—it relies solely onr.Context().Done(). While this works (context cleanup will eventually exit the loop), it's inconsistent with the POST test and leaves a dangling channel.Either:
- Add the signal at the end of the GET test (after line 1191):
closeCh <- struct{}{}to mirror the POST test pattern- Remove
closeChentirely and rely only onr.Context().Done()for cleanupThe suggested refactor with
t.Cleanup(func() { close(closeCh) })(changing to unbuffered channel) is also valid and idiomatic—close(closeCh)unblocks all receivers waiting on it.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
router-tests/websocket_test.go (2)
976-994: Guard heartbeat writes against ctx cancel to avoid race/flaky failuresIf the request context is canceled at the same time the ticker fires, select may take the ticker branch and attempt a write/flush on a closed stream (broken pipe). Add a fast ctx.Err() check in the ticker case to exit cleanly.
case <-ticker.C: - _, err = io.WriteString(w, ":heartbeat\n\n") - require.NoError(t, err) - flusher.Flush() + if r.Context().Err() != nil { + return + } + _, err = io.WriteString(w, ":heartbeat\n\n") + require.NoError(t, err) + flusher.Flush()
1109-1127: Same ctx-cancel race in SSE GET heartbeat loopApply the same defensive check here to prevent writes after cancellation.
case <-ticker.C: - _, err = io.WriteString(w, ":heartbeat\n\n") - require.NoError(t, err) - flusher.Flush() + if r.Context().Err() != nil { + return + } + _, err = io.WriteString(w, ":heartbeat\n\n") + require.NoError(t, err) + flusher.Flush()
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
router-tests/websocket_test.go(4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: Analyze (go)
- GitHub Check: Analyze (javascript-typescript)
- GitHub Check: image_scan
- GitHub Check: image_scan (nonroot)
- GitHub Check: build_push_image (nonroot)
- GitHub Check: integration_test (./. ./fuzzquery ./lifecycle ./modules)
- GitHub Check: build_push_image
- GitHub Check: integration_test (./telemetry)
- GitHub Check: build_test
Summary by CodeRabbit
Bug Fixes
Chores
Checklist