Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
e3a5093
wip: subs v2
endigma Mar 18, 2026
c71f609
generate + bump
endigma Mar 23, 2026
e451658
fix subscriptions modules tests to expect non-terminal errors
endigma Mar 23, 2026
cfb10bc
chore: use wip go-tools 4c8ddc30
endigma Mar 23, 2026
e278731
chore: use wip go-tools cb2348a6
endigma Mar 23, 2026
d10b066
chore: use wip go-tools d78d83c6
endigma Mar 23, 2026
585c1e8
Merge branch 'main' into jesse/eng-8566-new-subscription-client
endigma Apr 9, 2026
c6bc729
update engine, add complete echo check scenario
endigma Apr 9, 2026
ef0b85b
chore: use wip go-tools e37eeb18
endigma Apr 9, 2026
a4274bb
Remove complete-after-error hack
endigma Apr 9, 2026
3a9ad1f
update testdata config
endigma Apr 10, 2026
7d759e2
Merge branch 'main' into jesse/eng-8566-new-subscription-client
endigma Apr 10, 2026
e870af4
feat: wire DefaultErrorExtensionCode to subscription client
endigma Apr 16, 2026
18fdc95
chore: use wip go-tools af907712
endigma Apr 16, 2026
b5a0d8c
review feedback, bump engine
endigma Apr 22, 2026
b7027fc
pr feedback
endigma Apr 22, 2026
45e4a55
remove odd comment
endigma Apr 22, 2026
c4a5aa2
feedback
endigma Apr 22, 2026
e5959c0
wip
endigma Apr 22, 2026
eea83e3
Merge remote-tracking branch 'origin/main' into jesse/eng-8566-new-su…
endigma Apr 22, 2026
c71bfa0
fix: formatting
endigma Apr 22, 2026
fa86f32
format again
endigma Apr 22, 2026
01bd8d3
update snapshots
endigma Apr 22, 2026
4f3123f
minor cleanup
endigma Apr 22, 2026
856daac
wip: docs
endigma Apr 22, 2026
28ac955
fix: Handle terminal subscription errors
endigma Apr 23, 2026
618be77
Update docs-website/router/subscriptions-migration.mdx
endigma Apr 29, 2026
a265d2c
fix: improve ws close in router websocket handling
endigma Apr 29, 2026
f183e83
docs: improve speedtrap README
endigma Apr 29, 2026
50bf944
Merge branch 'main' into jesse/eng-8566-new-subscription-client
endigma Apr 29, 2026
e0c0c8e
Merge branch 'main' into jesse/eng-8566-new-subscription-client
endigma Apr 29, 2026
6fa20f7
use latest go tools
endigma Apr 29, 2026
2f20ac8
docs improvements
endigma Apr 30, 2026
2a37938
chore: use release go-tools
endigma Apr 30, 2026
1abe588
Merge branch 'main' into jesse/eng-8566-new-subscription-client
endigma Apr 30, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
349 changes: 344 additions & 5 deletions demo/pkg/subgraphs/test1/subgraph/generated/generated.go

Large diffs are not rendered by default.

6 changes: 6 additions & 0 deletions demo/pkg/subgraphs/test1/subgraph/model/models_gen.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

12 changes: 10 additions & 2 deletions demo/pkg/subgraphs/test1/subgraph/schema.graphqls
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,12 @@ type TimestampedString {
"Total number of responses to be sent"
total: Int!
initialPayload: Map
extensions: Map
}

type SubscribeMetadata {
initialPayload: Map
subscribeExtensions: Map
}

type Subscription {
Expand All @@ -84,7 +90,9 @@ type Subscription {
initPayloadValue(key: String!, repeat: Int): TimestampedString!
"Returns a stream with the value of the WS initial payload."
initialPayload(repeat: Int): Map
returnsError: String
returnsError: String!
"Returns a stream with the values of the WS initial payload and subscribe extensions"
metadata(repeat: Int! = 1): SubscribeMetadata!
}

type Employee @key(fields: "id") {
Expand Down Expand Up @@ -883,4 +891,4 @@ type ZBigObject {
xFieldOnZBigObject: Float!
yFieldOnZBigObject: String!
zFieldOnZBigObject: Int!
}
}
43 changes: 42 additions & 1 deletion demo/pkg/subgraphs/test1/subgraph/schema.resolvers.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion docs-website/docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -162,7 +162,8 @@
"router/subscriptions/server-sent-events-sse",
"router/subscriptions/multipart-http-requests",
"router/subscriptions/websocket-subprotocols",
"router/subscriptions/router-configuration-for-subscriptions"
"router/subscriptions/router-configuration-for-subscriptions",
"router/subscriptions-migration"
]
},
{
Expand Down
32 changes: 24 additions & 8 deletions docs-website/router/configuration.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -1846,14 +1846,16 @@ Configure the GraphQL Execution Engine of the Router.
| ENGINE_ENABLE_REQUEST_TRACING | enable_request_tracing | <Icon icon="square" /> | Enable [Advanced Request Tracing (ART)](/router/advanced-request-tracing-art)This config is not correlated to OTEL tracing. | true |
| ENGINE_ENABLE_EXECUTION_PLAN_CACHE_RESPONSE_HEADER | enable_execution_plan_cache_response_header | <Icon icon="square" /> | **Deprecated**, use ["enable_cache_response_headers"](/router/configuration#debug-configuration) instead. When enabled, the Router sets the response Header "X-WG-Execution-Plan-Cache" to "HIT" or "MISS" | false |
| ENGINE_MAX_CONCURRENT_RESOLVERS | max_concurrent_resolvers | <Icon icon="square" /> | Use this to limit the concurrency in the GraphQL Engine. A high number will lead to more memory usage. A number too low will slow down your Router. | 32 |
| ENGINE_ENABLE_NET_POLL | enable_net_poll | <Icon icon="square" /> | Enables the more efficient poll implementation for all WebSocket implementations (client, server) of the router. This is only available on Linux and MacOS. On Windows or when the host system is limited, the default synchronous implementation is used. | true |
| ENGINE_WEBSOCKET_CLIENT_POLL_TIMEOUT | websocket_client_poll_timeout | <Icon icon="square" /> | The timeout for the poll loop of the WebSocket client implementation. The period is specified as a string with a number and a unit | 1s |
| ENGINE_WEBSOCKET_CLIENT_CONN_BUFFER_SIZE | websocket_client_conn_buffer_size | <Icon icon="square" /> | The buffer size for the poll buffer of the WebSocket client implementation. The buffer size determines how many connections can be handled in one loop. | 128 |
| ENGINE_WEBSOCKET_CLIENT_READ_TIMEOUT | websocket_client_read_timeout | <Icon icon="square" /> | The timeout for the websocket read of the WebSocket client implementation. | 5s |
| ENGINE_ENABLE_NET_POLL | enable_net_poll | <Icon icon="square" /> | Enables the more efficient poll implementation for the server-side WebSocket handler of the router. This is only available on Linux and MacOS. On Windows or when the host system is limited, the default synchronous implementation is used. Has no effect on the router's upstream connections to subgraphs. | true |
| ENGINE_WEBSOCKET_SERVER_READ_TIMEOUT | websocket_server_read_timeout | <Icon icon="square" /> | Read timeout on the server-side WebSocket handler (router accepting clients). Specified as a Go duration string, e.g. `10ms`, `1s`, `1m`. | 5s |
| ENGINE_WEBSOCKET_SERVER_WRITE_TIMEOUT | websocket_server_write_timeout | <Icon icon="square" /> | Write timeout on the server-side WebSocket handler (router accepting clients). | 10s |
| ENGINE_WEBSOCKET_SERVER_POLL_TIMEOUT | websocket_server_poll_timeout | <Icon icon="square" /> | The timeout for the poll loop of the server-side WebSocket handler. The period is specified as a string with a number and a unit. | 1s |
| ENGINE_WEBSOCKET_SERVER_CONN_BUFFER_SIZE | websocket_server_conn_buffer_size | <Icon icon="square" /> | The buffer size for the poll buffer of the server-side WebSocket handler. The buffer size determines how many connections can be handled in one loop. | 128 |
| ENGINE_WEBSOCKET_CLIENT_WRITE_TIMEOUT | websocket_client_write_timeout | <Icon icon="square" /> | The timeout for the websocket write of the WebSocket client implementation. | 10s |
| ENGINE_WEBSOCKET_CLIENT_PING_INTERVAL | websocket_client_ping_interval | <Icon icon="square" /> | The Websocket client ping interval to the subgraph. Defines how often the router will ping the subgraph to signal that the connection is still alive. Timeout needs to be coordinated with the subgraph. The timeout is specified as a string with a number and a unit, e.g. 10ms, 1s, 1m, 1h. The supported units are 'ms', 's', 'm', 'h'. | 15s |
| ENGINE_WEBSOCKET_CLIENT_PING_TIMEOUT | websocket_client_ping_timeout | <Icon icon="square" /> | The Websocket client ping timeout to the subgraph. Defines how long the router will wait for a ping response from the subgraph. The timeout is specified as a string with a number and a unit, e.g. 10ms, 1s, 1m, 1h. The supported units are 'ms', 's', 'm', 'h'. | 30s |
| ENGINE_WEBSOCKET_CLIENT_FRAME_TIMEOUT | websocket_client_frame_timeout | <Icon icon="square" /> | The Websocket client frame timeout to the subgraph. Defines how long the router will wait for a frame response from the subgraph. The timeout is specified as a string with a number and a unit, e.g. 10ms, 1s, 1m, 1h. The supported units are 'ms', 's', 'm', 'h'. | 100ms |
| ENGINE_WEBSOCKET_CLIENT_ACK_TIMEOUT | websocket_client_ack_timeout | <Icon icon="square" /> | How long the router waits for a `connection_ack` from a subgraph after sending `connection_init`. Minimum `1s`. | 30s |
| ENGINE_WEBSOCKET_CLIENT_READ_LIMIT | websocket_client_read_limit | <Icon icon="square" /> | Maximum size of an incoming WebSocket message from a subgraph. Specified as a byte string, e.g. `512KB`, `1MB`. Minimum `1KB`. | 1MB |
| ENGINE_EXECUTION_PLAN_CACHE_SIZE | execution_plan_cache_size | <Icon icon="square" /> | Define how many GraphQL Operations should be stored in the execution plan cache. A low number will lead to more frequent cache misses, which will lead to increased latency. | 1024 |
| ENGINE_SLOW_PLAN_CACHE_SIZE | slow_plan_cache_size | <Icon icon="square" /> | The maximum number of entries in the slow plan cache. This cache protects slow-to-plan queries from being evicted by the main plan cache's LFU policy. Only used when `in_memory_fallback` is enabled. See [Slow Plan Cache](/concepts/cache-warmer#slow-plan-cache). | 300 |
| ENGINE_SLOW_PLAN_CACHE_THRESHOLD | slow_plan_cache_threshold | <Icon icon="square" /> | The minimum planning duration for a query to be promoted into the slow plan cache. Queries that take longer than this threshold to plan are considered expensive and protected from eviction. The period is specified as a string with a number and a unit, e.g. 10ms, 1s, 5s. The supported units are 'ms', 's', 'm', 'h'. | 100ms |
Expand All @@ -1872,6 +1874,14 @@ Configure the GraphQL Execution Engine of the Router.
| ENGINE_SUBSCRIPTION_FETCH_TIMEOUT | subscription_fetch_timeout | <Icon icon="square" /> | The maximum time a subscription fetch can take before it is considered timed out. | 30s |
| ENGINE_RELAX_SUBGRAPH_OPERATION_FIELD_SELECTION_MERGING_NULLABILITY | relax_subgraph_operation_field_selection_merging_nullability | <Icon icon="square" /> | Relaxes nullability validation for [field selection merging](/router/relaxed-field-selection-merging-nullability) across union member types. | false |

<Warning>
**Subject to change.** `enable_net_poll`, `websocket_server_poll_timeout`, and `websocket_server_conn_buffer_size` may be removed or changed in a future release without a standard deprecation cycle. Avoid depending on them.
</Warning>

<Note>
If you're upgrading from a previous release and set `websocket_client_read_timeout`, `websocket_client_poll_timeout`, `websocket_client_conn_buffer_size`, `websocket_client_frame_timeout`, or `websocket_client_write_timeout`, see the [subscriptions overhaul migration guide](/router/subscriptions-migration). Several of these were removed and one (`websocket_client_write_timeout`) kept its name but changed scope.
</Note>

### Example YAML config:


Expand All @@ -1885,11 +1895,17 @@ engine:
enable_websocket_epoll_kqueue: true
epoll_kqueue_poll_timeout: "1s"
epoll_kqueue_conn_buffer_size: 128
websocket_client_read_timeout: "1s"
websocket_client_write_timeout: "5s"
# Server-side WebSocket handler (router accepting clients)
websocket_server_read_timeout: "5s"
websocket_server_write_timeout: "10s"
websocket_server_poll_timeout: "1s"
websocket_server_conn_buffer_size: 128
# Upstream subscription client (router connecting to subgraphs)
websocket_client_write_timeout: "10s"
websocket_client_ping_interval: "15s"
websocket_client_ping_timeout: "30s"
websocket_client_frame_timeout: "100ms"
websocket_client_ack_timeout: "30s"
websocket_client_read_limit: "1MB"
execution_plan_cache_size: 10000
slow_plan_cache_size: 300
slow_plan_cache_threshold: 100ms
Expand Down
40 changes: 40 additions & 0 deletions docs-website/router/subgraph-error-propagation.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -178,6 +178,46 @@ subgraph_error_propagation:
}
```

## Upstream WebSocket errors during subscriptions

When a subgraph WebSocket connection carrying active subscriptions fails — TCP drop, EOF, or a close frame from the subgraph — the router delivers a GraphQL error on each affected subscription and terminates that subscription. The downstream client's WebSocket stays open and other subscriptions on that connection continue normally.

The error uses the `message` `"upstream service error"` and the `code` extension is set to the value configured in `subgraph_error_propagation.default_extension_code` (default `DOWNSTREAM_SERVICE_ERROR`).

**Sudden closure (TCP drop, EOF):**

```json
{
"errors": [
{
"message": "upstream service error",
"extensions": { "code": "DOWNSTREAM_SERVICE_ERROR" }
}
]
}
```

**Close frame from subgraph:** when the failure carries a WebSocket close frame, the router additionally populates `closeCode` and `closeReason` extensions:

```json
{
"errors": [
{
"message": "upstream service error",
"extensions": {
"code": "DOWNSTREAM_SERVICE_ERROR",
"closeCode": 1011,
"closeReason": "internal error"
}
}
]
}
```

<Note>
Prior to the [subscription overhaul release](/router/subscriptions-migration), upstream WebSocket failures during an active subscription terminated the entire downstream client WebSocket with close code `1001` and no GraphQL error payload. Clients that detected upstream failures via that close code must now inspect the `code` extension on GraphQL errors.
</Note>

## Fallback status code errors

In cases where the router cannot parse a properly formed error from the subgraph response, e.g.
Expand Down
Loading
Loading