prep release: v2.8.0 by abernix · Pull Request #8495 · apollographql/router

abernix · 2025-10-27T18:47:16Z

Note

When approved, this PR will merge into the 2.8.0 branch which will — upon being approved itself — merge into main.

Things to review in this PR:

Changelog correctness (There is a preview below, but it is not necessarily the most up to date. See the Files Changed for the true reality.)

Version bumps

That it targets the right release branch (2.8.0 in this case!).

🚀 Features

Support per-stage coprocessor URLs (PR #8384)

You can now configure different coprocessor URLs for each stage of request/response processing (router, supergraph, execution, subgraph). Each stage can specify its own url field that overrides the global default URL.

Changes:

Add optional url field to all stage configuration structs
Update all stage as_service methods to accept and resolve URLs
Add tests for URL validation and per-stage configuration

This change maintains full backward compatibility—existing configurations with a single global URL continue to work unchanged.

By @cgati in #8384

Add automatic unit conversion for duration instruments with non-second units

The router now automatically converts duration measurements to match the configured unit for telemetry instruments.
Previously, duration instruments always recorded values in seconds regardless of the configured unit field.
When you specify units like "ms" (milliseconds), "us" (microseconds), or "ns" (nanoseconds),
the router automatically converts the measured duration to the appropriate scale.

Supported units:

"s" - seconds (default)
"ms" - milliseconds
"us" - microseconds
"ns" - nanoseconds

Note

Use this feature only when you need to integrate with an observability platform that doesn't properly translate from source time units to target time units (for example, seconds to milliseconds). In all other cases, follow the OTLP convention that you "SHOULD" use seconds as the unit.

Example:

telemetry:
  instrumentation:
    instruments:
      subgraph:
        acme.request.duration:
          value: duration
          type: histogram
          unit: ms # Values are now automatically converted to milliseconds
          description: "Metric to get the request duration in milliseconds"

By Jon Christiansen in #8415

Add response reformatting and result coercion errors (PR #8441)

All subgraph responses are checked and corrected to ensure alignment with the schema and query. When a misaligned value is returned, it's nullified. When the feature is enabled, errors for this nullification are now included in the errors array in the response.

By @TylerBloom in #8441

Add router overhead metric (PR #8455)

The apollo.router.overhead histogram provides a direct measurement of router processing overhead. This metric tracks the time the router spends on tasks other than waiting for downstream HTTP requests—including GraphQL parsing, validation, query planning, response composition, and plugin execution.

The overhead calculation excludes time spent waiting for downstream HTTP services (subgraphs and connectors), giving you visibility into the router's actual processing time versus downstream latency. This metric helps identify when the router itself is a bottleneck versus when delays are caused by downstream services.

Note: Coprocessor request time is currently included in the overhead calculation. In a future release, coprocessor time may be excluded similar to subgraphs and connectors.

telemetry:
  instrumentation:
    instruments:
      router:
        apollo.router.overhead: true

Note that the use of this metric is nuanced, and there is risk misinterpretation. See the full docs for this metric to help understand how it can be used.

By @BrynCooke in #8455

Include invalid Trace ID values in error logs (PR #8149)

Error messages for malformed Trace IDs now include the invalid value to help with debugging. Previously, when the router received an unparseable Trace ID in incoming requests, error logs only indicated that the Trace ID was invalid without showing the actual value.

Trace IDs can be unparseable due to invalid hexadecimal characters, incorrect length, or non-standard formats. Including the invalid value in error logs makes it easier to diagnose and resolve tracing configuration issues.

By @juancarlosjr97 in #8149

Add ability to rename metrics (PR #8424)

The router can now rename instruments via OpenTelemetry views.

Benefits:

Cost optimization: Some observability platforms only allow tag indexing controls on a per-metric name basis. Using OTLP semantic naming conventions and having the same metric name emitted by different services can prevent effective use of these controls.
Convention alignment: Many customers have specific metric naming conventions across their organization—this feature allows them to align with those conventions.

By Jon Christiansen in #8412

🐛 Fixes

Reload telemetry only when configuration changes (PR #8328)

Previously, schema or config reloads would always reload telemetry, dropping existing exporters and creating new ones.

Telemetry exporters are now only recreated when relevant configuration has changed.

By @BrynCooke in #8328

Replace Redis connections metric with clients metric (PR #8161)

The apollo.router.cache.redis.connections metric has been removed and replaced with the apollo.router.cache.redis.clients metric.

The connections metric was implemented with an up-down counter that would sometimes not be collected properly (it could go negative). The name connections was also inaccurate since Redis clients each make multiple connections, one to each node in the Redis pool (if in clustered mode).

The new clients metric counts the number of clients across the router via an AtomicU64 and surfaces that value in a gauge.

Note: The old metric included a kind attribute to reflect the number of clients in each pool (for example, entity caching, query planning). The new metric doesn't include this attribute; the purpose of the metric is to ensure the number of clients isn't growing unbounded (#7319).

By @carodewig in #8161

Prevent entity caching of expired data based on Age header (PR #8456)

When the Age header is higher than the max-age directive in Cache-Control, the router no longer caches the data because it's already expired.

For example, with these headers:

Cache-Control: max-age=5
Age: 90

The data won't be cached since Age (90) exceeds max-age (5).

By @bnjjj in #8456

Reduce config and schema reload log noise (PR #8336)

File watch events during an existing hot reload no longer spam the logs. Hot reload continues as usual after the existing reload finishes.

By @goto-bus-stop in #8336

Prevent query planning errors for `@shareable` mutation fields (PR #8352)

Query planning a mutation operation that executes a @shareable mutation field at the top level may unexpectedly error when attempting to generate a plan where that mutation field is called more than once across multiple subgraphs. Query planning now avoids generating such plans.

By @sachindshinde in #8352

Prevent UpDownCounter drift using RAII guards (PR #8379)

UpDownCounters now use RAII guards instead of manual incrementing and decrementing, ensuring they're always decremented when dropped.

This fix resolves drift in apollo.router.opened.subscriptions that occurred due to manual incrementing and decrementing.

By @BrynCooke in #8379

Reduce Rhai short circuit response log noise (PR #8364)

Rhai scripts that short-circuit the pipeline by throwing now only log an error if a response body isn't present.

For example the following will NOT log:

    throw #{
        status: 403,
        body: #{
            errors: [#{
                message: "Custom error with body",
                extensions: #{
                    code: "FORBIDDEN"
                }
            }]
        }
    };

For example the following WILL log:

throw "An error occurred without a body";

By @BrynCooke in #8364

Prevent query planning error where `@requires` subgraph jump fetches `@key` from wrong subgraph (PR #8016)

During query planning, a subgraph jump added due to a @requires field may sometimes try to collect the necessary @key fields from an upstream subgraph fetch as an optimization, but it wasn't properly checking whether that subgraph had those fields. This is now fixed and resolves query planning errors with messages like "Cannot add selection of field T.id to selection set of parent type T".

By @sachindshinde in #8016

Reduce log level for interrupted WebSocket streams (PR #8344)

The router now logs interrupted WebSocket streams at trace level instead of error level.

Previously, WebSocket stream interruptions logged at error level, creating excessive noise in logs when clients disconnected normally or networks experienced transient issues. Client disconnections and network interruptions are expected operational events that don't require immediate attention.

Your logs will now be cleaner and more actionable, making genuine errors easier to spot. You can enable trace level logging when debugging WebSocket connection issues.

By @bnjjj in #8344

Respect Redis cluster slots when inserting multiple items (PR #8185)

The existing insert code would silently fail when trying to insert multiple values that correspond to different Redis cluster hash slots. This change corrects that behavior, raises errors when inserts fail, and adds new metrics to track Redis client health.

New metrics:

apollo.router.cache.redis.unresponsive: counter for 'unresponsive' events raised by the Redis library
- kind: Redis cache purpose (APQ, query planner, entity)
- server: Redis server that became unresponsive
apollo.router.cache.redis.reconnection: counter for 'reconnect' events raised by the Redis library
- kind: Redis cache purpose (APQ, query planner, entity)
- server: Redis server that required client reconnection

By @carodewig in #8185

Prevent unnecessary precomputation during query planner construction (PR #8373)

A regression introduced in v2.5.0 caused query planner construction to unnecessarily precompute metadata, leading to increased CPU and memory utilization during supergraph loading. Query planner construction now correctly avoids this unnecessary precomputation.

By @sachindshinde in #8373

Update cache key version for entity caching (PR #8458)

Important

If you have enabled Entity caching, this release contains changes that necessarily alter the hashing algorithm used for the cache keys. You should anticipate additional cache regeneration cost when updating between these versions while the new hashing algorithm comes into service.

The entity cache key version has been bumped to avoid keeping invalid cached data for too long (fixed in #8456).

By @bnjjj in #8458

📃 Configuration

Add telemetry instrumentation config for `http_client` headers (PR #8349)

A new telemetry instrumentation configuration for http_client spans allows request headers added by Rhai scripts to be attached to the http_client span. The some_rhai_response_header value remains available on the subgraph span as before.

telemetry:
  instrumentation:
    spans:
      mode: spec_compliant
      subgraph:
        attributes:
          http.response.header.some_rhai_response_header:
            subgraph_response_header: "some_rhai_response_header"
      http_client:
        attributes:
          http.request.header.some_rhai_request_header:
            request_header: "some_rhai_request_header"

By @bonnici in #8349

Promote Subgraph Insights metrics flag to general availability (PR #8392)

The subgraph_metrics config flag that powers the Studio Subgraph Insights feature is now promoted from preview to general availability.
The flag name has been updated from preview_subgraph_metrics to

telemetry:
  apollo:
    subgraph_metrics: true

By @david_castaneda in #8392

🛠 Maintenance

Add export destination details to trace and metrics error messages (PR #8363)

Error messages raised during tracing and metric exports now indicate whether the error occurred when exporting to Apollo Studio or to your configured OTLP or Zipkin endpoint. For example, errors that occur when exporting Apollo Studio traces look like:
OpenTelemetry trace error occurred: [apollo traces] <etc>
while errors that occur when exporting traces to your configured OTLP endpoint look like:
OpenTelemetry trace error occurred: [otlp traces] <etc>

By @bonnici in #8363

📚 Documentation

Change MCP default port from 5000 to 8000 (PR #8375)

MCP's default port has changed from 5000 to 8000.

Add Render and Railway deployment guides (PR #8242)

Two new deployment guides are now available for popular hosting platforms: Render and Railway.

By @the-gigi-apollo in #8242

Add comprehensive context key reference (PR #8420)

The documentation now includes a comprehensive reference for all context keys the router supports.

By @faisalwaseem in #8420

Reorganize observability documentation structure (PR #8183)

Restructured the router observability and telemetry documentation to improve content discoverability and user experience. GraphOS insights documentation and router OpenTelemetry telemetry documentation are now in separate sections, with APM-specific documentation organized in dedicated folders for each APM provider (Datadog, Dynatrace, Jaeger, Prometheus, New Relic, Zipkin). This reorganization makes it easier for users to find relevant monitoring and observability configuration for their specific APM tools.

By @Robert113289 in #8183

Add comprehensive Datadog integration documentation (PR #8319)

The Datadog APM guide has been expanded to include the OpenTelemetry Collector, recommended router telemetry configuration, and out-of-the-box dashboard templates:

New pages: Connection methods overview, OpenTelemetry Collector setup, router instrumentation, and dashboard template
Structure: Complete configurations upfront, followed by detailed explanations and best practices

By @Robert113289 in #8319

Clarify timeout hierarchy for traffic shaping (PR #8203)

The documentation reflects more clearly that subgraph timeouts should not be higher than the router timeout or the router timeout will initiate prior to the subgraph.

By @abernix in #8203

apollo-librarian · 2025-10-27T18:47:24Z

✅ Docs preview ready

The preview is ready to be viewed. View the preview

File Changes

0 new, 2 changed, 0 removed

* graphos/routing/(latest)/query-planning/query-planning-best-practices.mdx
* graphos/routing/(latest)/self-hosted/managed-hosting/railway.mdx

Build ID: 5797befc74f64338b0b5d4fd
Build Logs: View logs

URL: https://www.apollographql.com/docs/deploy-preview/5797befc74f64338b0b5d4fd

lrlna · 2025-10-28T10:08:24Z

CHANGELOG.md

+All subgraph responses are checked and corrected to ensure alignment with the schema and query. When a misaligned value is returned, it's nullified. When the feature is enabled, errors for this nullification are now included in the errors array in the response.
+
+By [@TylerBloom](https://github.com/TylerBloom) in https://github.com/apollographql/router/pull/8441


@TylerBloom the changelog should add how to enable the feature.

Good eye. I'll chase down the change here right now.

CHANGELOG.md

Co-authored-by: Iryna Shestak <shestak.irina@gmail.com>

abernix · 2025-10-28T13:18:29Z

merging based on @lrlna's previous approval.

prep release: v2.8.0

407a2e0

abernix requested a review from a team October 27, 2025 18:47

abernix requested review from a team as code owners October 27, 2025 18:47

lrlna reviewed Oct 28, 2025

View reviewed changes

abernix commented Oct 28, 2025

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

Apply suggestions from code review

f0bfe93

Co-authored-by: Iryna Shestak <shestak.irina@gmail.com>

lrlna previously approved these changes Oct 28, 2025

View reviewed changes

Add details for #8441

a8fac14

abernix dismissed lrlna’s stale review via a8fac14 October 28, 2025 13:16

abernix merged commit 484c62c into 2.8.0 Oct 28, 2025
9 of 10 checks passed

abernix deleted the prep-2.8.0 branch October 28, 2025 13:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

prep release: v2.8.0#8495

prep release: v2.8.0#8495
abernix merged 3 commits into2.8.0from
prep-2.8.0

abernix commented Oct 27, 2025

Uh oh!

apollo-librarian bot commented Oct 27, 2025 •

edited

Loading

Uh oh!

lrlna Oct 28, 2025

Uh oh!

abernix Oct 28, 2025

Uh oh!

abernix Oct 28, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

abernix commented Oct 28, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		All subgraph responses are checked and corrected to ensure alignment with the schema and query. When a misaligned value is returned, it's nullified. When the feature is enabled, errors for this nullification are now included in the errors array in the response.

		By [@TylerBloom](https://github.com/TylerBloom) in https://github.com/apollographql/router/pull/8441

Conversation

abernix commented Oct 27, 2025

🚀 Features

Support per-stage coprocessor URLs (PR #8384)

Add automatic unit conversion for duration instruments with non-second units

Add response reformatting and result coercion errors (PR #8441)

Add router overhead metric (PR #8455)

Include invalid Trace ID values in error logs (PR #8149)

Add ability to rename metrics (PR #8424)

🐛 Fixes

Reload telemetry only when configuration changes (PR #8328)

Replace Redis connections metric with clients metric (PR #8161)

Prevent entity caching of expired data based on Age header (PR #8456)

Reduce config and schema reload log noise (PR #8336)

Prevent query planning errors for @shareable mutation fields (PR #8352)

Prevent UpDownCounter drift using RAII guards (PR #8379)

Reduce Rhai short circuit response log noise (PR #8364)

Prevent query planning error where @requires subgraph jump fetches @key from wrong subgraph (PR #8016)

Reduce log level for interrupted WebSocket streams (PR #8344)

Respect Redis cluster slots when inserting multiple items (PR #8185)

Prevent unnecessary precomputation during query planner construction (PR #8373)

Update cache key version for entity caching (PR #8458)

📃 Configuration

Add telemetry instrumentation config for http_client headers (PR #8349)

Promote Subgraph Insights metrics flag to general availability (PR #8392)

🛠 Maintenance

Add export destination details to trace and metrics error messages (PR #8363)

📚 Documentation

Change MCP default port from 5000 to 8000 (PR #8375)

Add Render and Railway deployment guides (PR #8242)

Add comprehensive context key reference (PR #8420)

Reorganize observability documentation structure (PR #8183)

Add comprehensive Datadog integration documentation (PR #8319)

Clarify timeout hierarchy for traffic shaping (PR #8203)

Uh oh!

apollo-librarian bot commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Docs preview ready

Uh oh!

lrlna Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

abernix Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

abernix Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

abernix commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Prevent query planning errors for `@shareable` mutation fields (PR #8352)

Prevent query planning error where `@requires` subgraph jump fetches `@key` from wrong subgraph (PR #8016)

Add telemetry instrumentation config for `http_client` headers (PR #8349)

apollo-librarian bot commented Oct 27, 2025 •

edited

Loading

abernix commented Oct 28, 2025 •

edited

Loading