Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
🐛 Fixes
Redis connection leak on schema changes (PR #7319)
The router performs a 'hot reload' whenever it detects a schema update. During this reload, it effectively instantiates a new internal router, warms it up (optional), redirects all traffic to this new router, and drops the old internal router.
This change fixes a bug in that "drop" process where the Redis connections are never told to terminate, even though the Redis client pool is dropped. This leads to an ever-increasing number of inactive Redis connections as each new schema comes in and goes out of service, which eats up memory.
The solution adds a new up-down counter metric,
apollo.router.cache.redis.connections, to track the number of open Redis connections. This metric includes akindlabel to discriminate between different Redis connection pools, which mirrors thekindlabel on other cache metrics (ieapollo.router.cache.hit.time).By @carodewig in #7319
Propagate client name and version modifications through telemetry (PR #7369)
The router accepts modifications to the client name and version (
apollo::telemetry::client_nameandapollo::telemetry::client_version), but those modifications are not currently propagated through the telemetry layers to update spans and traces.This PR moves where the client name and version are bound to the span, so that the modifications from plugins on the
routerservice are propagated.By @carodewig in #7369
Progressive overrides are not disabled when connectors are used (PR #7351)
Prior to this fix, introducing a connector disabled the progressive override plugin.
By @lennyburdette in #7351
Avoid unnecessary cloning in the deduplication plugin (PR #7347)
The deduplication plugin always cloned responses, even if there were not multiple simultaneous requests that would benefit from the cloned response.
We now check to see if deduplication will provide a benefit before we clone the subgraph response.
There was also an undiagnosed race condition which meant that a notification could be missed. This would have resulted in additional work being performed as the missed notification would have led to another subgraph request.
By @garypen in #7347
Spans should only include path in
http.route(PR #7390)Per the OpenTelemetry spec, the
http.routeshould only include "the matched route, that is, the path template used in the format used by the respective server framework."The router currently sends the full URI in
http.route, which can be high cardinality (ie/graphql?operation=one_of_many_values). After this change, the router will only include the path (/graphql).By @carodewig in #7390
Decrease log level for JWT authentication failure (PR #7396)
A recent change inadvertently increased the log level of JWT authentication failures from
infotoerror. This reverts that change returning it to the previous behavior.By @carodewig in #7396
Avoid fractional decimals when generating
apollo.router.operations.batching.sizemetrics for GraphQL request batch sizes (PR #7306)Corrects the calculation of the
apollo.router.operations.batching.sizemetric to reflect accurate batch sizes rather than occasionally returning fractional numbers.By @bnjjj in #7306
📃 Configuration
Log warnings for deprecated coprocessor
contextconfiguration usage (PR #7349)context: trueis an alias forcontext: deprecatedbut should not be used. The router now logs a runtime warning on startup if you do use it.Instead of:
Explicitly use
deprecatedorall:See the 2.x upgrade guide for more detailed upgrade steps.
By @goto-bus-stop in #7349
🛠 Maintenance
Linux: Compatibility with glibc 2.28 or newer (PR #7355)
The default build images provided in our CI environment have a relatively modern version of
glibc(2.35). This means that on some distributions, notably those based around RedHat, it wasn't possible to use our binaries since the version ofglibcwas older than 2.35.We now maintain a build image which is based on a distribution with
glibc2.28. This is old enough that recent releases of either of the main Linux distribution families (Debian and RedHat) can make use of our binary releases.By @garypen in #7355
Reject
@skip/@includeon subscription root fields in validation (PR #7338)This implements a GraphQL spec RFC, rejecting subscriptions in validation that can be invalid during execution.
By @goto-bus-stop in #7338
📚 Documentation
Query planning best practices (PR #7263)
Added a new page under Routing docs about Query Planning Best Practices.
By @smyrick in #7263