feat(router): entity caching with L1/L2, shadow mode, and analytics#2677
feat(router): entity caching with L1/L2, shadow mode, and analytics#2677
Conversation
- router@0.284.2
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
- aws-lambda-router@0.39.2 - router@0.285.0
- cdn@0.16.2 - controlplane@0.191.3 - keycloak@0.11.3
- wgc@0.105.5 - @wundergraph/cosmo-connect@0.131.0 - controlplane@0.192.0 - keycloak@0.12.0 - @wundergraph/protographic@0.16.0 - router@0.286.0 - @wundergraph/cosmo-shared@0.43.4 - studio@0.155.1
The Cosmo signup page can now be controlled with the uc param to display corresponding content, e.g. uc=apollo to address users reacting to an apollo user campaign.
Co-authored-by: Ludwig Bedacht <ludwig.bedacht@gmail.com>
- controlplane@0.192.1 - studio@0.156.0
- wgc@0.105.6 - @wundergraph/composition@0.51.2 - controlplane@0.192.2 - @wundergraph/protographic@0.16.1 - @wundergraph/cosmo-shared@0.43.5 - studio@0.156.1
Co-authored-by: Dustin Deus <deusdustin@gmail.com>
- wgc@0.105.7 - @wundergraph/cosmo-connect@0.132.0 - controlplane@0.193.0 - @wundergraph/protographic@0.16.2 - @wundergraph/cosmo-shared@0.43.6 - studio@0.157.0
- router@0.286.1 - studio@0.158.0
…changes in control plane (#2578)
- controlplane@0.193.1
Resolve conflicts: keep local entity-caching changes (graphql-go-tools rc.270 with @requestScoped, EntityCachingConfiguration, MemoryStorageProvider, normalization factory typing) and merge upstream additions (CostControl ValidateSliceArguments, StorageProvider envPrefix, PQLManifestConfig tests, Cosmo Connect args + @requires). Regenerate proto bindings (make generate-go), composition-go bundle, and demo plugins (cd demo && make plugin-build-ci). Add staticcheck.conf to remaining entity_caching subgraphs to silence SA4004 in gqlgen-generated federation code. Apply gofmt to attributes.go and router_entity_cache_test.go. Includes a regression test (TestEntityCaching) that asserts @cachePopulate must not write to L2 when router L2 caching is disabled. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
See TODO.md for the per-item resolution log. Highlights: - Namespace 6 new entity-caching directives as @openfed__ (Aenimus). - Proto field rename negative_cache_ttl_seconds -> not_found_cache_ttl_seconds (field #6 preserved, old name reserved) + TTL-semantics field comments. - Composition correctness: nested-key mapping iterates all + carries correct isBatch on list returns; renamed root types attach cache config under the canonical root; TypeScript hygiene (branded aliases, Array<T>, ordinal helper, alphabetical constants, reuse isTypeNodeListType). - Router: default cache stored under both "default" and its provider_id so overrides reuse the instance; executor nil-safety + skip+warn on unknown subgraph ID; buildRedisClient rename; logEntityCacheOverrideIssues rename; metrics source labels via CacheOperationSource; otel attribute constants; config.schema.json memory max_size bytes-string format. - Test truthfulness: 10 router-tests subtests now assert counters.* and/or cache.Len(); 9 of 10 fail with L2 disabled (was 0). negative_cache tests redesigned with mutation-created IDs; items subgraph mutations persist and emit events. Circuit breaker concurrent tests; redis post-close Set and Delete coverage; memory concurrency test error propagation; ristretto bounded assertion; post-close len guard. - New unit coverage: entity_cache_metrics_test.go (0% -> ~94%), graphql_handler caching-options tests, router_entity_cache_test.go covering default-cache reuse via require.Same. - Playground: functional setHeaders (no stale closure), Math.max zero duration clamp, controlled Tabs. - Docs: PII redaction on raw keys, namespace sweep across all references, ENTITY_CACHING_DEMO SDL refresh, CLAUDE.md cheatsheet + code-block tags. Flagged follow-up: @openfed__cachePopulate writes still land in L2 when L2.Enabled=false. Read path correctly gates so user-visible impact is wasted writes only. Fix location: cache-populate write path must respect EntityCachingConfiguration.L2.Enabled. Verified: - composition: 1194 tests pass. - router, router-tests: -race clean. - router-tests/entity_caching: 63/64 pass (1 pre-existing skip). - cache-demo e2e: every documented query + all three X-WG-Disable-Entity-Cache* header gates behave as documented. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
TODO.md - per-item resolution for every Aenimus / SkArchon / Noroth / CodeRabbit review comment: what was done, where disagreements stand, what is deferred, and the final verification matrix (composition / router / router-tests / cache-demo e2e all green). FEEDBACK.md - raw extracted feedback (47 human + 29 bot inline comments, review-level summaries, and other signals) captured for traceability so reviewers can cross-reference disposition entries. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Dependency ReviewThe following issues were found:
|
graphqlmetrics - uncommitted changes detectedSeems like you forgot to commit some code. Possible causes:
Dirty files
|
There was a problem hiding this comment.
Actionable comments posted: 11
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
♻️ Duplicate comments (2)
composition/CLAUDE.md (1)
103-109:⚠️ Potential issue | 🟡 MinorAdd language tags to these fenced blocks.
These unlabeled fences will keep tripping MD040/markdownlint. Use
graphqlfor the directive example andbashfor the command snippets.Also applies to: 208-225
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@composition/CLAUDE.md` around lines 103 - 109, The fenced code blocks showing GraphQL directives (e.g. `@openfed__entityCache`, `@openfed__queryCache`, `@openfed__cachePopulate`, `@openfed__cacheInvalidate`, `@openfed__is`) should be labeled with a language tag: change the directive block fences to use ```graphql and any command/snippet blocks mentioned elsewhere (like shell examples) to use ```bash; update the other unlabeled fences referenced around lines 208–225 in the same file to use the appropriate ```graphql or ```bash tags so markdownlint/MD040 no longer flags them.composition/src/v1/normalization/normalization-factory.ts (1)
3797-3829:⚠️ Potential issue | 🟠 MajorStore root-field cache config under the canonical root type.
processRootFieldCacheDirectives()still passes the raw map key into the cache extractors, so renamed roots likeschema { query: MyQuery }write cache config underMyQuerywhile the rest of normalization keys root config underQuery/Mutation/Subscription. Those entries then miss the later root-node handling innormalize()and the router never sees the cache directives on the actual root.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@composition/src/v1/normalization/normalization-factory.ts` around lines 3797 - 3829, processRootFieldCacheDirectives() is using the raw map key (parentTypeName) when calling extractQueryCacheConfig / extractCacheInvalidateConfig / extractCachePopulateConfig, which causes renamed schema roots to be stored under the wrong type; compute the canonical root type name from the operationType node (e.g. the NamedType node on operationType, such as operationType.type.name.value) and pass that canonicalRootTypeName to extractQueryCacheConfig/extractCacheInvalidateConfig/extractCachePopulateConfig instead of parentTypeName so root-field cache config is stored under the standard Query/Mutation/Subscription types.
🟡 Minor comments (10)
demo/pkg/subgraphs/cachegraph_ext/subgraph/data.go-46-53 (1)
46-53:⚠️ Potential issue | 🟡 MinorAdd a nil guard in
toArticleto prevent panics.Line 49–51 dereferences
extunconditionally. If a missing extension reaches this helper, it will panic.Proposed fix
func toArticle(id string, ext *articleExtData) *model.Article { + if ext == nil { + return &model.Article{ID: id} + } return &model.Article{ ID: id, ViewCount: ext.ViewCount, Rating: ext.Rating, ReviewSummary: ext.ReviewSummary, } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@demo/pkg/subgraphs/cachegraph_ext/subgraph/data.go` around lines 46 - 53, The toArticle function currently dereferences ext (type *articleExtData) without checking for nil, which can panic; add a nil guard at the start of toArticle: if ext is nil return &model.Article{ID: id} (or populate only safe defaults) otherwise proceed to use ext.ViewCount, ext.Rating, ext.ReviewSummary; keep the function name toArticle and the types articleExtData and model.Article unchanged so callers remain compatible.TODO.md-21-24 (1)
21-24:⚠️ Potential issue | 🟡 MinorReplace hardcoded absolute paths with portable commands.
The verification commands contain absolute paths specific to one developer's machine (
/Users/jens/repos/...,/Users/jens/go/pkg/mod/...), which prevents other developers from running these verification steps. Use relative paths from the repository root or environment variables.📝 Suggested portable rewrite
### Verification rerun on 2026-04-20 -- `cd /Users/jens/repos/wundergraph-projects/claude-space-2/graphql-go-tools/v2 && PATH=/Users/jens/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.26.0.darwin-arm64/bin:$PATH go test ./pkg/engine/resolve -run 'TestDetectMutationEntityImpact|TestMutationCacheTTLOverride' -count=1` -- `cd router && PATH=/Users/jens/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.26.0.darwin-arm64/bin:$PATH go test ./core ./pkg/entitycache ./pkg/metric -count=1` -- `cd router-tests && PATH=/Users/jens/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.26.0.darwin-arm64/bin:$PATH go test ./entity_caching -count=1` +- `cd ../graphql-go-tools/v2 && go test ./pkg/engine/resolve -run 'TestDetectMutationEntityImpact|TestMutationCacheTTLOverride' -count=1` +- `cd router && go test ./core ./pkg/entitycache ./pkg/metric -count=1` +- `cd router-tests && go test ./entity_caching -count=1` - `cd playground && pnpm test && pnpm build`(The PATH override is unnecessary if the correct Go toolchain is already in your environment.)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@TODO.md` around lines 21 - 24, Replace the hardcoded absolute paths in the verification commands (the four shell lines that start with cd /Users/jens/repos/... and PATH=/Users/jens/go/pkg/mod/...) with portable alternatives: use repository-relative paths (e.g., cd to the local subfolders like graphql-go-tools/v2, router, router-tests, playground) and drop the machine-specific PATH override or replace it with a generic toolchain lookup (rely on the developer's PATH or use environment variables such as GOMODCACHE or GOROOT/bin if needed); ensure the commands remain the same otherwise (go test ./pkg/engine/resolve -run 'TestDetectMutationEntityImpact|TestMutationCacheTTLOverride' -count=1, go test ./core ./pkg/entitycache ./pkg/metric -count=1, go test ./entity_caching -count=1, pnpm test && pnpm build) so they run from repo root on any machine.demo/pkg/injector/latency.go-22-25 (1)
22-25:⚠️ Potential issue | 🟡 MinorHonor request cancellation during injected delay.
If the client disconnects/cancels, the middleware still sleeps the full duration before returning. Prefer a context-aware wait.
Suggested fix
- if ms, err := strconv.Atoi(v); err == nil && ms > 0 && ms <= 10000 { + if ms, err := strconv.Atoi(v); err == nil && ms > 0 && ms <= 10000 { log.Printf("[latency] %s %s sleep %dms prefix=%q", r.Method, r.Host, ms, r.Header.Get("X-WG-Cache-Key-Prefix")) - time.Sleep(time.Duration(ms) * time.Millisecond) + timer := time.NewTimer(time.Duration(ms) * time.Millisecond) + defer timer.Stop() + select { + case <-timer.C: + case <-r.Context().Done(): + return + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@demo/pkg/injector/latency.go` around lines 22 - 25, Replace the unconditional time.Sleep in the latency injection middleware with a context-aware wait so the handler honors client cancellation: after parsing ms (from strconv.Atoi) use the request context (r.Context()) and wait with a select between time.After(time.Duration(ms)*time.Millisecond) and ctx.Done(), returning immediately (or skipping the sleep) if ctx.Done() fires; keep the existing log.Printf call but consider logging that the delay was aborted when ctx.Done() occurs.demo/pkg/subgraphs/subgraphs.go-227-228 (1)
227-228:⚠️ Potential issue | 🟡 MinorViewer is bypassing the latency injector in the shared demo path.
cache-demowraps the viewer server withinjector.Latency(...), and the benchmark router propagatesX-Artificial-Latencytoviewer, but both of these paths returnviewer.NewHandler()directly. Requests started throughsubgraphs.New()orViewerHandler()will ignore the latency header and behave differently from the dedicated cache-demo binary.Suggested fix
func ViewerHandler() http.Handler { - return viewer.NewHandler() + return injector.Latency(viewer.NewHandler()) } @@ if config.Ports.Viewer != 0 { servers = append(servers, &http.Server{ Addr: ":" + strconv.Itoa(config.Ports.Viewer), - Handler: viewer.NewHandler(), + Handler: ViewerHandler(), }) }Also applies to: 312-316
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@demo/pkg/subgraphs/subgraphs.go` around lines 227 - 228, ViewerHandler and the handler returned by subgraphs.New currently return viewer.NewHandler() directly and therefore bypass the latency injector; wrap the viewer handler with the shared latency middleware so X-Artificial-Latency is respected. Concretely, replace returns of viewer.NewHandler() in ViewerHandler and in the viewer route inside subgraphs.New with injector.Latency(viewer.NewHandler()) (or the appropriate injector.Latency(...) call used elsewhere) so the injected latency middleware is applied to requests coming through those code paths.benchmark/scripts/start_stack.sh-57-86 (1)
57-86:⚠️ Potential issue | 🟡 Minor
BENCHMARK_REDIS_PORTis not fully wired through the stack.
start_redis.shlets callers move Redis off6399, but this script always launches the router withbenchmark/router-cache.redis.yaml, which still points atredis://localhost:6399/2. Any non-default port will start Redis successfully and then leave the router talking to the wrong address.Low-friction guard until the config is parameterized
bash "${SCRIPT_DIR}/start_redis.sh" + +if [ "${BENCHMARK_REDIS_PORT:-6399}" != "6399" ]; then + echo "BENCHMARK_REDIS_PORT is not wired through benchmark/router-cache.redis.yaml yet" >&2 + exit 1 +fi ( cd "${ROOT_DIR}/demo" env -u GOROOT go build -o "${RUN_DIR}/cache-demo.bin" ./cmd/cache-demo )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@benchmark/scripts/start_stack.sh` around lines 57 - 86, The router is still pointed at redis://localhost:6399/2 via the static benchmark/router-cache.redis.yaml while start_redis.sh can start Redis on a different port (BENCHMARK_REDIS_PORT); update start_stack.sh so the spawn_detached call that launches "${RUN_DIR}/router.bin" uses the actual BENCHMARK_REDIS_PORT — either by exporting a REDIS_URL (or similar) env var constructed from BENCHMARK_REDIS_PORT before the spawn_detached call, or by generating a temporary router config from router-cache.redis.yaml with the port substituted and passing that config to "${RUN_DIR}/router.bin"; modify the code around spawn_detached in start_stack.sh (and any place that references router-cache.redis.yaml or RUN_DIR/router.bin) so the router always receives the correct Redis address.composition/AGENTS.md-11-25 (1)
11-25:⚠️ Potential issue | 🟡 MinorAdd a language to this fenced block.
This fence currently trips MD040. Using
texthere is enough and keeps the docs lint-clean.Suggested fix
-``` +```text `@key`(fields: "...") on entity type ↓ extractKeyFieldSets() → KeyFieldSetData { documentNode, normalizedFieldSet, isUnresolvable } ↓ keyFieldSetDatasByTypeName: Map<TypeName, Map<normalizedFieldSet, KeyFieldSetData>> ↓ validateKeyFieldSets() → keyFieldPathsByTypeNameByFieldSet (per-key dot-notation paths) ↓ extractPerKeyFieldPaths() → Array<{normalizedFieldSet, isUnresolvable, fieldPaths: Set<string>}> ↓ buildArgumentKeyMappings() → EntityKeyMappingConfig[] (one per fully-satisfiable key) ↓ RootFieldCacheConfig.entityKeyMappings → serialized to protobuf → router -``` +```🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@composition/AGENTS.md` around lines 11 - 25, The fenced code block in AGENTS.md is missing a language tag which triggers MD040; update the opening fence from ``` to ```text (and leave the closing fence as ```), so the block becomes a fenced "text" code block; no changes needed to the block contents—only change the opening fence to include "text".composition/COMPOSITION_CONVENTIONS.md-16-51 (1)
16-51:⚠️ Potential issue | 🟡 MinorLabel this fenced block to satisfy Markdown lint.
The directory tree fence is missing a language tag, so MD040 will keep firing here.
textis the simplest fix.Suggested fix
-``` +```text composition/src/ ├── index.ts # public exports — minimal, additive ├── buildASTSchema/ # schema construction helpers ├── errors/ # error factories + typed param interfaces ├── federation/ # shared federation types (public result shapes) ├── normalization/ # shared normalization types ├── resolvability-graph/ # reachability validation (graph.ts, walker/) ├── router-compatibility-version/ # ROUTER_COMPATIBILITY_VERSION_ONE ├── router-configuration/types.ts # ConfigurationData + cache/field mapping types ├── schema-building/ # build router + client schemas from merged model ├── subgraph/types.ts # Subgraph, SubgraphConfig, InternalSubgraph ├── types/ # shared branded/alias types (SubgraphName, ...) ├── utils/ # string-constants.ts, params.ts, shared helpers ├── v1/ # router-compatibility-version=1 implementation │ ├── constants/ # split into dedicated modules (see §3) │ │ ├── constants.ts │ │ ├── directive-definitions.ts │ │ ├── integers.ts │ │ ├── non-directive-definitions.ts │ │ ├── strings.ts │ │ └── type-nodes.ts │ ├── federation/ # federation-factory.ts + params │ ├── normalization/ # normalization-factory.ts + walkers + utils │ ├── schema-building/ │ ├── subgraph/ │ ├── utils/ │ └── warnings/ └── warnings/ # shared warning types composition/tests/v1/ # mirrors v1/ layout ├── directives/ # one file per directive (authorization, entity-caching, ...) ├── types/ # per-GraphQL-type tests (interfaces, unions, ...) ├── utils/ # shared test helpers + SDL fragments └── test-data/ -``` +```🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@composition/COMPOSITION_CONVENTIONS.md` around lines 16 - 51, The Markdown fenced code block containing the directory tree in COMPOSITION_CONVENTIONS.md is missing a language tag (triggers MD040); update the opening fence from ``` to ```text so the block is labeled as text and ensure the closing fence remains ```; locate the block by searching for the directory-tree snippet starting with "composition/src/" and change only the opening fence to include the language tag.benchmark/scripts/run_suite.ts-65-87 (1)
65-87:⚠️ Potential issue | 🟡 MinorReject missing or malformed flag values during parsing.
--scenario --all,--duration --all, or--vus fooare all accepted here and only fail much later, sometimes after the benchmark stack has already started. Validate that each option has a non-flag value, and reject non-numeric--vus, before returning fromparseArgs().🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@benchmark/scripts/run_suite.ts` around lines 65 - 87, The argument parsing in parseArgs currently accepts missing or flag-like values (e.g., "--scenario --all" or non-numeric "--vus foo"); update parseArgs to validate each option's next token before assigning to scenario, vus, duration, rampUp, and rampDown: ensure args[i+1] exists and does not startWith("--"), and for vus (and any numeric options) convert to Number and verify it's a finite number (and positive integer if required); if validation fails, throw or return a descriptive error immediately instead of continuing so invalid flags are rejected early.composition/CLAUDE.md-165-168 (1)
165-168:⚠️ Potential issue | 🟡 MinorReference the current helper name here.
buildArgumentKeyMappingsV2does not exist incomposition/src/v1/normalization/normalization-factory.ts; the implementation isbuildArgumentKeyMappings()plusgetIsFieldValue(). This note will send readers to the wrong symbol.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@composition/CLAUDE.md` around lines 165 - 168, The documentation incorrectly references buildArgumentKeyMappingsV2; update the CLAUDE.md note to reference the actual helper names used in code: buildArgumentKeyMappings() and getIsFieldValue() (instead of buildArgumentKeyMappingsV2), and call out that the argument name is FIELDS (plural) so ensure IS_DEFINITION and IS_DEFINITION_DATA use the FIELDS constant and that buildArgumentKeyMappings() reads arg.name.value === FIELDS to extract `@openfed__is` values.benchmark/scripts/lib.ts-189-206 (1)
189-206:⚠️ Potential issue | 🟡 MinorReject half-configured multi-profile scenarios up front.
If a manifest provides
responseFixtureswithoutauthProfiles(or the reverse), this falls through to the single-fixture branch. WithresponseFixturealso present, the suite will silently benchmark the wrong matrix instead of failing fast.🛡️ Proposed fix
export function resolveRequestMatrix(scenario: Scenario): RequestVariant[] { + const hasMultiProfileFixtures = Boolean(scenario.responseFixtures); + const hasMultiProfileAuthProfiles = Boolean(scenario.authProfiles?.length); + + if (hasMultiProfileFixtures !== hasMultiProfileAuthProfiles) { + throw new Error( + `scenario ${scenario.name} must define both responseFixtures and authProfiles together`, + ); + } + if (scenario.responseFixtures && scenario.authProfiles?.length) { return scenario.authProfiles.map((authProfile) => { const fixturePath = scenario.responseFixtures?.[authProfile]; if (!fixturePath) { throw new Error(🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@benchmark/scripts/lib.ts` around lines 189 - 206, In resolveRequestMatrix, detect and reject partially-specified multi-profile scenarios: if exactly one of scenario.responseFixtures or scenario.authProfiles is present (i.e., one exists but the other is missing/empty), throw an error indicating the manifest is half-configured for that scenario; otherwise keep existing behavior (use authProfiles+responseFixtures to build the matrix, or fall back to single responseFixture/scenario.authProfile). Reference resolveRequestMatrix and the properties scenario.responseFixtures, scenario.authProfiles, scenario.responseFixture, and scenario.authProfile when implementing the check.
🧹 Nitpick comments (5)
TODO.md (1)
167-167: Minor word-order nitpick.Consider rewording
"user asked to not defer"→"user asked not to defer"for smoother reading.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@TODO.md` at line 167, Replace the phrase "user asked to not defer" with "user asked not to defer" in the TODO.md text to fix the word-order nitpick; search for the exact substring "user asked to not defer" (the resolved sentence mentioning "another PR" and metrics consolidation) and update it to "user asked not to defer" for smoother reading.benchmark/k6/cache_demo.js (1)
5-9: Fail cleanly on malformedBENCHMARK_PAYLOAD.A bad env payload currently throws during module init, so you lose the explicit
fail(...)message. Wrapping the parse makes this much easier to diagnose.Suggested fix
-const payload = JSON.parse(__ENV.BENCHMARK_PAYLOAD || "{}"); +const payload = (() => { + try { + return JSON.parse(__ENV.BENCHMARK_PAYLOAD || "{}"); + } catch { + fail("BENCHMARK_PAYLOAD must be valid JSON"); + } +})();As per coding guidelines, "Add proper error handling with try-catch blocks".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@benchmark/k6/cache_demo.js` around lines 5 - 9, The module currently parses BENCHMARK_PAYLOAD at init which can throw; wrap the JSON.parse of __ENV.BENCHMARK_PAYLOAD in a try-catch so malformed JSON triggers a clear fail(...) instead of an uncaught exception: attempt to parse into the payload variable (keeping the "{}" fallback), and in the catch call fail with a message that includes the parse error and the raw __ENV.BENCHMARK_PAYLOAD value so it's easy to diagnose; update references to payload as before.benchmark/scripts/lib.ts (2)
13-103: Preferinterfacefor these exported object shapes.
Scenario,Manifest,RequestVariant,K6Stage,ComparableModeSummary,ModeComparison,ScenarioVariantSummary, andSuiteSummaryare all object-shape exports. In this repo those should be interfaces, not type aliases.As per coding guidelines, "Prefer interfaces over type aliases for object shapes in TypeScript".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@benchmark/scripts/lib.ts` around lines 13 - 103, Replace the exported object-shape type aliases with exported interfaces: convert Scenario, Manifest, RequestVariant, K6Stage, ComparableModeSummary, ModeComparison, ScenarioVariantSummary, and SuiteSummary from "export type X = { ... }" to "export interface X { ... }", preserving all property names, optional modifiers, nested object structures (e.g., cache, router, redis), and exported status; ensure any places that import or reference these remain valid (no runtime changes needed) and adjust only the declaration syntax for those specific symbols.
115-125: Make the mode tables exhaustive overMode.Both
MODE_ORDERandCOMPARISON_PAIRSare widened tostring, so adding or renaming a mode will not fail compilation and will silently fall back to the default ordering/comparison behavior.♻️ Proposed fix
-const MODE_ORDER: Record<string, number> = { +const MODE_ORDER: Record<Mode, number> = { cache_disabled: 0, cache_enabled: 1, request_scoped_l1_disabled: 2, request_scoped_default: 3, }; -const COMPARISON_PAIRS: Array<readonly [baselineMode: string, candidateMode: string]> = [ +const COMPARISON_PAIRS: Array<readonly [baselineMode: Mode, candidateMode: Mode]> = [ ["cache_disabled", "cache_enabled"], ["request_scoped_l1_disabled", "request_scoped_default"], ];🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@benchmark/scripts/lib.ts` around lines 115 - 125, MODE_ORDER and COMPARISON_PAIRS are typed too broadly as string which lets missing/renamed modes slip through; change their types to be explicit over the Mode union so the compiler enforces exhaustiveness. Replace Record<string, number> with Record<Mode, number> for MODE_ORDER and change COMPARISON_PAIRS to Array<readonly [baselineMode: Mode, candidateMode: Mode]> (or readonly (readonly [Mode, Mode])[]), and ensure the object/array literal actually contains an entry for every member of the Mode union so the compiler will error if a Mode is added or renamed; update the declarations of MODE_ORDER and COMPARISON_PAIRS (and any const assertions) accordingly.composition/src/errors/errors.ts (1)
1862-1871: Do not hardcode composite-key diagnostics to two arguments.Composite keys and decomposed input objects can contribute more than two mapped argument paths. This factory only reports
firstArgumentandsecondArgument, so 3+ field keys will emit incomplete diagnostics right when someone is debugging mapping validity.♻️ Proposed fix
export function explicitCompositeAdditionalNonKeyArgumentErrorMessage( fieldCoords: string, - firstArgument: string, - secondArgument: string, + mappedArguments: string[], compositeKey: string, entityType: string, extraArgument: string, ): string { - return `Field "${fieldCoords}" has arguments "${firstArgument}" and "${secondArgument}" with `@openfed__is` mappings covering composite `@key` "${compositeKey}" on entity "${entityType}", but also has additional argument "${extraArgument}" which is not mapped to a key field. All arguments must be key arguments — additional arguments may filter the response, making the cache key incomplete.`; + return `Field "${fieldCoords}" has arguments "${mappedArguments.join('", "')}" with `@openfed__is` mappings covering composite `@key` "${compositeKey}" on entity "${entityType}", but also has additional argument "${extraArgument}" which is not mapped to a key field. All arguments must be key arguments — additional arguments may filter the response, making the cache key incomplete.`; }Based on learnings, input object decomposition in argument mapping must decompose into per-field mappings, with nested input objects mapping recursively.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@composition/src/errors/errors.ts` around lines 1862 - 1871, The message builder explicitCompositeAdditionalNonKeyArgumentErrorMessage currently assumes exactly two mapped arguments (firstArgument, secondArgument) which hides additional mapped paths; change its signature to accept a list of mapped argument paths (e.g., mappedArguments: string[]) instead of firstArgument/secondArgument and build the diagnostic by joining all mappedArguments (comma-separated, with proper "and" before last item) so the message includes every mapped key path along with the extraArgument and compositeKey/entityType; update all call sites that pass firstArgument/secondArgument to instead pass the array of decomposed per-field mapping paths (recursing into nested input objects where necessary) so diagnostics correctly report 3+ mapped arguments.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@benchmark/k6/cache_demo.js`:
- Around line 70-72: The current buildK6Payload implementation replaces the
default JSON header when payload.headers is provided, causing requests to lose
"content-type: application/json"; change the header assignment in buildK6Payload
to merge the defaults with any custom headers (e.g., combine { "content-type":
"application/json" } with payload.headers using a shallow merge so custom keys
override defaults but default content-type remains unless explicitly
overridden).
In `@benchmark/scripts/lib.ts`:
- Around line 651-678: In fetchJsonWithRetry, wrap each attempt with a
per-attempt timeout by creating an AbortController and passing its signal into
the fetch call without discarding any existing input.init.signal (merge: if
input.init.signal exists, race it with the timeout controller and forward a
composite signal or listen to input.init.signal to abort the timeout
controller); ensure the timeout controller is aborted/cleared after each attempt
and that the signal is included in the RequestInit passed to fetchImpl (do not
overwrite input.init directly), so the timeout-triggered AbortError flows into
isRetryableFetchError and the retry logic continues correctly.
In `@benchmark/scripts/scrape_metrics.ts`:
- Around line 54-57: The fetch for metrics currently has no timeout and can
hang; wrap the fetch in an AbortController: create a controller, pass
controller.signal to fetch("http://127.0.0.1:8088/metrics"), start a timer
(e.g., 5000ms) that calls controller.abort() on expiry, clear the timer after
fetch resolves, and handle AbortError to throw a clear "metrics fetch timeout"
error; update the code around the response/response.ok checks to use this
abort-enabled fetch flow (reference AbortController, controller.signal,
response).
In `@benchmark/scripts/stop_stack.sh`:
- Around line 13-16: The current shutdown in stop_stack.sh uses wait "${pid}"
which is ineffective for detached processes started with start_new_session=True
and can race; replace the wait call for "${pid}" with a polling loop that
repeatedly checks kill -0 "${pid}" (or /proc/${pid} existence) until the process
no longer exists or a timeout elapses, sleeping briefly between checks and then
proceed to remove the pid file; ensure the loop uses the same ${pid} variable
and enforces a reasonable timeout and graceful fallback (e.g., log/warn and
continue) so stop_stack.sh reliably waits for the detached process to exit.
In `@composition/src/v1/constants/directive-definitions.ts`:
- Around line 766-789: The new REQUEST_SCOPED_DEFINITION and its name symbol
REQUEST_SCOPED are not registered in the directive registry used during
normalization; import REQUEST_SCOPED and REQUEST_SCOPED_DEFINITION into the
directive registry module (where DIRECTIVE_DEFINITION_BY_NAME is defined) and
add an entry mapping REQUEST_SCOPED to REQUEST_SCOPED_DEFINITION in
DIRECTIVE_DEFINITION_BY_NAME so the `@openfed__requestScoped` directive is
discoverable and emitted during normalization.
In `@composition/src/v1/normalization/normalization-factory.ts`:
- Around line 4333-4339: The current early return from buildExplicitMappings
when isCompositeIsSpec is true causes a composite `@openfed__is` mapping to
suppress other satisfiable keys; instead of returning the result of
buildCompositeIsMapping, call buildCompositeIsMapping with (fieldCoords,
entityTypeName, keyFieldSets, isListReturn, argInfo) and append its produced
EntityKeyMappingConfig(s) to the mappings collection used by
buildExplicitMappings (do not merge into a single composite that enforces AND),
then continue processing remaining explicit/auto-mappable `@key` directives so
each independent `@key` yields its own EntityKeyMappingConfig (OR semantics).
Ensure you reference and update the same mappings accumulator used later in
buildExplicitMappings and remove the early return path.
- Around line 4846-4863: The current mismatch branch in normalization-factory.ts
aborts all auto-mapping by returning [] when a single argument vs key field
mismatch is found; instead, remove the early return so a mismatched field only
marks the current key as not fully mapped (set keyFullyMapped = false), emit the
warning via this.warnings.push(autoMappingTypeMismatchWarning(...)) if not
already emitted, and continue evaluating the rest of the key fields so other
candidate `@key` combinations can still produce separate EntityKeyMappingConfig
entries; keep use of namedTypesMatch(argTypeNode, keyTypeNode), the
argIsList/keyIsList/isListReturn checks, and the typeMismatchWarningEmitted flag
as-is but do not clear or abort outer mappings—only skip this key by ensuring
the loop continues and that fully-satisfiable keys are still collected.
- Around line 4185-4231: The current leaf-type acceptance uses
this.namedTypesMatch(inputFieldData.type, keyTypeNode) which ignores list
wrappers and lets mismatched list shapes (e.g. [ID!]! vs ID!) through; update
the acceptance to require both the named type and list shape match by also
calling this.listStructureMatches(...) (i.e. change the if to fail when either
namedTypesMatch(...) is false OR listStructureMatches(inputFieldData.type,
keyTypeNode) is false), so the error branch runs for nested list mismatches and
prevents emitting the invalid FieldMappingConfig used when building
fieldMappings/entityKeyField/argumentPath.
In `@demo/cmd/cache-demo/main.go`:
- Around line 40-47: gqlServer currently mounts the gqlgen server directly at
"/graphql" (srv created by handler.New) which bypasses injector.HTTP; change the
registration to wrap that handler with injector.HTTP so requests go through the
same middleware chain (i.e., replace mux.Handle("/graphql", srv) with
mux.Handle("/graphql", injector.HTTP(srv)) while keeping the existing
injector.Latency(mux) return). This ensures gqlServer, the handler.New-created
srv, and the "/graphql" route use injector.HTTP for consistent
request-context/auth behavior.
In `@demo/pkg/subgraphs/cachegraph/subgraph/schema.resolvers.go`:
- Around line 17-46: The package-level slices/maps (articlesData, listingsData)
are mutated without synchronization causing races and possible "concurrent map
read and map write" panics; protect access with a sync.RWMutex (e.g., declare a
package-level var articlesMu sync.RWMutex and listingsMu sync.RWMutex) and use
articlesMu.Lock()/Unlock() for CreateArticle and UpdateArticle mutations and
articlesMu.RLock()/RUnlock() for query readers, and similarly use
listingsMu.Lock()/Unlock() around DeleteListing (and any listing reads) when
accessing listingsData; alternatively implement copy-on-write snapshots for
writes and atomically replace the store if you prefer immutability.
In `@Makefile`:
- Around line 195-196: The demo target currently backgrounds the subgraph
process started by "go run cmd/all/main.go" and never cleans it up; modify the
Makefile commands to capture the background PID (from the "go run
cmd/all/main.go &" invocation), install a shell trap for INT/EXIT that kills
that PID (and waits for any children) before exiting, and ensure the router
process ("go run cmd/router/main.go --config ../demo/router-cache.yaml") runs in
the foreground so the trap can fire; use the captured PID (or a temp file) to
reliably kill the background subgraph on normal exit, interrupt, or error.
---
Minor comments:
In `@benchmark/scripts/lib.ts`:
- Around line 189-206: In resolveRequestMatrix, detect and reject
partially-specified multi-profile scenarios: if exactly one of
scenario.responseFixtures or scenario.authProfiles is present (i.e., one exists
but the other is missing/empty), throw an error indicating the manifest is
half-configured for that scenario; otherwise keep existing behavior (use
authProfiles+responseFixtures to build the matrix, or fall back to single
responseFixture/scenario.authProfile). Reference resolveRequestMatrix and the
properties scenario.responseFixtures, scenario.authProfiles,
scenario.responseFixture, and scenario.authProfile when implementing the check.
In `@benchmark/scripts/run_suite.ts`:
- Around line 65-87: The argument parsing in parseArgs currently accepts missing
or flag-like values (e.g., "--scenario --all" or non-numeric "--vus foo");
update parseArgs to validate each option's next token before assigning to
scenario, vus, duration, rampUp, and rampDown: ensure args[i+1] exists and does
not startWith("--"), and for vus (and any numeric options) convert to Number and
verify it's a finite number (and positive integer if required); if validation
fails, throw or return a descriptive error immediately instead of continuing so
invalid flags are rejected early.
In `@benchmark/scripts/start_stack.sh`:
- Around line 57-86: The router is still pointed at redis://localhost:6399/2 via
the static benchmark/router-cache.redis.yaml while start_redis.sh can start
Redis on a different port (BENCHMARK_REDIS_PORT); update start_stack.sh so the
spawn_detached call that launches "${RUN_DIR}/router.bin" uses the actual
BENCHMARK_REDIS_PORT — either by exporting a REDIS_URL (or similar) env var
constructed from BENCHMARK_REDIS_PORT before the spawn_detached call, or by
generating a temporary router config from router-cache.redis.yaml with the port
substituted and passing that config to "${RUN_DIR}/router.bin"; modify the code
around spawn_detached in start_stack.sh (and any place that references
router-cache.redis.yaml or RUN_DIR/router.bin) so the router always receives the
correct Redis address.
In `@composition/AGENTS.md`:
- Around line 11-25: The fenced code block in AGENTS.md is missing a language
tag which triggers MD040; update the opening fence from ``` to ```text (and
leave the closing fence as ```), so the block becomes a fenced "text" code
block; no changes needed to the block contents—only change the opening fence to
include "text".
In `@composition/CLAUDE.md`:
- Around line 165-168: The documentation incorrectly references
buildArgumentKeyMappingsV2; update the CLAUDE.md note to reference the actual
helper names used in code: buildArgumentKeyMappings() and getIsFieldValue()
(instead of buildArgumentKeyMappingsV2), and call out that the argument name is
FIELDS (plural) so ensure IS_DEFINITION and IS_DEFINITION_DATA use the FIELDS
constant and that buildArgumentKeyMappings() reads arg.name.value === FIELDS to
extract `@openfed__is` values.
In `@composition/COMPOSITION_CONVENTIONS.md`:
- Around line 16-51: The Markdown fenced code block containing the directory
tree in COMPOSITION_CONVENTIONS.md is missing a language tag (triggers MD040);
update the opening fence from ``` to ```text so the block is labeled as text and
ensure the closing fence remains ```; locate the block by searching for the
directory-tree snippet starting with "composition/src/" and change only the
opening fence to include the language tag.
In `@demo/pkg/injector/latency.go`:
- Around line 22-25: Replace the unconditional time.Sleep in the latency
injection middleware with a context-aware wait so the handler honors client
cancellation: after parsing ms (from strconv.Atoi) use the request context
(r.Context()) and wait with a select between
time.After(time.Duration(ms)*time.Millisecond) and ctx.Done(), returning
immediately (or skipping the sleep) if ctx.Done() fires; keep the existing
log.Printf call but consider logging that the delay was aborted when ctx.Done()
occurs.
In `@demo/pkg/subgraphs/cachegraph_ext/subgraph/data.go`:
- Around line 46-53: The toArticle function currently dereferences ext (type
*articleExtData) without checking for nil, which can panic; add a nil guard at
the start of toArticle: if ext is nil return &model.Article{ID: id} (or populate
only safe defaults) otherwise proceed to use ext.ViewCount, ext.Rating,
ext.ReviewSummary; keep the function name toArticle and the types articleExtData
and model.Article unchanged so callers remain compatible.
In `@demo/pkg/subgraphs/subgraphs.go`:
- Around line 227-228: ViewerHandler and the handler returned by subgraphs.New
currently return viewer.NewHandler() directly and therefore bypass the latency
injector; wrap the viewer handler with the shared latency middleware so
X-Artificial-Latency is respected. Concretely, replace returns of
viewer.NewHandler() in ViewerHandler and in the viewer route inside
subgraphs.New with injector.Latency(viewer.NewHandler()) (or the appropriate
injector.Latency(...) call used elsewhere) so the injected latency middleware is
applied to requests coming through those code paths.
In `@TODO.md`:
- Around line 21-24: Replace the hardcoded absolute paths in the verification
commands (the four shell lines that start with cd /Users/jens/repos/... and
PATH=/Users/jens/go/pkg/mod/...) with portable alternatives: use
repository-relative paths (e.g., cd to the local subfolders like
graphql-go-tools/v2, router, router-tests, playground) and drop the
machine-specific PATH override or replace it with a generic toolchain lookup
(rely on the developer's PATH or use environment variables such as GOMODCACHE or
GOROOT/bin if needed); ensure the commands remain the same otherwise (go test
./pkg/engine/resolve -run
'TestDetectMutationEntityImpact|TestMutationCacheTTLOverride' -count=1, go test
./core ./pkg/entitycache ./pkg/metric -count=1, go test ./entity_caching
-count=1, pnpm test && pnpm build) so they run from repo root on any machine.
---
Duplicate comments:
In `@composition/CLAUDE.md`:
- Around line 103-109: The fenced code blocks showing GraphQL directives (e.g.
`@openfed__entityCache`, `@openfed__queryCache`, `@openfed__cachePopulate`,
`@openfed__cacheInvalidate`, `@openfed__is`) should be labeled with a language tag:
change the directive block fences to use ```graphql and any command/snippet
blocks mentioned elsewhere (like shell examples) to use ```bash; update the
other unlabeled fences referenced around lines 208–225 in the same file to use
the appropriate ```graphql or ```bash tags so markdownlint/MD040 no longer flags
them.
In `@composition/src/v1/normalization/normalization-factory.ts`:
- Around line 3797-3829: processRootFieldCacheDirectives() is using the raw map
key (parentTypeName) when calling extractQueryCacheConfig /
extractCacheInvalidateConfig / extractCachePopulateConfig, which causes renamed
schema roots to be stored under the wrong type; compute the canonical root type
name from the operationType node (e.g. the NamedType node on operationType, such
as operationType.type.name.value) and pass that canonicalRootTypeName to
extractQueryCacheConfig/extractCacheInvalidateConfig/extractCachePopulateConfig
instead of parentTypeName so root-field cache config is stored under the
standard Query/Mutation/Subscription types.
---
Nitpick comments:
In `@benchmark/k6/cache_demo.js`:
- Around line 5-9: The module currently parses BENCHMARK_PAYLOAD at init which
can throw; wrap the JSON.parse of __ENV.BENCHMARK_PAYLOAD in a try-catch so
malformed JSON triggers a clear fail(...) instead of an uncaught exception:
attempt to parse into the payload variable (keeping the "{}" fallback), and in
the catch call fail with a message that includes the parse error and the raw
__ENV.BENCHMARK_PAYLOAD value so it's easy to diagnose; update references to
payload as before.
In `@benchmark/scripts/lib.ts`:
- Around line 13-103: Replace the exported object-shape type aliases with
exported interfaces: convert Scenario, Manifest, RequestVariant, K6Stage,
ComparableModeSummary, ModeComparison, ScenarioVariantSummary, and SuiteSummary
from "export type X = { ... }" to "export interface X { ... }", preserving all
property names, optional modifiers, nested object structures (e.g., cache,
router, redis), and exported status; ensure any places that import or reference
these remain valid (no runtime changes needed) and adjust only the declaration
syntax for those specific symbols.
- Around line 115-125: MODE_ORDER and COMPARISON_PAIRS are typed too broadly as
string which lets missing/renamed modes slip through; change their types to be
explicit over the Mode union so the compiler enforces exhaustiveness. Replace
Record<string, number> with Record<Mode, number> for MODE_ORDER and change
COMPARISON_PAIRS to Array<readonly [baselineMode: Mode, candidateMode: Mode]>
(or readonly (readonly [Mode, Mode])[]), and ensure the object/array literal
actually contains an entry for every member of the Mode union so the compiler
will error if a Mode is added or renamed; update the declarations of MODE_ORDER
and COMPARISON_PAIRS (and any const assertions) accordingly.
In `@composition/src/errors/errors.ts`:
- Around line 1862-1871: The message builder
explicitCompositeAdditionalNonKeyArgumentErrorMessage currently assumes exactly
two mapped arguments (firstArgument, secondArgument) which hides additional
mapped paths; change its signature to accept a list of mapped argument paths
(e.g., mappedArguments: string[]) instead of firstArgument/secondArgument and
build the diagnostic by joining all mappedArguments (comma-separated, with
proper "and" before last item) so the message includes every mapped key path
along with the extraArgument and compositeKey/entityType; update all call sites
that pass firstArgument/secondArgument to instead pass the array of decomposed
per-field mapping paths (recursing into nested input objects where necessary) so
diagnostics correctly report 3+ mapped arguments.
In `@TODO.md`:
- Line 167: Replace the phrase "user asked to not defer" with "user asked not to
defer" in the TODO.md text to fix the word-order nitpick; search for the exact
substring "user asked to not defer" (the resolved sentence mentioning "another
PR" and metrics consolidation) and update it to "user asked not to defer" for
smoother reading.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: cbaaac0d-212c-48a2-bf71-4c51cd586799
⛔ Files ignored due to path filters (12)
connect-go/gen/proto/wg/cosmo/node/v1/node.pb.gois excluded by!**/*.pb.go,!**/gen/**connect-go/gen/proto/wg/cosmo/platform/v1/platform.pb.gois excluded by!**/*.pb.go,!**/gen/**demo/pkg/subgraphs/cachegraph/subgraph/generated/federation.gois excluded by!**/generated/**demo/pkg/subgraphs/cachegraph/subgraph/generated/generated.gois excluded by!**/generated/**demo/pkg/subgraphs/cachegraph_ext/subgraph/generated/federation.gois excluded by!**/generated/**demo/pkg/subgraphs/cachegraph_ext/subgraph/generated/federation.requires.gois excluded by!**/generated/**demo/pkg/subgraphs/cachegraph_ext/subgraph/generated/generated.gois excluded by!**/generated/**demo/pkg/subgraphs/projects/generated/service.pb.gois excluded by!**/*.pb.go,!**/generated/**demo/pkg/subgraphs/projects/generated/service.protois excluded by!**/generated/**demo/pkg/subgraphs/projects/generated/service.proto.lock.jsonis excluded by!**/generated/**demo/pkg/subgraphs/viewer/subgraph/generated/federation.gois excluded by!**/generated/**demo/pkg/subgraphs/viewer/subgraph/generated/generated.gois excluded by!**/generated/**
📒 Files selected for processing (93)
FEEDBACK.mdMakefileTODO.mdbenchmark/.gitignorebenchmark/README.mdbenchmark/fixtures/article_simple.response.jsonbenchmark/fixtures/articles_by_ids_batch.response.jsonbenchmark/fixtures/catalogs_partial_load.response.jsonbenchmark/fixtures/listing_composite_key.response.jsonbenchmark/fixtures/request_scoped_viewer_articles.response.jsonbenchmark/fixtures/user_profile_header_sensitive.alice.response.jsonbenchmark/fixtures/user_profile_header_sensitive.bob.response.jsonbenchmark/fixtures/venue_nested_key.response.jsonbenchmark/fixtures/viewer_articles_deep_nested.response.jsonbenchmark/k6/cache_demo.jsbenchmark/queries/article_simple.graphqlbenchmark/queries/articles_by_ids_batch.graphqlbenchmark/queries/catalogs_partial_load.graphqlbenchmark/queries/listing_composite_key.graphqlbenchmark/queries/request_scoped_viewer_articles.graphqlbenchmark/queries/user_profile_header_sensitive.graphqlbenchmark/queries/venue_nested_key.graphqlbenchmark/queries/viewer_articles_deep_nested.graphqlbenchmark/router-cache.redis.yamlbenchmark/scenarios/cache-demo.jsonbenchmark/scripts/capture_pprof.shbenchmark/scripts/capture_redis_stats.shbenchmark/scripts/lib.test.tsbenchmark/scripts/lib.tsbenchmark/scripts/run_suite.tsbenchmark/scripts/scrape_metrics.tsbenchmark/scripts/start_redis.shbenchmark/scripts/start_stack.shbenchmark/scripts/stop_redis.shbenchmark/scripts/stop_stack.shbenchmark/scripts/verify_equivalence.tsbenchmark/scripts/wait_ready.shcomposition-go/index.global.jscomposition/AGENTS.mdcomposition/CLAUDE.mdcomposition/COMPOSITION_CONVENTIONS.mdcomposition/src/errors/errors.tscomposition/src/router-configuration/types.tscomposition/src/utils/string-constants.tscomposition/src/v1/constants/directive-definitions.tscomposition/src/v1/normalization/normalization-factory.tscomposition/src/v1/warnings/params.tscomposition/src/v1/warnings/warnings.tscomposition/tests/v1/directives/entity-cache-fuzz.test.tscomposition/tests/v1/directives/entity-cache-mapping-rules.test.tscomposition/tests/v1/directives/entity-caching.test.tsconnect/src/wg/cosmo/node/v1/node_pb.tsdemo/Makefiledemo/cmd/all/main.godemo/cmd/cache-demo/main.godemo/cmd/cachegraph/main.godemo/config-cache-only.jsondemo/go.moddemo/graph-cache-only.yamldemo/graph.yamldemo/pkg/injector/latency.godemo/pkg/subgraphs/cachegraph/cachegraph.godemo/pkg/subgraphs/cachegraph/generate.godemo/pkg/subgraphs/cachegraph/gqlgen.ymldemo/pkg/subgraphs/cachegraph/subgraph/data.godemo/pkg/subgraphs/cachegraph/subgraph/data_test.godemo/pkg/subgraphs/cachegraph/subgraph/entity.resolvers.godemo/pkg/subgraphs/cachegraph/subgraph/model/models_gen.godemo/pkg/subgraphs/cachegraph/subgraph/resolver.godemo/pkg/subgraphs/cachegraph/subgraph/schema.graphqlsdemo/pkg/subgraphs/cachegraph/subgraph/schema.resolvers.godemo/pkg/subgraphs/cachegraph_ext/cachegraph_ext.godemo/pkg/subgraphs/cachegraph_ext/generate.godemo/pkg/subgraphs/cachegraph_ext/gqlgen.ymldemo/pkg/subgraphs/cachegraph_ext/subgraph/data.godemo/pkg/subgraphs/cachegraph_ext/subgraph/entity.resolvers.godemo/pkg/subgraphs/cachegraph_ext/subgraph/model/models_gen.godemo/pkg/subgraphs/cachegraph_ext/subgraph/resolver.godemo/pkg/subgraphs/cachegraph_ext/subgraph/schema.graphqlsdemo/pkg/subgraphs/cachegraph_ext/subgraph/schema.resolvers.godemo/pkg/subgraphs/cachegraph_ext/subgraph/schema.resolvers_test.godemo/pkg/subgraphs/subgraphs.godemo/pkg/subgraphs/viewer/generate.godemo/pkg/subgraphs/viewer/gqlgen.ymldemo/pkg/subgraphs/viewer/subgraph/context.godemo/pkg/subgraphs/viewer/subgraph/data.godemo/pkg/subgraphs/viewer/subgraph/entity.resolvers.godemo/pkg/subgraphs/viewer/subgraph/model/models_gen.godemo/pkg/subgraphs/viewer/subgraph/resolver.godemo/pkg/subgraphs/viewer/subgraph/schema.graphqlsdemo/pkg/subgraphs/viewer/subgraph/schema.resolvers.godemo/pkg/subgraphs/viewer/viewer.godemo/router-cache.yaml
✅ Files skipped from review due to trivial changes (39)
- benchmark/fixtures/venue_nested_key.response.json
- benchmark/fixtures/article_simple.response.json
- benchmark/.gitignore
- benchmark/fixtures/user_profile_header_sensitive.alice.response.json
- benchmark/fixtures/articles_by_ids_batch.response.json
- benchmark/fixtures/catalogs_partial_load.response.json
- benchmark/fixtures/user_profile_header_sensitive.bob.response.json
- demo/pkg/subgraphs/cachegraph_ext/generate.go
- demo/pkg/subgraphs/viewer/generate.go
- benchmark/queries/article_simple.graphql
- benchmark/fixtures/listing_composite_key.response.json
- benchmark/queries/articles_by_ids_batch.graphql
- demo/pkg/subgraphs/cachegraph/generate.go
- benchmark/queries/catalogs_partial_load.graphql
- benchmark/queries/user_profile_header_sensitive.graphql
- demo/pkg/subgraphs/cachegraph_ext/subgraph/resolver.go
- demo/graph.yaml
- demo/pkg/subgraphs/cachegraph/subgraph/resolver.go
- benchmark/fixtures/request_scoped_viewer_articles.response.json
- benchmark/queries/venue_nested_key.graphql
- benchmark/queries/listing_composite_key.graphql
- benchmark/queries/request_scoped_viewer_articles.graphql
- benchmark/router-cache.redis.yaml
- benchmark/scripts/capture_pprof.sh
- demo/graph-cache-only.yaml
- benchmark/fixtures/viewer_articles_deep_nested.response.json
- benchmark/scenarios/cache-demo.json
- benchmark/README.md
- demo/pkg/subgraphs/cachegraph_ext/gqlgen.yml
- demo/pkg/subgraphs/cachegraph_ext/subgraph/entity.resolvers.go
- demo/pkg/subgraphs/cachegraph_ext/subgraph/schema.resolvers.go
- benchmark/queries/viewer_articles_deep_nested.graphql
- demo/pkg/subgraphs/cachegraph/subgraph/data.go
- demo/pkg/subgraphs/cachegraph/gqlgen.yml
- demo/pkg/subgraphs/viewer/subgraph/data.go
- demo/pkg/subgraphs/cachegraph_ext/subgraph/schema.graphqls
- composition/src/v1/warnings/params.ts
- composition/tests/v1/directives/entity-cache-mapping-rules.test.ts
- composition/tests/v1/directives/entity-caching.test.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- composition/src/router-configuration/types.ts
- composition/src/v1/warnings/warnings.ts
| { | ||
| headers: payload.headers || { "content-type": "application/json" }, | ||
| }, |
There was a problem hiding this comment.
Merge custom headers with the JSON default instead of replacing it.
buildK6Payload() always includes a headers object, so this fallback never runs. That means benchmark requests drop content-type: application/json unless every caller remembers to add it manually.
Suggested fix
{
- headers: payload.headers || { "content-type": "application/json" },
+ headers: {
+ "content-type": "application/json",
+ ...(payload.headers || {}),
+ },
},📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| { | |
| headers: payload.headers || { "content-type": "application/json" }, | |
| }, | |
| { | |
| headers: { | |
| "content-type": "application/json", | |
| ...(payload.headers || {}), | |
| }, | |
| }, |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@benchmark/k6/cache_demo.js` around lines 70 - 72, The current buildK6Payload
implementation replaces the default JSON header when payload.headers is
provided, causing requests to lose "content-type: application/json"; change the
header assignment in buildK6Payload to merge the defaults with any custom
headers (e.g., combine { "content-type": "application/json" } with
payload.headers using a shallow merge so custom keys override defaults but
default content-type remains unless explicitly overridden).
| export async function fetchJsonWithRetry(input: { | ||
| url: string; | ||
| init: RequestInit; | ||
| fetchImpl?: typeof fetch; | ||
| retries?: number; | ||
| }): Promise<unknown> { | ||
| const fetchImpl = input.fetchImpl ?? fetch; | ||
| const retries = input.retries ?? 1; | ||
|
|
||
| for (let attempt = 0; ; attempt += 1) { | ||
| try { | ||
| const response = await fetchImpl(input.url, input.init); | ||
| const body = await response.text(); | ||
|
|
||
| if (!response.ok) { | ||
| throw new Error(`request failed: ${response.status} ${body}`); | ||
| } | ||
|
|
||
| return JSON.parse(body); | ||
| } catch (error) { | ||
| if (attempt >= retries || !isRetryableFetchError(error)) { | ||
| throw error; | ||
| } | ||
|
|
||
| await delay(100); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, check if the file exists and examine the function
fd -t f "lib.ts" | grep benchmarkRepository: wundergraph/cosmo
Length of output: 85
🏁 Script executed:
# Read the specific lines of the file to verify the code snippet
sed -n '651,678p' benchmark/scripts/lib.tsRepository: wundergraph/cosmo
Length of output: 750
🏁 Script executed:
# Check the broader context around the function to understand its full implementation
sed -n '640,690p' benchmark/scripts/lib.tsRepository: wundergraph/cosmo
Length of output: 1372
🏁 Script executed:
# Search for usages of fetchJsonWithRetry to understand how it's called
rg "fetchJsonWithRetry" -A 2 -B 2Repository: wundergraph/cosmo
Length of output: 1770
🏁 Script executed:
# Check if there are any similar timeout/retry patterns in the codebase
rg "AbortController|AbortSignal" benchmark/Repository: wundergraph/cosmo
Length of output: 43
🏁 Script executed:
# Check the delay function implementation to understand the retry pattern
rg "function delay|const delay" -A 3Repository: wundergraph/cosmo
Length of output: 1956
🏁 Script executed:
# Look at RequestInit type to understand what signal property is available
rg "RequestInit" -A 5 -B 5 --max-count=5Repository: wundergraph/cosmo
Length of output: 4572
🏁 Script executed:
# Find the isRetryableFetchError function to see if it handles AbortError
rg "isRetryableFetchError" -A 10 benchmark/scripts/lib.tsRepository: wundergraph/cosmo
Length of output: 529
🏁 Script executed:
# Check if AbortError is handled anywhere in the error handling logic
rg "AbortError" benchmark/scripts/Repository: wundergraph/cosmo
Length of output: 43
🏁 Script executed:
# Verify if any calls to fetchJsonWithRetry already pass a signal in init
rg "fetchJsonWithRetry" -A 5 benchmark/ | grep -A 5 "signal"Repository: wundergraph/cosmo
Length of output: 43
🏁 Script executed:
# Double-check if any test cases might be affected by timeout changes
cat -n benchmark/scripts/lib.test.ts | grep -A 20 "fetchJsonWithRetry retries"Repository: wundergraph/cosmo
Length of output: 803
Add a timeout to each fetch attempt.
This retry wrapper talks to the router over the network but never attaches an AbortSignal, so a hung connection can stall the entire benchmark run until the OS-level timeout kicks in. Bound each attempt with a timeout and feed that timeout into the retry path.
Consider merging signals instead of overwriting them in the proposed fix. The spread should conditionally add the signal to avoid overriding an existing signal in input.init:
⏱️ Proposed fix
export async function fetchJsonWithRetry(input: {
url: string;
init: RequestInit;
fetchImpl?: typeof fetch;
retries?: number;
+ timeoutMs?: number;
}): Promise<unknown> {
const fetchImpl = input.fetchImpl ?? fetch;
const retries = input.retries ?? 1;
+ const timeoutMs = input.timeoutMs ?? 10_000;
for (let attempt = 0; ; attempt += 1) {
+ const controller = new AbortController();
+ const timeout = setTimeout(() => controller.abort(), timeoutMs);
try {
- const response = await fetchImpl(input.url, input.init);
+ const response = await fetchImpl(input.url, {
+ ...input.init,
+ signal: controller.signal,
+ });
const body = await response.text();
if (!response.ok) {
throw new Error(`request failed: ${response.status} ${body}`);
}
return JSON.parse(body);
} catch (error) {
if (attempt >= retries || !isRetryableFetchError(error)) {
throw error;
}
await delay(100);
+ } finally {
+ clearTimeout(timeout);
}
}
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@benchmark/scripts/lib.ts` around lines 651 - 678, In fetchJsonWithRetry, wrap
each attempt with a per-attempt timeout by creating an AbortController and
passing its signal into the fetch call without discarding any existing
input.init.signal (merge: if input.init.signal exists, race it with the timeout
controller and forward a composite signal or listen to input.init.signal to
abort the timeout controller); ensure the timeout controller is aborted/cleared
after each attempt and that the signal is included in the RequestInit passed to
fetchImpl (do not overwrite input.init directly), so the timeout-triggered
AbortError flows into isRetryableFetchError and the retry logic continues
correctly.
| const response = await fetch("http://127.0.0.1:8088/metrics"); | ||
| if (!response.ok) { | ||
| throw new Error(`failed to fetch router metrics: ${response.status}`); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Check the file exists and get size
wc -l benchmark/scripts/scrape_metrics.tsRepository: wundergraph/cosmo
Length of output: 99
🏁 Script executed:
# Read the full file to understand context
cat -n benchmark/scripts/scrape_metrics.tsRepository: wundergraph/cosmo
Length of output: 2532
🏁 Script executed:
# Search for AbortController usage in the codebase to understand patterns
rg "AbortController" --type ts --type tsx -B 2 -A 5Repository: wundergraph/cosmo
Length of output: 88
🏁 Script executed:
# Search for fetch patterns and timeout handling
rg "fetch\(" --type ts --type tsx -B 1 -A 3 | head -100Repository: wundergraph/cosmo
Length of output: 88
🏁 Script executed:
# Search for AbortController usage with correct syntax
rg "AbortController" --type ts -B 2 -A 5Repository: wundergraph/cosmo
Length of output: 1686
🏁 Script executed:
# Search for fetch patterns with timeout/abort handling
rg "fetch\(" --type ts -B 1 -A 5 | head -150Repository: wundergraph/cosmo
Length of output: 10849
🏁 Script executed:
# Check if there are setTimeout patterns used for request timeouts
rg "setTimeout.*fetch|AbortController|signal:" --type ts -B 1 -A 3Repository: wundergraph/cosmo
Length of output: 3101
Add timeout to metrics fetch to prevent indefinite hangs.
The fetch call has no timeout, so a wedged router will block the benchmark indefinitely. Add an explicit cancellation path using AbortController so the suite fails fast instead.
Suggested fix
- const response = await fetch("http://127.0.0.1:8088/metrics");
+ const controller = new AbortController();
+ const timeout = setTimeout(() => controller.abort(), 5_000);
+ let response: Response;
+ try {
+ response = await fetch("http://127.0.0.1:8088/metrics", {
+ signal: controller.signal,
+ });
+ } finally {
+ clearTimeout(timeout);
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const response = await fetch("http://127.0.0.1:8088/metrics"); | |
| if (!response.ok) { | |
| throw new Error(`failed to fetch router metrics: ${response.status}`); | |
| } | |
| const controller = new AbortController(); | |
| const timeout = setTimeout(() => controller.abort(), 5_000); | |
| let response: Response; | |
| try { | |
| response = await fetch("http://127.0.0.1:8088/metrics", { | |
| signal: controller.signal, | |
| }); | |
| } finally { | |
| clearTimeout(timeout); | |
| } | |
| if (!response.ok) { | |
| throw new Error(`failed to fetch router metrics: ${response.status}`); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@benchmark/scripts/scrape_metrics.ts` around lines 54 - 57, The fetch for
metrics currently has no timeout and can hang; wrap the fetch in an
AbortController: create a controller, pass controller.signal to
fetch("http://127.0.0.1:8088/metrics"), start a timer (e.g., 5000ms) that calls
controller.abort() on expiry, clear the timer after fetch resolves, and handle
AbortError to throw a clear "metrics fetch timeout" error; update the code
around the response/response.ok checks to use this abort-enabled fetch flow
(reference AbortController, controller.signal, response).
| if kill -0 "${pid}" >/dev/null 2>&1; then | ||
| kill "${pid}" >/dev/null 2>&1 || true | ||
| wait "${pid}" 2>/dev/null || true | ||
| fi |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's examine the stop_stack.sh file to understand the full context
cat -n benchmark/scripts/stop_stack.shRepository: wundergraph/cosmo
Length of output: 836
🏁 Script executed:
# Now let's examine the start_stack.sh file to see how processes are started
cat -n benchmark/scripts/start_stack.shRepository: wundergraph/cosmo
Length of output: 2615
🏁 Script executed:
# Let's also check what processes are being started and how they're managed
grep -r "\.run" benchmark/scripts/ --include="*.sh" -B2 -A2Repository: wundergraph/cosmo
Length of output: 1133
wait "${pid}" is ineffective for these detached processes and creates a race condition.
In start_stack.sh, processes are spawned with start_new_session=True (line 30), making them independent process groups—not children of the current shell. Bash's wait builtin only works on child processes, so it returns immediately (errors suppressed), and the PID file gets deleted before the actual process finishes shutting down. This can race with the next benchmark start.
Replace with a polling loop that confirms process termination:
Suggested fix
if kill -0 "${pid}" >/dev/null 2>&1; then
kill "${pid}" >/dev/null 2>&1 || true
- wait "${pid}" 2>/dev/null || true
+ for _ in {1..50}; do
+ if ! kill -0 "${pid}" >/dev/null 2>&1; then
+ break
+ fi
+ sleep 0.1
+ done
+ if kill -0 "${pid}" >/dev/null 2>&1; then
+ kill -9 "${pid}" >/dev/null 2>&1 || true
+ fi
fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@benchmark/scripts/stop_stack.sh` around lines 13 - 16, The current shutdown
in stop_stack.sh uses wait "${pid}" which is ineffective for detached processes
started with start_new_session=True and can race; replace the wait call for
"${pid}" with a polling loop that repeatedly checks kill -0 "${pid}" (or
/proc/${pid} existence) until the process no longer exists or a timeout elapses,
sleeping briefly between checks and then proceed to remove the pid file; ensure
the loop uses the same ${pid} variable and enforces a reasonable timeout and
graceful fallback (e.g., log/warn and continue) so stop_stack.sh reliably waits
for the detached process to exit.
| // directive @openfed__requestScoped(key: String!) on FIELD_DEFINITION | ||
| // | ||
| // Marks a field as resolving to the same value for the entire request, shared with | ||
| // all other fields in the same subgraph that use the same `key`. Purely symmetric: | ||
| // every field that participates declares itself with the directive. There is no | ||
| // receiver/provider distinction. Whichever field is resolved first populates the | ||
| // per-request L1 cache under `{subgraphName}.{key}`; subsequent fields with the | ||
| // same key are served from L1 (and can skip the subgraph fetch when possible). | ||
| export const REQUEST_SCOPED_DEFINITION: DirectiveDefinitionNode = { | ||
| arguments: [ | ||
| { | ||
| kind: Kind.INPUT_VALUE_DEFINITION, | ||
| name: stringToNameNode(KEY), | ||
| type: { | ||
| kind: Kind.NON_NULL_TYPE, | ||
| type: stringToNamedTypeNode(STRING_SCALAR), | ||
| }, | ||
| }, | ||
| ], | ||
| kind: Kind.DIRECTIVE_DEFINITION, | ||
| locations: stringArrayToNameNodeArray([FIELD_DEFINITION_UPPER]), | ||
| name: stringToNameNode(REQUEST_SCOPED), | ||
| repeatable: false, | ||
| }; |
There was a problem hiding this comment.
Register REQUEST_SCOPED_DEFINITION in the directive registry.
This definition is added here, but composition/src/v1/constants/constants.ts:1-143 still doesn't import REQUEST_SCOPED / REQUEST_SCOPED_DEFINITION or add it to DIRECTIVE_DEFINITION_BY_NAME. That leaves @openfed__requestScoped undiscoverable during normalization, so request-scoped config never gets emitted.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@composition/src/v1/constants/directive-definitions.ts` around lines 766 -
789, The new REQUEST_SCOPED_DEFINITION and its name symbol REQUEST_SCOPED are
not registered in the directive registry used during normalization; import
REQUEST_SCOPED and REQUEST_SCOPED_DEFINITION into the directive registry module
(where DIRECTIVE_DEFINITION_BY_NAME is defined) and add an entry mapping
REQUEST_SCOPED to REQUEST_SCOPED_DEFINITION in DIRECTIVE_DEFINITION_BY_NAME so
the `@openfed__requestScoped` directive is discoverable and emitted during
normalization.
| // Check if @openfed__is targets multiple fields (composite key via input object) | ||
| const isCompositeIsSpec = isFieldValue.includes(LITERAL_SPACE); | ||
|
|
||
| if (isCompositeIsSpec) { | ||
| return this.buildCompositeIsMapping( | ||
| fieldCoords, entityTypeName, keyFieldSets, isListReturn, argInfo, | ||
| ); |
There was a problem hiding this comment.
Don't let one composite @openfed__is suppress other satisfiable keys.
Returning from buildExplicitMappings() here means a single input-object mapping prevents any remaining explicit or auto-mappable alternative @key from being emitted for the same field. That breaks multi-key OR semantics.
As per coding guidelines, "In composition entity key mapping, each independent @key directive should produce its own EntityKeyMappingConfig with OR semantics, not be merged into a single composite mapping with AND semantics."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@composition/src/v1/normalization/normalization-factory.ts` around lines 4333
- 4339, The current early return from buildExplicitMappings when
isCompositeIsSpec is true causes a composite `@openfed__is` mapping to suppress
other satisfiable keys; instead of returning the result of
buildCompositeIsMapping, call buildCompositeIsMapping with (fieldCoords,
entityTypeName, keyFieldSets, isListReturn, argInfo) and append its produced
EntityKeyMappingConfig(s) to the mappings collection used by
buildExplicitMappings (do not merge into a single composite that enforces AND),
then continue processing remaining explicit/auto-mappable `@key` directives so
each independent `@key` yields its own EntityKeyMappingConfig (OR semantics).
Ensure you reference and update the same mappings accumulator used later in
buildExplicitMappings and remove the early return path.
| func (r *mutationResolver) UpdateArticle(ctx context.Context, id string, title string) (*model.Article, error) { | ||
| for _, a := range articlesData { | ||
| if a.ID == id { | ||
| a.Title = title | ||
| return a, nil | ||
| } | ||
| } | ||
| return nil, nil | ||
| } | ||
|
|
||
| // CreateArticle is the resolver for the createArticle field. | ||
| func (r *mutationResolver) CreateArticle(ctx context.Context, title string, body string, authorName string) (*model.Article, error) { | ||
| a := &model.Article{ | ||
| ID: fmt.Sprintf("%d", len(articlesData)+1), | ||
| Title: title, | ||
| Body: body, | ||
| AuthorName: authorName, | ||
| PublishedAt: time.Now().Format(time.RFC3339), | ||
| Tags: []string{}, | ||
| } | ||
| articlesData = append(articlesData, a) | ||
| return a, nil | ||
| } | ||
|
|
||
| // DeleteListing is the resolver for the deleteListing field. | ||
| func (r *mutationResolver) DeleteListing(ctx context.Context, key model.ListingKey) (*model.Listing, error) { | ||
| k := listingKey{SellerID: key.SellerID, SKU: key.Sku} | ||
| l := listingsData[k] | ||
| delete(listingsData, k) | ||
| return l, nil |
There was a problem hiding this comment.
Protect the shared demo data before mutating it.
These mutations write package-level state that the query resolvers read concurrently. DeleteListing can panic with concurrent map read and map write, and CreateArticle / UpdateArticle race with the article queries under parallel traffic. Please guard these collections with a sync.RWMutex or switch the demo store to copy-on-write snapshots.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@demo/pkg/subgraphs/cachegraph/subgraph/schema.resolvers.go` around lines 17 -
46, The package-level slices/maps (articlesData, listingsData) are mutated
without synchronization causing races and possible "concurrent map read and map
write" panics; protect access with a sync.RWMutex (e.g., declare a package-level
var articlesMu sync.RWMutex and listingsMu sync.RWMutex) and use
articlesMu.Lock()/Unlock() for CreateArticle and UpdateArticle mutations and
articlesMu.RLock()/RUnlock() for query readers, and similarly use
listingsMu.Lock()/Unlock() around DeleteListing (and any listing reads) when
accessing listingsData; alternatively implement copy-on-write snapshots for
writes and atomically replace the store if you prefer immutability.
| cd demo && go run cmd/all/main.go & \ | ||
| sleep 2 && cd router && go run cmd/router/main.go --config ../demo/router-cache.yaml |
There was a problem hiding this comment.
Clean up the background subgraph process on exit.
go run cmd/all/main.go is backgrounded here, but nothing tears it down when the router exits or the user interrupts make demo. That leaves the demo subgraphs orphaned and the ports occupied for the next run.
Suggested fix
demo:
`@echo` "Composing subgraph schemas..."
cd demo && $(MAKE) compose-cache
`@echo` "Starting subgraphs and router..."
`@echo` "Playground will be at http://localhost:3002/"
- cd demo && go run cmd/all/main.go & \
- sleep 2 && cd router && go run cmd/router/main.go --config ../demo/router-cache.yaml
+ `@set` -e; \
+ (cd demo && go run cmd/all/main.go) & \
+ subgraphs_pid=$$!; \
+ trap 'kill $$subgraphs_pid >/dev/null 2>&1 || true' EXIT INT TERM; \
+ sleep 2; \
+ cd router && go run cmd/router/main.go --config ../demo/router-cache.yaml🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@Makefile` around lines 195 - 196, The demo target currently backgrounds the
subgraph process started by "go run cmd/all/main.go" and never cleans it up;
modify the Makefile commands to capture the background PID (from the "go run
cmd/all/main.go &" invocation), install a shell trap for INT/EXIT that kills
that PID (and waits for any children) before exiting, and ensure the router
process ("go run cmd/router/main.go --config ../demo/router-cache.yaml") runs in
the foreground so the trap can fire; use the captured PID (or a temp file) to
reliably kill the background subgraph on normal exit, interrupt, or error.
- Regenerate Go proto bindings with protoc-gen-go v1.36.10 (matches Makefile pin) - Reset pnpm-lock.yaml to main's resolution so rollup@4.59.0 override applies (was pinned to 4.46.2 with high-severity advisory) - Run go generate ./... in demo to refresh gqlgen-generated cachegraph/viewer - Re-run pnpm build:router to refresh embedded graphiql.html - Format composition docs and modified files (AGENTS.md, CLAUDE.md, COMPOSITION_CONVENTIONS.md, errors.ts, normalization-factory.ts, warnings.ts, entity-cache-fuzz.test.ts, entity-cache-mapping-rules.test.ts) with prettier Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…arnings Adds to CLAUDE.md: - Split between router playground (has Cache Explorer) and studio playground (does not, since it uses its own GraphiQL wrapper and does not import @wundergraph/playground). - End-to-end bring-up recipe for running infra in docker while running controlplane, studio, and router locally — including the keycloak image rebuild workaround, the explicit pnpm migrate step before seed, and the port map. - gqlgen.yml gotcha: `skip_runtime: true` only applies when the directive key matches the full schema name, including the `openfed__` prefix. Mismatched names silently generate a runtime handler that panics with "directive ... is not implemented" on every subgraph fetch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nfigs The subgraph schemas declare `@openfed__entityCache`, `@openfed__queryCache`, `@openfed__cacheInvalidate`, `@openfed__cachePopulate`, `@openfed__is`, and `@openfed__requestScoped`, but gqlgen.yml listed the directives under their short names (`entityCache`, `queryCache`, etc.). Gqlgen matches `directives:` keys against the SDL directive name, so `skip_runtime: true` never applied. The generated code retained runtime handlers on `DirectiveRoot` that panic with "directive ... is not implemented" whenever a federated router fetches from the subgraph. Fix: rename the keys to their fully prefixed names and regenerate. The regenerated `generated.go` no longer contains `Openfed__*` fields on `DirectiveRoot`. Also: - Add `demo/router-cache-cp.yaml` — a control-plane-connected router config for the cache demo, with entity_caching + L2 Redis enabled and no `execution_config.file`. - Check in `cli/cachegraph-cachedemo/` — `pnpm wgc federated-graph fetch` export for the cache demo graph, useful as a reference for the composed SDL and router config when working against the control plane. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…layground
The router-embedded playground (`playground/` package) already had a Cache Explorer view, a per-request cache-mode dropdown (`X-WG-Disable-Entity-Cache*` headers), and cache-annotated request traces — but the studio playground (`studio/src/components/playground/*` + `studio/src/pages/[...]/playground.tsx`) is a separate implementation that never imported `@wundergraph/playground` and was missing all three.
Port the cache explorer infrastructure into studio so graph owners get the same experience when opening the studio playground, not only the router-served one:
- Copy `cache-explorer-{types,controller,utils,runner,view}.{ts,tsx}` and `view-cache.tsx` from `playground/src/components/playground/` into `studio/src/components/playground/`. Only one import path needed adjustment (`@/lib/use-local-storage` → `@/hooks/use-local-storage`).
- Convert two `for (const … of …)` iterations over `Set`/`Map.entries()` to `.forEach(…)` so the code compiles under studio's non-downlevelIteration tsconfig.
- Extend `studio/src/components/playground/types.ts` with `'cache-explorer'` on `PlaygroundView`, the `CacheMode` type, `cacheMode`/`setCacheMode` on the context, the `CacheTrace`/`CacheStatus` helpers, and `cacheTrace?: CacheTrace` on `ARTFetchNode`.
- In `studio/src/pages/[organizationSlug]/[namespace]/graph/[slug]/playground.tsx`:
- Thread a `cacheMode` React state + ref through the fetcher and inject `X-WG-Disable-Entity-Cache{,-L1,-L2}` headers based on the selected mode.
- Add a `CacheControl` select next to the existing view dropdown, wired to the context.
- Add the Cache Explorer intercept: when `view === 'cache-explorer'` and the play button was clicked, dispatch the config to `cacheExplorerController.start(...)` and return a synthetic response so GraphiQL's response pane stays quiet while the runner takes over.
- Register a `cache-explorer-visualization` wrapper div and portal `CacheExplorerView` into it, mirroring the existing ART / plan wrappers.
- Add `CacheBadge` + `collectCacheSummary` so non-explorer runs still show the hits/total pill on the response toolbar.
- Keep the user's headers JSON in sync with the selected cache mode via `applyCacheModeHeaders`, so the injected `X-WG-Disable-Entity-Cache*` headers are visible in the editor.
Also update `demo/router-cache-cp.yaml` CORS so the studio origin can send the cache-control headers and run the explorer cross-origin.
Verified end-to-end with a 10-iteration cache demo run from the studio playground: 2.73x server-side speedup, 100% cache ratio, 30 eliminated subgraph HTTP calls, 12.7KB -> 0B router-to-subgraph bytes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ce UI into studio playground ## Router CORS The studio playground's Cache Explorer and per-request cache-mode dropdown need to send these headers: `X-WG-Disable-Entity-Cache`, `X-WG-Disable-Entity-Cache-L1`, `X-WG-Disable-Entity-Cache-L2`, and `X-WG-Cache-Key-Prefix`. When studio is on a different origin than the router (which is the norm in real deployments, and also how the local docker-compose layout works), the browser does a CORS preflight that's rejected because those headers are not in `Access-Control-Allow-Headers`. Fix it at the router level so the feature works out of the box against any router, not only against ones whose operator happens to remember to open those headers in `router.yaml`: append the four cache-mode headers to the list of defaults that `router/core/router.go` unions into the user's `CORS.AllowHeaders`. Alongside the already-merged WunderGraph ART / feature flag / tracing headers, this list is the set of headers the router's own playground tooling sends — so it's the right place for studio's new cache-explorer tooling to live too. With that default in place, `demo/router-cache-cp.yaml` no longer needs a full `cors:` override. It keeps only the demo-specific `X-Artificial-Latency` header, which the cachegraph demo subgraphs use to inject synthetic latency so the cache effect is visible. ## Studio playground backport While porting the cache explorer I noticed the studio playground also had drifted from `playground/` in three files that render the Request Trace: - `trace-view.tsx` had no `parseCacheTrace`, so `ARTFetchNode.cacheTrace` was always undefined even when the server returned `datasource_load_trace.cache_trace`. Also missing `minimumVisibleDurationNs` and the `forcedTheme` context field. - `fetch-flow.tsx` was missing the cache-status block on each node card (Cache: L1/L2 HIT / MISS / NO LOOKUP with hit/miss counts), the `ViewCache` modal button, and the `!data.loadSkipped` guard on the status-code badge. The latter caused a stale 200 OK badge to render alongside the red skipped-fetch bolt. - `fetch-waterfall.tsx` was missing the `BoltIcon` cache-hit indicator next to each subgraph name, plus the full block of cache attribute rows (cache name, TTL, entity count, L1/L2 enabled, L1/L2 hit/miss, cache duration, L2 get/set, and a cache-keys preview). Copy those three files verbatim from `playground/src/components/playground/` and rewrite the `nsToTime` import from `@/lib/utils` to `@/lib/insights-helpers` (studio's location). Every other dependency was already satisfied by prior commits on this branch (types.ts has the `CacheTrace` helpers, `view-cache.tsx` is present, `@/lib/utils:cn` and `@heroicons/react/24/solid:BoltIcon` exist in studio). End-to-end verified: running the same query twice in the studio playground now shows the 3/3 Cached badge, lightning-bolt L2-hit markers on every row of the waterfall view, and the full cache statistics on the tree-view node cards — matching the router-embedded playground. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Studio's `pnpm lint` (and `next build`) runs stricter rules than the playground package, so the verbatim copy of `cache-explorer-view.tsx` broke CI with four errors: - `react-hooks/rules-of-hooks` at lines 1182 and 1186: two `useMemo` calls lived after an `if (!cachedPlan && !uncachedPlan) return null;` early return, which skips them on the empty path. Move both hooks above the early return so hooks run in a stable order on every render (the hook results are cheap even when both inputs are undefined). - `react/no-unescaped-entities` at line 1380: two `"` characters directly in JSX text. Replace them with `"` entities. No behavior change — only lint compliance. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Studio's CI runs `pnpm lint:fix` and then a `git-dirty-check` step that fails if the tree changed. The verbatim copies from the router playground use slightly different whitespace (double-spaces after `;` to align inline comments, `useMemo`/`useCallback` arg layout) that studio's prettier config reformats on touch. Run `pnpm lint:fix` locally on the ported files and commit the result so CI's dirty-check sees a clean tree. Purely formatting; no behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
COMMENTS.md, TODO.md, FEEDBACK.md, DONE.md, and docs/superpowers/ are local-only scratch for review triage and should never be committed. TODO.md and FEEDBACK.md were previously committed by mistake and have been stripped from branch history with git filter-repo; add ignore rules to prevent reintroduction.
be5e727 to
86c88c6
Compare
Squashed snapshot of jensneuse/entity-caching (e5ac2b9) onto main. The original branch's history was rooted at unsigned "Initial commit" 4d73e22... while wundergraph/cosmo:main is rooted at the GPG-signed equivalent — every commit SHA differs and GitHub treated the two as unrelated histories, which is why PR #2677 was closed and could not be reopened. This commit re-bases the entire feature as a single diff on top of current main so it can be reviewed and merged. Original branch is preserved at jensneuse/entity-caching for history. Supersedes #2677.
Summary
@keyfields at the router level, so aProductcached by one query is reused by every query that needs it — regardless of query shape or subgraphsync.Map) eliminates redundant entity fetches within a single request; L2 (Redis or in-memory ristretto) shares cached entities across requests with per-entity TTLs via@entityCache(maxAge)@entityCacheand@entityCacheTTLdirectives validated and propagated through the composition layerCompanion PR: graphql-go-tools #1259
Documentation
All entity caching docs live in
docs/entity-caching/:engineering-brief.mddirectives.md@entityCacheand@entityCacheTTLdirective definitions and usageconfiguration.mdanalytics.mdanalytics-rfc.mdStart with the engineering brief for a full overview of the design and review focus areas.
Test plan
composition/tests/v1/directives/entity-caching.test.ts)router/core/executor_entity_cache_test.go)router-tests/entity_caching/)🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Documentation
Tests