Add Policy Engine changes to support alpha2 interfaces and streaming#1488
Add Policy Engine changes to support alpha2 interfaces and streaming#1488Krishanx92 merged 14 commits intowso2:mainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughReplaces gateway policy SDK imports with Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Envoy as Envoy/ext_proc
participant Kernel as Kernel/Translator
participant Executor as PolicyExecutor
participant Policies
Client->>Envoy: HTTP Request (headers)
Envoy->>Kernel: ProcessingRequestHeaders
Kernel->>Executor: ExecuteRequestHeaderPolicies(ctx)
Executor->>Policies: Invoke RequestHeaderPolicy hooks
Policies-->>Executor: RequestHeaderExecutionResult
Executor->>Kernel: return header actions
Kernel->>Envoy: Apply header mutations
alt Body Phase (streamed)
loop For each request chunk
Client->>Envoy: Request body chunk
Envoy->>Kernel: ProcessingRequestBodyChunk
Kernel->>Executor: ExecuteStreamingRequestPolicies(chunk)
Executor->>Policies: Invoke StreamingRequestPolicy hooks
Policies-->>Executor: StreamingRequestExecutionResult
Executor->>Kernel: return chunk action
Kernel->>Envoy: Apply chunk actions
end
else Body Phase (buffered)
Client->>Envoy: Full request body
Envoy->>Kernel: ProcessingRequestBody
Kernel->>Executor: ExecuteRequestBodyPolicies(fullBody)
end
Envoy->>Kernel: Upstream response headers
Kernel->>Executor: ExecuteResponseHeaderPolicies(ctx)
Executor->>Policies: Invoke ResponseHeaderPolicy hooks
Policies-->>Executor: ResponseHeaderExecutionResult
Executor->>Kernel: return header actions
Kernel->>Envoy: Apply response header mutations
alt Response Body (streamed)
loop For each response chunk
Envoy->>Kernel: Response body chunk
Kernel->>Executor: ExecuteStreamingResponsePolicies(chunk)
Executor->>Policies: Invoke StreamingResponsePolicy hooks
Policies-->>Executor: StreamingResponseExecutionResult
Executor->>Kernel: return chunk action
Kernel->>Envoy: Apply chunk actions
end
else Response Body (buffered)
Envoy->>Kernel: Full response body
Kernel->>Executor: ExecuteResponseBodyPolicies(fullBody)
end
Envoy->>Client: Final response
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
gateway/gateway-runtime/policy-engine/internal/kernel/extproc.go (1)
413-413:⚠️ Potential issue | 🟡 MinorMinor typo in comment.
Missing space between "metadata" and "even".
📝 Proposed fix
- // Build analytics metadata using route metadataeven when skipping policy processing + // Build analytics metadata using route metadata even when skipping policy processing🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/policy-engine/internal/kernel/extproc.go` at line 413, The comment "Build analytics metadata using route metadataeven when skipping policy processing" has a missing space; update the comment in extproc.go so it reads "Build analytics metadata using route metadata even when skipping policy processing" (locate the comment string in the extproc.go file and insert a space between "metadata" and "even").gateway/gateway-runtime/policy-engine/internal/pkg/cel/evaluator_bench_test.go (1)
234-241:⚠️ Potential issue | 🟡 MinorInconsistent phase value in response benchmark.
The request phase check (line 142) uses
"request_body", but here the response phase check uses"response"instead of"response_body". This appears inconsistent with the phase naming convention applied elsewhere in this PR.Suggested fix
b.Run("ProcessingPhaseCheck", func(b *testing.B) { - expr := `processing.phase == "response"` + expr := `processing.phase == "response_body"` b.ReportAllocs() b.ResetTimer() for i := 0; i < b.N; i++ { _, _ = evaluator.EvaluateResponseBodyCondition(expr, respCtx) } })🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/policy-engine/internal/pkg/cel/evaluator_bench_test.go` around lines 234 - 241, The benchmark uses an inconsistent phase string: update the expression used in the "ProcessingPhaseCheck" benchmark from processing.phase == "response" to processing.phase == "response_body" so it matches the request benchmark naming; locate the b.Run("ProcessingPhaseCheck", ...) block where expr is defined and change the literal there (the call to evaluator.EvaluateResponseBodyCondition remains the same).gateway/gateway-runtime/policy-engine/internal/xdsclient/handler.go (1)
347-427:⚠️ Potential issue | 🟠 MajorMissing
RequiresRequestHeaderandRequiresResponseHeadercomputation.Unlike
xds.gowhich computesrequiresRequestHeaderandrequiresResponseHeader(though doesn't assign them), this function doesn't compute these flags at all. BothbuildPolicyChainimplementations should be consistent.Additionally, the
PolicyChainstruct includes these fields, but they're never populated here.Suggested fix
requiresRequestBody := false requiresResponseBody := false hasExecutionConditions := false supportsRequestStreaming := true supportsResponseStreaming := true hasRequestBodyPolicy := false hasResponseBodyPolicy := false + requiresRequestHeader := false + requiresResponseHeader := false for _, policyConfig := range config.Policies { // ... existing code ... if mode.ResponseBodyMode == policy.BodyModeBuffer || mode.ResponseBodyMode == policy.BodyModeStream { requiresResponseBody = true hasResponseBodyPolicy = true if _, streaming := impl.(policy.StreamingResponsePolicy); !streaming { supportsResponseStreaming = false } } + + if _, ok := impl.(policy.RequestHeaderPolicy); ok { + requiresRequestHeader = true + } + if _, ok := impl.(policy.ResponseHeaderPolicy); ok { + requiresResponseHeader = true + } } // ... existing code ... chain := ®istry.PolicyChain{ Policies: policyList, PolicySpecs: policySpecs, RequiresRequestBody: requiresRequestBody, RequiresResponseBody: requiresResponseBody, HasExecutionConditions: hasExecutionConditions, + RequiresRequestHeader: requiresRequestHeader, + RequiresResponseHeader: requiresResponseHeader, SupportsRequestStreaming: supportsRequestStreaming, SupportsResponseStreaming: supportsResponseStreaming, }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/policy-engine/internal/xdsclient/handler.go` around lines 347 - 427, buildPolicyChain never computes or assigns RequiresRequestHeader and RequiresResponseHeader on the returned registry.PolicyChain; update the loop that inspects each policy impl (via impl.Mode()) to detect header requirements (e.g., check mode.RequestHeaderMode and mode.ResponseHeaderMode or the equivalent fields on the returned Mode) and set two local booleans requiresRequestHeader and requiresResponseHeader accordingly, then include those booleans in the chain struct initialization so registry.PolicyChain.RequiresRequestHeader and RequiresResponseHeader are populated before returning from buildPolicyChain.
🧹 Nitpick comments (5)
gateway/gateway-runtime/policy-engine/configs/envoy.yaml (1)
84-84: Constrain the/historysuffix if only child paths should match.Line 84’s
^/pets/([^/]+)/history.*also matches/pets/123/historyfoo, not just/historyor/history/.... If the intent is only to allow descendant paths, add a path boundary so this route and its ext-proc metadata do not leak onto unrelated endpoints.♻️ Suggested change
- regex: "^/pets/([^/]+)/history.*" + regex: "^/pets/([^/]+)/history(?:/.*)?$"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/policy-engine/configs/envoy.yaml` at line 84, The current regex pattern "^/pets/([^/]+)/history.*" is too permissive and matches paths like "/pets/123/historyfoo"; update the pattern used for the route (the string "^/pets/([^/]+)/history.*") so that after "history" it requires either a path separator or end-of-string (i.e., enforce a boundary such as "history" followed by "/" or end) so only "/history" or its descendant paths match, then verify the route's ext-proc metadata still applies as intended.gateway/gateway-builder/internal/validation/golang.go (1)
165-172: Consider updating error message to mentionGetPolicyV2.The error message still only references
GetPolicy()but the validation now accepts bothGetPolicyandGetPolicyV2. Consider updating the message for clarity.Suggested improvement
if !hasNewPolicy { errors = append(errors, types.ValidationError{ PolicyName: policy.Name, PolicyVersion: policy.Version, FilePath: policy.Path, - Message: "missing required GetPolicy() factory function", + Message: "missing required GetPolicy() or GetPolicyV2() factory function", }) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-builder/internal/validation/golang.go` around lines 165 - 172, Update the validation error message that is added when neither factory is found so it mentions both accepted factory names: change the message created in the types.ValidationError append (where hasNewPolicy is checked) to reference both "GetPolicy()" and "GetPolicyV2()" (use the same policy.Name, policy.Version, policy.Path fields), so the validation error clearly indicates that either GetPolicy() or GetPolicyV2() is required.gateway/gateway-runtime/policy-engine/internal/xdsclient/handler.go (1)
345-347: Consider extracting sharedbuildPolicyChainlogic.The comment acknowledges this is a copy of
kernel.ConfigLoader.buildPolicyChain. The current inconsistency (missing header requirement fields) demonstrates the risk of maintaining duplicate implementations. Consider extracting the common logic into a shared helper function.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/policy-engine/internal/xdsclient/handler.go` around lines 345 - 347, The buildPolicyChain logic in ResourceHandler.buildPolicyChain is duplicated from kernel.ConfigLoader.buildPolicyChain causing inconsistencies (e.g., missing header requirement fields); refactor by extracting the shared logic into a single helper function used by both callers: move the common transformation from policyenginev1.PolicyChain to registry.PolicyChain into a new package-level/shared function (e.g., ConvertPolicyChain or BuildPolicyChainFromConfig) that accepts the policyenginev1.PolicyChain and api metadata and returns (*registry.PolicyChain, error), update ResourceHandler.buildPolicyChain and kernel.ConfigLoader.buildPolicyChain to call this new helper, and ensure the helper preserves all fields including header requirement fields during the conversion.gateway/gateway-runtime/policy-engine/internal/kernel/translator.go (2)
656-664: Same misleading comment appears in response translation.The "deprecated flat field" comments also appear here for the response modification fields and should be updated for consistency.
📝 Suggested comment update
if policyResult.Action != nil { if mods, ok := policyResult.Action.(policy.DownstreamResponseModifications); ok { - // Collect SetHeader operations (deprecated flat field) + // Collect SetHeader operations for key, value := range mods.HeadersToSet { headerOps[strings.ToLower(key)] = append(headerOps[strings.ToLower(key)], &headerOp{opType: "set", value: value}) } -// Collect RemoveHeader operations (deprecated flat field) +// Collect RemoveHeader operations for _, key := range mods.HeadersToRemove {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/policy-engine/internal/kernel/translator.go` around lines 656 - 664, The comments above the response header modification loops incorrectly repeat "deprecated flat field"; update them to accurately describe what the loops do (collect response SetHeader and RemoveHeader operations) and mirror the wording used for request translation; specifically edit the comment lines near the loops that iterate over mods.HeadersToSet and mods.HeadersToRemove (and the corresponding response modification section) so they clearly state these collect response header set/remove operations rather than claiming a deprecated flat field.
107-114: Misleading comment: "deprecated flat field" refers to new API fields.The comment says "Collect SetHeader operations (deprecated flat field)" but
mods.HeadersToSetis the new v1alpha2 field name. The comment appears to be stale from the migration and doesn't match the current code.📝 Suggested comment update
- // Collect SetHeader operations (deprecated flat field) + // Collect SetHeader operations for key, value := range mods.HeadersToSet { headerOps[strings.ToLower(key)] = append(headerOps[strings.ToLower(key)], &headerOp{opType: "set", value: value}) } -// Collect RemoveHeader operations (deprecated flat field) +// Collect RemoveHeader operations for _, key := range mods.HeadersToRemove {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/policy-engine/internal/kernel/translator.go` around lines 107 - 114, The comments above the two header-collection loops are stale: update them so they accurately describe the data they handle—change the comment for the loop using mods.HeadersToSet to indicate it collects SetHeader operations from the new v1alpha2 HeadersToSet field, and update/clarify the comment for the loop using mods.HeadersToRemove to indicate it collects RemoveHeader operations from the (deprecated) flat HeadersToRemove field; reference the symbols headerOps, headerOp, mods.HeadersToSet and mods.HeadersToRemove when making the comment changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@gateway/gateway-builder/templates/plugin_registry.go.tmpl`:
- Line 52: The template now registers policies using {{ .ImportAlias
}}.GetPolicyV2 but the sample policy packages
(gateway/sample-policies/uppercase-body and
gateway/sample-policies/count-letters) only implement GetPolicy (v1alpha)
causing undefined symbol errors; update those packages to add GetPolicyV2
functions that match the v1alpha2 interface and forward or adapt the existing
GetPolicy logic (i.e., implement GetPolicyV2 in the uppercasebody and
countletters packages to return the same policy definition/behavior as their
GetPolicy, conforming to the new v1alpha2 signature used by
registry.GetRegistry().Register(policyDef_{{ .ImportAlias }}, {{ .ImportAlias
}}.GetPolicyV2)).
In `@gateway/gateway-runtime/policy-engine/internal/executor/chain_bench_test.go`:
- Around line 165-171: The OnRequestBody implementation in shortCircuitPolicy
should return a value ImmediateResponse instead of a pointer; locate the method
shortCircuitPolicy.OnRequestBody and change the returned expression from
&policy.ImmediateResponse{...} to policy.ImmediateResponse{...} (so it returns a
policy.RequestAction as a value), matching the other test
(extproc_bench_test.go) and the SDK interface assertion that expects
ImmediateResponse as a value.
In `@gateway/gateway-runtime/policy-engine/internal/executor/chain.go`:
- Around line 101-141: The loop evaluates CEL and clones params before checking
whether the policy implements the phase interface (policy.RequestHeaderPolicy),
which can cause non-applicable policies to be evaluated; move the type assertion
(headerPol, ok := pol.(policy.RequestHeaderPolicy)) to the very top of the
per-policy loop (before checking spec.Enabled, spec.ExecutionCondition,
c.celEvaluator.EvaluateRequestHeaderCondition, and deepCopyParams) and if ok is
false immediately end the span and continue (preserving the current
span.End/continue behavior), then proceed with CEL evaluation and deepCopyParams
only for policies that implement the phase interface; apply the same change to
the other phase loops at the other ranges mentioned.
- Around line 143-171: The header-phase loop currently appends
RequestHeaderPolicyResult and continues without applying header mutations to
ctx, so subsequent header policies see the pre-mutation state; fix this by,
immediately after obtaining action from headerPol.OnRequestHeaders (and before
appending result or continuing the loop), detect if action implements the
request-header mutation interface (e.g., a method like MutateRequestHeaders(ctx)
or a known type that carries header mutations) and call that mutation to update
the execution context (ctx) so the mutated headers/metadata are visible to the
next policy/CEL check; keep the existing short-circuit handling
(ImmediateResponse) but ensure mutations are applied prior to evaluating the
next header policy; apply the same change to the response-header loop (the block
referenced around lines 435-463) using the corresponding response-header
mutation interface.
In `@gateway/gateway-runtime/policy-engine/internal/kernel/execution_context.go`:
- Around line 287-379: The current accumulation path in
processStreamingRequestBody wrongly returns an empty StreamedBodyResponse (via
&extprocv3.StreamedBodyResponse{}), which does not suppress Envoy forwarding;
replace that return so it uses the proper ext_proc suppression mutation instead
of StreamedBodyResponse—locate the early-return in processStreamingRequestBody
(the block guarded by !chunk.EndOfStream && !shouldForceFlush &&
ec.anyPolicyNeedsMoreRequestData(...)) and swap the ProcessingResponse to the
repo's canonical "suppress/choke" BodyMutation/CommonResponse pattern used
elsewhere (search for existing suppression examples in the codebase) so Envoy
does not inject a 0-byte chunk; keep the surrounding logic
(requestStreamAccumulator, anyPolicyNeedsMoreRequestData,
TranslateStreamingRequestChunkAction) unchanged and ensure behavior in
FULL_DUPLEX_STREAMED mode now truly withholds the chunk until flush.
In `@gateway/gateway-runtime/policy-engine/internal/kernel/extproc.go`:
- Line 365: Update the typo in the comment near the call to
(*execCtx).buildRequestContexts(req.GetRequestHeaders(), routeMetadata): change
"// Build analytics metadata using route metadataeven when skipping policy
processing" to "// Build analytics metadata using route metadata even when
skipping policy processing" so the comment reads correctly; locate the comment
associated with buildRequestContexts and insert the missing space between
"metadata" and "even".
In `@gateway/gateway-runtime/policy-engine/internal/kernel/translator.go`:
- Around line 384-414: The code is using the wrong namespace constant; replace
constants.ExtProcFilter with constants.ExtProcFilterName wherever dynamic
metadata is being keyed (specifically the extProcNS assignments in the block
that sets execCtx.dynamicMetadata and dynamicMetadata and the similar block
under the execCtx.defaultUpstreamCluster branch). Update the extProcNS variable
assignments so extProcNS := constants.ExtProcFilterName, keeping the rest of the
logic that sets execCtx.dynamicMetadata[extProcNS], dynamicMetadata[extProcNS],
and the request_transformation.target_path entries unchanged (refer to
execCtx.dynamicMetadata, dynamicMetadata, TargetUpstreamClusterKey,
TargetUpstreamNameKey, and upstreamDefinitionPaths to locate the exact spots).
In
`@gateway/gateway-runtime/policy-engine/internal/pkg/cel/evaluator_bench_test.go`:
- Around line 433-448: The benchmark uses inconsistent processing.phase values:
reqExpr uses "request_body" but respExpr is set to "response", which doesn't
match the response phase used by EvaluateResponseBodyCondition; update the
respExpr string from "response" to "response_body" so the response expression
reflects the actual phase used by evaluator.EvaluateResponseBodyCondition (and
ensure the warm-cache call to evaluator.EvaluateResponseBodyCondition(respExpr,
respCtx) and the alternating loop use the corrected respExpr).
---
Outside diff comments:
In `@gateway/gateway-runtime/policy-engine/internal/kernel/extproc.go`:
- Line 413: The comment "Build analytics metadata using route metadataeven when
skipping policy processing" has a missing space; update the comment in
extproc.go so it reads "Build analytics metadata using route metadata even when
skipping policy processing" (locate the comment string in the extproc.go file
and insert a space between "metadata" and "even").
In
`@gateway/gateway-runtime/policy-engine/internal/pkg/cel/evaluator_bench_test.go`:
- Around line 234-241: The benchmark uses an inconsistent phase string: update
the expression used in the "ProcessingPhaseCheck" benchmark from
processing.phase == "response" to processing.phase == "response_body" so it
matches the request benchmark naming; locate the b.Run("ProcessingPhaseCheck",
...) block where expr is defined and change the literal there (the call to
evaluator.EvaluateResponseBodyCondition remains the same).
In `@gateway/gateway-runtime/policy-engine/internal/xdsclient/handler.go`:
- Around line 347-427: buildPolicyChain never computes or assigns
RequiresRequestHeader and RequiresResponseHeader on the returned
registry.PolicyChain; update the loop that inspects each policy impl (via
impl.Mode()) to detect header requirements (e.g., check mode.RequestHeaderMode
and mode.ResponseHeaderMode or the equivalent fields on the returned Mode) and
set two local booleans requiresRequestHeader and requiresResponseHeader
accordingly, then include those booleans in the chain struct initialization so
registry.PolicyChain.RequiresRequestHeader and RequiresResponseHeader are
populated before returning from buildPolicyChain.
---
Nitpick comments:
In `@gateway/gateway-builder/internal/validation/golang.go`:
- Around line 165-172: Update the validation error message that is added when
neither factory is found so it mentions both accepted factory names: change the
message created in the types.ValidationError append (where hasNewPolicy is
checked) to reference both "GetPolicy()" and "GetPolicyV2()" (use the same
policy.Name, policy.Version, policy.Path fields), so the validation error
clearly indicates that either GetPolicy() or GetPolicyV2() is required.
In `@gateway/gateway-runtime/policy-engine/configs/envoy.yaml`:
- Line 84: The current regex pattern "^/pets/([^/]+)/history.*" is too
permissive and matches paths like "/pets/123/historyfoo"; update the pattern
used for the route (the string "^/pets/([^/]+)/history.*") so that after
"history" it requires either a path separator or end-of-string (i.e., enforce a
boundary such as "history" followed by "/" or end) so only "/history" or its
descendant paths match, then verify the route's ext-proc metadata still applies
as intended.
In `@gateway/gateway-runtime/policy-engine/internal/kernel/translator.go`:
- Around line 656-664: The comments above the response header modification loops
incorrectly repeat "deprecated flat field"; update them to accurately describe
what the loops do (collect response SetHeader and RemoveHeader operations) and
mirror the wording used for request translation; specifically edit the comment
lines near the loops that iterate over mods.HeadersToSet and
mods.HeadersToRemove (and the corresponding response modification section) so
they clearly state these collect response header set/remove operations rather
than claiming a deprecated flat field.
- Around line 107-114: The comments above the two header-collection loops are
stale: update them so they accurately describe the data they handle—change the
comment for the loop using mods.HeadersToSet to indicate it collects SetHeader
operations from the new v1alpha2 HeadersToSet field, and update/clarify the
comment for the loop using mods.HeadersToRemove to indicate it collects
RemoveHeader operations from the (deprecated) flat HeadersToRemove field;
reference the symbols headerOps, headerOp, mods.HeadersToSet and
mods.HeadersToRemove when making the comment changes.
In `@gateway/gateway-runtime/policy-engine/internal/xdsclient/handler.go`:
- Around line 345-347: The buildPolicyChain logic in
ResourceHandler.buildPolicyChain is duplicated from
kernel.ConfigLoader.buildPolicyChain causing inconsistencies (e.g., missing
header requirement fields); refactor by extracting the shared logic into a
single helper function used by both callers: move the common transformation from
policyenginev1.PolicyChain to registry.PolicyChain into a new
package-level/shared function (e.g., ConvertPolicyChain or
BuildPolicyChainFromConfig) that accepts the policyenginev1.PolicyChain and api
metadata and returns (*registry.PolicyChain, error), update
ResourceHandler.buildPolicyChain and kernel.ConfigLoader.buildPolicyChain to
call this new helper, and ensure the helper preserves all fields including
header requirement fields during the conversion.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ab2da1ca-a323-43d7-8f35-d401267e29b2
📒 Files selected for processing (74)
gateway/build-lock.yamlgateway/gateway-builder/internal/validation/golang.gogateway/gateway-builder/internal/validation/validation_test.gogateway/gateway-builder/templates/plugin_registry.go.tmplgateway/gateway-controller/default-policies/advanced-ratelimit.yamlgateway/gateway-controller/default-policies/analytics-header-filter.yamlgateway/gateway-controller/default-policies/api-key-auth.yamlgateway/gateway-controller/default-policies/aws-bedrock-guardrail.yamlgateway/gateway-controller/default-policies/azure-content-safety-content-moderation.yamlgateway/gateway-controller/default-policies/basic-auth.yamlgateway/gateway-controller/default-policies/basic-ratelimit.yamlgateway/gateway-controller/default-policies/content-length-guardrail.yamlgateway/gateway-controller/default-policies/cors.yamlgateway/gateway-controller/default-policies/dynamic-endpoint.yamlgateway/gateway-controller/default-policies/json-schema-guardrail.yamlgateway/gateway-controller/default-policies/json-xml-mediator.yamlgateway/gateway-controller/default-policies/jwt-auth.yamlgateway/gateway-controller/default-policies/llm-cost-based-ratelimit.yamlgateway/gateway-controller/default-policies/llm-cost.yamlgateway/gateway-controller/default-policies/log-message.yamlgateway/gateway-controller/default-policies/mcp-acl-list.yamlgateway/gateway-controller/default-policies/mcp-auth.yamlgateway/gateway-controller/default-policies/mcp-authz.yamlgateway/gateway-controller/default-policies/mcp-rewrite.yamlgateway/gateway-controller/default-policies/model-round-robin.yamlgateway/gateway-controller/default-policies/model-weighted-round-robin.yamlgateway/gateway-controller/default-policies/pii-masking-regex.yamlgateway/gateway-controller/default-policies/prompt-decorator.yamlgateway/gateway-controller/default-policies/prompt-template.yamlgateway/gateway-controller/default-policies/regex-guardrail.yamlgateway/gateway-controller/default-policies/remove-headers.yamlgateway/gateway-controller/default-policies/request-rewrite.yamlgateway/gateway-controller/default-policies/respond.yamlgateway/gateway-controller/default-policies/semantic-cache.yamlgateway/gateway-controller/default-policies/semantic-prompt-guard.yamlgateway/gateway-controller/default-policies/sentence-count-guardrail.yamlgateway/gateway-controller/default-policies/set-headers.yamlgateway/gateway-controller/default-policies/subscription-validation.yamlgateway/gateway-controller/default-policies/token-based-ratelimit.yamlgateway/gateway-controller/default-policies/url-guardrail.yamlgateway/gateway-controller/default-policies/word-count-guardrail.yamlgateway/gateway-runtime/policy-engine/configs/envoy.yamlgateway/gateway-runtime/policy-engine/internal/admin/dumper.gogateway/gateway-runtime/policy-engine/internal/admin/dumper_test.gogateway/gateway-runtime/policy-engine/internal/admin/handlers_test.gogateway/gateway-runtime/policy-engine/internal/executor/chain.gogateway/gateway-runtime/policy-engine/internal/executor/chain_bench_test.gogateway/gateway-runtime/policy-engine/internal/executor/chain_test.gogateway/gateway-runtime/policy-engine/internal/kernel/analytics.gogateway/gateway-runtime/policy-engine/internal/kernel/body_mode.gogateway/gateway-runtime/policy-engine/internal/kernel/body_mode_test.gogateway/gateway-runtime/policy-engine/internal/kernel/execution_context.gogateway/gateway-runtime/policy-engine/internal/kernel/execution_context_test.gogateway/gateway-runtime/policy-engine/internal/kernel/extproc.gogateway/gateway-runtime/policy-engine/internal/kernel/extproc_bench_test.gogateway/gateway-runtime/policy-engine/internal/kernel/extproc_test.gogateway/gateway-runtime/policy-engine/internal/kernel/kernel_test.gogateway/gateway-runtime/policy-engine/internal/kernel/translator.gogateway/gateway-runtime/policy-engine/internal/kernel/translator_test.gogateway/gateway-runtime/policy-engine/internal/kernel/xds.gogateway/gateway-runtime/policy-engine/internal/pkg/cel/evaluator.gogateway/gateway-runtime/policy-engine/internal/pkg/cel/evaluator_bench_test.gogateway/gateway-runtime/policy-engine/internal/pkg/cel/evaluator_test.gogateway/gateway-runtime/policy-engine/internal/registry/chain.gogateway/gateway-runtime/policy-engine/internal/registry/config_resolver.gogateway/gateway-runtime/policy-engine/internal/registry/config_resolver_test.gogateway/gateway-runtime/policy-engine/internal/registry/registry.gogateway/gateway-runtime/policy-engine/internal/registry/registry_test.gogateway/gateway-runtime/policy-engine/internal/testutils/contexts.gogateway/gateway-runtime/policy-engine/internal/testutils/policies.gogateway/gateway-runtime/policy-engine/internal/xdsclient/handler.gogateway/gateway-runtime/policy-engine/internal/xdsclient/lazy_resource_handler.gogateway/gateway-runtime/policy-engine/internal/xdsclient/lazy_resource_handler_test.gogateway/system-policies/analytics/analytics.go
gateway/gateway-runtime/policy-engine/internal/executor/chain_bench_test.go
Show resolved
Hide resolved
gateway/gateway-runtime/policy-engine/internal/kernel/execution_context.go
Show resolved
Hide resolved
gateway/gateway-runtime/policy-engine/internal/kernel/translator.go
Outdated
Show resolved
Hide resolved
gateway/gateway-runtime/policy-engine/internal/pkg/cel/evaluator_bench_test.go
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
gateway/gateway-runtime/policy-engine/internal/pkg/cel/evaluator_bench_test.go (1)
234-241:⚠️ Potential issue | 🟡 MinorInconsistent phase value in ProcessingPhaseCheck benchmark.
The expression uses
"response"butEvaluateResponseBodyConditionsetsprocessing.phaseto"response_body". This condition will always evaluate tofalse, making the benchmark measure an expression that never matches production conditions.Suggested fix
b.Run("ProcessingPhaseCheck", func(b *testing.B) { - expr := `processing.phase == "response"` + expr := `processing.phase == "response_body"` b.ReportAllocs() b.ResetTimer() for i := 0; i < b.N; i++ { _, _ = evaluator.EvaluateResponseBodyCondition(expr, respCtx) } })🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/policy-engine/internal/pkg/cel/evaluator_bench_test.go` around lines 234 - 241, The ProcessingPhaseCheck benchmark uses expr `processing.phase == "response"` but EvaluateResponseBodyCondition sets processing.phase to `"response_body"`, so the condition never matches; fix by making the benchmark expression match the evaluator (change the expr in the ProcessingPhaseCheck b.Run to `processing.phase == "response_body"`) or alternatively adjust EvaluateResponseBodyCondition to set `processing.phase` to `"response"` so both sides agree; update only the ProcessingPhaseCheck benchmark (or the evaluator) so the phase string used by the benchmark and the EvaluateResponseBodyCondition function are consistent.
🧹 Nitpick comments (3)
gateway/gateway-runtime/policy-engine/internal/kernel/execution_context.go (1)
861-888: CEL evaluation errors are silently treated as condition-met.Lines 874-878 treat CEL evaluation errors as "condition met" (conservative). While this prevents policies from being skipped on transient errors, it could mask configuration issues. Consider logging these errors at a higher level.
Add warning log for CEL evaluation failures
if ec.policyChain.HasExecutionConditions && spec.ExecutionCondition != nil && *spec.ExecutionCondition != "" { if celEval != nil { conditionMet, err := celEval.EvaluateStreamingRequestCondition(*spec.ExecutionCondition, ec.requestStreamContext) - if err == nil && !conditionMet { + if err != nil { + slog.Warn("[streaming] CEL condition evaluation failed, treating as met", + "route", ec.routeKey, + "policy", spec.Name, + "error", err, + ) + } else if !conditionMet { continue } - // On error: fall through and treat as condition met (conservative) } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/policy-engine/internal/kernel/execution_context.go` around lines 861 - 888, The CEL evaluation error in anyPolicyNeedsMoreRequestData is currently swallowed and treated as "condition met"; modify the function to log CEL evaluation failures (when EvaluateStreamingRequestCondition returns err != nil) at warning level before falling through, using the runtime's server logger (e.g., ec.server.logger or the existing server logging facility) and include the policy spec/condition and error details; keep the conservative fallback behavior but ensure the warning is emitted so configuration/CELEvaluator errors are visible.gateway/gateway-runtime/policy-engine/internal/pkg/cel/lru.go (1)
40-46: Consider validating capacity.If
capacity <= 0is passed,putwill never evict entries due to the>=check on line 67, leading to unbounded growth. This could be a latent bug if an incorrect configuration value is passed.Proposed defensive check
func newProgramLRUCache(capacity int) *programLRUCache { + if capacity <= 0 { + capacity = 1 // sensible minimum + } return &programLRUCache{ capacity: capacity, list: list.New(), items: make(map[string]*list.Element, capacity), } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/policy-engine/internal/pkg/cel/lru.go` around lines 40 - 46, newProgramLRUCache currently accepts non-positive capacities which breaks eviction logic in programLRUCache.put (the >= capacity check) and allows unbounded growth; add validation in newProgramLRUCache to enforce capacity > 0 (either return an error, panic, or coerce to a sane default like 1) and ensure the created programLRUCache.capacity and items map use that validated value so put evicts as intended. Reference newProgramLRUCache, programLRUCache.capacity, and put when making the change.gateway/system-policies/analytics/analytics.go (1)
322-329: Potential nil map write if SharedContext.Metadata is nil.Line 322-324 initializes
Metadataif nil, which is defensive. However, this mutatesSharedContextwhich is shared across phases. Ensure this doesn't cause issues if other code expectsMetadatato remain nil.The current approach is safe since other code generally expects
Metadatato be non-nil after context building. This is a minor observation only.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/system-policies/analytics/analytics.go` around lines 322 - 329, The code currently initializes ctx.SharedContext.Metadata when nil which mutates the shared context; change the logic in the block that appends chunk.Chunk so you do not create/assign a new map on ctx.SharedContext.Metadata unless it already existed: read acc via acc, _ := ctx.SharedContext.Metadata[analyticsStreamAccKey].([]byte), if ctx.SharedContext.Metadata == nil allocate a local map variable (e.g., localMeta := make(map[string]interface{})) and append to that local accumulator, otherwise append into the existing ctx.SharedContext.Metadata; only write back to ctx.SharedContext.Metadata when it was non-nil (or if you deliberately want to persist across phases, ensure this is explicit and documented). Ensure references to analyticsStreamAccKey, ctx.SharedContext.Metadata and chunk.Chunk are used to locate the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@gateway/gateway-runtime/policy-engine/internal/kernel/execution_context.go`:
- Around line 317-338: The accumulated-chunk suppression is incorrect: code
returns an empty StreamedBodyResponse{} inside a BodyMutation (via
BodyMutation_StreamedResponse) which does not clear the chunk; instead, modify
the return in the block guarded by ec.anyPolicyNeedsMoreRequestData(...) so the
BodyMutation uses the clear_body flag (set the BodyMutation to indicate
ClearBody: true) rather than constructing a StreamedBodyResponse{}, i.e. replace
the BodyMutation_StreamedResponse/StreamedBodyResponse{} path with a
BodyMutation that sets clear_body (ClearBody true) in the
extprocv3.ProcessingResponse returned by this function.
---
Outside diff comments:
In
`@gateway/gateway-runtime/policy-engine/internal/pkg/cel/evaluator_bench_test.go`:
- Around line 234-241: The ProcessingPhaseCheck benchmark uses expr
`processing.phase == "response"` but EvaluateResponseBodyCondition sets
processing.phase to `"response_body"`, so the condition never matches; fix by
making the benchmark expression match the evaluator (change the expr in the
ProcessingPhaseCheck b.Run to `processing.phase == "response_body"`) or
alternatively adjust EvaluateResponseBodyCondition to set `processing.phase` to
`"response"` so both sides agree; update only the ProcessingPhaseCheck benchmark
(or the evaluator) so the phase string used by the benchmark and the
EvaluateResponseBodyCondition function are consistent.
---
Nitpick comments:
In `@gateway/gateway-runtime/policy-engine/internal/kernel/execution_context.go`:
- Around line 861-888: The CEL evaluation error in anyPolicyNeedsMoreRequestData
is currently swallowed and treated as "condition met"; modify the function to
log CEL evaluation failures (when EvaluateStreamingRequestCondition returns err
!= nil) at warning level before falling through, using the runtime's server
logger (e.g., ec.server.logger or the existing server logging facility) and
include the policy spec/condition and error details; keep the conservative
fallback behavior but ensure the warning is emitted so
configuration/CELEvaluator errors are visible.
In `@gateway/gateway-runtime/policy-engine/internal/pkg/cel/lru.go`:
- Around line 40-46: newProgramLRUCache currently accepts non-positive
capacities which breaks eviction logic in programLRUCache.put (the >= capacity
check) and allows unbounded growth; add validation in newProgramLRUCache to
enforce capacity > 0 (either return an error, panic, or coerce to a sane default
like 1) and ensure the created programLRUCache.capacity and items map use that
validated value so put evicts as intended. Reference newProgramLRUCache,
programLRUCache.capacity, and put when making the change.
In `@gateway/system-policies/analytics/analytics.go`:
- Around line 322-329: The code currently initializes ctx.SharedContext.Metadata
when nil which mutates the shared context; change the logic in the block that
appends chunk.Chunk so you do not create/assign a new map on
ctx.SharedContext.Metadata unless it already existed: read acc via acc, _ :=
ctx.SharedContext.Metadata[analyticsStreamAccKey].([]byte), if
ctx.SharedContext.Metadata == nil allocate a local map variable (e.g., localMeta
:= make(map[string]interface{})) and append to that local accumulator, otherwise
append into the existing ctx.SharedContext.Metadata; only write back to
ctx.SharedContext.Metadata when it was non-nil (or if you deliberately want to
persist across phases, ensure this is explicit and documented). Ensure
references to analyticsStreamAccKey, ctx.SharedContext.Metadata and chunk.Chunk
are used to locate the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ffe95063-6d66-4dbb-b996-c75e0f3595fe
📒 Files selected for processing (8)
gateway/gateway-runtime/policy-engine/internal/kernel/execution_context.gogateway/gateway-runtime/policy-engine/internal/pkg/cel/evaluator.gogateway/gateway-runtime/policy-engine/internal/pkg/cel/evaluator_bench_test.gogateway/gateway-runtime/policy-engine/internal/pkg/cel/lru.gogateway/gateway-runtime/policy-engine/internal/pkg/cel/lru_test.gogateway/system-policies/analytics/analytics.gosdk/core/policy/v1alpha2/action.gosdk/core/policy/v1alpha2/interface.go
✅ Files skipped from review due to trivial changes (2)
- sdk/core/policy/v1alpha2/action.go
- sdk/core/policy/v1alpha2/interface.go
| // Consult streaming policies to decide whether to flush now. | ||
| // In FULL_DUPLEX_STREAMED mode an empty BodyResponse passes the chunk through unchanged, | ||
| // so we must explicitly suppress it with an empty StreamedBodyResponse while accumulating. | ||
| if !chunk.EndOfStream && !shouldForceFlush && ec.anyPolicyNeedsMoreRequestData(ec.requestStreamAccumulator) { | ||
| slog.Debug("[streaming] accumulating — waiting for more request data", | ||
| "route", ec.routeKey, | ||
| "accumulated_bytes", len(ec.requestStreamAccumulator), | ||
| ) | ||
| return &extprocv3.ProcessingResponse{ | ||
| Response: &extprocv3.ProcessingResponse_ResponseHeaders{ | ||
| ResponseHeaders: &extprocv3.HeadersResponse{}, | ||
| Response: &extprocv3.ProcessingResponse_RequestBody{ | ||
| RequestBody: &extprocv3.BodyResponse{ | ||
| Response: &extprocv3.CommonResponse{ | ||
| BodyMutation: &extprocv3.BodyMutation{ | ||
| Mutation: &extprocv3.BodyMutation_StreamedResponse{ | ||
| StreamedResponse: &extprocv3.StreamedBodyResponse{}, | ||
| }, | ||
| }, | ||
| }, | ||
| }, | ||
| }, | ||
| }, nil | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for any ext_proc response patterns in the codebase that successfully suppress chunks
rg -n -A5 'StreamedBodyResponse' --type go -g '!*_test.go'Repository: wso2/api-platform
Length of output: 4824
🌐 Web query:
How to suppress forwarding a chunk in Envoy ext_proc FULL_DUPLEX_STREAMED mode without injecting a 0-byte replacement?
💡 Result:
In Envoy's ext_proc filter operating in FULL_DUPLEX_STREAMED mode, suppressing the forwarding of a specific body chunk to the upstream (i.e., dropping it without injecting a 0-byte replacement) is achieved by having the external processor respond to the corresponding ProcessingRequest_Body with a ProcessingResponse containing a BodyMutation where the mutation is set to clear_body: true. This clears the chunk without replacing it with any data, effectively dropping it from the stream forwarded downstream/upstream. This behavior is supported in streamed modes (STREAMED and FULL_DUPLEX_STREAMED), as evidenced by discussions and proposals where clear_body is used precisely for suppressing early chunks without sending replacements. Note that in FULL_DUPLEX_STREAMED mode, Envoy continues streaming chunks independently of processor responses (no blocking/waiting per chunk like in STREAMED mode), allowing asynchronous dropping via clear_body. No 0-byte body is needed; clear_body discards the chunk entirely. Example proto response structure (from external_processor.proto context): BodyMutation { mutation { clear_body: true } } If no mutation is provided or body is empty, the original chunk would forward unchanged, but clear_body explicitly suppresses it.
Citations:
- 1: Proposal: Support for New Processing Mode in Ext-proc Filter for Large Payloads Processing envoyproxy/envoy#32090
- 2: https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/filters/http/ext_proc/v3/ext_proc.proto
- 3: https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/filters/http/ext_proc/v3/processing_mode.proto
- 4: ext_proc: first body chunk not processed when changing mode to FULL_DUPLEX_STREAMED envoyproxy/envoy#40340
Use clear_body: true instead of empty StreamedBodyResponse{} to properly suppress accumulated chunks.
During accumulation, the code returns an empty StreamedBodyResponse{} within a BodyMutation, intending to suppress the current chunk. However, per Envoy ext_proc behavior, an empty body does not suppress the chunk — it passes the original chunk through unchanged. The correct suppression mechanism in FULL_DUPLEX_STREAMED mode is to set clear_body: true in the BodyMutation.
Replace the empty StreamedBodyResponse{} pattern at lines 331 and 533 with a BodyMutation containing clear_body: true to properly drop chunks while accumulating.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@gateway/gateway-runtime/policy-engine/internal/kernel/execution_context.go`
around lines 317 - 338, The accumulated-chunk suppression is incorrect: code
returns an empty StreamedBodyResponse{} inside a BodyMutation (via
BodyMutation_StreamedResponse) which does not clear the chunk; instead, modify
the return in the block guarded by ec.anyPolicyNeedsMoreRequestData(...) so the
BodyMutation uses the clear_body flag (set the BodyMutation to indicate
ClearBody: true) rather than constructing a StreamedBodyResponse{}, i.e. replace
the BodyMutation_StreamedResponse/StreamedBodyResponse{} path with a
BodyMutation that sets clear_body (ClearBody true) in the
extprocv3.ProcessingResponse returned by this function.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
gateway/sample-policies/count-letters/countletters.go (1)
52-56:⚠️ Potential issue | 🟡 MinorUnchecked type assertions may panic on malformed parameters.
Lines 52 and 55 perform direct type assertions without guarding against
nilor unexpected types. Ifparams["letters"]is missing or not[]interface{}, this will panic at runtime.Consider adding defensive checks:
🛡️ Suggested defensive check
- lettersRaw := params["letters"].([]interface{}) - letters := make([]string, len(lettersRaw)) - for i, letterRaw := range lettersRaw { - letters[i] = letterRaw.(string) - } + lettersRaw, ok := params["letters"].([]interface{}) + if !ok { + slog.Warn("[Count Letters]: missing or invalid 'letters' parameter") + return p.generateEmptyResponse(params) + } + letters := make([]string, 0, len(lettersRaw)) + for _, letterRaw := range lettersRaw { + if s, ok := letterRaw.(string); ok { + letters = append(letters, s) + } + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/sample-policies/count-letters/countletters.go` around lines 52 - 56, The code does unchecked assertions on params["letters"] and on each element which can panic; update the logic around lettersRaw, letters and params["letters"] to (1) check that params["letters"] exists and is non-nil, (2) use the comma-ok pattern to safely assert params["letters"] as []interface{} (e.g. lettersRaw, ok := params["letters"].([]interface{})) and return an error if not ok, and (3) for the loop safely assert each element to string using the comma-ok form (e.g. s, ok := letterRaw.(string)) and handle non-string elements (return an error or skip) before populating letters, so CountLetters/handler will not panic on malformed input.
🧹 Nitpick comments (1)
gateway/gateway-runtime/policy-engine/internal/executor/chain.go (1)
1008-1022: JSON round-trip for deep copy has performance implications for large parameters.The
deepCopyParamsfunction uses JSON marshal/unmarshal for deep copying. While correct, this has O(n) allocation overhead and may be slow for policies with large parameter maps. The PR objectives mention a "hardcoded 10MB accumulation cap" concern—this deep copy could contribute to memory pressure under high load.Consider documenting this limitation or using a more efficient deep-copy library if parameter sizes become an issue.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/policy-engine/internal/executor/chain.go` around lines 1008 - 1022, The deepCopyParams function uses a JSON marshal/unmarshal round-trip which is correct but can cause heavy allocations and CPU overhead for large maps; update deepCopyParams to use a more efficient deep-copy approach (e.g., implement a recursive copy for map[string]interface{} and []interface{} or switch to a well-maintained deep-copy library) to avoid JSON serialization overhead, and add a short comment above deepCopyParams documenting the previous JSON approach and the 10MB accumulation cap concern so future maintainers understand the trade-off.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@gateway/sample-policies/count-letters/countletters.go`:
- Around line 52-56: The code does unchecked assertions on params["letters"] and
on each element which can panic; update the logic around lettersRaw, letters and
params["letters"] to (1) check that params["letters"] exists and is non-nil, (2)
use the comma-ok pattern to safely assert params["letters"] as []interface{}
(e.g. lettersRaw, ok := params["letters"].([]interface{})) and return an error
if not ok, and (3) for the loop safely assert each element to string using the
comma-ok form (e.g. s, ok := letterRaw.(string)) and handle non-string elements
(return an error or skip) before populating letters, so CountLetters/handler
will not panic on malformed input.
---
Nitpick comments:
In `@gateway/gateway-runtime/policy-engine/internal/executor/chain.go`:
- Around line 1008-1022: The deepCopyParams function uses a JSON
marshal/unmarshal round-trip which is correct but can cause heavy allocations
and CPU overhead for large maps; update deepCopyParams to use a more efficient
deep-copy approach (e.g., implement a recursive copy for map[string]interface{}
and []interface{} or switch to a well-maintained deep-copy library) to avoid
JSON serialization overhead, and add a short comment above deepCopyParams
documenting the previous JSON approach and the 10MB accumulation cap concern so
future maintainers understand the trade-off.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0774c1a8-cb2e-4631-a0ba-45d9ac2cec28
📒 Files selected for processing (7)
gateway/build-lock.yamlgateway/gateway-runtime/policy-engine/internal/executor/chain.gogateway/gateway-runtime/policy-engine/internal/executor/chain_bench_test.gogateway/gateway-runtime/policy-engine/internal/kernel/extproc.gogateway/gateway-runtime/policy-engine/internal/kernel/translator.gogateway/sample-policies/count-letters/countletters.gogateway/sample-policies/uppercase-body/uppercasebody.go
✅ Files skipped from review due to trivial changes (1)
- gateway/gateway-runtime/policy-engine/internal/kernel/extproc.go
🚧 Files skipped from review as they are similar to previous changes (2)
- gateway/build-lock.yaml
- gateway/gateway-runtime/policy-engine/internal/kernel/translator.go
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
gateway/gateway-runtime/policy-engine/internal/xdsclient/handler.go (1)
345-425:⚠️ Potential issue | 🔴 CriticalMissing
RequiresRequestHeaderandRequiresResponseHeaderdetection.This
buildPolicyChainimplementation is missing the header requirement detection that exists inkernel/xds.go. TheConfigLoader.buildPolicyChainversion includes:if _, ok := impl.(policy.RequestHeaderPolicy); ok { requiresRequestHeader = true } if _, ok := impl.(policy.ResponseHeaderPolicy); ok { requiresResponseHeader = true }And populates
RequiresRequestHeader/RequiresResponseHeaderin the chain struct. This handler is the production xDS path, so chains built via ADS will have these fields unset (defaulting tofalse), potentially causing incorrect ext_proc mode selection.🐛 Proposed fix to add missing header requirement detection
requiresRequestBody := false requiresResponseBody := false hasExecutionConditions := false supportsRequestStreaming := true supportsResponseStreaming := true hasRequestBodyPolicy := false hasResponseBodyPolicy := false + requiresRequestHeader := false + requiresResponseHeader := false for _, policyConfig := range config.Policies {Inside the loop, after the body mode checks (around line 407):
if mode.ResponseBodyMode == policy.BodyModeBuffer || mode.ResponseBodyMode == policy.BodyModeStream { requiresResponseBody = true hasResponseBodyPolicy = true if _, streaming := impl.(policy.StreamingResponsePolicy); !streaming { supportsResponseStreaming = false } } + + if _, ok := impl.(policy.RequestHeaderPolicy); ok { + requiresRequestHeader = true + } + if _, ok := impl.(policy.ResponseHeaderPolicy); ok { + requiresResponseHeader = true + } }In the chain construction (around line 417):
chain := ®istry.PolicyChain{ Policies: policyList, PolicySpecs: policySpecs, RequiresRequestBody: requiresRequestBody, RequiresResponseBody: requiresResponseBody, HasExecutionConditions: hasExecutionConditions, + RequiresRequestHeader: requiresRequestHeader, + RequiresResponseHeader: requiresResponseHeader, SupportsRequestStreaming: supportsRequestStreaming, SupportsResponseStreaming: supportsResponseStreaming, }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/policy-engine/internal/xdsclient/handler.go` around lines 345 - 425, The buildPolicyChain function is missing detection for header-requiring policies; inside ResourceHandler.buildPolicyChain, after the body-mode checks for each impl (and before appending/ending the loop), check type assertions for policy.RequestHeaderPolicy and policy.ResponseHeaderPolicy and set local booleans requiresRequestHeader and requiresResponseHeader accordingly; then populate the resulting registry.PolicyChain fields RequiresRequestHeader and RequiresResponseHeader from those booleans when constructing chain so ADS-built chains reflect header requirements.
🧹 Nitpick comments (2)
gateway/gateway-runtime/policy-engine/internal/kernel/kernel_test.go (1)
636-650: Test is redundant and name is misleading.This test is now identical to
TestBuildAnalyticsStruct_WithNilSharedContext(lines 619-634):
- Both set
sharedCtx: nil- Both pass the same input data
- Both assert the same outcome
The name
WithNilRequestContextis misleading becausePolicyExecutionContextno longer has arequestContextfield—it has separate phase contexts (requestHeaderCtx,requestBodyCtx, etc.). SincebuildAnalyticsStructonly usessharedCtx, consider either:
- Removing this redundant test, or
- Repurposing it to test a different nil scenario if needed
🧹 Suggested cleanup: Remove redundant test
-func TestBuildAnalyticsStruct_WithNilRequestContext(t *testing.T) { - execCtx := &PolicyExecutionContext{ - sharedCtx: nil, - } - - data := map[string]any{ - "key": "value", - } - - result, err := buildAnalyticsStruct(data, execCtx) - - require.NoError(t, err) - require.NotNil(t, result) - assert.Len(t, result.Fields, 1) -}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/policy-engine/internal/kernel/kernel_test.go` around lines 636 - 650, The test TestBuildAnalyticsStruct_WithNilRequestContext is redundant with TestBuildAnalyticsStruct_WithNilSharedContext because both set execCtx.sharedCtx = nil and assert the same outcome; remove TestBuildAnalyticsStruct_WithNilRequestContext from kernel_test.go to avoid duplication, or if you prefer to keep coverage, repurpose it to exercise a different nil scenario (e.g., keep execCtx.sharedCtx non-nil and set a specific phase context like requestHeaderCtx or requestBodyCtx to nil) while calling buildAnalyticsStruct so it tests a distinct behavior.gateway/gateway-runtime/policy-engine/internal/xdsclient/handler.go (1)
343-345: Consider extracting shared chain-building logic to avoid drift.The comment acknowledges this is a copy of
ConfigLoader.buildPolicyChain. The implementations have already diverged (missing header detection above). Consider extracting the chain-building logic into a shared function in theregistrypackage that bothConfigLoaderandResourceHandlercan call, reducing maintenance burden and preventing future inconsistencies.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/policy-engine/internal/xdsclient/handler.go` around lines 343 - 345, Extract the duplicated policy-chain construction logic out of ResourceHandler.buildPolicyChain and ConfigLoader.buildPolicyChain into a single exported helper in the registry package (e.g., registry.NewPolicyChainFromConfig or similar); move the common logic there, ensure the missing header-detection behavior present in ResourceHandler is incorporated into the shared helper, update both callers (ResourceHandler.buildPolicyChain and ConfigLoader.buildPolicyChain) to call the new registry function and remove duplicated code, and add tests to the registry helper to cover header detection and any other policy-chain edge cases.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@gateway/gateway-runtime/policy-engine/internal/xdsclient/handler.go`:
- Around line 345-425: The buildPolicyChain function is missing detection for
header-requiring policies; inside ResourceHandler.buildPolicyChain, after the
body-mode checks for each impl (and before appending/ending the loop), check
type assertions for policy.RequestHeaderPolicy and policy.ResponseHeaderPolicy
and set local booleans requiresRequestHeader and requiresResponseHeader
accordingly; then populate the resulting registry.PolicyChain fields
RequiresRequestHeader and RequiresResponseHeader from those booleans when
constructing chain so ADS-built chains reflect header requirements.
---
Nitpick comments:
In `@gateway/gateway-runtime/policy-engine/internal/kernel/kernel_test.go`:
- Around line 636-650: The test TestBuildAnalyticsStruct_WithNilRequestContext
is redundant with TestBuildAnalyticsStruct_WithNilSharedContext because both set
execCtx.sharedCtx = nil and assert the same outcome; remove
TestBuildAnalyticsStruct_WithNilRequestContext from kernel_test.go to avoid
duplication, or if you prefer to keep coverage, repurpose it to exercise a
different nil scenario (e.g., keep execCtx.sharedCtx non-nil and set a specific
phase context like requestHeaderCtx or requestBodyCtx to nil) while calling
buildAnalyticsStruct so it tests a distinct behavior.
In `@gateway/gateway-runtime/policy-engine/internal/xdsclient/handler.go`:
- Around line 343-345: Extract the duplicated policy-chain construction logic
out of ResourceHandler.buildPolicyChain and ConfigLoader.buildPolicyChain into a
single exported helper in the registry package (e.g.,
registry.NewPolicyChainFromConfig or similar); move the common logic there,
ensure the missing header-detection behavior present in ResourceHandler is
incorporated into the shared helper, update both callers
(ResourceHandler.buildPolicyChain and ConfigLoader.buildPolicyChain) to call the
new registry function and remove duplicated code, and add tests to the registry
helper to cover header detection and any other policy-chain edge cases.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 62406cee-9da9-4a9c-869c-5dfe301e2f52
📒 Files selected for processing (4)
gateway/gateway-runtime/policy-engine/internal/kernel/extproc.gogateway/gateway-runtime/policy-engine/internal/kernel/kernel_test.gogateway/gateway-runtime/policy-engine/internal/kernel/xds.gogateway/gateway-runtime/policy-engine/internal/xdsclient/handler.go
🚧 Files skipped from review as they are similar to previous changes (1)
- gateway/gateway-runtime/policy-engine/internal/kernel/extproc.go
| ExecutionTime: executionTime, | ||
| }) | ||
|
|
||
| if _, ok := action.(policy.ImmediateResponse); ok { |
There was a problem hiding this comment.
We can first check ImmediateResponse before UpstreamRequestHeaderModifications.
|
|
||
| // maxStreamAccumulatorSize caps the amount of data accumulated before forcing | ||
| // a flush, preventing unbounded memory growth from large streaming bodies. | ||
| const maxStreamAccumulatorSize = 10 * 1024 * 1024 // 10 MB |
There was a problem hiding this comment.
We can make this a config, we can do this later.
|
TODO: Need to check if body processing needs to be done looking at the Mode function of policy as well |
5f83d60 to
29c4328
Compare
…ultiple policies, enhancing functionality and compatibility.
…alidation logic for v1alpha2 policies
…ment LRU caching for compiled CEL programs.
…s for improved clarity
….9.3 for llm-cost-based-ratelimit
…just existing entries
…istency across various policies
…nfo for improved clarity
a96c3f9 to
2847f96
Compare
…or accurate request handling


Purpose
Issue:
PR Add Policy Engine changes to support alpha2 interfaces and streaming #1488: Policy Engine v1alpha2 + Streaming Support — Architectural Review
1. What Was Done
This PR migrates the WSO2 API Platform's policy engine from v1alpha to v1alpha2, fundamentally restructuring how policies process HTTP traffic. The changes span 74 files (+2,914 -1,007 lines).
Core Changes
A. Multi-Phase Execution Model (from 2 phases → 6 phases)
This separation allows header-only policies (auth, CORS, rate-limiting) to execute before the body is even read, reducing latency and enabling early rejection.
B. First-Class Streaming Support
Policies can now implement StreamingRequestPolicy / StreamingResponsePolicy to process body chunks incrementally. The chain auto-detects capabilities at build time:
C. SDK Contract (sdk/core/policy/v1alpha2)
D. Envoy ext_proc Configuration
E. Policy Version Bump
40+ default policies updated to v0.9.x, signaling they implement the new interfaces.
2. Improvement Suggestions
High Priority
Streaming policy test for TestValidateGoInterface_V1Alpha2StreamingPolicy validates the wrong thing. The test expects an error "missing required OnRequest() method" for a streaming policy that only implements OnResponseBodyChunk. But the validation code treats OnResponseBodyChunk as satisfying hasOnResponse, and there is no OnRequestBody/OnRequestHeaders/OnRequestBodyChunk method — so the error should be about missing a request phase method, not "OnRequest". The error message references the old v1alpha method name, which is confusing for v1alpha2 users. The validation error messages should be updated to reflect the new interface names.
No graceful fallback for streaming-to-buffered downgrade mid-stream If processResponseBody() starts in streaming mode but encounters an error after chunks have already been sent downstream, there's no recovery mechanism. Once chunks are flushed to Envoy, ImmediateResponse is meaningless. Consider:
Medium Priority
CEL evaluator compiles once per unique expression but has no eviction. The map[string]cel.Program cache grows unboundedly. In environments with dynamic per-route conditions, this is a memory leak. Add a bounded LRU cache.
Missing streaming condition evaluation tests. EvaluateStreamingRequestCondition and EvaluateStreamingResponseCondition are added to the interface but I see no test coverage for them in the diff.
NeedsMoreRequestData(accumulated []byte) coupling. The streaming interfaces require each policy to inspect raw accumulated bytes to decide if more data is needed. This forces policies to implement their own framing/parsing. A higher-level abstraction (e.g., SSE frame boundary detection) would reduce boilerplate across LLM-specific policies.
Low Priority
The regex change ^/pets/([^/]+)/history → ^/pets/([^/]+)/history.* in the dev envoy.yaml broadens the match significantly. This appears to be for dev/testing only, but confirm it doesn't leak into production configs.
DropHeaderAction in actions has a string Action field ("allow"/"deny") that should be an enum/const for type safety.
3. Sequence Diagram — Full Request/Response Flow
Overall Assessment
This is a well-architected evolution of the policy engine. The key strengths:
The main risks are around the hard break for v1alpha policies (no adapter/shim), the hardcoded 10MB streaming cap, and the JSON-based param deep-copy performance. I'd recommend
addressing the backward compatibility story and making the streaming buffer configurable before merging.
Goals
Approach
User stories
Documentation
Automation tests
Security checks
Samples
Related PRs
Test environment
Summary by CodeRabbit
New Features
Policy Updates
Enhancements
Refactor