Conversation
📝 WalkthroughWalkthroughThis PR updates Argo CD project wiring, adds an AppProject and RBAC roles, introduces Helm NetworkPolicy/PDB templates and values, adds PostgreSQL/Redis backup+restore scripts and runbooks, adjusts observability (Prometheus, OTEL, Grafana), tweaks docker-compose and Vite port, and bumps OpenTelemetry Go deps. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Comment |
There was a problem hiding this comment.
Actionable comments posted: 15
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
k8s/helm/grainguard/values.yaml (1)
220-266: 🧹 Nitpick | 🔵 TrivialConsider adding PDB and NetworkPolicy to remaining services.
Services
cdc-transformer,dlq-reprocessor, andsearch-indexerlackpdbandnetworkPolicyconfigurations. Whilecdc-transformeranddlq-reprocessorhavereplicaCount: 1(making PDB less useful),search-indexercould benefit from these for consistency.If this is intentional (e.g., these are non-critical background workers), consider adding a comment to document the decision.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@k8s/helm/grainguard/values.yaml` around lines 220 - 266, Add explicit pdb and networkPolicy configuration entries for the cdc-transformer, dlq-reprocessor, and search-indexer blocks in values.yaml (or document why omitted); specifically, add a pdb subsection (even if disabled when replicaCount: 1) and a networkPolicy subsection under each service (e.g., cdc-transformer, dlq-reprocessor, search-indexer) with sensible defaults or disabled flags so chart consumers see the options; if you intentionally want them omitted, add a short comment next to the replicaCount or service block stating they are non-critical background workers and why pdb/networkPolicy are not provided.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@argocd/app-of-apps.yaml`:
- Around line 11-13: Two manifests define the same Application identity
(metadata.name: grainguard-apps in namespace: argocd) but point to different
spec.source.path values, causing last-write-wins drift; resolve by choosing one
canonical root App-of-Apps or renaming one Application to a unique metadata.name
and update any scripts/RBAC/policy references that depend on that name, or
consolidate the differing spec.source.path entries into the single canonical
manifest so there is only one grainguard-apps Application in the cluster.
In `@docs/runbooks/postgres-backup-restore.md`:
- Around line 27-29: Replace the hardcoded absolute path example in
docs/runbooks/postgres-backup-restore.md with an instruction to run the script
from the repo root: remove the "cd /Users/..." line and instead prefix the
script invocation with a comment like "From the repository root" and run
./infra/scripts/backup-postgres.sh (and make the identical change for the second
occurrence later in the file around the restore/other example), so users use
relative paths to their own clones rather than a local absolute path.
- Around line 15-16: Update the Markdown links in
docs/runbooks/postgres-backup-restore.md to avoid machine-specific absolute
paths by using repository-relative paths for the scripts; replace
`/Users/pahuldeep/Developer/grainguard/infra/scripts/backup-postgres.sh` and
`/Users/pahuldeep/Developer/grainguard/infra/scripts/restore-postgres.sh` with
relative links such as `infra/scripts/backup-postgres.sh` and
`infra/scripts/restore-postgres.sh` (or use `./infra/scripts/...`) so
contributors can open the scripts regardless of their local clone location.
In `@docs/runbooks/redis-backup-restore.md`:
- Around line 14-15: Update the markdown to stop linking to machine-specific
absolute paths and to remove hardcoded cd commands: replace the absolute links
to backup-redis.sh and restore-redis.sh with relative repository paths (e.g.,
./infra/scripts/backup-redis.sh and ./infra/scripts/restore-redis.sh or path
relative to the docs folder) and remove or change the hardcoded "cd /Users/..."
occurrences to a generic project-root relative `cd` (or instruct readers to run
commands from the repo root) where the docs currently reference the
user-specific path; target the links for backup-redis.sh and restore-redis.sh
and the two "cd" instances referenced in the file.
In `@infra/docker/docker-compose.yml`:
- Line 660: The port mapping "13133:13133" exposes the collector health endpoint
on all host interfaces; change the mapping so the port is bound to loopback or
internal-only (e.g., replace "13133:13133" with "127.0.0.1:13133:13133" or
remove the host mapping and expose only on the internal network) in the
docker-compose.yml service that currently declares the "13133:13133" mapping to
ensure the health endpoint is not reachable from external hosts.
In `@infra/docker/grafana/provisioning/dashboards/grainguard-overview.json`:
- Around line 10-11: The dashboard panels "Kafka Consumer Lag" and "Publish
Queue Depth" use Prometheus expressions that include "or vector(0)" for metrics
kafka_consumer_lag, publish_queue_depth, worker_queue_depth and
commit_queue_depth which masks missing telemetry; either (A) add/enable the
scrape targets and instrumentation that emit those metric names so the
expressions in the targets (e.g., sum(kafka_consumer_lag) and
sum(publish_queue_depth) / sum(worker_queue_depth) / sum(commit_queue_depth))
return real series, or (B) remove the "or vector(0)" from the target expressions
in those panels (titles "Kafka Consumer Lag" and "Publish Queue Depth") so
Grafana surfaces "no data" instead of showing zeros, or replace the expressions
with existing, backed metrics used elsewhere in the gateway if appropriate.
In `@infra/scripts/backup-redis.sh`:
- Around line 13-14: Replace the blocking redis-cli SAVE call with a
non-blocking BGSAVE and wait loop: invoke docker exec "${REDIS_CONTAINER}"
redis-cli BGSAVE, then poll docker exec "${REDIS_CONTAINER}" redis-cli LASTSAVE
in a loop (with a short sleep and timeout) until LASTSAVE is newer than the
start timestamp; update the echo/log messages accordingly and ensure the script
fails if the BGSAVE does not complete within the timeout to avoid hanging
indefinitely.
In `@infra/scripts/restore-postgres.sh`:
- Around line 41-43: The pg_restore invocation in restore-postgres.sh currently
includes the redundant flags --clean and --if-exists when restoring into a
freshly DROP/CREATE'd database; remove these flags from the pg_restore command
that uses "${dump_file}" so the command no longer passes --clean or --if-exists
(leave other flags like --no-owner intact) to avoid unnecessary/no-op options.
- Around line 34-37: The DROP DATABASE command fails when other clients hold
connections; before the docker exec psql call that runs DROP DATABASE for
"${db}" (using "${container}" and "${PGUSER}"), run a psql command that
terminates all other backends for that database (e.g., query pg_stat_activity
and call pg_terminate_backend(pid) for pid != pg_backend_pid()), then proceed to
DROP and CREATE; alternatively update the script to require stopping dependent
services before restore and document that requirement.
- Around line 35-37: The script interpolates ${db} directly into SQL for
DROP/CREATE which is unsafe for identifiers; update the psql commands that use
${db} (the DROP and CREATE statements executed by docker exec ... psql -U
"${PGUSER}" -d postgres -v ON_ERROR_STOP=1) to quote the identifier instead of
plain interpolation (e.g., construct the SQL using double-quoted identifiers so
the database name is wrapped in quotes), and ensure any embedded quotes in ${db}
are handled (e.g., via proper shell quoting/escaping or passing the identifier
as a psql variable) so both DROP DATABASE and CREATE DATABASE use a quoted
identifier.
In `@infra/scripts/restore-redis.sh`:
- Around line 19-20: The restore script currently copies RDB into the stopped
container but does not remove existing AOF, so Redis running with --appendonly
yes may replay AOF and override the restore; update restore-redis.sh to ensure
/data/appendonly.aof (and any appendonly.aof.*) is removed from the Redis data
volume before starting Redis: since docker exec won't work on a stopped
container, remove the AOF by either (a) mounting the same volume into a
short-lived helper container and rm /data/appendonly.aof after copying
${RDB_FILE} into the volume, or (b) start the ${REDIS_CONTAINER} in a controlled
way, immediately run docker exec ${REDIS_CONTAINER} rm -f /data/appendonly.aof
&& rm -f /data/appendonly.aof.* before allowing Redis to start; ensure the
script references REDIS_CONTAINER and RDB_FILE and logs the AOF removal step.
- Around line 22-23: After starting the container with docker start
"${REDIS_CONTAINER}", add a retry loop that repeatedly runs docker exec
"${REDIS_CONTAINER}" redis-cli PING until it returns PONG (or until a
configurable timeout/maximum attempts is reached), sleeping briefly between
attempts; if the loop times out, log an error and exit non‑zero. Place this loop
immediately after docker start and before any commands that assume Redis is
ready to ensure the script waits for the redis-cli PING to succeed.
In `@k8s/argocd/install.sh`:
- Around line 17-18: The install script applies a conflicting AppProject
(kubectl apply -f k8s/argocd/project.yaml) that can overwrite the role policies
applied by scripts/deploy.sh (argocd/project.yaml); fix by making a single
source of truth: remove the redundant apply from k8s/argocd/install.sh or change
it to reuse the same canonical AppProject (e.g., call the shared deploy script
or apply the single agreed project file), or add a guard that compares the two
project specs and aborts if they differ so the dev-deployer/prod-deployer
policies are never silently dropped.
In `@k8s/helm/grainguard/templates/networkpolicy.yaml`:
- Around line 22-34: The network policy fragment using
$networkPolicy.allowExternal currently emits only a ports list (no from clause),
which permits traffic from any source; update the template logic around
allowExternal in networkpolicy.yaml so that when allowExternal is true you
either (a) explicitly add a from selector limiting sources to your ingress
controller namespace/labels (e.g., match ingress controller
podSelector/namespaceSelector) or (b) document/guard that allowExternal must
only be set for gateway services—locate the block referencing
$networkPolicy.allowExternal and $svc.port/$svc.grpcPort/$svc.healthPort and
implement the chosen change (add a from: with appropriate selector when
restricting ingress, or add validation/commentary preventing accidental use on
internal services).
In `@k8s/helm/grainguard/templates/pdb.yaml`:
- Line 3: The PDB template currently forces replicaCount > 1 and ignores
explicit pdb.enabled: true; update the conditional so a PDB is rendered when the
service is enabled and either pdb.enabled is truthy or replicaCount > 1.
Concretely, modify the if-expression that uses $svc.enabled, $pdb.enabled, and
$svc.replicaCount so $pdb.enabled (with a sensible default) can opt-in
regardless of replicaCount—i.e., test pdb.enabled OR replicaCount > 1 instead of
requiring replicaCount > 1 in all cases.
---
Outside diff comments:
In `@k8s/helm/grainguard/values.yaml`:
- Around line 220-266: Add explicit pdb and networkPolicy configuration entries
for the cdc-transformer, dlq-reprocessor, and search-indexer blocks in
values.yaml (or document why omitted); specifically, add a pdb subsection (even
if disabled when replicaCount: 1) and a networkPolicy subsection under each
service (e.g., cdc-transformer, dlq-reprocessor, search-indexer) with sensible
defaults or disabled flags so chart consumers see the options; if you
intentionally want them omitted, add a short comment next to the replicaCount or
service block stating they are non-critical background workers and why
pdb/networkPolicy are not provided.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: c9f0978e-fd4e-45cd-8896-99cb55caf913
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (26)
apps/dashboard/vite.config.tsargocd/app-of-apps.yamlargocd/environments/dev.yamlargocd/environments/prod.yamlargocd/project.yamldocs/runbooks/postgres-backup-restore.mddocs/runbooks/redis-backup-restore.mdgo.modinfra/docker/docker-compose.ymlinfra/docker/grafana/provisioning/dashboards/grainguard-overview.jsoninfra/docker/grafana/provisioning/datasources/datasources.yamlinfra/docker/otel-collector-config.yamlinfra/docker/prometheus.ymlinfra/scripts/backup-postgres.shinfra/scripts/backup-redis.shinfra/scripts/restore-postgres.shinfra/scripts/restore-redis.shk8s/argocd/app-of-apps.yamlk8s/argocd/apps/grainguard-dev.yamlk8s/argocd/apps/grainguard-prod.yamlk8s/argocd/install.shk8s/argocd/project.yamlk8s/helm/grainguard/templates/deployment.yamlk8s/helm/grainguard/templates/networkpolicy.yamlk8s/helm/grainguard/templates/pdb.yamlk8s/helm/grainguard/values.yaml
| echo "==> Applying GrainGuard ArgoCD project..." | ||
| kubectl apply -f k8s/argocd/project.yaml |
There was a problem hiding this comment.
This applies an RBAC-incomplete AppProject variant and can remove role policies.
Line 18 applies k8s/argocd/project.yaml, while scripts/deploy.sh applies argocd/project.yaml (same AppProject identity, different spec content). Last apply wins, so dev-deployer/prod-deployer policies can be silently dropped.
🔧 Proposed fix (single source of truth)
echo "==> Applying GrainGuard ArgoCD project..."
-kubectl apply -f k8s/argocd/project.yaml
+kubectl apply -f argocd/project.yaml📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| echo "==> Applying GrainGuard ArgoCD project..." | |
| kubectl apply -f k8s/argocd/project.yaml | |
| echo "==> Applying GrainGuard ArgoCD project..." | |
| kubectl apply -f argocd/project.yaml |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@k8s/argocd/install.sh` around lines 17 - 18, The install script applies a
conflicting AppProject (kubectl apply -f k8s/argocd/project.yaml) that can
overwrite the role policies applied by scripts/deploy.sh (argocd/project.yaml);
fix by making a single source of truth: remove the redundant apply from
k8s/argocd/install.sh or change it to reuse the same canonical AppProject (e.g.,
call the shared deploy script or apply the single agreed project file), or add a
guard that compares the two project specs and aborts if they differ so the
dev-deployer/prod-deployer policies are never silently dropped.
| {{- if $networkPolicy.allowExternal }} | ||
| - ports: | ||
| - protocol: TCP | ||
| port: {{ $svc.port }} | ||
| {{- if $svc.grpcPort }} | ||
| - protocol: TCP | ||
| port: {{ $svc.grpcPort }} | ||
| {{- end }} | ||
| {{- if and $svc.healthPort (ne $svc.healthPort $svc.port) }} | ||
| - protocol: TCP | ||
| port: {{ $svc.healthPort }} | ||
| {{- end }} | ||
| {{- end }} |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Clarify: allowExternal permits traffic from any source.
The ingress rule for allowExternal: true has no from selector, meaning it allows traffic from any source (inside or outside the cluster) on the specified ports. This is correct for the gateway service (LoadBalancer), but ensure this flag isn't accidentally enabled on internal services.
# Current behavior when allowExternal: true
- ports: # No 'from' selector = allow from anywhere
- protocol: TCP
port: {{ $svc.port }}If you want to restrict external traffic to only come through an ingress controller, you'd need to add a from selector with the ingress namespace/labels.
🧰 Tools
🪛 YAMLlint (1.38.0)
[error] 22-22: too many spaces inside braces
(braces)
[error] 25-25: too many spaces inside braces
(braces)
[error] 25-25: too many spaces inside braces
(braces)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@k8s/helm/grainguard/templates/networkpolicy.yaml` around lines 22 - 34, The
network policy fragment using $networkPolicy.allowExternal currently emits only
a ports list (no from clause), which permits traffic from any source; update the
template logic around allowExternal in networkpolicy.yaml so that when
allowExternal is true you either (a) explicitly add a from selector limiting
sources to your ingress controller namespace/labels (e.g., match ingress
controller podSelector/namespaceSelector) or (b) document/guard that
allowExternal must only be set for gateway services—locate the block referencing
$networkPolicy.allowExternal and $svc.port/$svc.grpcPort/$svc.healthPort and
implement the chosen change (add a from: with appropriate selector when
restricting ingress, or add validation/commentary preventing accidental use on
internal services).
- argocd/app-of-apps.yaml: rename to grainguard-apps-legacy to prevent last-write-wins collision with k8s/argocd/app-of-apps.yaml (same name, different source paths) - k8s/argocd/project.yaml: add dev-deployer/prod-deployer RBAC roles so install.sh no longer silently drops policies vs argocd/project.yaml - infra/docker/docker-compose.yml: bind OTel health port to 127.0.0.1 so 13133 is not reachable from external interfaces - infra/scripts/restore-redis.sh: clear AOF files before starting Redis to prevent appendonly.aof from overwriting restored RDB; add retry loop so PING check doesn't race against Redis startup - docs/runbooks: replace hardcoded /Users/pahuldeep/Developer/grainguard absolute paths with relative paths for portability - k8s/helm/grainguard/templates/pdb.yaml: respect explicit pdb.enabled: true for single-replica services instead of ignoring it Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
docs/runbooks/postgres-backup-restore.md (1)
14-16:⚠️ Potential issue | 🟡 MinorMarkdown links won't resolve correctly.
Since this file is at
docs/runbooks/postgres-backup-restore.md, the linksinfra/scripts/...resolve todocs/runbooks/infra/scripts/...which doesn't exist. Use relative paths that traverse up to the repo root.📝 Proposed fix
- scripts: - - [`backup-postgres.sh`](infra/scripts/backup-postgres.sh) - - [`restore-postgres.sh`](infra/scripts/restore-postgres.sh) + - [`backup-postgres.sh`](../../infra/scripts/backup-postgres.sh) + - [`restore-postgres.sh`](../../infra/scripts/restore-postgres.sh)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/runbooks/postgres-backup-restore.md` around lines 14 - 16, The markdown links to the scripts are resolving relative to docs/runbooks; update the two links for `backup-postgres.sh` and `restore-postgres.sh` so they traverse up to the repo root (replace `infra/scripts/...` with `../../infra/scripts/...`) so the links point to the actual files in the infra/scripts directory.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/runbooks/redis-backup-restore.md`:
- Around line 13-15: Update the markdown links for the script references so they
resolve from docs/runbooks to the repository root: replace the current relative
links for `backup-redis.sh` and `restore-redis.sh` with links that traverse up
two levels (e.g., `../../infra/scripts/backup-redis.sh` and
`../../infra/scripts/restore-redis.sh`) so the `backup-redis.sh` and
`restore-redis.sh` links point to the correct files.
In `@k8s/helm/grainguard/templates/pdb.yaml`:
- Line 4: The helm template uses the default function so an explicit
pdb.enabled: false is treated as unset and ignored; update the logic that sets
$pdbEnabled (currently using "$pdb.enabled | default (gt $replicaCount 1)") to
first check whether the pdb.enabled key exists using hasKey and only fall back
to (gt $replicaCount 1) when pdb.enabled is not present; locate the pdb.enabled
reference and replace the default-based expression with a hasKey-based
conditional that honors an explicit false while still enabling PDBs by default
when replicaCount > 1.
---
Duplicate comments:
In `@docs/runbooks/postgres-backup-restore.md`:
- Around line 14-16: The markdown links to the scripts are resolving relative to
docs/runbooks; update the two links for `backup-postgres.sh` and
`restore-postgres.sh` so they traverse up to the repo root (replace
`infra/scripts/...` with `../../infra/scripts/...`) so the links point to the
actual files in the infra/scripts directory.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 3b34c9b1-d12c-4280-84da-5e094d9389cb
📒 Files selected for processing (7)
argocd/app-of-apps.yamldocs/runbooks/postgres-backup-restore.mddocs/runbooks/redis-backup-restore.mdinfra/docker/docker-compose.ymlinfra/scripts/restore-redis.shk8s/argocd/project.yamlk8s/helm/grainguard/templates/pdb.yaml
| - scripts: | ||
| - [`backup-redis.sh`](infra/scripts/backup-redis.sh) | ||
| - [`restore-redis.sh`](infra/scripts/restore-redis.sh) |
There was a problem hiding this comment.
Markdown links won't resolve correctly.
Same issue as the Postgres runbook—since this file is at docs/runbooks/redis-backup-restore.md, these links need to traverse up to the repo root.
📝 Proposed fix
- scripts:
- - [`backup-redis.sh`](infra/scripts/backup-redis.sh)
- - [`restore-redis.sh`](infra/scripts/restore-redis.sh)
+ - [`backup-redis.sh`](../../infra/scripts/backup-redis.sh)
+ - [`restore-redis.sh`](../../infra/scripts/restore-redis.sh)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - scripts: | |
| - [`backup-redis.sh`](infra/scripts/backup-redis.sh) | |
| - [`restore-redis.sh`](infra/scripts/restore-redis.sh) | |
| - scripts: | |
| - [`backup-redis.sh`](../../infra/scripts/backup-redis.sh) | |
| - [`restore-redis.sh`](../../infra/scripts/restore-redis.sh) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/runbooks/redis-backup-restore.md` around lines 13 - 15, Update the
markdown links for the script references so they resolve from docs/runbooks to
the repository root: replace the current relative links for `backup-redis.sh`
and `restore-redis.sh` with links that traverse up two levels (e.g.,
`../../infra/scripts/backup-redis.sh` and
`../../infra/scripts/restore-redis.sh`) so the `backup-redis.sh` and
`restore-redis.sh` links point to the correct files.
| {{- if and $svc.enabled (gt ($svc.replicaCount | default 1 | int) 1) }} | ||
| {{- $pdb := $svc.pdb | default dict }} | ||
| {{- $replicaCount := ($svc.replicaCount | default 1 | int) }} | ||
| {{- $pdbEnabled := $pdb.enabled | default (gt $replicaCount 1) }} |
There was a problem hiding this comment.
Explicit pdb.enabled: false is ignored for multi-replica services.
Helm's default function treats false as falsy, so $pdb.enabled | default (gt $replicaCount 1) returns true when replicaCount > 1 even if pdb.enabled is explicitly set to false. This prevents operators from opting out of PDBs for high-replica services.
Use hasKey to distinguish between "not set" and "explicitly false":
🔧 Proposed fix using hasKey
{{- $pdb := $svc.pdb | default dict }}
{{- $replicaCount := ($svc.replicaCount | default 1 | int) }}
-{{- $pdbEnabled := $pdb.enabled | default (gt $replicaCount 1) }}
+{{- $pdbEnabled := ternary $pdb.enabled (gt $replicaCount 1) (hasKey $pdb "enabled") }}
{{- if and $svc.enabled $pdbEnabled }}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| {{- $pdbEnabled := $pdb.enabled | default (gt $replicaCount 1) }} | |
| {{- $pdb := $svc.pdb | default dict }} | |
| {{- $replicaCount := ($svc.replicaCount | default 1 | int) }} | |
| {{- $pdbEnabled := ternary $pdb.enabled (gt $replicaCount 1) (hasKey $pdb "enabled") }} | |
| {{- if and $svc.enabled $pdbEnabled }} |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@k8s/helm/grainguard/templates/pdb.yaml` at line 4, The helm template uses the
default function so an explicit pdb.enabled: false is treated as unset and
ignored; update the logic that sets $pdbEnabled (currently using "$pdb.enabled |
default (gt $replicaCount 1)") to first check whether the pdb.enabled key exists
using hasKey and only fall back to (gt $replicaCount 1) when pdb.enabled is not
present; locate the pdb.enabled reference and replace the default-based
expression with a hasKey-based conditional that honors an explicit false while
still enabling PDBs by default when replicaCount > 1.
HIGH PRIORITY: - #9: Fix CSRF timing attack — pad buffers to fixed length before timingSafeEqual - #10: Upgrade circuit breaker to distributed (Redis-backed state sharing across pods) - #8: Fix saga recovery infinite loop — mark corrupted payloads as FAILED instead of retrying MEDIUM PRIORITY: - #11: Add webhook idempotency check (dedup by endpoint_id + event_type) - #12: Assert Stripe webhook body is Buffer before signature verification - #13: Eliminate N+1 query in deviceTelemetry resolver (single JOIN query) - #15: Add ORDER BY + LIMIT to saga FindByCorrelationID for deterministic results - #17: Make critical audit events (auth, admin) throw on failure instead of silent swallow LOW PRIORITY: - #20: Add 10s query timeout to all saga repository DB operations - #23: Add IsValidStatus validator for saga status constants - #24: Set httpServer.timeout (30s) and keepAliveTimeout (65s) on BFF - #25: Add RabbitMQ heartbeat (30s) and connection error/close handlers - #7: Fix remaining saga JSON marshal error check (initialErr) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary by CodeRabbit
New Features
Infrastructure & Monitoring
Chores