feat(staging): scaffold pre-prod environment and harden local/runtime flows#15
feat(staging): scaffold pre-prod environment and harden local/runtime flows#15pahuldeepp merged 21 commits intomasterfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a staging Terraform environment and ArgoCD/Helm staging deployment, introduces multiple k6 load tests, tightens error handling for SSO and account export/deletion (audit-aware), centralizes Docker Compose envs, updates README/runbooks/roadmap, and improves proto resolution for gRPC device service. Changes
Sequence Diagram(s)mermaid Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
| @@ -113,33 +133,38 @@ accountRouter.get( | |||
| "/account/export", | |||
| authMiddleware, | |||
| async (req: Request, res: Response) => { | |||
| const tenantId = req.user!.tenantId; | |||
|
|
|||
| const [tenantResult, usersResult, devicesResult, alertsResult, auditResult, keysResult] = | |||
| await Promise.all([ | |||
| pool.query("SELECT id, name, slug, plan, email, created_at FROM tenants WHERE id = $1", [tenantId]), | |||
| pool.query("SELECT id, email, role, created_at FROM tenant_users WHERE tenant_id = $1", [tenantId]), | |||
| pool.query("SELECT id, serial_number, created_at FROM devices WHERE tenant_id = $1", [tenantId]), | |||
| pool.query("SELECT id, name, metric, operator, threshold, enabled, created_at FROM alert_rules WHERE tenant_id = $1", [tenantId]), | |||
| pool.query("SELECT id, event_type, actor_id, resource_type, payload, created_at FROM audit_events WHERE tenant_id = $1 ORDER BY created_at DESC LIMIT 1000", [tenantId]), | |||
| pool.query("SELECT id, name, created_at, expires_at, revoked_at FROM api_keys WHERE tenant_id = $1", [tenantId]), | |||
| ]); | |||
|
|
|||
| const exportData = { | |||
| exportedAt: new Date().toISOString(), | |||
| tenant: tenantResult.rows[0] || null, | |||
| users: usersResult.rows, | |||
| devices: devicesResult.rows, | |||
| alertRules: alertsResult.rows, | |||
| auditEvents: auditResult.rows, | |||
| apiKeys: keysResult.rows, | |||
Check failure
Code scanning / CodeQL
Missing rate limiting High
Copilot Autofix
AI 29 days ago
Copilot could not generate an autofix suggestion
Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.
There was a problem hiding this comment.
Actionable comments posted: 15
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
apps/gateway/src/routes/sso.ts (2)
129-144: 🛠️ Refactor suggestion | 🟠 MajorAuth0 errors propagate unhandled on mutating routes.
The POST
/samlroute (and similarly/oidcand DELETE) calls Auth0 functions (createSamlConnection,enableConnectionOnOrg) without try/catch. If Auth0 returns an error (e.g., rate limit, invalid cert format, network timeout), it propagates as an unhandled exception.Unlike the GET route which now gracefully degrades, these mutating routes will return 500 errors with potentially leaky error messages. Consider wrapping Auth0 calls with consistent error handling that returns structured error responses.
Example error handling pattern
try { const { connectionId } = await createSamlConnection({ tenantId, name, signInUrl, signingCert, emailDomains, }); await enableConnectionOnOrg(orgId, connectionId); // ... DB updates } catch (error) { console.error("[sso] SAML setup failed:", error); const message = error instanceof Error ? error.message : "unknown"; if (message.includes("SSO not configured")) { return res.status(503).json({ error: "sso_not_configured" }); } return res.status(502).json({ error: "auth0_error", message: "Failed to configure SAML connection" }); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/gateway/src/routes/sso.ts` around lines 129 - 144, Wrap the Auth0 calls in the POST /saml handler (specifically the createSamlConnection and enableConnectionOnOrg calls and the subsequent DB update using pool.query) in a try/catch block, logging the caught error and returning a structured HTTP error response instead of letting exceptions bubble; map known error cases (e.g., message includes "SSO not configured") to a 503 with error key "sso_not_configured" and return a 502 with error key "auth0_error" and a generic message like "Failed to configure SAML connection" for other Auth0/network errors; apply the same try/catch/error-mapping pattern to the POST /oidc and DELETE SSO routes so all mutating SSO endpoints consistently handle and sanitize Auth0 errors.
84-94:⚠️ Potential issue | 🟠 MajorRisk of inconsistent state without transaction or compensating action.
If
createOrganizationat line 85 succeeds but thepool.queryUPDATE at lines 88-91 fails, the Auth0 org exists butauth0_org_idis not recorded. Subsequent calls will create another org (the idempotency check at line 79 reads from the DB, not Auth0).Since Auth0 is external, a simple transaction won't fully solve this. Consider:
- Wrap DB operations in a transaction
- On DB failure after Auth0 success, either retry the DB update or delete the Auth0 org
The same pattern applies to POST
/saml, POST/oidc, and DELETE routes below.As per coding guidelines: "Verify every mutating route uses a real transaction client instead of pool-level BEGIN/COMMIT."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/gateway/src/routes/sso.ts` around lines 84 - 94, The code may create an Auth0 org via createOrganization(tenantId, rows[0].name) but then fail to persist auth0_org_id with pool.query, leaving an inconsistent external state; update the POST /sso handler to use a real DB transaction client (start a transaction client, use that client for the SELECT/UPDATE/INSERT operations instead of pool), and after createOrganization succeeds ensure either (a) retry the DB UPDATE within the same transaction client and commit only on success, or (b) if the DB update ultimately fails, call the Auth0 cleanup delete (using the orgId returned from createOrganization) as a compensating action before rolling back; apply the same pattern (transaction client + retry or compensating delete) to the POST /saml, POST /oidc and relevant DELETE routes that create external Auth0 resources so external and DB state remain consistent.apps/gateway/src/services/device.ts (1)
50-56:⚠️ Potential issue | 🔴 CriticalK8s deployments (staging and production) lack gRPC mTLS certificate volume mounts.
The concern is valid. Verification reveals that:
- Docker-compose correctly mounts
/certs:/certsfor both gateway and telemetry-service- K8s base deployments (
gateway.yaml,telemetry-service.yaml) have no volume mounts for certs- Helm template does not include a volumes/volumeMounts section
- Helm values files for staging and production contain no cert volume configuration
With current K8s deployments, both the gateway and telemetry-service will always trigger the
fs.existsSync("/certs/ca.crt")check to fail, falling back togrpc.credentials.createInsecure(). This means unencrypted gRPC traffic in staging and production environments.Add volumeMounts and volumes to the K8s base manifests and/or Helm template to mount the TLS certificates, or update the code to source cert paths from environment variables that point to K8s secret-mounted paths.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/gateway/src/services/device.ts` around lines 50 - 56, The code currently checks fs.existsSync("/certs/ca.crt") and falls back to grpc.credentials.createInsecure(), causing unencrypted gRPC in K8s because certs are not mounted; fix by either (A) adding volume and volumeMount entries to the K8s base manifests/Helm chart and values so /certs (containing ca.crt, gateway-client.key, gateway-client.crt) is mounted into the gateway and telemetry-service pods, or (B) update the certificate loading in this module to read cert file paths from environment variables (e.g., CERT_DIR or CA_PATH) instead of hardcoding "/certs", and ensure Helm values and K8s manifests mount the Secret at those paths; ensure the code uses those env vars when calling fs.existsSync and grpc.credentials.createSsl so createSsl is used when certs are present.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/dashboard/src/features/settings/SettingsPage.tsx`:
- Around line 106-123: The catch currently shows raw backend error codes (e.g.,
"internal_error") to users; add a small translation layer and use it before
calling toast.error. Implement a helper like translateExportError(errorCode:
string): string (add near SettingsPage or above the export handler) that maps
known codes (e.g., "internal_error" -> "An internal server error occurred.
Please try again later.", "not_authenticated" -> "Please sign in to export
data.", etc.) and returns a friendly default for unknown values; then replace
the toast.error call in the catch block to use
toast.error(translateExportError(e instanceof Error ? e.message : "")) so users
see human-friendly messages instead of raw machine codes.
- Around line 116-118: The regex used to extract the filename in
SettingsPage.tsx (where disposition is matched and filenameMatch is derived)
contains unnecessary escapes for double quotes; update the pattern used in
disposition.match(...) (the expression that sets filenameMatch) to remove the
backslashes so it reads /filename="?([^"]+)"?/i so ESLint's no-useless-escape
rule is satisfied and a.download assignment still falls back to the same default
filename when no match is found.
In `@apps/gateway/src/routes/account.ts`:
- Around line 95-109: The backend now retains immutable audit_events when
deleting a tenant (see the deletion response constructed in account route:
return res.json(...) in the handler that performs DELETE FROM tenants), so
update the dashboard confirmation copy to match: in the SettingsPage component
(SettingsPage.tsx) locate the tenant erasure confirmation/confirmation modal
text (the string that currently says audit logs will be permanently deleted) and
change it to indicate audit logs are retained for compliance (e.g., “Audit logs
will be retained and cannot be deleted for compliance”); update any related help
text or docs strings in the same component to reflect retained-audit behavior so
the UI matches the account.ts response.
- Around line 136-146: This route is returning tenant-wide sensitive data;
enforce RBAC by checking the authenticated user's role (req.user and
req.user.role) before running the tenant-wide queries: if req.user.role !==
'admin' return 403, otherwise continue; alternatively, for non-admins scope each
query to the requesting user by replacing tenant-wide queries for tenant_users,
devices, alert_rules, audit_events, and api_keys with user-scoped queries
(filter by req.user.id or actor_id where appropriate) so only the requester’s
records are exported; update the code around the existing tenantId and
Promise.all block that runs pool.query for tenant_users, devices, alert_rules,
audit_events, and api_keys to enforce the chosen behavior.
In `@apps/gateway/src/routes/sso.ts`:
- Around line 46-58: The catch in the route swallowing every error from
listOrgConnections hides operator configuration problems (e.g. errors from
getManagementToken when AUTH0_* env vars are missing); update error handling so
configuration errors surface: have getManagementToken (or listOrgConnections)
throw a distinct error type or include a recognisable property/message (e.g.
ConfigError or error.code === 'CONFIG_MISSING') and in the sso route catch
distinguish those and rethrow or return a 5xx response indicating
misconfiguration, while still converting transient Auth0 API failures into the
existing configured:true with warning response; reference listOrgConnections and
getManagementToken when adding the new error type check and branching logic.
In `@infra/docker/docker-compose.yml`:
- Around line 560-561: The docker-compose services currently include a global
env_file reference (env_file: - ../../.env) for the gateway, grafana, and
jobs-worker which exposes the entire root .env to containers; replace that broad
injection by scoping environment variables per-service: remove the ../../.env
env_file entry from the gateway, grafana, and jobs-worker service blocks and
either (a) point each service to a minimal service-specific env file (e.g.,
env_file: - ./gateway.env) that contains only the vars it needs, or (b)
enumerate only the required variables in the service's environment section using
explicit ${VAR} mappings; update the gateway, grafana, and jobs-worker service
definitions accordingly to enforce least-privilege.
In `@infra/terraform/environments/staging/.gitignore`:
- Line 1: Add common Terraform ignore entries in addition to .terraform/ to
prevent accidental commits of state and local artifacts: include patterns for
lock and state files (e.g., .terraform.lock.hcl, *.tfstate, *.tfstate.backup),
crash and override artifacts (crash.log, override.tf, override.tf.json), CLI and
variable files (.terraform.tfvars, *.tfvars, *.tfvars.json, *.tfvars.local) and
workspace/local backend files; update the .gitignore in this Terraform
environment to include those entries so local state, credentials, and
machine-generated files are not committed.
In `@infra/terraform/environments/staging/main.tf`:
- Line 5: Extract the hardcoded CIDR by adding a locals block (locals { vpc_cidr
= "10.10.0.0/16" }) and replace all occurrences of the literal "10.10.0.0/16"
with local.vpc_cidr; update the vpc_cidr argument passed into the modules that
reference it (the module blocks for vpc, rds, elasticache, and msk) to use
local.vpc_cidr so future changes only require editing the locals entry.
In `@infra/terraform/environments/staging/outputs.tf`:
- Around line 1-17: Mark the exposed connection endpoints as sensitive to avoid
accidental leakage in logs: add sensitive = true to the output blocks for
"rds_endpoint", "redis_endpoint", "kafka_brokers_tls", and "kafka_brokers_sasl"
(the output blocks referencing module.rds.endpoint,
module.elasticache.primary_endpoint, module.msk.bootstrap_brokers_tls, and
module.msk.bootstrap_brokers_sasl respectively); update those output
declarations accordingly while leaving other outputs unchanged.
In `@infra/terraform/environments/staging/providers.tf`:
- Around line 23-25: The AWS provider block (provider "aws") should include a
default_tags block to apply consistent tags to all resources; add a default_tags
{ tags = { ... } } section inside provider "aws" and populate keys like
"Environment", "Owner", "Project" using variables (e.g. var.environment,
var.owner, var.project) or hardcoded values as appropriate, and create any
missing vars if they don't exist so tags reference valid variables; keep region
= var.aws_region and ensure tag keys follow your org's naming convention.
In `@infra/terraform/environments/staging/variables.tf`:
- Around line 1-14: Add descriptive text for each Terraform input variable to
improve documentation and terraform plan clarity: add a description to variable
"project" explaining it holds the GCP/AWS project or service name, to variable
"aws_region" describing the AWS region (e.g., default region for resources), and
to variable "db_password" clarifying this is the sensitive database password
with no default and must be provided; update the variable blocks for project,
aws_region, and db_password to include these description strings.
In `@k8s/argocd/apps/grainguard-staging.yaml`:
- Around line 13-14: The ArgoCD application currently sets targetRevision: HEAD
which will promote every push on the default branch into the staging cluster;
update the ArgoCD app config by replacing targetRevision: HEAD with a controlled
ref such as a dedicated branch name (e.g., targetRevision: staging) or a
tag/semver release ref to ensure only intended commits are deployed, and
document the chosen strategy so repoURL and targetRevision remain consistent
with your deployment workflow.
In `@k8s/argocd/project.yaml`:
- Around line 36-40: The staging-deployer role (name: staging-deployer) defines
policies but has no groups field, so add a groups array to bind it to the same
subject group(s) used by dev-deployer/prod-deployer (e.g., the CI/service
account or team group) or explicitly note that bindings are handled elsewhere;
update the staging-deployer role to include a groups: [...] entry (matching the
existing dev-deployer/prod-deployer group names) so the policies can be assumed
by the intended users/service accounts.
In `@k8s/helm/grainguard/values-prod.yaml`:
- Around line 1-2: The Helm change added namespace: grainguard-prod but there is
no Terraform prod environment; create a new Terraform environment directory
mirroring infra/terraform/environments/dev and
infra/terraform/environments/staging (e.g., infra/terraform/environments/prod/)
and copy the same IRSA module configuration found in those main.tf files,
updating the module/input that sets environment to "prod" so the IRSA module and
any resources match the grainguard-prod namespace; ensure any variables or
backend configs are adjusted to match prod conventions and that the environment
value is consistently "prod" across the module and Terraform workspace
configuration.
In `@README.md`:
- Around line 190-201: Update the README Testing section to point to the new
load-test locations and filenames: replace references to tests/load/ with
scripts/load-tests/ and list the committed scripts graphql-stress.js,
ingest-stress.js, and mixed-stack-stress.js (e.g., k6 run
scripts/load-tests/graphql-stress.js, etc.); also update the Note bullet that
mentions tests/load/ to reference scripts/load-tests/ so the README and the new
PR-added scripts are in sync while leaving the existing replay path
(./scripts/replay/replay_test.sh) unchanged.
---
Outside diff comments:
In `@apps/gateway/src/routes/sso.ts`:
- Around line 129-144: Wrap the Auth0 calls in the POST /saml handler
(specifically the createSamlConnection and enableConnectionOnOrg calls and the
subsequent DB update using pool.query) in a try/catch block, logging the caught
error and returning a structured HTTP error response instead of letting
exceptions bubble; map known error cases (e.g., message includes "SSO not
configured") to a 503 with error key "sso_not_configured" and return a 502 with
error key "auth0_error" and a generic message like "Failed to configure SAML
connection" for other Auth0/network errors; apply the same
try/catch/error-mapping pattern to the POST /oidc and DELETE SSO routes so all
mutating SSO endpoints consistently handle and sanitize Auth0 errors.
- Around line 84-94: The code may create an Auth0 org via
createOrganization(tenantId, rows[0].name) but then fail to persist auth0_org_id
with pool.query, leaving an inconsistent external state; update the POST /sso
handler to use a real DB transaction client (start a transaction client, use
that client for the SELECT/UPDATE/INSERT operations instead of pool), and after
createOrganization succeeds ensure either (a) retry the DB UPDATE within the
same transaction client and commit only on success, or (b) if the DB update
ultimately fails, call the Auth0 cleanup delete (using the orgId returned from
createOrganization) as a compensating action before rolling back; apply the same
pattern (transaction client + retry or compensating delete) to the POST /saml,
POST /oidc and relevant DELETE routes that create external Auth0 resources so
external and DB state remain consistent.
In `@apps/gateway/src/services/device.ts`:
- Around line 50-56: The code currently checks fs.existsSync("/certs/ca.crt")
and falls back to grpc.credentials.createInsecure(), causing unencrypted gRPC in
K8s because certs are not mounted; fix by either (A) adding volume and
volumeMount entries to the K8s base manifests/Helm chart and values so /certs
(containing ca.crt, gateway-client.key, gateway-client.crt) is mounted into the
gateway and telemetry-service pods, or (B) update the certificate loading in
this module to read cert file paths from environment variables (e.g., CERT_DIR
or CA_PATH) instead of hardcoding "/certs", and ensure Helm values and K8s
manifests mount the Secret at those paths; ensure the code uses those env vars
when calling fs.existsSync and grpc.credentials.createSsl so createSsl is used
when certs are present.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7c54fd6d-56b2-41c6-8646-2d95d47c2c19
📒 Files selected for processing (23)
README.mdapps/dashboard/src/features/settings/SettingsPage.tsxapps/gateway/src/routes/__tests__/account.test.tsapps/gateway/src/routes/__tests__/sso.test.tsapps/gateway/src/routes/account.tsapps/gateway/src/routes/sso.tsapps/gateway/src/services/device.tsinfra/docker/docker-compose.ymlinfra/terraform/environments/dev/main.tfinfra/terraform/environments/staging/.gitignoreinfra/terraform/environments/staging/.terraform.lock.hclinfra/terraform/environments/staging/main.tfinfra/terraform/environments/staging/outputs.tfinfra/terraform/environments/staging/providers.tfinfra/terraform/environments/staging/variables.tfk8s/argocd/apps/grainguard-staging.yamlk8s/argocd/project.yamlk8s/helm/grainguard/values-dev.yamlk8s/helm/grainguard/values-prod.yamlk8s/helm/grainguard/values-staging.yamlscripts/load-tests/graphql-stress.jsscripts/load-tests/ingest-stress.jsscripts/load-tests/mixed-stack-stress.js
| if (!res.ok) { | ||
| const body = await res.json().catch(() => ({})); | ||
| throw new Error( | ||
| typeof body?.error === "string" ? body.error : `HTTP ${res.status}` | ||
| ); | ||
| } | ||
| const blob = await res.blob(); | ||
| const url = URL.createObjectURL(blob); | ||
| const a = document.createElement("a"); | ||
| a.href = url; | ||
| a.download = `grainguard-export-${new Date().toISOString().slice(0, 10)}.json`; | ||
| const disposition = res.headers.get("content-disposition") ?? ""; | ||
| const filenameMatch = disposition.match(/filename=\"?([^"]+)\"?/i); | ||
| a.download = filenameMatch?.[1] ?? `grainguard-export-${new Date().toISOString().slice(0, 10)}.json`; | ||
| a.click(); | ||
| URL.revokeObjectURL(url); | ||
| toast.success("Data exported"); | ||
| } catch { | ||
| toast.error("Export failed"); | ||
| } catch (e) { | ||
| toast.error(e instanceof Error ? e.message : "Export failed"); |
There was a problem hiding this comment.
Translate backend error codes before showing the toast.
The gateway currently returns machine codes like internal_error on export failures, so this branch will show "internal_error" directly to end users. Map known codes to friendly copy and keep raw values out of the toast.
💡 Proposed fix
if (!res.ok) {
const body = await res.json().catch(() => ({}));
- throw new Error(
- typeof body?.error === "string" ? body.error : `HTTP ${res.status}`
- );
+ const code = typeof body?.error === "string" ? body.error : null;
+ throw new Error(
+ code === "internal_error"
+ ? "Export failed. Please try again."
+ : code ?? `HTTP ${res.status}`
+ );
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (!res.ok) { | |
| const body = await res.json().catch(() => ({})); | |
| throw new Error( | |
| typeof body?.error === "string" ? body.error : `HTTP ${res.status}` | |
| ); | |
| } | |
| const blob = await res.blob(); | |
| const url = URL.createObjectURL(blob); | |
| const a = document.createElement("a"); | |
| a.href = url; | |
| a.download = `grainguard-export-${new Date().toISOString().slice(0, 10)}.json`; | |
| const disposition = res.headers.get("content-disposition") ?? ""; | |
| const filenameMatch = disposition.match(/filename=\"?([^"]+)\"?/i); | |
| a.download = filenameMatch?.[1] ?? `grainguard-export-${new Date().toISOString().slice(0, 10)}.json`; | |
| a.click(); | |
| URL.revokeObjectURL(url); | |
| toast.success("Data exported"); | |
| } catch { | |
| toast.error("Export failed"); | |
| } catch (e) { | |
| toast.error(e instanceof Error ? e.message : "Export failed"); | |
| if (!res.ok) { | |
| const body = await res.json().catch(() => ({})); | |
| const code = typeof body?.error === "string" ? body.error : null; | |
| throw new Error( | |
| code === "internal_error" | |
| ? "Export failed. Please try again." | |
| : code ?? `HTTP ${res.status}` | |
| ); | |
| } | |
| const blob = await res.blob(); | |
| const url = URL.createObjectURL(blob); | |
| const a = document.createElement("a"); | |
| a.href = url; | |
| const disposition = res.headers.get("content-disposition") ?? ""; | |
| const filenameMatch = disposition.match(/filename=\"?([^"]+)\"?/i); | |
| a.download = filenameMatch?.[1] ?? `grainguard-export-${new Date().toISOString().slice(0, 10)}.json`; | |
| a.click(); | |
| URL.revokeObjectURL(url); | |
| toast.success("Data exported"); | |
| } catch (e) { | |
| toast.error(e instanceof Error ? e.message : "Export failed"); |
🧰 Tools
🪛 GitHub Actions: CI
[error] 117-117: ESLint (no-useless-escape): Unnecessary escape character: "
🪛 GitHub Check: TS Checks — dashboard
[failure] 117-117:
Unnecessary escape character: "
[failure] 117-117:
Unnecessary escape character: "
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/dashboard/src/features/settings/SettingsPage.tsx` around lines 106 -
123, The catch currently shows raw backend error codes (e.g., "internal_error")
to users; add a small translation layer and use it before calling toast.error.
Implement a helper like translateExportError(errorCode: string): string (add
near SettingsPage or above the export handler) that maps known codes (e.g.,
"internal_error" -> "An internal server error occurred. Please try again
later.", "not_authenticated" -> "Please sign in to export data.", etc.) and
returns a friendly default for unknown values; then replace the toast.error call
in the catch block to use toast.error(translateExportError(e instanceof Error ?
e.message : "")) so users see human-friendly messages instead of raw machine
codes.
| // Most tenant-linked tables cascade from tenants, so deleting the | ||
| // tenant removes them automatically. Immutable audit_events are | ||
| // intentionally retained for compliance and cannot be deleted. | ||
| await client.query("DELETE FROM tenants WHERE id = $1", [tenantId]); | ||
|
|
||
| await client.query("COMMIT"); | ||
| return res.json({ deleted: true, scope: "tenant", message: "Tenant and all data deleted" }); | ||
| const immutableAuditEvents = auditEventRows[0]?.count ?? 0; | ||
| return res.json({ | ||
| deleted: true, | ||
| scope: "tenant", | ||
| message: | ||
| immutableAuditEvents > 0 | ||
| ? "Tenant deleted. Immutable audit events were retained for compliance." | ||
| : "Tenant and all mutable data deleted", | ||
| }); |
There was a problem hiding this comment.
Update the delete UX/docs to match the retained-audit behavior.
This branch now intentionally keeps audit_events, but the dashboard confirmation in apps/dashboard/src/features/settings/SettingsPage.tsx still says audit logs will be permanently deleted. Please update that copy in the same PR so the erasure flow does not promise behavior the backend no longer performs.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/gateway/src/routes/account.ts` around lines 95 - 109, The backend now
retains immutable audit_events when deleting a tenant (see the deletion response
constructed in account route: return res.json(...) in the handler that performs
DELETE FROM tenants), so update the dashboard confirmation copy to match: in the
SettingsPage component (SettingsPage.tsx) locate the tenant erasure
confirmation/confirmation modal text (the string that currently says audit logs
will be permanently deleted) and change it to indicate audit logs are retained
for compliance (e.g., “Audit logs will be retained and cannot be deleted for
compliance”); update any related help text or docs strings in the same component
to reflect retained-audit behavior so the UI matches the account.ts response.
| try { | ||
| const tenantId = req.user!.tenantId; | ||
|
|
||
| const [tenantResult, usersResult, devicesResult, alertsResult, auditResult, keysResult] = | ||
| await Promise.all([ | ||
| pool.query("SELECT id, name, slug, plan, email, created_at FROM tenants WHERE id = $1", [tenantId]), | ||
| pool.query("SELECT id, email, role, created_at FROM tenant_users WHERE tenant_id = $1", [tenantId]), | ||
| pool.query("SELECT id, serial_number, created_at FROM devices WHERE tenant_id = $1", [tenantId]), | ||
| pool.query("SELECT id, name, metric, operator, threshold, enabled, created_at FROM alert_rules WHERE tenant_id = $1", [tenantId]), | ||
| pool.query("SELECT id, event_type, actor_id, resource_type, payload, created_at FROM audit_events WHERE tenant_id = $1 ORDER BY created_at DESC LIMIT 1000", [tenantId]), | ||
| pool.query("SELECT id, name, created_at, expires_at, revoked_at FROM api_keys WHERE tenant_id = $1", [tenantId]), |
There was a problem hiding this comment.
Restrict this export to admins or narrow it to per-user data.
This handler exports tenant_users, devices, alert_rules, audit_events, and api_keys for the whole tenant, so any authenticated member can download other users’ emails and tenant-wide operational data. Either add an admin check here or scope the export to the requesting user’s own records.
As per coding guidelines, "Verify tenantId, RBAC, and CSRF protections are applied consistently on all protected routes."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/gateway/src/routes/account.ts` around lines 136 - 146, This route is
returning tenant-wide sensitive data; enforce RBAC by checking the authenticated
user's role (req.user and req.user.role) before running the tenant-wide queries:
if req.user.role !== 'admin' return 403, otherwise continue; alternatively, for
non-admins scope each query to the requesting user by replacing tenant-wide
queries for tenant_users, devices, alert_rules, audit_events, and api_keys with
user-scoped queries (filter by req.user.id or actor_id where appropriate) so
only the requester’s records are exported; update the code around the existing
tenantId and Promise.all block that runs pool.query for tenant_users, devices,
alert_rules, audit_events, and api_keys to enforce the chosen behavior.
| try { | ||
| const connections = await listOrgConnections(orgId); | ||
|
|
||
| return res.json({ configured: true, orgId, connections }); | ||
| } catch (error) { | ||
| console.error("[sso] failed to list org connections:", error); | ||
| return res.json({ | ||
| configured: true, | ||
| orgId, | ||
| connections: [], | ||
| warning: "Auth0 management API is unavailable for the gateway right now.", | ||
| }); | ||
| } |
There was a problem hiding this comment.
Catching all errors masks configuration issues.
The try/catch catches every error from listOrgConnections, including configuration errors thrown by getManagementToken() (e.g., missing AUTH0_DOMAIN, AUTH0_MANAGEMENT_CLIENT_ID, AUTH0_MANAGEMENT_CLIENT_SECRET). These are operator misconfigurations that should surface clearly, not be hidden behind a generic "unavailable" warning.
Consider distinguishing transient API failures from configuration errors:
Proposed fix to differentiate error types
try {
const connections = await listOrgConnections(orgId);
return res.json({ configured: true, orgId, connections });
} catch (error) {
+ const message = error instanceof Error ? error.message : "";
+ // Configuration errors should fail loudly, not be masked
+ if (message.includes("SSO not configured") || message.includes("must be set")) {
+ console.error("[sso] configuration error:", error);
+ return res.status(500).json({ error: "sso_misconfigured" });
+ }
console.error("[sso] failed to list org connections:", error);
return res.json({
configured: true,
orgId,
connections: [],
warning: "Auth0 management API is unavailable for the gateway right now.",
});
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/gateway/src/routes/sso.ts` around lines 46 - 58, The catch in the route
swallowing every error from listOrgConnections hides operator configuration
problems (e.g. errors from getManagementToken when AUTH0_* env vars are
missing); update error handling so configuration errors surface: have
getManagementToken (or listOrgConnections) throw a distinct error type or
include a recognisable property/message (e.g. ConfigError or error.code ===
'CONFIG_MISSING') and in the sso route catch distinguish those and rethrow or
return a 5xx response indicating misconfiguration, while still converting
transient Auth0 API failures into the existing configured:true with warning
response; reference listOrgConnections and getManagementToken when adding the
new error type check and branching logic.
| variable "project" { | ||
| type = string | ||
| default = "grainguard" | ||
| } | ||
|
|
||
| variable "aws_region" { | ||
| type = string | ||
| default = "us-east-1" | ||
| } | ||
|
|
||
| variable "db_password" { | ||
| type = string | ||
| sensitive = true | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
LGTM with minor suggestion.
Variables are correctly defined. The db_password is appropriately marked sensitive = true with no default, requiring explicit input.
Consider adding description fields for better documentation and terraform plan output clarity:
📝 Add descriptions
variable "project" {
type = string
default = "grainguard"
+ description = "Project name used for resource naming"
}
variable "aws_region" {
type = string
default = "us-east-1"
+ description = "AWS region for staging infrastructure"
}
variable "db_password" {
type = string
sensitive = true
+ description = "RDS database master password"
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| variable "project" { | |
| type = string | |
| default = "grainguard" | |
| } | |
| variable "aws_region" { | |
| type = string | |
| default = "us-east-1" | |
| } | |
| variable "db_password" { | |
| type = string | |
| sensitive = true | |
| } | |
| variable "project" { | |
| type = string | |
| default = "grainguard" | |
| description = "Project name used for resource naming" | |
| } | |
| variable "aws_region" { | |
| type = string | |
| default = "us-east-1" | |
| description = "AWS region for staging infrastructure" | |
| } | |
| variable "db_password" { | |
| type = string | |
| sensitive = true | |
| description = "RDS database master password" | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@infra/terraform/environments/staging/variables.tf` around lines 1 - 14, Add
descriptive text for each Terraform input variable to improve documentation and
terraform plan clarity: add a description to variable "project" explaining it
holds the GCP/AWS project or service name, to variable "aws_region" describing
the AWS region (e.g., default region for resources), and to variable
"db_password" clarifying this is the sensitive database password with no default
and must be provided; update the variable blocks for project, aws_region, and
db_password to include these description strings.
| repoURL: https://github.com/pahuldeepp/GrainGuard-.git | ||
| targetRevision: HEAD |
There was a problem hiding this comment.
targetRevision: HEAD may cause unintended deployments to staging.
Using HEAD means every push to the default branch triggers a staging deployment. For a pre-production environment meant to validate rollouts, consider tracking a dedicated branch (e.g., staging) or using tags/release candidates to control what gets deployed.
♻️ Suggested alternatives
- targetRevision: HEAD
+ targetRevision: staging # or use release tags like v1.2.0-rc1📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| repoURL: https://github.com/pahuldeepp/GrainGuard-.git | |
| targetRevision: HEAD | |
| repoURL: https://github.com/pahuldeepp/GrainGuard-.git | |
| targetRevision: staging # or use release tags like v1.2.0-rc1 |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@k8s/argocd/apps/grainguard-staging.yaml` around lines 13 - 14, The ArgoCD
application currently sets targetRevision: HEAD which will promote every push on
the default branch into the staging cluster; update the ArgoCD app config by
replacing targetRevision: HEAD with a controlled ref such as a dedicated branch
name (e.g., targetRevision: staging) or a tag/semver release ref to ensure only
intended commits are deployed, and document the chosen strategy so repoURL and
targetRevision remain consistent with your deployment workflow.
| - name: staging-deployer | ||
| description: Can sync staging for pre-production validation | ||
| policies: | ||
| - p, proj:grainguard:staging-deployer, applications, sync, grainguard/grainguard-staging, allow | ||
| - p, proj:grainguard:staging-deployer, applications, get, grainguard/grainguard-staging, allow |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider adding groups to bind the role to subjects.
The staging-deployer role is defined with correct policies, but like the existing dev-deployer and prod-deployer roles, it lacks a groups field. Without group bindings, no users or service accounts can assume this role.
If this is intentional (bindings managed elsewhere via Argo CD RBAC ConfigMap), this is fine. Otherwise, add:
📝 Example group binding
- name: staging-deployer
description: Can sync staging for pre-production validation
policies:
- p, proj:grainguard:staging-deployer, applications, sync, grainguard/grainguard-staging, allow
- p, proj:grainguard:staging-deployer, applications, get, grainguard/grainguard-staging, allow
+ groups:
+ - staging-team
+ # or for CI: system:serviceaccount:argocd:<ci-service-account>🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@k8s/argocd/project.yaml` around lines 36 - 40, The staging-deployer role
(name: staging-deployer) defines policies but has no groups field, so add a
groups array to bind it to the same subject group(s) used by
dev-deployer/prod-deployer (e.g., the CI/service account or team group) or
explicitly note that bindings are handled elsewhere; update the staging-deployer
role to include a groups: [...] entry (matching the existing
dev-deployer/prod-deployer group names) so the policies can be assumed by the
intended users/service accounts.
| namespace: grainguard-prod | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check prod Terraform IRSA configuration
echo "=== Prod IRSA namespace configuration ==="
rg -n 'k8s_namespace' infra/terraform/environments/prod/ 2>/dev/null || echo "No prod environment found or no k8s_namespace set"
echo -e "\n=== Full prod main.tf iam_irsa module block ==="
ast-grep --pattern $'module "iam_irsa" {
$$$
}' | head -50Repository: pahuldeepp/GrainGuard-
Length of output: 206
🏁 Script executed:
#!/bin/bash
# Check Terraform directory structure
echo "=== Terraform environments structure ==="
find infra/terraform/environments -type d -maxdepth 1 2>/dev/null || echo "No environments dir found"
echo -e "\n=== All Terraform files in infra/terraform ==="
find infra/terraform -name "*.tf" -type f | head -20
echo -e "\n=== Check for any IRSA module definitions ==="
rg -l 'module.*iam_irsa' infra/terraform/ 2>/dev/null || echo "No IRSA modules found"
echo -e "\n=== Check if prod namespace is set anywhere ==="
rg -r 'grainguard-prod' infra/terraform/ 2>/dev/null || echo "No grainguard-prod references found in Terraform"Repository: pahuldeepp/GrainGuard-
Length of output: 1506
🏁 Script executed:
#!/bin/bash
# Examine dev environment IRSA configuration to see the pattern
echo "=== Dev IRSA module configuration ==="
rg -A 5 'module.*"iam_irsa"' infra/terraform/environments/dev/main.tf 2>/dev/null
echo -e "\n=== Staging IRSA module configuration ==="
rg -A 5 'module.*"iam_irsa"' infra/terraform/environments/staging/main.tf 2>/dev/null
echo -e "\n=== Check dev/staging helm values for namespace ==="
find k8s/helm -name "values*.yaml" -type f | xargs grep -l namespace 2>/dev/null || echo "No namespaces in values files"Repository: pahuldeepp/GrainGuard-
Length of output: 866
🏁 Script executed:
#!/bin/bash
# Check the iam_irsa module to understand namespace handling
echo "=== iam_irsa module variables/main ==="
cat infra/terraform/modules/iam_irsa/variables.tf 2>/dev/null | head -30
echo -e "\n=== iam_irsa module main logic (relevant parts) ==="
rg -A 2 'namespace|environment' infra/terraform/modules/iam_irsa/main.tf 2>/dev/null | head -20
echo -e "\n=== Check dev values namespace ==="
grep namespace k8s/helm/grainguard/values-dev.yaml 2>/dev/null
echo -e "\n=== Check staging values namespace ==="
grep namespace k8s/helm/grainguard/values-staging.yaml 2>/dev/null
echo -e "\n=== Check prod values namespace ==="
grep namespace k8s/helm/grainguard/values-prod.yaml 2>/dev/nullRepository: pahuldeepp/GrainGuard-
Length of output: 1267
Prod infrastructure setup is incomplete - Terraform environment missing.
Adding namespace: grainguard-prod to the Helm values is correct for consistency, but the prod Terraform environment doesn't exist yet. Dev and staging both have environment configurations (infra/terraform/environments/{dev,staging}/main.tf), but there is no prod environment. Creating prod namespace in Helm values without corresponding prod Terraform infrastructure means pods cannot be deployed to production. Create infra/terraform/environments/prod/ with the same IRSA module configuration before deploying to production, passing environment = "prod" to match the grainguard-prod namespace.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@k8s/helm/grainguard/values-prod.yaml` around lines 1 - 2, The Helm change
added namespace: grainguard-prod but there is no Terraform prod environment;
create a new Terraform environment directory mirroring
infra/terraform/environments/dev and infra/terraform/environments/staging (e.g.,
infra/terraform/environments/prod/) and copy the same IRSA module configuration
found in those main.tf files, updating the module/input that sets environment to
"prod" so the IRSA module and any resources match the grainguard-prod namespace;
ensure any variables or backend configs are adjusted to match prod conventions
and that the environment value is consistently "prod" across the module and
Terraform workspace configuration.
| # k6 load tests (requires running stack) | ||
| k6 run tests/load/spike.js | ||
| k6 run tests/load/soak.js | ||
| k6 run tests/load/stress.js | ||
|
|
||
| # Chaos tests (requires kubectl + live cluster) | ||
| bash tests/chaos/run-all.sh | ||
|
|
||
| # Replay + idempotency test | ||
| ./scripts/replay/replay_test.sh | ||
| ``` | ||
|
|
||
| Note: | ||
| - The core load-test scripts above are committed in `tests/load/`. | ||
| - Cluster-level chaos automation is not currently committed on `master`; add or restore it before relying on README-driven chaos drills. |
There was a problem hiding this comment.
Load-test paths in Testing section are out of sync with the new committed scripts.
Line 191-Line 193 and Line 200 reference tests/load/, but this PR adds load tests under scripts/load-tests/ (graphql-stress.js, ingest-stress.js, mixed-stack-stress.js). Please align the README so users run the intended scripts.
📌 Proposed README fix
# k6 load tests (requires running stack)
-k6 run tests/load/spike.js
-k6 run tests/load/soak.js
-k6 run tests/load/stress.js
+k6 run scripts/load-tests/graphql-stress.js
+k6 run scripts/load-tests/ingest-stress.js
+k6 run scripts/load-tests/mixed-stack-stress.js
# Replay + idempotency test
./scripts/replay/replay_test.sh
@@
Note:
-- The core load-test scripts above are committed in `tests/load/`.
+- The core load-test scripts above are committed in `scripts/load-tests/`.
- Cluster-level chaos automation is not currently committed on `master`; add or restore it before relying on README-driven chaos drills.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # k6 load tests (requires running stack) | |
| k6 run tests/load/spike.js | |
| k6 run tests/load/soak.js | |
| k6 run tests/load/stress.js | |
| # Chaos tests (requires kubectl + live cluster) | |
| bash tests/chaos/run-all.sh | |
| # Replay + idempotency test | |
| ./scripts/replay/replay_test.sh | |
| ``` | |
| Note: | |
| - The core load-test scripts above are committed in `tests/load/`. | |
| - Cluster-level chaos automation is not currently committed on `master`; add or restore it before relying on README-driven chaos drills. | |
| # k6 load tests (requires running stack) | |
| k6 run scripts/load-tests/graphql-stress.js | |
| k6 run scripts/load-tests/ingest-stress.js | |
| k6 run scripts/load-tests/mixed-stack-stress.js | |
| # Replay + idempotency test | |
| ./scripts/replay/replay_test.sh |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@README.md` around lines 190 - 201, Update the README Testing section to point
to the new load-test locations and filenames: replace references to tests/load/
with scripts/load-tests/ and list the committed scripts graphql-stress.js,
ingest-stress.js, and mixed-stack-stress.js (e.g., k6 run
scripts/load-tests/graphql-stress.js, etc.); also update the Note bullet that
mentions tests/load/ to reference scripts/load-tests/ so the README and the new
PR-added scripts are in sync while leaving the existing replay path
(./scripts/replay/replay_test.sh) unchanged.
HIGH PRIORITY: - #9: Fix CSRF timing attack — pad buffers to fixed length before timingSafeEqual - #10: Upgrade circuit breaker to distributed (Redis-backed state sharing across pods) - #8: Fix saga recovery infinite loop — mark corrupted payloads as FAILED instead of retrying MEDIUM PRIORITY: - #11: Add webhook idempotency check (dedup by endpoint_id + event_type) - #12: Assert Stripe webhook body is Buffer before signature verification - #13: Eliminate N+1 query in deviceTelemetry resolver (single JOIN query) - #15: Add ORDER BY + LIMIT to saga FindByCorrelationID for deterministic results - #17: Make critical audit events (auth, admin) throw on failure instead of silent swallow LOW PRIORITY: - #20: Add 10s query timeout to all saga repository DB operations - #23: Add IsValidStatus validator for saga status constants - #24: Set httpServer.timeout (30s) and keepAliveTimeout (65s) on BFF - #25: Add RabbitMQ heartbeat (30s) and connection error/close handlers - #7: Fix remaining saga JSON marshal error check (initialErr) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
apps/dashboard/src/features/settings/SettingsPage.tsx (1)
100-125: 🧹 Nitpick | 🔵 TrivialConsider adding a loading state to prevent duplicate export requests.
Unlike
handleDeleteAccountwhich tracksdeletingstate,handleExporthas no loading indicator. Users can click "Export My Data" multiple times while the request is in-flight, triggering redundant downloads.♻️ Proposed improvement
const [deleting, setDeleting] = useState(false); +const [exporting, setExporting] = useState(false); async function handleExport() { + if (exporting) return; + setExporting(true); try { // ... existing logic ... } catch (e) { toast.error(e instanceof Error ? e.message : "Export failed"); + } finally { + setExporting(false); } }Then disable the button:
<button onClick={handleExport} + disabled={exporting} className="px-4 py-2 text-sm bg-gray-100 ..." > - Export My Data + {exporting ? "Exporting..." : "Export My Data"} </button>🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/dashboard/src/features/settings/SettingsPage.tsx` around lines 100 - 125, handleExport has no loading guard so users can trigger duplicate export requests; add a boolean state (e.g., exporting) and set it true at start of handleExport and false in finally, bail out early if exporting is already true, and use that state to disable the "Export My Data" button and show a spinner/disabled text; update references to getAccessTokenSilently, fetch, and toast inside handleExport accordingly and ensure URL.revokeObjectURL runs even on errors by placing it in the finally block.
♻️ Duplicate comments (1)
apps/dashboard/src/features/settings/SettingsPage.tsx (1)
106-111:⚠️ Potential issue | 🟡 MinorRaw backend error codes are still exposed to end users.
The error handling extracts machine codes (
internal_error,missing_token,tenant_missing,invalid_token) and displays them directly viatoast.error(e.message). Users will see cryptic messages instead of friendly copy.Map known error codes to user-friendly messages before displaying.
💡 Proposed fix
+const ERROR_MESSAGES: Record<string, string> = { + internal_error: "Export failed. Please try again later.", + missing_token: "Please sign in to export your data.", + invalid_token: "Your session has expired. Please sign in again.", + tenant_missing: "Account not found. Please contact support.", +}; + +function friendlyErrorMessage(code: string): string { + return ERROR_MESSAGES[code] ?? code; +} + async function handleExport() { try { const token = await getAccessTokenSilently(); const res = await fetch(`${GW}/account/export`, { headers: { Authorization: `Bearer ${token}` }, }); if (!res.ok) { const body = await res.json().catch(() => ({})); throw new Error( typeof body?.error === "string" ? body.error : `HTTP ${res.status}` ); } // ... blob handling ... } catch (e) { - toast.error(e instanceof Error ? e.message : "Export failed"); + toast.error( + e instanceof Error ? friendlyErrorMessage(e.message) : "Export failed" + ); } }Also applies to: 122-123
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/dashboard/src/features/settings/SettingsPage.tsx` around lines 106 - 111, The current response error handling throws raw backend error codes (body?.error) which then get shown to users; update the error handling around the fetch response check (the block that reads const body = await res.json().catch(() => ({})); and throws new Error(...)) to map known machine error codes (e.g., "internal_error", "missing_token", "tenant_missing", "invalid_token") to user-friendly messages via a simple map/object, and throw or return a sanitized message instead; ensure the same mapping is applied wherever you consume e.message for toast.error (the other occurrence mentioned), using a default fallback like "An unexpected error occurred" if the code is unknown.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@apps/dashboard/src/features/settings/SettingsPage.tsx`:
- Around line 100-125: handleExport has no loading guard so users can trigger
duplicate export requests; add a boolean state (e.g., exporting) and set it true
at start of handleExport and false in finally, bail out early if exporting is
already true, and use that state to disable the "Export My Data" button and show
a spinner/disabled text; update references to getAccessTokenSilently, fetch, and
toast inside handleExport accordingly and ensure URL.revokeObjectURL runs even
on errors by placing it in the finally block.
---
Duplicate comments:
In `@apps/dashboard/src/features/settings/SettingsPage.tsx`:
- Around line 106-111: The current response error handling throws raw backend
error codes (body?.error) which then get shown to users; update the error
handling around the fetch response check (the block that reads const body =
await res.json().catch(() => ({})); and throws new Error(...)) to map known
machine error codes (e.g., "internal_error", "missing_token", "tenant_missing",
"invalid_token") to user-friendly messages via a simple map/object, and throw or
return a sanitized message instead; ensure the same mapping is applied wherever
you consume e.message for toast.error (the other occurrence mentioned), using a
default fallback like "An unexpected error occurred" if the code is unknown.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 69d173e2-4575-4768-bff7-29843ce7e93b
📒 Files selected for processing (1)
apps/dashboard/src/features/settings/SettingsPage.tsx
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/gateway/src/services/device.ts`:
- Around line 11-18: The protoCandidates array contains an unreachable fallback
"/libs/proto/device.proto" for production because the Docker runtime copies libs
to /app/libs; remove that entry from protoCandidates (or replace it with a
documented environment-specific path) and update any comment near
protoCandidates to state that the production image uses /app/libs so only the
valid paths are listed; reference the protoCandidates constant in
apps/gateway/src/services/device.ts when making the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 3fc277f2-5fb5-449e-a899-f3bcbc49e01b
📒 Files selected for processing (1)
apps/gateway/src/services/device.ts
| const protoCandidates = [ | ||
| path.resolve(process.cwd(), "libs/proto/device.proto"), | ||
| path.resolve(__dirname, "../../libs/proto/device.proto"), | ||
| path.resolve(__dirname, "../../../libs/proto/device.proto"), | ||
| path.resolve(__dirname, "../../../../libs/proto/device.proto"), | ||
| "/app/libs/proto/device.proto", | ||
| "/libs/proto/device.proto", | ||
| ]; |
There was a problem hiding this comment.
Unreachable fallback path /libs/proto/device.proto.
Per the Dockerfile context, the libs directory is copied to /app/libs in the runtime image, not /libs. Line 17's path will never match in the production container. Consider removing it to avoid confusion, or add a comment explaining which environment it targets.
const protoCandidates = [
path.resolve(process.cwd(), "libs/proto/device.proto"),
path.resolve(__dirname, "../../libs/proto/device.proto"),
path.resolve(__dirname, "../../../libs/proto/device.proto"),
path.resolve(__dirname, "../../../../libs/proto/device.proto"),
"/app/libs/proto/device.proto",
- "/libs/proto/device.proto",
];📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const protoCandidates = [ | |
| path.resolve(process.cwd(), "libs/proto/device.proto"), | |
| path.resolve(__dirname, "../../libs/proto/device.proto"), | |
| path.resolve(__dirname, "../../../libs/proto/device.proto"), | |
| path.resolve(__dirname, "../../../../libs/proto/device.proto"), | |
| "/app/libs/proto/device.proto", | |
| "/libs/proto/device.proto", | |
| ]; | |
| const protoCandidates = [ | |
| path.resolve(process.cwd(), "libs/proto/device.proto"), | |
| path.resolve(__dirname, "../../libs/proto/device.proto"), | |
| path.resolve(__dirname, "../../../libs/proto/device.proto"), | |
| path.resolve(__dirname, "../../../../libs/proto/device.proto"), | |
| "/app/libs/proto/device.proto", | |
| ]; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/gateway/src/services/device.ts` around lines 11 - 18, The
protoCandidates array contains an unreachable fallback
"/libs/proto/device.proto" for production because the Docker runtime copies libs
to /app/libs; remove that entry from protoCandidates (or replace it with a
documented environment-specific path) and update any comment near
protoCandidates to state that the production image uses /app/libs so only the
valid paths are listed; reference the protoCandidates constant in
apps/gateway/src/services/device.ts when making the change.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
apps/gateway/src/routes/account.ts (1)
141-149:⚠️ Potential issue | 🔴 CriticalGate the tenant-wide export behind admin RBAC.
This still exports
tenant_users,devices,alert_rules,audit_events, andapi_keysfor the whole tenant without any role check. Any authenticated member can download other users’ emails and tenant-wide operational data. Require an admin role before these queries run, or scope the export to the requesting user’s own records.As per coding guidelines, "Verify tenantId, RBAC, and CSRF protections are applied consistently on all protected routes."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/gateway/src/routes/account.ts` around lines 141 - 149, This endpoint performs tenant-wide exports (tenantResult, usersResult, devicesResult, alertsResult, auditResult, keysResult via pool.query using tenantId) without RBAC; add an explicit admin role check (e.g., verify req.user.role === "admin" or call hasAdminRole(req.user)) before running these queries and return 403 if not authorized, or alternatively scope queries to the requesting user (filter tenant_users, devices, api_keys, audit_events by actor_id or user_id) so only the caller’s records are exported; implement the check at the top of the export handler (before the Promise.all) and use the existing auth principal from the request to decide.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/gateway/src/routes/account.ts`:
- Around line 161-166: The export response currently sets Content-Type and
Content-Disposition but does not prevent caching; update the handler around
where res.setHeader(...) and return res.json(exportData) are called to add
cache-busting headers (e.g. set "Cache-Control: no-store, no-cache,
must-revalidate", "Pragma: no-cache", and "Expires: 0") before returning; locate
the logic in the account export route (the block that calls res.setHeader(...)
and res.json(exportData)) and insert these headers so the exported tenant file
is not stored by browsers or intermediaries.
- Around line 49-50: The accountRouter currently applies apiRateLimiter globally
via accountRouter.use(), but the DELETE /account/me and GET /account/export
routes still re-apply apiRateLimiter; replace those redundant uses with
route-level stricter limiters: remove apiRateLimiter from the DELETE handler for
the account erasure route and attach strictRateLimiter (100 req/60s) to that
route, and remove apiRateLimiter from the GET export handler and attach
bulkRateLimiter (10 req/60s) to the export route; locate the routes by the
DELETE /account/me and GET /account/export handlers and the global
accountRouter.use() and swap the middleware references accordingly so global
apiRateLimiter remains in place for other endpoints.
---
Duplicate comments:
In `@apps/gateway/src/routes/account.ts`:
- Around line 141-149: This endpoint performs tenant-wide exports (tenantResult,
usersResult, devicesResult, alertsResult, auditResult, keysResult via pool.query
using tenantId) without RBAC; add an explicit admin role check (e.g., verify
req.user.role === "admin" or call hasAdminRole(req.user)) before running these
queries and return 403 if not authorized, or alternatively scope queries to the
requesting user (filter tenant_users, devices, api_keys, audit_events by
actor_id or user_id) so only the caller’s records are exported; implement the
check at the top of the export handler (before the Promise.all) and use the
existing auth principal from the request to decide.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 1ab515ac-e3f6-4620-b541-be514fa79ba2
📒 Files selected for processing (1)
apps/gateway/src/routes/account.ts
| apiRateLimiter, | ||
| authMiddleware, |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's look at the account.ts file to understand the router setup
wc -l apps/gateway/src/routes/account.tsRepository: pahuldeepp/GrainGuard-
Length of output: 104
🏁 Script executed:
# Read the relevant sections of account.ts
head -20 apps/gateway/src/routes/account.tsRepository: pahuldeepp/GrainGuard-
Length of output: 823
🏁 Script executed:
# Check lines around line 49-50
sed -n '40,60p' apps/gateway/src/routes/account.tsRepository: pahuldeepp/GrainGuard-
Length of output: 662
🏁 Script executed:
# Check lines around line 135-136
sed -n '125,145p' apps/gateway/src/routes/account.tsRepository: pahuldeepp/GrainGuard-
Length of output: 905
🏁 Script executed:
# Check the rateLimiting middleware to see what limiters are available
cat apps/gateway/src/middleware/rateLimiting.tsRepository: pahuldeepp/GrainGuard-
Length of output: 2116
Replace redundant apiRateLimiter applications with dedicated route-level limiters.
Line 8 already applies apiRateLimiter to all account routes via accountRouter.use(). Reapplying it on the DELETE and GET /export endpoints at lines 49-50 and 135-136 is redundant and prevents these sensitive GDPR endpoints from using tighter rate limits.
Replace with:
strictRateLimiter(100 req/60s) for DELETE /account/me (account erasure)bulkRateLimiter(10 req/60s) for GET /account/export (data portability)
Suggested change
-import { apiRateLimiter } from "../middleware/rateLimiting";
+import { apiRateLimiter, bulkRateLimiter, strictRateLimiter } from "../middleware/rateLimiting";
...
accountRouter.delete(
"/account/me",
- apiRateLimiter,
+ strictRateLimiter,
authMiddleware,
...
accountRouter.get(
"/account/export",
- apiRateLimiter,
+ bulkRateLimiter,
authMiddleware,📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| apiRateLimiter, | |
| authMiddleware, | |
| strictRateLimiter, | |
| authMiddleware, |
🧰 Tools
🪛 GitHub Check: CodeQL
[failure] 49-49: Missing rate limiting
This route handler performs a database access, but is not rate-limited.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/gateway/src/routes/account.ts` around lines 49 - 50, The accountRouter
currently applies apiRateLimiter globally via accountRouter.use(), but the
DELETE /account/me and GET /account/export routes still re-apply apiRateLimiter;
replace those redundant uses with route-level stricter limiters: remove
apiRateLimiter from the DELETE handler for the account erasure route and attach
strictRateLimiter (100 req/60s) to that route, and remove apiRateLimiter from
the GET export handler and attach bulkRateLimiter (10 req/60s) to the export
route; locate the routes by the DELETE /account/me and GET /account/export
handlers and the global accountRouter.use() and swap the middleware references
accordingly so global apiRateLimiter remains in place for other endpoints.
| res.setHeader("Content-Type", "application/json"); | ||
| res.setHeader( | ||
| "Content-Disposition", | ||
| `attachment; filename="grainguard-export-${tenantId}-${new Date().toISOString().slice(0, 10)}.json"` | ||
| ); | ||
| return res.json(exportData); |
There was a problem hiding this comment.
Mark the export response as non-cacheable.
This attachment contains sensitive tenant data, but the response never sets Cache-Control: no-store. Browsers and intermediaries can retain the file unless caching is explicitly disabled.
🛡️ Suggested header hardening
res.setHeader("Content-Type", "application/json");
+ res.setHeader("Cache-Control", "private, no-store, max-age=0");
+ res.setHeader("Pragma", "no-cache");
res.setHeader(
"Content-Disposition",
`attachment; filename="grainguard-export-${tenantId}-${new Date().toISOString().slice(0, 10)}.json"`
);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/gateway/src/routes/account.ts` around lines 161 - 166, The export
response currently sets Content-Type and Content-Disposition but does not
prevent caching; update the handler around where res.setHeader(...) and return
res.json(exportData) are called to add cache-busting headers (e.g. set
"Cache-Control: no-store, no-cache, must-revalidate", "Pragma: no-cache", and
"Expires: 0") before returning; locate the logic in the account export route
(the block that calls res.setHeader(...) and res.json(exportData)) and insert
these headers so the exported tenant file is not stored by browsers or
intermediaries.
| accountRouter.get( | ||
| "/account/me", | ||
| authMiddleware, | ||
| apiRateLimiter, |
Check failure
Code scanning / CodeQL
Missing rate limiting High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 29 days ago
In general, to fix missing rate limiting on an Express route that performs database access, we should attach a proper rate-limiting middleware (for example, using the well-known express-rate-limit package) to that route or to a router that encloses it. This middleware tracks requests per client (typically per IP or auth identifier) over a sliding window and rejects or slows excessive requests, preventing easy exploitation for denial-of-service.
For this specific file, the simplest, least-disruptive fix that preserves existing behavior is to define a concrete rate limiter using express-rate-limit in apps/gateway/src/routes/account.ts and use it instead of the currently imported apiRateLimiter on this route. Because we cannot modify files other than the one shown, we’ll (1) add an import of express-rate-limit, (2) define a local accountMeRateLimiter configured to allow a reasonable burst (for example, 60 requests per 15 minutes per IP) and (3) replace apiRateLimiter in the /account/me route middleware chain with this new limiter. This guarantees that the /account/me route is clearly and directly protected by a recognized rate-limiting middleware, satisfying the security requirement without changing the functional logic of the handler itself.
Concretely:
- Edit the import section in
apps/gateway/src/routes/account.tsto importrateLimitfromexpress-rate-limit. - Directly below the
accountRoutercreation, defineconst accountMeRateLimiter = rateLimit({...})with a sensiblewindowMs,max, andstandardHeaders/legacyHeadersoptions. - Update the
accountRouter.get("/account/me", ...)call to useaccountMeRateLimiterinstead ofapiRateLimiter. All other code in the handler stays unchanged.
| @@ -2,15 +2,23 @@ | ||
| import { authMiddleware } from "../middleware/auth"; | ||
| import { apiRateLimiter } from "../middleware/rateLimiting"; | ||
| import { writePool as pool } from "../database/db"; | ||
| import rateLimit from "express-rate-limit"; | ||
|
|
||
| export const accountRouter = Router(); | ||
|
|
||
| const accountMeRateLimiter = rateLimit({ | ||
| windowMs: 15 * 60 * 1000, // 15 minutes | ||
| max: 100, // limit each IP to 100 requests per windowMs | ||
| standardHeaders: true, | ||
| legacyHeaders: false, | ||
| }); | ||
|
|
||
| // ── GET /account/me ───────────────────────────────────────────────────────── | ||
| // Returns the current user's profile and tenant info. | ||
| accountRouter.get( | ||
| "/account/me", | ||
| authMiddleware, | ||
| apiRateLimiter, | ||
| accountMeRateLimiter, | ||
| async (req: Request, res: Response) => { | ||
| const tenantId = req.user!.tenantId; | ||
| const userId = req.user!.sub; |
| accountRouter.delete( | ||
| "/account/me", | ||
| authMiddleware, | ||
| apiRateLimiter, |
Check failure
Code scanning / CodeQL
Missing rate limiting High
Copilot Autofix
AI 29 days ago
Copilot could not generate an autofix suggestion
Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.
| alertRules: alertsResult.rows, | ||
| auditEvents: auditResult.rows, | ||
| apiKeys: keysResult.rows, | ||
| }; |
Check failure
Code scanning / CodeQL
Missing rate limiting High
Copilot Autofix
AI 29 days ago
Copilot could not generate an autofix suggestion
Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.
…haining syntax - billing.ts: add apiRateLimiter to GET /billing/subscription, POST /billing/checkout, POST /billing/portal to resolve CodeQL js/missing-rate-limiting alerts - performance-budget.js: replace body?.errors optional chaining with body && body.errors for compatibility with older k6/Babel versions in CI Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
| billingRouter.get( | ||
| "/billing/subscription", | ||
| authMiddleware, | ||
| apiRateLimiter, |
Check failure
Code scanning / CodeQL
Missing rate limiting High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 29 days ago
In general, the fix is to wrap this expensive route with a concrete, well-defined rate limiter so that bursts of requests cannot exhaust DB resources or Stripe’s API quota. The typical approach in Express is to use express-rate-limit to define a limiter (with a sensible window and max count) and then apply it as a middleware for this specific route.
Concretely for apps/gateway/src/routes/billing.ts, we will:
- Import
rateLimitfrom theexpress-rate-limitpackage at the top of the file. - Define a
subscriptionRateLimiterconstant configured with appropriate values (for example, a 15-minute window and a lowish per-IP cap, e.g., 30 or 50 calls). - Apply this new
subscriptionRateLimitermiddleware to the/billing/subscriptionroute, in addition to the existingauthMiddlewareandapiRateLimiter. This keeps all current behavior while adding an explicit and recognizable rate-limiting layer for this specific route.
We will only modify the shown regions: the import section and the billingRouter.get("/billing/subscription", ...) declaration.
| @@ -1,6 +1,7 @@ | ||
| import { Router, Request, Response } from "express"; | ||
| import Stripe from "stripe"; | ||
| import amqp from "amqplib"; | ||
| import rateLimit from "express-rate-limit"; | ||
| import { writePool as pool } from "../lib/db"; | ||
| import { authMiddleware } from "../lib/auth"; | ||
| import { writeAuditLog } from "../lib/audit"; | ||
| @@ -9,6 +10,13 @@ | ||
| const RABBITMQ_URL = process.env.RABBITMQ_URL || "amqp://grainguard:grainguard@rabbitmq:5672/grainguard"; | ||
| const STRIPE_BILLING_QUEUE = "grainguard.stripe.billing"; | ||
|
|
||
| const subscriptionRateLimiter = rateLimit({ | ||
| windowMs: 15 * 60 * 1000, // 15 minutes | ||
| max: 50, // limit each IP to 50 requests per window | ||
| standardHeaders: true, | ||
| legacyHeaders: false, | ||
| }); | ||
|
|
||
| async function publishToStripeQueue(payload: object): Promise<void> { | ||
| let conn: Awaited<ReturnType<typeof amqp.connect>> | null = null; | ||
| let ch: amqp.ConfirmChannel | null = null; | ||
| @@ -64,6 +72,7 @@ | ||
| "/billing/subscription", | ||
| authMiddleware, | ||
| apiRateLimiter, | ||
| subscriptionRateLimiter, | ||
| async (req: Request, res: Response) => { | ||
| const { rows } = await pool.query( | ||
| `SELECT tb.stripe_customer_id, |
| billingRouter.post( | ||
| "/billing/checkout", | ||
| authMiddleware, | ||
| apiRateLimiter, |
Check failure
Code scanning / CodeQL
Missing rate limiting High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 29 days ago
In general, to fix missing or insufficient rate limiting on an expensive route, you add a dedicated rate-limiting middleware that enforces an appropriate threshold for that specific endpoint, rather than relying only on a global or generic limiter. In Express, this is typically done using a library such as express-rate-limit, configured with a suitable windowMs and max for the operation in question.
For this specific /billing/checkout route in apps/gateway/src/routes/billing.ts, the best minimal fix without changing existing behavior is:
- Import
express-rate-limitat the top of the file. - Define a new limiter instance specifically for billing checkout, e.g.
billingCheckoutRateLimiter, with stricter limits (for example, a low number of requests per minute per IP or per user). - Apply this new middleware in addition to the existing
authMiddlewareandapiRateLimiterin the/billing/checkoutroute’s middleware array.
Concretely:
- In
apps/gateway/src/routes/billing.ts, add an import forexpress-rate-limit. - Just after the queue/Stripe constants or near other middleware-related declarations, define
const billingCheckoutRateLimiter = rateLimit({ ... }). - Update the
billingRouter.post("/billing/checkout", ...)declaration so that its middleware chain isauthMiddleware, apiRateLimiter, billingCheckoutRateLimiter, async (req, res) => { ... }.
This keeps existing functionality intact while adding an extra safeguard specifically tailored to this sensitive billing operation.
| @@ -5,10 +5,18 @@ | ||
| import { authMiddleware } from "../lib/auth"; | ||
| import { writeAuditLog } from "../lib/audit"; | ||
| import { apiRateLimiter } from "../middleware/rateLimiting"; | ||
| import rateLimit from "express-rate-limit"; | ||
|
|
||
| const RABBITMQ_URL = process.env.RABBITMQ_URL || "amqp://grainguard:grainguard@rabbitmq:5672/grainguard"; | ||
| const STRIPE_BILLING_QUEUE = "grainguard.stripe.billing"; | ||
|
|
||
| const billingCheckoutRateLimiter = rateLimit({ | ||
| windowMs: 1 * 60 * 1000, // 1 minute | ||
| max: 5, // limit each IP/user to 5 checkout requests per minute | ||
| standardHeaders: true, | ||
| legacyHeaders: false, | ||
| }); | ||
|
|
||
| async function publishToStripeQueue(payload: object): Promise<void> { | ||
| let conn: Awaited<ReturnType<typeof amqp.connect>> | null = null; | ||
| let ch: amqp.ConfirmChannel | null = null; | ||
| @@ -124,6 +128,7 @@ | ||
| "/billing/checkout", | ||
| authMiddleware, | ||
| apiRateLimiter, | ||
| billingCheckoutRateLimiter, | ||
| async (req: Request, res: Response) => { | ||
| const { plan } = req.body as { plan: string }; | ||
|
|
| billingRouter.post( | ||
| "/billing/portal", | ||
| authMiddleware, | ||
| apiRateLimiter, |
Check failure
Code scanning / CodeQL
Missing rate limiting High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 29 days ago
To fix the problem, the webhook endpoint should be protected by rate limiting just like the other billing routes. The existing apiRateLimiter middleware is already imported and used for /billing/checkout and /billing/portal, so the best fix is to apply the same middleware in front of the Stripe webhook handler without changing its core behavior.
Concretely, instead of exporting stripeWebhookHandler as a bare function, we should expose it as a route on billingRouter with apiRateLimiter in the middleware chain, for example via billingRouter.post("/billing/stripe-webhook", apiRateLimiter, stripeWebhookHandler). Since we cannot see the router setup here, and to avoid altering exports or wiring that may be used elsewhere, an even less intrusive change is to wrap the existing stripeWebhookHandler logic in a new, rate-limited handler and export that wrapper. This preserves the external interface (the exported symbol name remains stripeWebhookHandler) while adding rate limiting in-process.
Implementation steps within apps/gateway/src/routes/billing.ts:
- Leave the existing logic inside
stripeWebhookHandleruntouched but move it into a new internal function, e.g.async function stripeWebhookHandlerCore(req, res). - Create a new exported
stripeWebhookHandlerthat is an Express-style handler applyingapiRateLimitermanually: invokeapiRateLimiter(req, res, next)and, in its callback, callstripeWebhookHandlerCore. This mimics how Express would apply the middleware without requiring router-level changes. - Ensure imports remain unchanged; we already import
apiRateLimiter, so no new imports are needed.
This adds rate limiting to the webhook handler while keeping its external behavior and signature compatible.
| @@ -220,7 +220,7 @@ | ||
| } | ||
| ); | ||
|
|
||
| export async function stripeWebhookHandler(req: Request, res: Response) { | ||
| async function stripeWebhookHandlerCore(req: Request, res: Response) { | ||
| const sig = req.headers["stripe-signature"] as string; | ||
|
|
||
| let event: Stripe.Event; | ||
| @@ -248,3 +248,13 @@ | ||
| return res.status(500).json({ error: "queue_publish_failed" }); | ||
| } | ||
| } | ||
|
|
||
| export function stripeWebhookHandler(req: Request, res: Response) { | ||
| apiRateLimiter(req, res, (err?: any) => { | ||
| if (err) { | ||
| // If the rate limiter passes an error, delegate to Express-style error handling | ||
| return res.status(429).json({ error: "rate_limited" }); | ||
| } | ||
| void stripeWebhookHandlerCore(req, res); | ||
| }); | ||
| } |
Summary by CodeRabbit
New Features
Bug Fixes
Documentation