Skip to content

Optimize Costs for Staging Deployment#3238

Merged
arkid15r merged 25 commits intoOWASP:feature/nest-zappa-migrationfrom
rudransh-shrivastava:feature/nest-zappa-migration-cost
Jan 10, 2026
Merged

Optimize Costs for Staging Deployment#3238
arkid15r merged 25 commits intoOWASP:feature/nest-zappa-migrationfrom
rudransh-shrivastava:feature/nest-zappa-migration-cost

Conversation

@rudransh-shrivastava
Copy link
Collaborator

@rudransh-shrivastava rudransh-shrivastava commented Jan 7, 2026

Proposed change

Resolves #3216

Refactor frontend/alb to be a module. Route traffic to lambda and ECS Frontend Tasks based on path.
Add conditions for :

  • Creating VPC endpoints.
  • Using Fargate Spot.

Add a minimal staging configuration.

Checklist

  • Required: I read and followed the contributing guidelines
  • Required: I ran make check-test locally and all tests passed
  • I used AI for code, documentation, or tests in this PR

@rudransh-shrivastava rudransh-shrivastava linked an issue Jan 7, 2026 that may be closed by this pull request
2 tasks
@github-actions github-actions bot added docs Improvements or additions to documentation backend labels Jan 7, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 7, 2026

Summary by CodeRabbit

  • New Features

    • Application Load Balancer with HTTPS, path-based routing, and ALB DNS output.
    • Fargate Spot option for frontend and backend tasks.
    • Toggleable VPC endpoints (CloudWatch, ECR, S3, Secrets Manager, SSM).
  • Infrastructure Improvements

    • Improved ECS networking (public/private subnet support, optional public IPs).
    • More explicit ECR image upload and deployment flow.
  • Configuration Updates

    • Simplified cookie/security parameters to framework defaults.
    • Enhanced allowed origins/hosts and frontend URL/domain handling.
  • Documentation

    • Updated deployment README for ALB/ECR-based workflow.

✏️ Tip: You can customize this high-level summary in your review settings.

Walkthrough

Removes CSRF/session cookie SSM/settings, adds a reusable ALB module and ALB/staging wiring, makes VPC endpoints conditional, updates ECS to capacity providers and subnet/public-IP options, adjusts frontend/ECR/ECR flows and docs, and adds multiple Terraform variable/outputs changes across modules.

Changes

Cohort / File(s) Summary
Backend config
backend/settings/staging.py
Removed CSRF/session cookie attributes: CSRF_COOKIE_HTTPONLY, CSRF_COOKIE_SAMESITE, CSRF_COOKIE_SECURE, SESSION_COOKIE_SAMESITE.
Zappa & spellcheck
backend/zappa_settings.example.json, cspell/custom-dict.txt
Added apigateway_enabled: false to staging zappa example; added hcl token to cspell dictionary.
Docs
infrastructure/README.md
Rewrote deployment instructions to reflect ALB/ECR workflow, ECR pushes, S3/ECS fixtures, and known issues.
ALB module
infrastructure/modules/alb/...
.../main.tf, .../outputs.tf, .../variables.tf
New ALB module: optional ACM, HTTP/HTTPS listeners, Lambda target support, S3 access logging, path-based routing, validations, and new outputs.
Frontend module
infrastructure/modules/frontend/...
.../main.tf, .../outputs.tf, .../variables.tf
Removed embedded ALB/ACM; frontend ECS now uses var.target_group_arn, exposes use_fargate_spot, and outputs ECS cluster/service info; removed several ALB vars/outputs.
ECS core & task modules
infrastructure/modules/ecs/..., infrastructure/modules/ecs/modules/task/...
Added aws_ecs_cluster_capacity_providers; propagate assign_public_ip, subnet_ids, use_fargate_spot; task module uses capacity_provider_strategy and subnet_ids (replacing private_subnet_ids).
Cache module
infrastructure/modules/cache/main.tf
Narrowed random_password.redis_auth_token.override_special character set.
Networking (top-level)
infrastructure/modules/networking/...
main.tf, variables.tf, outputs.tf
Added per-endpoint flags for CloudWatch/ECR/S3/SecretsManager/SSM; gate vpc_endpoint module by OR of flags; outputs now handle absent endpoints safely.
VPC endpoint module
infrastructure/modules/networking/modules/vpc-endpoint/...
Converted endpoints and SG to count-based conditional resources; added local.any_interface_endpoint and per-endpoint create_* flags; adjusted indexed references and S3 route associations.
Security module
infrastructure/modules/security/...
main.tf, variables.tf
Added create_vpc_endpoint_rules guard and precondition; endpoint-related SG rules are now conditional; vpc_endpoint_sg_id made nullable with default.
Parameters / SSM
infrastructure/modules/parameters/...
main.tf, outputs.tf, variables.tf
Removed SSM params for CSRF/session cookie settings; added allowed_origins and nextauth_url variables; updated allowed_hosts description and several SSM parameter values/outputs.
Staging orchestration & examples
infrastructure/staging/...
main.tf, variables.tf, outputs.tf, terraform.tfvars.example
Wired new alb module and VPC endpoint toggles; propagated assign_public_ip, subnet_ids/ecs_use_public_subnets, use_fargate_spot, lambda inputs; adjusted staging outputs and example vars.
Frontend ALB submodule removals
infrastructure/modules/frontend/modules/alb/...
Removed ALB-specific variables and outputs that previously exposed ALB/ACM/target group details.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • kasya
  • arkid15r
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Optimize Costs for Staging Deployment' clearly summarizes the main objective of the pull request—cost optimization through conditional resource creation and minimal configuration for staging.
Description check ✅ Passed The description is related to the changeset, referencing issue #3216, mentioning refactored ALB module, traffic routing, VPC endpoint conditions, Fargate Spot usage, and minimal staging configuration.
Linked Issues check ✅ Passed The PR successfully implements the objectives from issue #3216: conditional resource creation (VPC endpoints), Fargate Spot options, ALB refactoring, and minimal staging configuration are all present in the code changes.
Out of Scope Changes check ✅ Passed All changes are scoped to the cost optimization objectives: ALB module creation, ECS/networking/security/parameter module updates for conditional resources, Fargate Spot support, staging configuration, and related documentation updates are all in scope.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8c3fd81 and fa5b512.

📒 Files selected for processing (1)
  • infrastructure/staging/terraform.tfvars.example
🚧 Files skipped from review as they are similar to previous changes (1)
  • infrastructure/staging/terraform.tfvars.example
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Run backend tests
  • GitHub Check: Run frontend unit tests
  • GitHub Check: Run frontend e2e tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (5)
infrastructure/modules/cache/main.tf (1)

42-48: Coordinate deployment when regenerating Redis auth token.

Changing override_special will trigger Terraform to regenerate the Redis auth token, which:

  1. Updates the SSM parameter with a new password
  2. Changes the ElastiCache auth token on next apply
  3. Requires applications to restart to pick up the new password from SSM

The character set reduction (!$^- instead of !&#$^<>-) is sensible—the removed characters (&, #, <, >) cause escaping issues in the rediss:// URI scheme used by Django's Redis connection string. Ensure all dependent services are redeployed after terraform apply to avoid connection failures.

infrastructure/modules/security/main.tf (3)

116-125: Critical: Potential null reference when VPC endpoint rules are enabled.

The count condition checks var.create_vpc_endpoint_rules, but the rule references var.vpc_endpoint_sg_id which can be null (default = null per variables.tf). If create_vpc_endpoint_rules is true but vpc_endpoint_sg_id is null, Terraform will fail validation because source_security_group_id cannot accept null.

🔧 Proposed fix: Add null check to count condition
 resource "aws_security_group_rule" "ecs_to_vpc_endpoints" {
-  count                    = var.create_vpc_endpoint_rules ? 1 : 0
+  count                    = var.create_vpc_endpoint_rules && var.vpc_endpoint_sg_id != null ? 1 : 0
   description              = "Allow HTTPS to VPC endpoints"
   from_port                = 443
   protocol                 = "tcp"
   security_group_id        = aws_security_group.ecs.id
   source_security_group_id = var.vpc_endpoint_sg_id
   to_port                  = 443
   type                     = "egress"
 }

Alternatively, add a validation constraint in variables.tf to ensure vpc_endpoint_sg_id is non-null when create_vpc_endpoint_rules is true.


147-156: Critical: Potential null reference when VPC endpoint rules are enabled.

Same issue as lines 116-125: if create_vpc_endpoint_rules is true but vpc_endpoint_sg_id is null, this rule will fail validation.

🔧 Proposed fix: Add null check to count condition
 resource "aws_security_group_rule" "frontend_to_vpc_endpoints" {
-  count                    = var.create_vpc_endpoint_rules ? 1 : 0
+  count                    = var.create_vpc_endpoint_rules && var.vpc_endpoint_sg_id != null ? 1 : 0
   description              = "Allow HTTPS to VPC endpoints"
   from_port                = 443
   protocol                 = "tcp"
   security_group_id        = aws_security_group.frontend.id
   source_security_group_id = var.vpc_endpoint_sg_id
   to_port                  = 443
   type                     = "egress"
 }

168-177: Critical: Potential null reference when VPC endpoint rules are enabled.

Same issue as lines 116-125 and 147-156: if create_vpc_endpoint_rules is true but vpc_endpoint_sg_id is null, this rule will fail validation.

🔧 Proposed fix: Add null check to count condition
 resource "aws_security_group_rule" "lambda_to_vpc_endpoints" {
-  count                    = var.create_vpc_endpoint_rules ? 1 : 0
+  count                    = var.create_vpc_endpoint_rules && var.vpc_endpoint_sg_id != null ? 1 : 0
   description              = "Allow HTTPS to VPC endpoints"
   from_port                = 443
   protocol                 = "tcp"
   security_group_id        = aws_security_group.lambda.id
   source_security_group_id = var.vpc_endpoint_sg_id
   to_port                  = 443
   type                     = "egress"
 }
infrastructure/modules/networking/main.tf (1)

168-180: Document the cost-optimized staging configuration when NAT Gateway is disabled.

The dynamic route block correctly implements conditional NAT routing. The staging infrastructure properly supports running ECS tasks in public subnets via the ecs_use_public_subnets variable (defaults to true), with corresponding assign_public_ip configuration. However, the infrastructure README lacks documentation on:

  1. When ecs_use_public_subnets = false, ECS tasks run in private subnets without NAT egress—document this requires VPC endpoints (S3 is enabled by default) or Lambda-based operations only.
  2. The create_nat_gateway variable behavior and cost implications.
  3. When and how to use this cost-optimized staging configuration.
🤖 Fix all issues with AI agents
In @infrastructure/modules/alb/variables.tf:
- Around line 12-22: Add a validation block to the existing enable_https
variable so Terraform enforces that domain_name is set when enable_https is
true: update the variable "enable_https" definition to include a validation with
condition that allow either enable_https is false or var.domain_name != null and
a clear error_message such as "domain_name must be provided when enable_https is
true."; this ensures the ACM certificate logic that depends on variable
"domain_name" (and resources referencing aws_acm_certificate.main) cannot be
triggered without the required domain_name.

In @infrastructure/modules/security/variables.tf:
- Around line 13-17: The variables create_vpc_endpoint_rules and
vpc_endpoint_sg_id have a logical dependency that isn’t enforced; when
create_vpc_endpoint_rules is true vpc_endpoint_sg_id must be non-null. Fix by
adding a validation mechanism: either add a validation block for
vpc_endpoint_sg_id (and document the constraint) and implement a locals-based
cross-variable precondition in the calling module, or change resource/module
count/conditional logic in main.tf to only reference vpc_endpoint_sg_id when
create_vpc_endpoint_rules is true; reference the variable names
create_vpc_endpoint_rules and vpc_endpoint_sg_id to locate where to add the
validation/precondition or conditional guard.
🧹 Nitpick comments (6)
infrastructure/modules/frontend/variables.tf (1)

90-93: Document Fargate Spot implications for frontend availability.

The use_fargate_spot flag enables using Fargate Spot capacity, which can reduce costs by up to 70% but comes with trade-offs: tasks can be interrupted with 2-minute notice when capacity is reclaimed. For the frontend (user-facing service), consider:

  • Whether 2-minute interruption windows are acceptable for your SLA
  • Implementing graceful shutdown handling
  • Setting appropriate task counts to handle potential interruptions
  • Monitoring spot interruption rates

The default of false is appropriate for production-like environments.

infrastructure/README.md (2)

246-268: Clear database setup instructions with good operational detail.

The new section provides explicit step-by-step instructions for uploading fixtures and running ECS tasks through the AWS Console. The level of detail (cluster names, networking configuration, security groups) is helpful for operators unfamiliar with the infrastructure.

Optional: Fix markdown list indentation

Static analysis flagged list indentation issues (MD007). While minor, fixing them improves readability:

-  - Upload the fixture present in `backend/data` to `nest-fixtures` bucket using the following command:
+- Upload the fixture present in `backend/data` to `nest-fixtures` bucket using the following command:

Apply similar fixes to lines 256-268 for consistent 0-space or 2-space indentation throughout.


342-346: Document known Zappa issue with proper link formatting.

The "Known Issues" section is valuable for operators. However, the bare URL on Line 346 should be formatted as a markdown link for better readability and to address the static analysis warning.

Fix bare URL
-Reference: https://github.com/zappa/Zappa/issues/939
+Reference: [Zappa Issue #939](https://github.com/zappa/Zappa/issues/939)
infrastructure/modules/ecs/main.tf (1)

19-28: LGTM! FARGATE_SPOT reduces compute costs for staging.

The new capacity provider configuration enables FARGATE_SPOT when var.use_fargate_spot is true, which can reduce costs by approximately 70% compared to on-demand FARGATE. The configuration appropriately sets base = 0 and weight = 1, making SPOT the only capacity provider when enabled.

Note that FARGATE_SPOT instances can be interrupted with 2 minutes notice, so ensure all ECS tasks are designed to handle interruptions gracefully (e.g., idempotent operations, checkpointing for long-running tasks).

infrastructure/staging/variables.tf (1)

13-17: Consider aligning NAT gateway default with ECS public subnet configuration.

With ecs_use_public_subnets defaulting to true, ECS tasks will have direct internet access via public subnets. If no other resources require NAT gateway access (e.g., Lambda in private subnets without VPC endpoints), consider defaulting create_nat_gateway to false for additional cost savings.

This may be intentional for flexibility, but worth verifying the intended staging configuration.

Also applies to: 160-164

infrastructure/modules/alb/main.tf (1)

66-83: Consider enabling drop_invalid_header_fields for security hardening.

Adding drop_invalid_header_fields = true prevents HTTP request smuggling and header injection attacks by dropping requests with invalid HTTP headers.

🔒 Proposed security enhancement
 resource "aws_lb" "main" {
   depends_on                 = [aws_s3_bucket_policy.alb_logs]
+  drop_invalid_header_fields = true
   enable_deletion_protection = false
   internal                   = false
   load_balancer_type         = "application"
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b3170ed and a101448.

📒 Files selected for processing (30)
  • backend/settings/staging.py
  • backend/zappa_settings.example.json
  • cspell/custom-dict.txt
  • infrastructure/README.md
  • infrastructure/modules/alb/main.tf
  • infrastructure/modules/alb/outputs.tf
  • infrastructure/modules/alb/variables.tf
  • infrastructure/modules/cache/main.tf
  • infrastructure/modules/ecs/main.tf
  • infrastructure/modules/ecs/modules/task/main.tf
  • infrastructure/modules/ecs/modules/task/variables.tf
  • infrastructure/modules/ecs/variables.tf
  • infrastructure/modules/frontend/main.tf
  • infrastructure/modules/frontend/outputs.tf
  • infrastructure/modules/frontend/variables.tf
  • infrastructure/modules/networking/main.tf
  • infrastructure/modules/networking/modules/vpc-endpoint/main.tf
  • infrastructure/modules/networking/modules/vpc-endpoint/outputs.tf
  • infrastructure/modules/networking/modules/vpc-endpoint/variables.tf
  • infrastructure/modules/networking/outputs.tf
  • infrastructure/modules/networking/variables.tf
  • infrastructure/modules/parameters/main.tf
  • infrastructure/modules/parameters/outputs.tf
  • infrastructure/modules/parameters/variables.tf
  • infrastructure/modules/security/main.tf
  • infrastructure/modules/security/variables.tf
  • infrastructure/staging/main.tf
  • infrastructure/staging/outputs.tf
  • infrastructure/staging/terraform.tfvars.example
  • infrastructure/staging/variables.tf
💤 Files with no reviewable changes (2)
  • backend/settings/staging.py
  • infrastructure/modules/frontend/outputs.tf
🧰 Additional context used
🧠 Learnings (8)
📓 Common learnings
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2551
File: infrastructure/modules/parameters/main.tf:1-191
Timestamp: 2025-11-08T11:16:25.725Z
Learning: The parameters module in infrastructure/modules/parameters/ is currently configured for staging environment only. The `configuration` and `settings_module` variables default to "Staging" and "settings.staging" respectively, and users can update parameter values via the AWS Parameter Store console. The lifecycle.ignore_changes blocks on these parameters support manual console updates without Terraform reverting them.
📚 Learning: 2025-11-23T11:52:15.463Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2699
File: backend/wsgi.py:13-13
Timestamp: 2025-11-23T11:52:15.463Z
Learning: In the OWASP Nest project, the SSM parameter store setup in backend/wsgi.py (using boto3 to fetch parameters from AWS Systems Manager) is designed for staging and production environments, not just for testing purposes.

Applied to files:

  • backend/zappa_settings.example.json
  • infrastructure/modules/parameters/main.tf
  • infrastructure/staging/terraform.tfvars.example
  • infrastructure/modules/parameters/outputs.tf
📚 Learning: 2025-11-08T11:16:25.725Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2551
File: infrastructure/modules/parameters/main.tf:1-191
Timestamp: 2025-11-08T11:16:25.725Z
Learning: The parameters module in infrastructure/modules/parameters/ is currently configured for staging environment only. The `configuration` and `settings_module` variables default to "Staging" and "settings.staging" respectively, and users can update parameter values via the AWS Parameter Store console. The lifecycle.ignore_changes blocks on these parameters support manual console updates without Terraform reverting them.

Applied to files:

  • backend/zappa_settings.example.json
  • infrastructure/modules/parameters/main.tf
  • infrastructure/staging/terraform.tfvars.example
  • infrastructure/staging/variables.tf
  • infrastructure/staging/main.tf
📚 Learning: 2025-10-23T19:22:23.811Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2431
File: infrastructure/main.tf:0-0
Timestamp: 2025-10-23T19:22:23.811Z
Learning: In Zappa-based serverless deployments, Lambda functions and IAM execution roles are managed by Zappa at application deployment time (via `zappa deploy`/`zappa update`), not via Terraform. Terraform provisions the supporting infrastructure (VPC, RDS, S3, security groups, RDS Proxy, Secrets Manager), while Zappa handles the Lambda orchestration layer.

Applied to files:

  • infrastructure/staging/terraform.tfvars.example
  • infrastructure/README.md
  • infrastructure/staging/variables.tf
📚 Learning: 2025-11-27T19:38:23.956Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2743
File: infrastructure/modules/storage/main.tf:1-9
Timestamp: 2025-11-27T19:38:23.956Z
Learning: In the OWASP/Nest infrastructure code (infrastructure/ directory), the team prefers exact version pinning for Terraform (e.g., "1.14.0") and AWS provider (e.g., "6.22.0") over semantic versioning constraints (e.g., "~> 1.14" or "~> 6.22"). This is a deliberate choice for maximum reproducibility and explicit control over version updates in their testing environment.

Applied to files:

  • infrastructure/staging/terraform.tfvars.example
📚 Learning: 2025-10-17T15:25:34.963Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2431
File: infrastructure/modules/cache/main.tf:30-30
Timestamp: 2025-10-17T15:25:34.963Z
Learning: The infrastructure/Terraform code in the OWASP Nest repository under the `infrastructure/` directory is intended for quick testing purposes only, not for production deployment.

Applied to files:

  • infrastructure/staging/terraform.tfvars.example
  • infrastructure/README.md
📚 Learning: 2025-12-26T06:09:08.868Z
Learnt from: ahmedxgouda
Repo: OWASP/Nest PR: 3041
File: .github/workflows/run-ci-cd.yaml:233-243
Timestamp: 2025-12-26T06:09:08.868Z
Learning: For the OWASP/Nest repository, Redis image versions should remain consistent across all environments (production, staging, local, E2E, and CI/CD E2E tests). When upgrading Redis, update all docker-compose files and CI/CD workflow configurations together to maintain environment parity.

Applied to files:

  • infrastructure/README.md
📚 Learning: 2025-10-17T15:25:55.624Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2431
File: infrastructure/providers.tf:1-3
Timestamp: 2025-10-17T15:25:55.624Z
Learning: The infrastructure code in the OWASP/Nest repository (infrastructure/ directory) is intended for quick testing purposes only, not for production deployment.

Applied to files:

  • infrastructure/README.md
🪛 Checkov (3.2.334)
infrastructure/modules/alb/main.tf

[medium] 66-83: Ensure that ALB drops HTTP headers

(CKV_AWS_131)


[medium] 85-96: Ensure ALB protocol is HTTPS

(CKV_AWS_2)


[high] 85-96: Ensure that load balancer is using at least TLS 1.2

(CKV_AWS_103)


[medium] 131-152: Ensure AWS Load Balancer doesn't use HTTP protocol

(CKV_AWS_378)

🪛 markdownlint-cli2 (0.18.1)
infrastructure/README.md

155-155: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


156-156: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


163-163: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


189-189: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


215-215: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


248-248: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


256-256: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


257-257: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


258-258: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


259-259: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


260-260: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


261-261: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


262-262: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


263-263: Unordered list indentation
Expected: 4; Actual: 6

(MD007, ul-indent)


264-264: Unordered list indentation
Expected: 4; Actual: 6

(MD007, ul-indent)


265-265: Unordered list indentation
Expected: 4; Actual: 6

(MD007, ul-indent)


266-266: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


267-267: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


268-268: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


274-274: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


288-288: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


292-292: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


346-346: Bare URL used

(MD034, no-bare-urls)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Run backend tests
  • GitHub Check: Run frontend e2e tests
  • GitHub Check: Run frontend unit tests
  • GitHub Check: CodeQL (javascript-typescript)
🔇 Additional comments (53)
cspell/custom-dict.txt (1)

76-76: Addition of "hcl" is appropriate and well-placed.

The addition of "hcl" (HashiCorp Configuration Language) to the custom dictionary is contextually appropriate for this infrastructure-focused PR. The alphabetical ordering is maintained correctly between "gunicorn" and "heroui".

infrastructure/modules/ecs/modules/task/variables.tf (3)

1-5: LGTM! Cost optimization variable added.

The assign_public_ip variable enables ECS tasks to run in public subnets without requiring a NAT gateway, which is a direct cost optimization for staging environments.


103-107: Fargate Spot variable enables key cost savings.

The use_fargate_spot variable is central to the cost optimization goal. Fargate Spot provides significant savings but with the trade-off of potential task interruptions. The default of false is appropriate for safety.


71-74: Variable rename successfully updated across all module invocations.

The refactor from private_subnet_ids to subnet_ids is complete. All task module invocations in staging/main.tf (lines 48, 65, 86) have been properly updated to use the new variable name. The networking module's private_subnet_ids outputs are correctly referenced as value sources for the renamed parameter.

infrastructure/modules/alb/outputs.tf (4)

1-14: ACM certificate outputs properly conditionalized.

The conditional logic correctly returns ACM certificate details only when HTTPS is enabled and a domain name is provided, otherwise returning appropriate null/empty defaults.


16-29: ALB outputs look correct.

The ALB outputs correctly expose the core ALB attributes without unnecessary conditionals, as the ALB itself is always created.


31-44: Listener outputs correctly handle HTTP/HTTPS scenarios.

The listener outputs properly differentiate between:

  • HTTPS enabled: HTTP listener redirects to HTTPS, HTTPS listener serves traffic
  • HTTPS disabled: HTTP listener serves traffic directly, no HTTPS listener

46-49: Lambda target group output appropriately optional.

The output correctly returns the Lambda target group ARN only when Lambda integration is configured, otherwise null.

infrastructure/modules/ecs/variables.tf (3)

1-5: LGTM! Enhanced description provides helpful context.

The assign_public_ip variable at the ECS module level correctly notes that public IPs are required for public subnets, which clarifies when this flag should be enabled.


86-89: Variable rename is consistent across modules.

The subnet_ids variable rename matches the task submodule and supports the flexible subnet deployment model.


108-112: Excellent description for cost optimization flag.

The use_fargate_spot variable description explicitly mentions "for cost savings," making the purpose clear for users of this module.

infrastructure/modules/ecs/modules/task/main.tf (3)

84-84: Subnet reference updated consistently.

The subnet reference correctly uses the renamed var.subnet_ids variable, supporting both public and private subnet deployments.


86-86: Public IP assignment now configurable.

The change from hardcoded false to var.assign_public_ip enables flexible deployment to public subnets without requiring a NAT gateway, supporting the cost optimization goals.


75-81: Capacity provider strategy correctly implements Fargate Spot with proper cluster configuration.

The change from launch_type to capacity_provider_strategy is correct. Setting launch_type = null is required when using capacity providers. The conditional logic properly selects FARGATE_SPOT for cost savings or standard FARGATE based on the flag.

The ECS cluster has both FARGATE and FARGATE_SPOT capacity providers registered in infrastructure/modules/ecs/main.tf (lines 19-28), and the cluster-level default strategy uses the same conditional logic, confirming this task module configuration is properly supported.

backend/zappa_settings.example.json (1)

3-3: ALB integration for Lambda invocation is properly configured.

Disabling API Gateway for cost optimization in staging is sound—the infrastructure correctly configures ALB with a dedicated Lambda target group. Listener rules route backend paths to Lambda, and the necessary permissions are in place for ALB invocation. The alternative routing path is fully wired.

infrastructure/staging/terraform.tfvars.example (1)

1-12: Verify the frontend domain name for staging environment.

The configuration shows cost optimization flags that align well with the PR objectives (Fargate Spot, public subnets, disabled RDS proxy). However, the frontend_domain_name value "https://nest.owasp.dev" appears to be a production domain rather than a staging-specific domain. Verify this is intentional.

infrastructure/modules/networking/outputs.tf (1)

16-19: LGTM: Null-safe output implementation.

The output correctly handles the case where VPC endpoints are conditionally disabled by checking the module's existence before accessing its attributes. This prevents errors when module.vpc_endpoint is an empty list.

infrastructure/modules/networking/modules/vpc-endpoint/outputs.tf (1)

1-4: LGTM: Null-safe output implementation.

The output correctly handles the case where VPC endpoint security groups are conditionally disabled by checking for existence before accessing the resource ID. This aligns with the count-based resource creation pattern.

infrastructure/modules/networking/modules/vpc-endpoint/variables.tf (1)

11-45: LGTM! Well-structured cost optimization flags.

The new boolean flags enable conditional VPC endpoint creation as intended for cost optimization. All default to true, maintaining backwards compatibility. The S3 endpoint description correctly notes it's a Gateway endpoint (which is free), distinguishing it from the interface endpoints that incur hourly charges.

infrastructure/modules/parameters/variables.tf (2)

1-9: Clear distinction between hostname and URL formats.

The updated descriptions appropriately clarify that allowed_hosts expects hostname-only (no protocol), while allowed_origins expects full URLs with protocol. This distinction helps prevent configuration errors.


54-57: All module consumers have been properly updated with the required nextauth_url variable.

The sole consumer of the parameters module (infrastructure/staging/main.tf, line 140) correctly provides nextauth_url with an appropriate fallback value.

infrastructure/modules/frontend/main.tf (2)

106-110: LGTM! Proper capacity provider strategy implementation.

The switch from launch_type to capacity_provider_strategy is the recommended approach. The conditional logic correctly selects between FARGATE_SPOT and FARGATE based on the use_fargate_spot flag. The configuration with base = 0 and weight = 1 ensures all tasks use the selected capacity provider.


115-115: LGTM! Target group ARN correctly externalized.

The change from module.alb.target_group_arn to var.target_group_arn aligns with the frontend module's interface changes, properly externalizing ALB management to a separate module.

infrastructure/modules/frontend/variables.tf (1)

85-88: The frontend module's breaking change has been properly handled.

The target_group_arn variable is now required (no default value), reflecting the shift of ALB management to a separate module. The only consumer at infrastructure/staging/main.tf has been updated and correctly passes target_group_arn = module.alb.frontend_target_group_arn.

infrastructure/modules/parameters/outputs.tf (1)

4-21: This review comment is based on incorrect assumptions and should be dismissed.

The three CSRF-related parameter ARNs mentioned (django_csrf_cookie_httponly, django_csrf_cookie_samesite, django_session_cookie_samesite) were never defined in the infrastructure/modules/parameters module. A comprehensive search of the codebase confirms:

  • No aws_ssm_parameter resources for these Django CSRF parameters exist in main.tf
  • No variable definitions for these parameters exist in variables.tf
  • No references to these parameters exist in any Python backend files
  • Only CSRF-related infrastructure parameter is next_server_csrf_url (a frontend parameter for Next.js)

These parameters could not have been removed from this file because they were never there in the first place.

Likely an incorrect or invalid review comment.

infrastructure/modules/parameters/main.tf (4)

40-50: LGTM! Clear parameter description.

The updated description clarifies that DJANGO_ALLOWED_HOSTS should contain hostnames without the protocol, which helps prevent configuration errors.


254-264: Variable nextauth_url is properly defined. The variable exists in infrastructure/modules/parameters/variables.tf at line 54 with type string and appropriate description. The resource correctly references this variable and the lifecycle configuration ensures it won't be overwritten by external changes.


52-62: No action required.

The var.allowed_origins variable is properly defined in infrastructure/modules/parameters/variables.tf with type string and a clear description matching the expected format (comma-separated URLs with protocol). The implementation is correct.


1-274: The removed SSM parameters have been relocated to Django settings—verify CSRF cookie defaults align with security requirements.

The three removed SSM parameters—django_csrf_cookie_httponly, django_csrf_cookie_samesite, and django_session_cookie_samesite—are now configured directly in backend/settings/base.py:

  • SESSION_COOKIE_SAMESITE = "Lax"
  • SESSION_COOKIE_HTTPONLY = True
  • SESSION_COOKIE_SECURE = True

The CsrfViewMiddleware is active. However, CSRF cookie settings (CSRF_COOKIE_HTTPONLY and CSRF_COOKIE_SAMESITE) are not explicitly configured in Django settings, meaning they rely on Django's defaults. Confirm that Django's default behavior (CSRF_COOKIE_SAMESITE='Strict' in Django 3.1+) meets your security requirements across all environments.

infrastructure/modules/alb/variables.tf (2)

1-72: LGTM! Well-structured variable definitions for the ALB module.

The variables provide good flexibility for cost optimization:

  • Optional HTTPS via enable_https (defaults to false for staging)
  • Optional Lambda backend routing via nullable lambda_arn and lambda_function_name
  • Sensible defaults for health checks, ports, and log retention

29-39: Lambda null handling is properly implemented through conditional resource creation.

The main.tf correctly handles null values using Terraform's count and for_each conditionals. Lambda-related resources (aws_lb_target_group, aws_lb_target_group_attachment, aws_lambda_permission, and listener rules) are only created when var.lambda_arn != null, ensuring optional Lambda backend routing works as intended. The lambda_function_name variable is safely used within the conditionally created aws_lambda_permission block, preventing any null reference issues.

infrastructure/modules/networking/main.tf (2)

45-64: LGTM! Conditional VPC endpoint creation reduces staging costs.

The vpc_endpoint module now uses count-based conditional creation, only instantiating when at least one endpoint type is enabled. This aligns with the PR objective to reduce staging costs by making VPC endpoints optional. The individual endpoint creation flags provide fine-grained control over which endpoints to provision.


138-155: LGTM! Conditional NAT Gateway creation reduces staging costs.

The NAT Gateway and associated Elastic IP now use count to enable conditional creation via var.create_nat_gateway. This is a significant cost optimization, as NAT Gateways incur both hourly charges (~$0.045/hour) and data transfer fees. The indexing [0] on Line 149 is safe because count ensures exactly one resource when the condition is true.

infrastructure/README.md (2)

154-169: LGTM! Clear instructions for ALB-Lambda integration.

The new section provides step-by-step guidance for configuring ALB routing after Zappa deployment, including extracting Lambda details and updating Terraform variables. This reflects the architectural change to route through ALB instead of API Gateway directly.


192-192: The Docker build paths in the documentation are correct. The Dockerfile locations (docker/backend/Dockerfile, docker/frontend/Dockerfile) and build contexts (backend/, frontend/) all exist in the repository structure as documented.

infrastructure/modules/ecs/main.tf (1)

219-240: LGTM! Consistent updates enable public subnet deployment and FARGATE_SPOT.

All six task modules (sync_data_task, owasp_update_project_health_metrics_task, owasp_update_project_health_scores_task, migrate_task, load_data_task, index_data_task) have been consistently updated with:

  1. assign_public_ip = var.assign_public_ip - Enables public IP assignment for tasks in public subnets (avoids NAT Gateway costs)
  2. subnet_ids = var.subnet_ids - Replaces private_subnet_ids, providing flexibility to use public or private subnets
  3. use_fargate_spot = var.use_fargate_spot - Enables FARGATE_SPOT for cost reduction

This uniform approach ensures all tasks can operate in the cost-optimized staging configuration (public subnets without NAT Gateway).

Also applies to: 242-271, 273-294, 296-315, 317-349, 351-370

infrastructure/staging/outputs.tf (3)

1-19: LGTM! Clean output restructuring aligned with ALB module extraction.

The new ACM and ALB outputs properly expose the centralized ALB module's values, and renaming backend_ecr_repository_url to reference module.ecs.ecr_repository_url correctly reflects the module restructuring.


26-29: Simplified frontend_url derivation looks correct.

The logic now derives HTTPS/HTTP scheme directly from var.frontend_domain_name != null, which aligns with the removal of the separate frontend_enable_https variable and the updated description indicating HTTPS is auto-enabled via ACM when domain is set.


36-40: New private_subnet_ids output is useful for downstream consumers.

This output enables other modules or external consumers to reference private subnet IDs without re-querying the networking module.

infrastructure/modules/networking/variables.tf (1)

17-57: Well-structured conditional creation variables for cost optimization.

The new variables provide granular control over VPC endpoint and NAT gateway creation. The conservative defaults (false for paid endpoints) at the module level are appropriate, while the staging configuration can override as needed.

infrastructure/staging/main.tf (3)

20-35: ALB module extraction looks well-structured.

The new centralized ALB module consolidates load balancer, ACM certificate, and Lambda integration logic. The conditional HTTPS enablement (enable_https = var.frontend_domain_name != null) aligns with the simplified configuration approach.


74-88: ECS public subnet and Fargate Spot configuration enables cost optimization.

The conditional subnet assignment (var.ecs_use_public_subnets ? module.networking.public_subnet_ids : module.networking.private_subnet_ids) combined with assign_public_ip allows ECS tasks to run in public subnets without NAT gateway costs when appropriate.


146-158: Ensure create_vpc_endpoint_rules is logically consistent with actual VPC endpoint module creation.

The networking module output properly returns null when no VPC endpoints exist, and the security module's vpc_endpoint_sg_id variable accepts null as a default. However, the security group rules referencing vpc_endpoint_sg_id are only created when create_vpc_endpoint_rules is true. If this flag is true but no VPC endpoint modules actually exist (causing vpc_endpoint_sg_id to be null), the rules will fail. Verify that the logic determining create_vpc_endpoint_rules always aligns with whether vpc_endpoint modules will be created.

infrastructure/staging/variables.tf (2)

37-71: VPC endpoint defaults are well-configured for cost optimization.

The paid interface endpoints default to false while the free S3 gateway endpoint defaults to true. This provides a sensible cost-optimized baseline while allowing production deployments to enable endpoints as needed.


202-206: Fargate Spot defaults are appropriate for staging cost optimization.

Both frontend_use_fargate_spot and use_fargate_spot default to true, leveraging up to 70% cost savings for staging workloads where brief interruptions are acceptable.

Also applies to: 280-284

infrastructure/modules/alb/main.tf (4)

85-96: HTTP listener configuration is appropriately conditional.

The HTTP listener (lines 85-96) is only created when HTTPS is disabled (count = var.enable_https ? 0 : 1), while the redirect listener (lines 98-114) handles HTTP-to-HTTPS redirection when HTTPS is enabled. This design correctly addresses the static analysis concerns about unencrypted HTTP traffic.

Also applies to: 98-114


116-129: HTTPS listener uses a strong TLS policy.

The ELBSecurityPolicy-TLS13-1-2-2021-06 policy enforces TLS 1.2+ with TLS 1.3 support, meeting current security best practices.


179-213: Backend routing uses chunked path patterns to handle AWS listener rule limits.

The for_each with chunklist(local.backend_paths, 5) correctly addresses the AWS limit of 5 path patterns per listener rule condition. The priority calculation (100 + each.key) ensures non-conflicting rule ordering.


215-275: ALB logs bucket has comprehensive security configuration.

The S3 bucket includes:

  • Server-side encryption (AES256)
  • Public access blocking
  • Versioning enabled
  • Lifecycle configuration for log expiration
  • prevent_destroy lifecycle rule
infrastructure/modules/networking/modules/vpc-endpoint/main.tf (4)

12-14: Clean consolidation of interface endpoint flags.

The any_interface_endpoint local correctly aggregates all interface endpoint creation flags, ensuring the shared security group is only created when at least one interface endpoint is needed.


16-35: Shared security group for interface endpoints is efficient.

Creating a single security group for all interface endpoints reduces resource count and simplifies security rule management. The conditional creation (count = local.any_interface_endpoint ? 1 : 0) properly gates the resource.


37-110: Interface endpoint security group references are correctly indexed.

Since each interface endpoint's count condition is a subset of any_interface_endpoint, the aws_security_group.vpc_endpoints[0].id reference is guaranteed to exist when any individual endpoint is created.


76-84: S3 Gateway endpoint correctly omits security group.

Gateway endpoints don't use security groups, and the route table associations properly reference aws_vpc_endpoint.s3[0].id with matching count conditions.

Also applies to: 112-122

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @infrastructure/staging/terraform.tfvars.example:
- Around line 3-5: The variable create_nat_gateway is set to true while
ecs_use_public_subnets is true, which will unnecessarily create NAT gateways;
change create_nat_gateway to false in the terraform.tfvars.example (or make its
default false) for staging, and add/update a short comment or README note
explaining that NAT gateways should only be enabled when using private subnets
or when specific resources require them; also run a quick Terraform plan to
verify no other resources (e.g., route tables, rds_proxy settings) depend on
create_nat_gateway being true before committing.
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a101448 and 5460813.

📒 Files selected for processing (1)
  • infrastructure/staging/terraform.tfvars.example
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2431
File: infrastructure/main.tf:0-0
Timestamp: 2025-10-23T19:22:23.811Z
Learning: In Zappa-based serverless deployments, Lambda functions and IAM execution roles are managed by Zappa at application deployment time (via `zappa deploy`/`zappa update`), not via Terraform. Terraform provisions the supporting infrastructure (VPC, RDS, S3, security groups, RDS Proxy, Secrets Manager), while Zappa handles the Lambda orchestration layer.
📚 Learning: 2025-11-23T11:52:15.463Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2699
File: backend/wsgi.py:13-13
Timestamp: 2025-11-23T11:52:15.463Z
Learning: In the OWASP Nest project, the SSM parameter store setup in backend/wsgi.py (using boto3 to fetch parameters from AWS Systems Manager) is designed for staging and production environments, not just for testing purposes.

Applied to files:

  • infrastructure/staging/terraform.tfvars.example
📚 Learning: 2025-11-27T19:38:23.956Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2743
File: infrastructure/modules/storage/main.tf:1-9
Timestamp: 2025-11-27T19:38:23.956Z
Learning: In the OWASP/Nest infrastructure code (infrastructure/ directory), the team prefers exact version pinning for Terraform (e.g., "1.14.0") and AWS provider (e.g., "6.22.0") over semantic versioning constraints (e.g., "~> 1.14" or "~> 6.22"). This is a deliberate choice for maximum reproducibility and explicit control over version updates in their testing environment.

Applied to files:

  • infrastructure/staging/terraform.tfvars.example
📚 Learning: 2025-10-17T15:25:34.963Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2431
File: infrastructure/modules/cache/main.tf:30-30
Timestamp: 2025-10-17T15:25:34.963Z
Learning: The infrastructure/Terraform code in the OWASP Nest repository under the `infrastructure/` directory is intended for quick testing purposes only, not for production deployment.

Applied to files:

  • infrastructure/staging/terraform.tfvars.example
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Run frontend e2e tests
  • GitHub Check: Run frontend unit tests
  • GitHub Check: Run backend tests
🔇 Additional comments (3)
infrastructure/staging/terraform.tfvars.example (3)

4-4: Good cost optimization: RDS Proxy disabled for staging.

Disabling RDS Proxy eliminates ~$15+/month in compute costs while maintaining database connectivity for staging workloads. This aligns well with the PR's cost optimization objectives.


8-8: Excellent cost optimization: Fargate Spot enabled for staging.

Enabling Fargate Spot for both frontend and backend workloads can reduce compute costs by up to 70% compared to on-demand pricing. This is an appropriate trade-off for staging environments where occasional spot interruptions are acceptable.

Also applies to: 12-12


9-10: No changes needed. The lambda_arn and lambda_function_name variables are correctly set to null. Both variables are optional with documented defaults of null and are designed to skip Lambda routing when not provided. This is the intended configuration for staging deployments.

@rudransh-shrivastava rudransh-shrivastava force-pushed the feature/nest-zappa-migration-cost branch from 33beb85 to 7c9e997 Compare January 7, 2026 16:16
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @infrastructure/modules/alb/main.tf:
- Around line 16-35: The ALB routing local.backend_paths list is missing routes
used by Django (/django-rq/* and /owasp/*), causing those requests to hit the
frontend default instead of the backend; update the locals block that defines
local.backend_paths (and thus local.backend_path_chunks) to include "/django-rq"
and "/django-rq/*" and "/owasp" and "/owasp/*" (or document intentional
exclusion), then re-run chunklist(local.backend_paths, 5) to regenerate
backend_path_chunks so the new paths are included in ALB routing.
🧹 Nitpick comments (4)
infrastructure/staging/terraform.tfvars.example (1)

3-11: Document the cost vs security trade-off for staging configuration.

The example shows ecs_use_public_subnets = false (private subnets) but doesn't explicitly set create_nat_gateway, which defaults to true in variables.tf. This configuration is secure but incurs NAT Gateway costs (~$32+/month per AZ).

For maximum cost savings in staging, consider:

  • ecs_use_public_subnets = true + create_nat_gateway = false (cheapest, tasks get public IPs)
  • Current config: ecs_use_public_subnets = false + create_nat_gateway = true (secure, more expensive)

Consider adding a comment in the example documenting this trade-off, or explicitly showing create_nat_gateway = true to make the cost implications clear.

📝 Suggested documentation addition
+# Cost optimization: Set to true to avoid NAT Gateway costs in staging
+# Security: Set to false to run tasks in private subnets (requires create_nat_gateway = true)
 ecs_use_public_subnets        = false
+# Required for private subnet internet access (ECS image pulls, etc.)
+# Can be set to false if ecs_use_public_subnets = true
+create_nat_gateway            = true
infrastructure/modules/alb/main.tf (2)

82-99: Add drop_invalid_header_fields to enhance ALB security.

The ALB should drop invalid HTTP headers to prevent header-based attacks. This is a security best practice for application load balancers.

🔒 Proposed security enhancement
 resource "aws_lb" "main" {
   depends_on                 = [aws_s3_bucket_policy.alb_logs]
+  drop_invalid_header_fields = true
   enable_deletion_protection = false
   internal                   = false
   load_balancer_type         = "application"

222-231: Reconsider prevent_destroy lifecycle for testing infrastructure.

The S3 bucket has prevent_destroy = true, which can make it difficult to tear down test environments. This may cause issues when destroying staging infrastructure for cleanup or recreation.

Consider making this conditional based on environment or removing it entirely for staging.

Based on learnings, this infrastructure is intended for testing purposes.

♻️ Suggested conditional prevent_destroy
   lifecycle {
-    prevent_destroy = true
+    prevent_destroy = var.environment == "production" ? true : false
   }

Note: This requires adding environment to the module's input variables.

infrastructure/staging/variables.tf (1)

138-142: Document security implications of public subnet default.

The default ecs_use_public_subnets = true optimizes for cost (avoids NAT Gateway charges) but assigns public IP addresses to ECS tasks, exposing them directly to the internet. This is appropriate for staging cost optimization but represents a security trade-off.

Consider enhancing the description to explicitly mention this trade-off.

📝 Enhanced description
 variable "ecs_use_public_subnets" {
-  description = "Whether to run ECS tasks in public subnets (requires assign_public_ip)."
+  description = "Whether to run ECS tasks in public subnets with public IPs (cost-optimized, avoids NAT). Set to false for private subnets (more secure, requires create_nat_gateway=true)."
   type        = bool
   default     = true
 }
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 33beb85 and 49c4999.

📒 Files selected for processing (6)
  • infrastructure/modules/alb/main.tf
  • infrastructure/modules/alb/variables.tf
  • infrastructure/modules/ecs/modules/task/main.tf
  • infrastructure/modules/security/main.tf
  • infrastructure/staging/terraform.tfvars.example
  • infrastructure/staging/variables.tf
🚧 Files skipped from review as they are similar to previous changes (1)
  • infrastructure/modules/alb/variables.tf
🧰 Additional context used
🧠 Learnings (8)
📓 Common learnings
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2431
File: infrastructure/main.tf:0-0
Timestamp: 2025-10-23T19:22:23.811Z
Learning: In Zappa-based serverless deployments, Lambda functions and IAM execution roles are managed by Zappa at application deployment time (via `zappa deploy`/`zappa update`), not via Terraform. Terraform provisions the supporting infrastructure (VPC, RDS, S3, security groups, RDS Proxy, Secrets Manager), while Zappa handles the Lambda orchestration layer.
📚 Learning: 2026-01-07T16:10:40.068Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 3238
File: infrastructure/staging/terraform.tfvars.example:3-5
Timestamp: 2026-01-07T16:10:40.068Z
Learning: In the OWASP/Nest infrastructure, Lambda functions deployed in VPCs require NAT gateways for internet access. The create_nat_gateway flag should remain true when Lambda functions are part of the deployment, regardless of ECS subnet configuration.

Applied to files:

  • infrastructure/staging/variables.tf
📚 Learning: 2025-10-23T19:22:23.811Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2431
File: infrastructure/main.tf:0-0
Timestamp: 2025-10-23T19:22:23.811Z
Learning: In Zappa-based serverless deployments, Lambda functions and IAM execution roles are managed by Zappa at application deployment time (via `zappa deploy`/`zappa update`), not via Terraform. Terraform provisions the supporting infrastructure (VPC, RDS, S3, security groups, RDS Proxy, Secrets Manager), while Zappa handles the Lambda orchestration layer.

Applied to files:

  • infrastructure/staging/variables.tf
  • infrastructure/staging/terraform.tfvars.example
📚 Learning: 2025-11-08T11:16:25.725Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2551
File: infrastructure/modules/parameters/main.tf:1-191
Timestamp: 2025-11-08T11:16:25.725Z
Learning: The parameters module in infrastructure/modules/parameters/ is currently configured for staging environment only. The `configuration` and `settings_module` variables default to "Staging" and "settings.staging" respectively, and users can update parameter values via the AWS Parameter Store console. The lifecycle.ignore_changes blocks on these parameters support manual console updates without Terraform reverting them.

Applied to files:

  • infrastructure/staging/variables.tf
  • infrastructure/staging/terraform.tfvars.example
📚 Learning: 2026-01-07T16:10:40.068Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 3238
File: infrastructure/staging/terraform.tfvars.example:3-5
Timestamp: 2026-01-07T16:10:40.068Z
Learning: In Terraform variable files for AWS Lambda deployments in this repository, ensure that NAT access is enabled by keeping create_nat_gateway = true whenever Lambda functions are deployed in a VPC, regardless of ECS subnet configuration. Apply this guideline to all relevant tfvars.example files that configure Lambda/VPC infrastructure.

Applied to files:

  • infrastructure/staging/terraform.tfvars.example
📚 Learning: 2025-10-17T15:25:34.963Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2431
File: infrastructure/modules/cache/main.tf:30-30
Timestamp: 2025-10-17T15:25:34.963Z
Learning: The infrastructure/Terraform code in the OWASP Nest repository under the `infrastructure/` directory is intended for quick testing purposes only, not for production deployment.

Applied to files:

  • infrastructure/staging/terraform.tfvars.example
📚 Learning: 2025-11-27T19:38:23.956Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2743
File: infrastructure/modules/storage/main.tf:1-9
Timestamp: 2025-11-27T19:38:23.956Z
Learning: In the OWASP/Nest infrastructure code (infrastructure/ directory), the team prefers exact version pinning for Terraform (e.g., "1.14.0") and AWS provider (e.g., "6.22.0") over semantic versioning constraints (e.g., "~> 1.14" or "~> 6.22"). This is a deliberate choice for maximum reproducibility and explicit control over version updates in their testing environment.

Applied to files:

  • infrastructure/staging/terraform.tfvars.example
📚 Learning: 2025-11-23T11:52:15.463Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2699
File: backend/wsgi.py:13-13
Timestamp: 2025-11-23T11:52:15.463Z
Learning: In the OWASP Nest project, the SSM parameter store setup in backend/wsgi.py (using boto3 to fetch parameters from AWS Systems Manager) is designed for staging and production environments, not just for testing purposes.

Applied to files:

  • infrastructure/staging/terraform.tfvars.example
🪛 Checkov (3.2.334)
infrastructure/modules/alb/main.tf

[medium] 82-99: Ensure that ALB drops HTTP headers

(CKV_AWS_131)


[medium] 101-112: Ensure ALB protocol is HTTPS

(CKV_AWS_2)


[high] 101-112: Ensure that load balancer is using at least TLS 1.2

(CKV_AWS_103)


[medium] 183-204: Ensure AWS Load Balancer doesn't use HTTP protocol

(CKV_AWS_378)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Run frontend e2e tests
  • GitHub Check: Run frontend unit tests
  • GitHub Check: Run backend tests
🔇 Additional comments (10)
infrastructure/modules/security/main.tf (4)

12-17: Excellent validation using check blocks.

The check block provides early detection of misconfiguration at plan time, ensuring vpc_endpoint_sg_id is provided whenever VPC endpoint rules are enabled. This prevents downstream errors during apply.


123-132: Well-implemented cost optimization pattern.

The conditional creation of the VPC endpoint egress rule allows staging environments to skip VPC endpoints entirely, directly supporting the cost reduction objectives outlined in issue #3216.


154-163: Consistent conditional pattern applied to frontend.

The implementation mirrors the ECS rule pattern, ensuring consistency across all compute resources that interact with VPC endpoints.


175-184: VPC endpoint ingress rules are already in place.

The VPC endpoint security group includes an ingress rule (in infrastructure/modules/networking/modules/vpc-endpoint/main.tf lines 26-35) that allows HTTPS traffic from the entire VPC via cidr_blocks = [var.vpc_cidr]. This ingress rule will accept HTTPS traffic from the ECS, frontend, and Lambda security groups, though it uses a CIDR-based approach rather than source security group references. The egress rules from the three compute resources will function correctly with this configuration.

infrastructure/modules/ecs/modules/task/main.tf (2)

86-86: Good refactor to enable flexible subnet selection.

Renaming from private_subnet_ids to subnet_ids provides the flexibility needed for cost optimization by allowing either private or public subnets. This aligns well with the PR objectives to conditionally create expensive resources like NAT gateways.


84-84: The coordination between subnet type and public IP assignment is already implemented at the infrastructure level through the ecs_use_public_subnets variable in staging/main.tf (lines 77 and 86), which ties both assign_public_ip and subnet_ids to the same configuration decision. The staging configuration correctly demonstrates this: ecs_use_public_subnets = false results in private subnets with assign_public_ip = false, and create_nat_gateway defaults to true for internet access. No additional verification is needed.

Likely an incorrect or invalid review comment.

infrastructure/modules/alb/main.tf (1)

101-112: HTTP listener is intentional for cost optimization.

The Checkov warnings about HTTP protocol and missing TLS are expected here. This HTTP listener is only created when enable_https = false (line 102), which is by design for staging cost optimization (avoiding ACM certificate and domain costs).

The HTTPS listener (lines 132-145) uses TLS 1.3, which exceeds security requirements when HTTPS is enabled.

infrastructure/staging/variables.tf (3)

190-194: Fargate Spot configuration aligns with cost optimization goals.

The frontend_use_fargate_spot and use_fargate_spot variables with default = true effectively support the PR's cost optimization objectives. Fargate Spot can provide up to 70% cost savings for fault-tolerant workloads, making it ideal for staging environments.

Also applies to: 280-284


196-206: Lambda variable documentation is clear.

The lambda_arn and lambda_function_name variables correctly reference Zappa Lambda integration. The descriptions are clear, and defaults of null appropriately indicate optional Lambda integration.

Based on learnings, Lambda functions are managed by Zappa separately and these variables enable wiring to the ALB.


25-59: VPC endpoint flags provide effective cost controls.

The new VPC endpoint feature flags (create_vpc_*_endpoint) with defaults set to false (except S3 Gateway which is free) provide granular cost control. The networking module properly implements these conditionals with appropriate security group and route table associations, ensuring that resources are only created when needed.

When endpoints are disabled, services can still be reached via the NAT Gateway configured in the VPC.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @infrastructure/modules/alb/main.tf:
- Around line 75-103: The ALB should be configured to drop invalid HTTP headers
to mitigate header-based attacks; add the attribute drop_invalid_header_fields =
true to the aws_lb resource "main" (inside the aws_lb.main block) so the load
balancer will discard malformed headers, then run terraform plan/apply to apply
the change.
- Around line 53-64: The ACM cert resource (aws_acm_certificate.main) uses DNS
validation but there is no aws_acm_certificate_validation resource to complete
validation; add an aws_acm_certificate_validation that uses the certificate_arn
from aws_acm_certificate.main and creates/points the required DNS validation
record(s) in Route53 (via a aws_route53_record resource or by referencing a
route53_zone data source) so Terraform waits for validation to complete, and
update any HTTPS listener/resource (e.g., aws_lb_listener) to depend_on the
aws_acm_certificate_validation to ensure the listener is created only after
validation succeeds.
🧹 Nitpick comments (2)
infrastructure/modules/security/main.tf (1)

19-25: LGTM: Precondition logic is correct.

The precondition ensures that vpc_endpoint_sg_id is provided when VPC endpoint rules need to be created, preventing misconfiguration. The condition logic and error message are clear.

Minor note: The precondition is placed on the ALB security group but validates a dependency for ECS, frontend, and lambda rules. While this works, consider placing it on one of the affected rules or using variable validation for better semantic alignment.

infrastructure/modules/alb/main.tf (1)

34-34: Document the rationale for chunking backend paths.

The paths are chunked into groups of 5, which relates to AWS's limit of 5 path patterns per listener rule condition. Consider adding a comment explaining this constraint for future maintainers.

📝 Suggested documentation
+  # AWS ALB listener rules support max 5 path patterns per condition
+  # Chunk the backend paths to comply with this limit
   backend_path_chunks = chunklist(local.backend_paths, 5)
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 49c4999 and df7ff5c.

📒 Files selected for processing (2)
  • infrastructure/modules/alb/main.tf
  • infrastructure/modules/security/main.tf
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2431
File: infrastructure/main.tf:0-0
Timestamp: 2025-10-23T19:22:23.811Z
Learning: In Zappa-based serverless deployments, Lambda functions and IAM execution roles are managed by Zappa at application deployment time (via `zappa deploy`/`zappa update`), not via Terraform. Terraform provisions the supporting infrastructure (VPC, RDS, S3, security groups, RDS Proxy, Secrets Manager), while Zappa handles the Lambda orchestration layer.
🪛 Checkov (3.2.334)
infrastructure/modules/alb/main.tf

[medium] 75-103: Ensure that ALB drops HTTP headers

(CKV_AWS_131)


[medium] 105-116: Ensure ALB protocol is HTTPS

(CKV_AWS_2)


[high] 105-116: Ensure that load balancer is using at least TLS 1.2

(CKV_AWS_103)


[medium] 187-208: Ensure AWS Load Balancer doesn't use HTTP protocol

(CKV_AWS_378)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Run backend tests
  • GitHub Check: Run frontend e2e tests
  • GitHub Check: Run frontend unit tests
🔇 Additional comments (9)
infrastructure/modules/security/main.tf (1)

124-124: No external references found. These security group rules are not referenced elsewhere in the codebase, so the count-based conditional approach is safe.

infrastructure/modules/alb/main.tf (8)

1-14: LGTM! Clean provider configuration.

The Terraform and provider version pinning follows best practices for reproducible infrastructure.


37-51: LGTM! Standard ALB logging setup.

The data sources correctly configure IAM permissions for ALB to write access logs to S3.


66-73: LGTM! Correct Lambda permission setup.

The permission correctly allows the ALB to invoke the Lambda function, with proper source ARN restriction for security.


105-149: LGTM! Well-designed conditional listener configuration.

The listeners are properly structured:

  • HTTP-only mode for development/staging (lines 105-116)
  • HTTP→HTTPS redirect when HTTPS is enabled (lines 118-134)
  • HTTPS with strong TLS 1.3 policy (lines 136-149)

The #NOSONAR comments appropriately acknowledge the intentional HTTP usage for non-production environments. The TLS policy ELBSecurityPolicy-TLS13-1-2-2021-06 is excellent for security.


151-185: LGTM! Routing rules correctly implement path-based splitting.

The listener rules properly route backend API paths to Lambda while allowing frontend paths to go to the default ECS target group. The priority calculation (100 + each.key) ensures non-conflicting rule priorities across the 3 path chunks.


187-224: LGTM! Target groups properly configured for both ECS and Lambda.

The implementation correctly handles:

  • Frontend TG with IP targets for ECS Fargate (lines 187-208)
  • Health checks with appropriate intervals and thresholds
  • Lambda TG without health checks (correct for Lambda targets, lines 210-217)
  • Proper dependency ordering for Lambda attachment (line 223)

The 30-second deregistration delay provides a good balance between fast deployments and graceful connection draining.


226-286: LGTM! S3 logging bucket follows security best practices.

Excellent implementation with:

  • prevent_destroy lifecycle to protect log retention (line 233)
  • Automated lifecycle management for cost control (lines 237-251)
  • Full public access block (lines 258-264)
  • Server-side encryption (lines 266-274)
  • Versioning enabled (lines 276-282)
  • Random suffix for globally unique bucket names (lines 284-286)

This meets security and compliance requirements for ALB access logs.


17-33: No action needed. CSRF paths are required.

Backend code confirms active CSRF functionality: the /csrf endpoint exists at backend/apps/core/api/internal/csrf.py and is routed in backend/settings/urls.py. The PR's removal of CSRF/session cookie settings refers to configuration changes, not elimination of the CSRF endpoint itself. The ALB backend paths list correctly includes /csrf, /csrf/*, /idx, and /graphql/* since they implement CSRF protection and token handling.

Likely an incorrect or invalid review comment.

coderabbitai[bot]
coderabbitai bot previously approved these changes Jan 8, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @infrastructure/modules/alb/main.tf:
- Around line 53-64: The ACM certificate aws_acm_certificate.main is created
with validation_method = "DNS" but no validation resources are present, so the
cert will remain pending; add DNS validation by creating
aws_route53_record.cert_validation that iterates over
aws_acm_certificate.main[0].domain_validation_options to create the necessary
records (and introduce var.route53_zone_id if using Route 53), then add
aws_acm_certificate_validation.main referencing aws_acm_certificate.main[0].arn
and the fqdn(s) of the route53 records so the certificate is validated
automatically; if you don’t manage Route 53, instead output the
domain_validation_options for manual DNS entry.
🧹 Nitpick comments (1)
infrastructure/staging/variables.tf (1)

37-41: Default value inconsistency for S3 endpoint between module and staging.

The create_vpc_s3_endpoint defaults to true here (line 40), but defaults to false in infrastructure/modules/networking/variables.tf (line 38). While the staging value will override the module default, this inconsistency could cause confusion. Consider aligning the module default to true since S3 Gateway endpoints are free.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between df7ff5c and fed6fe0.

📒 Files selected for processing (8)
  • infrastructure/modules/alb/main.tf
  • infrastructure/modules/ecs/main.tf
  • infrastructure/modules/frontend/modules/alb/outputs.tf
  • infrastructure/modules/frontend/modules/alb/variables.tf
  • infrastructure/modules/networking/main.tf
  • infrastructure/modules/networking/variables.tf
  • infrastructure/staging/main.tf
  • infrastructure/staging/variables.tf
💤 Files with no reviewable changes (2)
  • infrastructure/modules/frontend/modules/alb/outputs.tf
  • infrastructure/modules/frontend/modules/alb/variables.tf
🧰 Additional context used
🧠 Learnings (5)
📚 Learning: 2025-10-23T19:22:23.811Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2431
File: infrastructure/main.tf:0-0
Timestamp: 2025-10-23T19:22:23.811Z
Learning: In Zappa-based serverless deployments, Lambda functions and IAM execution roles are managed by Zappa at application deployment time (via `zappa deploy`/`zappa update`), not via Terraform. Terraform provisions the supporting infrastructure (VPC, RDS, S3, security groups, RDS Proxy, Secrets Manager), while Zappa handles the Lambda orchestration layer.

Applied to files:

  • infrastructure/staging/variables.tf
📚 Learning: 2026-01-07T16:10:40.068Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 3238
File: infrastructure/staging/terraform.tfvars.example:3-5
Timestamp: 2026-01-07T16:10:40.068Z
Learning: In the OWASP/Nest infrastructure, Lambda functions deployed in VPCs require NAT gateways for internet access. The create_nat_gateway flag should remain true when Lambda functions are part of the deployment, regardless of ECS subnet configuration.

Applied to files:

  • infrastructure/staging/variables.tf
📚 Learning: 2025-11-08T11:16:25.725Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2551
File: infrastructure/modules/parameters/main.tf:1-191
Timestamp: 2025-11-08T11:16:25.725Z
Learning: The parameters module in infrastructure/modules/parameters/ is currently configured for staging environment only. The `configuration` and `settings_module` variables default to "Staging" and "settings.staging" respectively, and users can update parameter values via the AWS Parameter Store console. The lifecycle.ignore_changes blocks on these parameters support manual console updates without Terraform reverting them.

Applied to files:

  • infrastructure/staging/variables.tf
  • infrastructure/staging/main.tf
📚 Learning: 2025-11-27T19:38:23.956Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2743
File: infrastructure/modules/storage/main.tf:1-9
Timestamp: 2025-11-27T19:38:23.956Z
Learning: In the OWASP/Nest infrastructure code (infrastructure/ directory), the team prefers exact version pinning for Terraform (e.g., "1.14.0") and AWS provider (e.g., "6.22.0") over semantic versioning constraints (e.g., "~> 1.14" or "~> 6.22"). This is a deliberate choice for maximum reproducibility and explicit control over version updates in their testing environment.

Applied to files:

  • infrastructure/modules/alb/main.tf
📚 Learning: 2025-10-17T15:25:34.963Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2431
File: infrastructure/modules/cache/main.tf:30-30
Timestamp: 2025-10-17T15:25:34.963Z
Learning: The infrastructure/Terraform code in the OWASP Nest repository under the `infrastructure/` directory is intended for quick testing purposes only, not for production deployment.

Applied to files:

  • infrastructure/modules/alb/main.tf
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Run frontend e2e tests
  • GitHub Check: Run backend tests
  • GitHub Check: Run frontend unit tests
🔇 Additional comments (16)
infrastructure/modules/ecs/main.tf (3)

19-28: LGTM! Capacity provider strategy correctly implemented.

The ECS cluster capacity providers setup properly configures both FARGATE and FARGATE_SPOT, with the default strategy conditionally selecting based on var.use_fargate_spot. Setting base = 0 and weight = 1 ensures all tasks use the selected provider.


296-315: Good safeguard: migrate task excluded from Fargate Spot.

Hardcoding use_fargate_spot = false for the migrate task is a sensible choice. Database migrations are critical operations that should not be interrupted by Spot instance reclamation, which aligns with the commit message "never use Fargate SPOT for migrate task."


222-240: Consistent parameter wiring across task modules.

All task modules properly receive the new assign_public_ip, subnet_ids, and use_fargate_spot parameters. The transition from private_subnet_ids to subnet_ids provides flexibility for public/private subnet deployment based on the staging configuration.

Also applies to: 242-271, 273-294, 317-349, 351-369

infrastructure/modules/networking/main.tf (2)

45-64: LGTM! VPC endpoint conditional creation properly implemented.

The count-based gating ensures the vpc_endpoint module is only instantiated when at least one endpoint is requested. Individual endpoint flags are correctly passed through for granular control.


138-153: NAT gateway remains unconditionally created despite cost optimization objectives.

The PR objectives mention conditional creation of NAT gateways to reduce staging costs. However, the NAT gateway (aws_nat_gateway.main) and its EIP (aws_eip.nat) are still always created. The private route table also unconditionally references the NAT gateway.

Based on learnings, Lambda functions deployed in VPCs require NAT gateways for internet access. If Lambda integration is retained, the NAT gateway may indeed be necessary. However, if staging uses ECS tasks in public subnets with public IPs (as the new ecs_use_public_subnets flag suggests), those tasks wouldn't need NAT for outbound traffic.

Is the unconditional NAT gateway creation intentional given the Lambda requirements, or should it be conditionally created similar to VPC endpoints?

Also applies to: 166-175

infrastructure/staging/variables.tf (2)

19-53: Well-structured VPC endpoint toggles for cost optimization.

The VPC endpoint variables provide granular control. Defaulting Interface endpoints (CloudWatch, ECR, SSM, Secrets Manager) to false while keeping the free S3 Gateway endpoint at true is a sensible cost-saving default for staging.


132-136: Cost optimization defaults properly configured.

  • ecs_use_public_subnets = true allows ECS tasks to run with public IPs, avoiding NAT gateway costs
  • frontend_use_fargate_spot = true and use_fargate_spot = true enable Spot pricing for cost savings

These defaults align well with the PR objectives for staging cost reduction.

Also applies to: 184-188, 274-278

infrastructure/modules/networking/variables.tf (1)

17-51: LGTM! Clean variable definitions for VPC endpoint toggles.

The six endpoint creation variables follow consistent naming conventions and have appropriate boolean types with false defaults. This provides the expected flexibility for conditional endpoint creation.

infrastructure/staging/main.tf (4)

20-35: LGTM! ALB module properly wired with Lambda integration.

The ALB module configuration correctly:

  • Conditionally enables HTTPS based on frontend_domain_name presence
  • Passes through Lambda ARN and function name for backend routing
  • Uses appropriate security group and subnet references

74-88: Correct conditional subnet and public IP assignment for ECS.

The logic properly selects public or private subnets based on ecs_use_public_subnets, and assigns public IPs accordingly. This enables the cost optimization of running ECS tasks in public subnets without NAT dependency.


130-134: Verify the "zappa" literal in allowed_hosts.

The allowed_hosts includes a hardcoded "zappa" string. This appears intentional for Zappa Lambda deployments (Django's ALLOWED_HOSTS needs the Lambda handler hostname), but please confirm this is the expected value for your Zappa configuration.


148-160: Correct exclusion of S3 from VPC endpoint security rules.

Line 153 correctly excludes create_vpc_s3_endpoint from the create_vpc_endpoint_rules condition. S3 uses a Gateway endpoint (not Interface), so it doesn't require security group rules.

infrastructure/modules/alb/main.tf (4)

16-35: Good approach: Path chunking for ALB listener rules.

ALB listener rules are limited to 5 path patterns per condition. Using chunklist(local.backend_paths, 5) correctly splits the 14 backend paths into multiple rules while maintaining proper priority ordering.


75-104: Good: Preconditions provide early validation and security header handling.

The lifecycle preconditions ensure configuration consistency:

  • HTTPS requires domain_name
  • Lambda ARN requires function name

The drop_invalid_header_fields = true setting (line 77) improves security by preventing request smuggling attacks via malformed headers.


152-186: LGTM! Backend routing rules correctly implemented for HTTP/HTTPS.

The for_each iteration over path chunks with priority offset (100 + each.key) ensures proper rule ordering. Conditional creation based on lambda_arn and enable_https appropriately gates the rules.


211-225: LGTM! Lambda target group setup is correct.

The Lambda target group and attachment are properly gated on lambda_arn != null, and the attachment correctly depends on the Lambda permission to ensure proper creation order.

coderabbitai[bot]
coderabbitai bot previously approved these changes Jan 8, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
infrastructure/staging/variables.tf (2)

132-142: Consider adding a validation or documentation note about NAT gateway requirement.

When ecs_use_public_subnets = false, ECS tasks run in private subnets and need a NAT gateway to pull container images (unless VPC endpoints for ECR are enabled). Based on learnings, Lambda in VPC also requires NAT. Consider adding a validation or a comment clarifying the NAT/VPC endpoint dependency.


196-206: Consider adding ARN format validation for lambda_arn.

When lambda_arn is provided, validating the ARN format could help catch configuration errors early.

♻️ Optional validation to add
 variable "lambda_arn" {
   description = "The ARN of the Zappa Lambda function for backend routing."
   type        = string
   default     = null
+
+  validation {
+    condition     = var.lambda_arn == null || can(regex("^arn:aws:lambda:[a-z0-9-]+:[0-9]+:function:.+$", var.lambda_arn))
+    error_message = "Lambda ARN must be a valid AWS Lambda function ARN."
+  }
 }
infrastructure/staging/terraform.tfvars.example (1)

1-11: Consider documenting the networking dependencies in comments.

This example sets ecs_use_public_subnets = false (running ECS in private subnets), but doesn't show VPC endpoint or NAT gateway configuration. For ECS tasks in private subnets to pull container images, either NAT gateway or ECR VPC endpoints must be enabled.

Consider adding inline comments to clarify the dependencies:

📝 Suggested documentation
 availability_zones            = ["us-east-2a", "us-east-2b", "us-east-2c"]
 aws_region                    = "us-east-2"
 create_rds_proxy              = false
 ecs_use_fargate_spot          = true
+# When false, tasks run in private subnets and require either:
+# - NAT gateway (create_nat_gateway = true), OR
+# - VPC endpoints (create_vpc_ecr_api_endpoint = true, create_vpc_ecr_dkr_endpoint = true)
 ecs_use_public_subnets        = false
 environment                   = "staging"
 frontend_domain_name          = "nest.owasp.dev"
 frontend_use_fargate_spot     = true
+# Set these when using Zappa Lambda backend routing
 lambda_arn                    = null
 lambda_function_name          = null
 project_name                  = "owasp-nest"
infrastructure/staging/main.tf (1)

130-134: Minor: Consider extracting the "zappa" allowed host to a variable.

The hardcoded "zappa" string in allowed_hosts (line 132) appears to be for Zappa health checks. For consistency and documentation purposes, consider extracting this to a local or adding a brief comment explaining its purpose.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fed6fe0 and ddad92d.

📒 Files selected for processing (3)
  • infrastructure/staging/main.tf
  • infrastructure/staging/terraform.tfvars.example
  • infrastructure/staging/variables.tf
🧰 Additional context used
🧠 Learnings (8)
📚 Learning: 2025-10-17T15:25:34.963Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2431
File: infrastructure/modules/cache/main.tf:30-30
Timestamp: 2025-10-17T15:25:34.963Z
Learning: The infrastructure/Terraform code in the OWASP Nest repository under the `infrastructure/` directory is intended for quick testing purposes only, not for production deployment.

Applied to files:

  • infrastructure/staging/terraform.tfvars.example
📚 Learning: 2025-11-27T19:38:23.956Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2743
File: infrastructure/modules/storage/main.tf:1-9
Timestamp: 2025-11-27T19:38:23.956Z
Learning: In the OWASP/Nest infrastructure code (infrastructure/ directory), the team prefers exact version pinning for Terraform (e.g., "1.14.0") and AWS provider (e.g., "6.22.0") over semantic versioning constraints (e.g., "~> 1.14" or "~> 6.22"). This is a deliberate choice for maximum reproducibility and explicit control over version updates in their testing environment.

Applied to files:

  • infrastructure/staging/terraform.tfvars.example
📚 Learning: 2026-01-07T16:10:40.068Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 3238
File: infrastructure/staging/terraform.tfvars.example:3-5
Timestamp: 2026-01-07T16:10:40.068Z
Learning: In Terraform variable files for AWS Lambda deployments in this repository, ensure that NAT access is enabled by keeping create_nat_gateway = true whenever Lambda functions are deployed in a VPC, regardless of ECS subnet configuration. Apply this guideline to all relevant tfvars.example files that configure Lambda/VPC infrastructure.

Applied to files:

  • infrastructure/staging/terraform.tfvars.example
📚 Learning: 2025-11-08T11:16:25.725Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2551
File: infrastructure/modules/parameters/main.tf:1-191
Timestamp: 2025-11-08T11:16:25.725Z
Learning: The parameters module in infrastructure/modules/parameters/ is currently configured for staging environment only. The `configuration` and `settings_module` variables default to "Staging" and "settings.staging" respectively, and users can update parameter values via the AWS Parameter Store console. The lifecycle.ignore_changes blocks on these parameters support manual console updates without Terraform reverting them.

Applied to files:

  • infrastructure/staging/terraform.tfvars.example
  • infrastructure/staging/main.tf
📚 Learning: 2025-11-23T11:52:15.463Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2699
File: backend/wsgi.py:13-13
Timestamp: 2025-11-23T11:52:15.463Z
Learning: In the OWASP Nest project, the SSM parameter store setup in backend/wsgi.py (using boto3 to fetch parameters from AWS Systems Manager) is designed for staging and production environments, not just for testing purposes.

Applied to files:

  • infrastructure/staging/terraform.tfvars.example
📚 Learning: 2026-01-08T14:29:20.530Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 3238
File: infrastructure/modules/alb/main.tf:53-64
Timestamp: 2026-01-08T14:29:20.530Z
Learning: In the OWASP/Nest repository's infrastructure/modules/alb module, the ACM certificate is intentionally created without automatic validation resources (aws_acm_certificate_validation or aws_route53_record for validation). Instead, the module outputs acm_certificate_domain_validation_options for manual DNS configuration. This is a deliberate design decision to allow DNS management outside of this Terraform module.

Applied to files:

  • infrastructure/staging/terraform.tfvars.example
  • infrastructure/staging/main.tf
📚 Learning: 2025-10-23T19:22:23.811Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2431
File: infrastructure/main.tf:0-0
Timestamp: 2025-10-23T19:22:23.811Z
Learning: In Zappa-based serverless deployments, Lambda functions and IAM execution roles are managed by Zappa at application deployment time (via `zappa deploy`/`zappa update`), not via Terraform. Terraform provisions the supporting infrastructure (VPC, RDS, S3, security groups, RDS Proxy, Secrets Manager), while Zappa handles the Lambda orchestration layer.

Applied to files:

  • infrastructure/staging/variables.tf
📚 Learning: 2026-01-07T16:10:40.068Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 3238
File: infrastructure/staging/terraform.tfvars.example:3-5
Timestamp: 2026-01-07T16:10:40.068Z
Learning: In the OWASP/Nest infrastructure, Lambda functions deployed in VPCs require NAT gateways for internet access. The create_nat_gateway flag should remain true when Lambda functions are part of the deployment, regardless of ECS subnet configuration.

Applied to files:

  • infrastructure/staging/variables.tf
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Run backend tests
  • GitHub Check: Run frontend unit tests
  • GitHub Check: Run frontend e2e tests
🔇 Additional comments (6)
infrastructure/staging/variables.tf (2)

7-11: LGTM!

The aws_region variable is properly defined with a sensible default matching the availability zones.


19-53: Good approach to cost optimization with VPC endpoint toggles.

The defaults correctly reflect cost considerations: interface endpoints (CloudWatch Logs, ECR, Secrets Manager, SSM) default to false as they incur hourly charges, while the S3 gateway endpoint defaults to true since it's free.

infrastructure/staging/main.tf (4)

20-35: LGTM - ALB module wiring looks correct.

The conditional HTTPS enablement based on frontend_domain_name != null aligns with the variable description stating "HTTPS is auto-enabled via ACM when set."


74-88: LGTM - ECS module conditional networking is well-implemented.

The subnet selection and public IP assignment logic correctly maps ecs_use_public_subnets to the appropriate configuration.


148-160: LGTM - Security module VPC endpoint rules condition is correct.

The condition correctly excludes create_vpc_s3_endpoint since S3 gateway endpoints use route table entries rather than security group rules.


130-146: The database module already implements proper fallback logic for db_proxy_endpoint. The output is defined as a conditional expression: when create_rds_proxy is false, it returns aws_db_instance.main.address (the direct RDS endpoint) instead of null or empty. No action needed.

Likely an incorrect or invalid review comment.

coderabbitai[bot]
coderabbitai bot previously approved these changes Jan 8, 2026
coderabbitai[bot]
coderabbitai bot previously approved these changes Jan 8, 2026
@sonarqubecloud
Copy link

sonarqubecloud bot commented Jan 8, 2026

@rudransh-shrivastava rudransh-shrivastava marked this pull request as ready for review January 8, 2026 16:28
@arkid15r arkid15r merged commit 97e36fc into OWASP:feature/nest-zappa-migration Jan 10, 2026
27 checks passed
@rudransh-shrivastava rudransh-shrivastava deleted the feature/nest-zappa-migration-cost branch January 26, 2026 11:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend docs Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Optimize Costs for Staging Deployment

2 participants