Fix infrastructure/update data manager Docker definition#826
Fix infrastructure/update data manager Docker definition#826forstmeier merged 4 commits intomasterfrom
Conversation
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 41 minutes and 54 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughDocker build flags changed for data_manager; Pulumi notifications refactored to use SNS + typed AnomalySubscription args and topic policy; ECS log group names, task family, container logging, env var, and ensemble service health-check grace updated. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile SummaryThis PR resolves two previously flagged infrastructure bugs — the Dockerfile opt-level cache-invalidation issue and the missing SNS topic policy for Cost Anomaly Detection — and makes several related improvements: log group name alignment, a new Confidence Score: 5/5Safe to merge — both previously flagged P1 issues are resolved and no new issues were found. All findings from the prior review round are addressed: CARGO_PROFILE_RELEASE_OPT_LEVEL is now an ENV directive consistent across cook and build, and the SNS topic policy with the correct costalerts.amazonaws.com principal and depends_on guard is in place. No new P0/P1 issues were identified in this pass. No files require special attention.
|
| Filename | Overview |
|---|---|
| applications/data_manager/Dockerfile | Reduces parallel build jobs to 1 and promotes CARGO_PROFILE_RELEASE_OPT_LEVEL=1 to an ENV directive, fixing the previously flagged cache invalidation issue between cook and build steps. |
| infrastructure/notifications.py | Adds SNS topic policy granting costalerts.amazonaws.com publish rights, migrates anomaly subscription from raw JSON + email to typed args + SNS, with correct depends_on ordering. |
| infrastructure/compute.py | Renames CloudWatch log groups to include the applications prefix, adds AWS_S3_MODEL_ARTIFACT_PATH env var to the ensemble manager task definition, and extends the ECS health-check grace period to 180s. |
| infrastructure/training.py | Renames task family to tide-runner, replaces hardcoded log group name and region strings with dynamic Pulumi outputs, adds sort_keys for deterministic JSON serialization. |
| infrastructure/pyproject.toml | Bumps pulumi-aws minimum version to 7.7.0, required for the typed AnomalySubscriptionThresholdExpressionArgs used in notifications.py. |
| uv.lock | Lock file updated to reflect the pulumi-aws>=7.7.0 constraint change. |
Reviews (4): Last reviewed commit: "Address pull request #826 feedback: use ..." | Re-trigger Greptile
There was a problem hiding this comment.
Pull request overview
Updates infrastructure and build configuration to address broken billing alert resource definition and adjust the data manager container build behavior.
Changes:
- Fix Cost Explorer anomaly subscription
threshold_expressiondefinition to use Pulumi AWS typed args instead of JSON. - Adjust
applications/data_managerDocker build settings (cargo build parallelism and release opt-level duringcargo chef cook).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| infrastructure/notifications.py | Updates anomaly subscription threshold expression definition for AWS Cost Explorer. |
| applications/data_manager/Dockerfile | Tweaks Rust build configuration in the builder stage (jobs + release opt-level for dependency cook). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Warning
CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.
Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@infrastructure/notifications.py`:
- Around line 50-55: The project uses typed Pulumi classes like
AnomalySubscriptionThresholdExpressionArgs and
AnomalySubscriptionThresholdExpressionDimensionArgs (used in
threshold_expression) which require pulumi-aws>=7.7.0; update the dependency
constraint in pyproject.toml from pulumi-aws>=7.4.0 to pulumi-aws>=7.7.0 and
then reinstall or lock dependencies (poetry install/poetry lock or pip tooling)
so the runtime has the Pulumi AWS v7.7.0+ classes available.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 4b17d2a9-be0b-4290-9295-137289811fc3
📒 Files selected for processing (2)
applications/data_manager/Dockerfileinfrastructure/notifications.py
There was a problem hiding this comment.
Warning
CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.
Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@infrastructure/notifications.py`:
- Around line 56-61: Create and attach an SNS topic policy resource that grants
"sns:Publish" to the service principal "costalerts.amazonaws.com" for the
infrastructure_alerts_topic ARN, and ensure the cost anomaly subscription
depends on that policy; specifically, add an aws.sns.TopicPolicy (or equivalent)
referencing infrastructure_alerts_topic.arn with a statement allowing Action
"sns:Publish" for Principal {"Service":"costalerts.amazonaws.com"}, then add a
depends_on linking the aws.costexplorer.AnomalySubscriptionSubscriberArgs (the
subscribers creation) to that TopicPolicy so the policy exists before the
subscription is created.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 26da53a6-c0cc-4ce8-9f0a-2c327a2d7a7d
📒 Files selected for processing (3)
infrastructure/compute.pyinfrastructure/notifications.pyinfrastructure/training.py
…and SNS topic policy - Move CARGO_PROFILE_RELEASE_OPT_LEVEL=1 to an ENV directive in the builder stage so both cargo chef cook and cargo build --release use the same opt-level. Previously the inline env var only applied to the cook step, causing Cargo to invalidate the chef cache and recompile dependencies at opt-level 3 during the final build. - Add aws.sns.TopicPolicy granting costalerts.amazonaws.com SNS:Publish on the infrastructure_alerts_topic. Without this policy, Cost Anomaly Detection deployments succeed but alerts fail to deliver at runtime. - Update pulumi-aws minimum version from 7.4.0 to 7.7.0 to match the AnomalySubscriptionThresholdExpressionArgs and AnomalySubscriptionThresholdExpressionDimensionArgs classes introduced in that release. Co-Authored-By: Claude <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 6 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Warning
CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.
Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@applications/data_manager/Dockerfile`:
- Line 42: The Dockerfile currently sets CARGO_PROFILE_RELEASE_OPT_LEVEL=1 which
lowers Rust release optimizations; decide whether to restore the default
(opt-level 3) for production or explicitly document and gate the opt-level
change. Either remove or change the CARGO_PROFILE_RELEASE_OPT_LEVEL assignment
to the higher optimization (or make it configurable via build-time ARG) to
preserve runtime performance, or add a clear comment and a README note
describing the trade-off and monitoring expectations so maintainers know why
opt-level=1 was chosen; reference the CARGO_PROFILE_RELEASE_OPT_LEVEL
environment var in the Dockerfile and any build scripts that consume it.
- Line 40: Add a short explanatory comment immediately above or on the same line
as the ENV CARGO_BUILD_JOBS=1 declaration in the Dockerfile explaining why the
build jobs are limited to 1 (e.g., to avoid high parallel memory/CPU usage in CI
or on constrained builder nodes, to ensure deterministic build ordering, or
because of known cargo/rustc concurrency issues), and state whether it’s for
resource constraints, build stability, or compatibility with the build
environment so reviewers understand the intent and can reevaluate later.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 04670c0b-df2c-468e-8257-cfd67249946a
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (3)
applications/data_manager/Dockerfileinfrastructure/notifications.pyinfrastructure/pyproject.toml
…ckerfile build vars - infrastructure/training.py: import region from config and use it for awslogs-region instead of hardcoded "us-east-1", consistent with all other ECS task definitions in compute.py - applications/data_manager/Dockerfile: add comment explaining CARGO_BUILD_JOBS=1 (OOM prevention on constrained builder nodes) and CARGO_PROFILE_RELEASE_OPT_LEVEL=1 (cache consistency trade-off) Co-Authored-By: Claude <noreply@anthropic.com>
Overview
Changes
Context
Some broken stuff.
Summary by CodeRabbit
Chores
Refactor
New Features