Conversation
|
Too many files changed for review. ( |
📝 WalkthroughWalkthroughRenames services (datamanager → data_manager, portfoliomanager → portfolio_manager, equitypricemodel → ensemble_manager), adds a Tide model project, and replaces a monolithic Pulumi infra entrypoint with modular infra modules (compute, config, iam, networking, storage, secrets, notifications); updates Dockerfiles, CI/workflows, docker-compose, and many test import paths. Changes
Sequence Diagram(s)mermaid Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
This PR reorganizes the repository structure and renames services/packages to improve readability, while extracting the TiDE training workflow into a dedicated models/tide Python package and renaming the equity price model service to ensemble_manager. It also refactors Pulumi infrastructure into multiple modules and updates automation/scripts to match the new naming.
Changes:
- Introduces
models/tideas a standalone Python package with workflow/run/deploy code and its own trainer Docker image. - Renames/realigns application packages (e.g.,
portfolio_manager,ensemble_manager,data_manager) and updates related tests, Dockerfiles, and workspace configuration. - Refactors Pulumi
infrastructure/into modular files (networking/storage/iam/secrets/notifications/etc.) and updatesmaskfile.md+ GitHub workflow automation.
Reviewed changes
Copilot reviewed 70 out of 106 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/pyproject.toml | Removes Prefect dependency from tools package. |
| tools/Dockerfile | Updates build context and PYTHONPATH; still runs Prefect worker. |
| pyproject.toml | Updates uv workspace members; adds models/tide and expands pytest testpaths. |
| models/tide/tests/test_workflow.py | Updates imports/patch targets to tide.workflow. |
| models/tide/tests/test_trainer.py | Updates imports to tide.trainer. |
| models/tide/tests/test_tide_model.py | Updates imports to tide.tide_model. |
| models/tide/tests/test_tide_data.py | Updates imports to tide.tide_data. |
| models/tide/tests/test_tasks.py | Updates imports/patch targets to tide.tasks. |
| models/tide/tests/test_run.py | Updates imports/patch targets to tide.run; updates expected artifact naming. |
| models/tide/tests/test_notifications.py | Updates imports to tide.notifications; renames env vars to FUND_*. |
| models/tide/tests/test_deploy.py | Updates imports/patch targets to tide.deploy. |
| models/tide/tests/conftest.py | Adds shared test fixture for generating raw Polars data. |
| models/tide/tests/init.py | Package marker for tests. |
| models/tide/src/tide/workflow.py | New Prefect flow/tasks for sync/prepare/train pipeline; updates env var names + artifact path layout. |
| models/tide/src/tide/trainer.py | Moves trainer into tide package; removes old CLI __main__ runner. |
| models/tide/src/tide/tide_model.py | Adds TiDE model implementation (tinygrad) including training/validation utilities. |
| models/tide/src/tide/tasks.py | Moves training-data preparation utilities into tide.tasks; removes old CLI __main__. |
| models/tide/src/tide/run.py | Adds CLI entry (python -m tide.run) to execute the training pipeline. |
| models/tide/src/tide/notifications.py | Updates notification env vars to FUND_TRAINING_NOTIFICATION_*. |
| models/tide/src/tide/deploy.py | Adds deployment helper using Prefect EntrypointType; updates env var names. |
| models/tide/src/tide/init.py | Package marker for tide. |
| models/tide/pyproject.toml | Declares tide package and dependencies (prefect/tinygrad/etc.) within the workspace. |
| models/tide/init.py | Package marker at model root. |
| models/tide/Dockerfile | Adds CUDA-based trainer image that runs python -m tide.workflow. |
| models/init.py | Package marker for models namespace. |
| maskfile.md | Updates image build logic, Pulumi imports, ECS service naming, and model train/deploy commands. |
| libraries/python/tests/test_infrastructure_storage.py | Adds tests asserting storage resources exist in infrastructure/storage.py. |
| libraries/python/tests/test_infrastructure_networking.py | Adds tests asserting NAT alarm configuration in infrastructure/networking.py. |
| libraries/python/tests/test_infrastructure_iam.py | Adds tests asserting IAM policy constraints in infrastructure/iam.py. |
| libraries/python/tests/test_infrastructure_configuration.py | Removes monolithic infra entrypoint tests targeting old infrastructure/__main__.py layout. |
| libraries/python/tests/test_infrastructure_config.py | Adds tests targeting new infrastructure/config.py. |
| libraries/python/tests/test_equity_details_schema.py | Updates import path to internal.equity_details_schema. |
| libraries/python/src/internal/equity_details_schema.py | Adds shared Pandera schema for equity details. |
| libraries/python/src/internal/combine_data.py | Renames combine function and removes script-style __main__. |
| libraries/python/pyproject.toml | Adds structlog dependency to shared internal library package. |
| infrastructure/storage.py | New storage/ECR definitions; migrates S3 bucket resources to BucketV2; creates ECR repos + image URIs. |
| infrastructure/secrets.py | New Secrets Manager definitions and versions using serialized secret objects. |
| infrastructure/parameters.py | Renames SSM parameter paths (e.g., portfolio-manager, ensemble-manager). |
| infrastructure/notifications.py | Adds SNS topic subscriptions and AWS budget notifications. |
| infrastructure/networking.py | New VPC/subnets/NAT/endpoints/security groups + NAT traffic alarm. |
| infrastructure/iam.py | New IAM roles/policies (GitHub Actions + ECS execution/task roles) + SES + SSM params for training notifications. |
| infrastructure/docker-compose.yaml | Renames local services (data-manager/portfolio-manager/ensemble-manager) and updates env wiring. |
| infrastructure/config.py | New Pulumi config loader with validations; centralizes constants and secret config helpers. |
| applications/portfolio_manager/tests/test_statistical_arbitrage.py | Updates imports to portfolio_manager.*. |
| applications/portfolio_manager/tests/test_risk_management.py | Updates imports to portfolio_manager.*. |
| applications/portfolio_manager/tests/test_report.py | Updates imports to portfolio_manager.*. |
| applications/portfolio_manager/tests/test_regime.py | Updates imports to portfolio_manager.*. |
| applications/portfolio_manager/tests/test_portfolio_server.py | Updates imports/patch targets to portfolio_manager.server.*. |
| applications/portfolio_manager/tests/test_portfolio_schema.py | Updates imports to portfolio_manager.portfolio_schema. |
| applications/portfolio_manager/tests/test_data_client.py | Updates imports/patch targets to portfolio_manager.data_client.*. |
| applications/portfolio_manager/tests/test_consolidation.py | Updates imports to portfolio_manager.consolidation. |
| applications/portfolio_manager/tests/test_beta.py | Updates imports to portfolio_manager.beta. |
| applications/portfolio_manager/tests/test_alpaca_client.py | Updates imports/patch targets to portfolio_manager.alpaca_client.*. |
| applications/portfolio_manager/src/portfolio_manager/statistical_arbitrage.py | Adds pair selection logic using correlations/z-scores + ranking/greedy selection. |
| applications/portfolio_manager/src/portfolio_manager/server.py | Updates env var names and switches predictions source to ensemble manager endpoint. |
| applications/portfolio_manager/src/portfolio_manager/risk_management.py | Adds volatility-parity sizing and beta-neutral optimization for pair weights. |
| applications/portfolio_manager/src/portfolio_manager/report.py | Adds formatted text reports for regime/beta/consolidation/pairs/portfolio. |
| applications/portfolio_manager/src/portfolio_manager/regime.py | Adds basic regime classifier (mean reversion vs trending) based on SPY stats. |
| applications/portfolio_manager/src/portfolio_manager/portfolio_schema.py | Adds Pandera schemas and validation helpers for pairs/portfolio outputs. |
| applications/portfolio_manager/src/portfolio_manager/exceptions.py | Adds domain exception types. |
| applications/portfolio_manager/src/portfolio_manager/enums.py | Adds enums for position/trade actions/sides. |
| applications/portfolio_manager/src/portfolio_manager/data_client.py | Adds HTTP client helpers to fetch data/equity details/SPY series. |
| applications/portfolio_manager/src/portfolio_manager/consolidation.py | Adds ensemble signal blending + realized volatility computation. |
| applications/portfolio_manager/src/portfolio_manager/beta.py | Adds market beta computation against SPY and portfolio beta calculation. |
| applications/portfolio_manager/src/portfolio_manager/alpaca_client.py | Adds Alpaca trading client wrapper + error mapping + rate-limit sleeps. |
| applications/portfolio_manager/pyproject.toml | Renames package to portfolio_manager. |
| applications/portfolio_manager/Dockerfile | Updates build paths, PYTHONPATH, and entrypoint package/module names. |
| applications/equitypricemodel/Dockerfile | Removes old equity price model Dockerfile (replaced by ensemble manager + tide trainer). |
| applications/ensemble_manager/tests/test_server.py | Updates imports/patch targets to ensemble_manager.server. |
| applications/ensemble_manager/tests/test_preprocess.py | Updates imports to ensemble_manager.preprocess. |
| applications/ensemble_manager/tests/test_predictions_schema.py | Updates imports to ensemble_manager.predictions_schema. |
| applications/ensemble_manager/tests/conftest.py | Adds shared Polars raw-data fixture for tests. |
| applications/ensemble_manager/src/ensemble_manager/server.py | Switches to shared internal schema + tide model/data; updates env var names; defines SSM parameter constant. |
| applications/ensemble_manager/src/ensemble_manager/preprocess.py | Adds equity bar filtering by average price/volume thresholds. |
| applications/ensemble_manager/src/ensemble_manager/predictions_schema.py | Adds Pandera schema and checks for predictions outputs. |
| applications/ensemble_manager/pyproject.toml | Renames package to ensemble_manager; depends on tide; removes tinygrad/numpy direct deps. |
| applications/ensemble_manager/Dockerfile | Adds Docker build for ensemble manager service including models/tide. |
| applications/data_manager/tests/test_storage.rs | Updates crate path to data_manager. |
| applications/data_manager/tests/test_state_and_health.rs | Updates crate path to data_manager. |
| applications/data_manager/tests/test_handlers.rs | Updates crate path to data_manager. |
| applications/data_manager/tests/test_errors.rs | Updates crate path to data_manager. |
| applications/data_manager/tests/test_data.rs | Updates crate path to data_manager. |
| applications/data_manager/tests/common/mod.rs | Adds shared LocalStack + server harness utilities for integration tests. |
| applications/data_manager/src/storage.rs | Implements S3 read/write and DuckDB S3 querying utilities with sanitization helpers. |
| applications/data_manager/src/state.rs | Adds env-based state initialization (AWS clients + Massive API secrets). |
| applications/data_manager/src/startup.rs | Updates env var naming; updates service log names. |
| applications/data_manager/src/router.rs | Adds Axum router module for all endpoints + tracing/Sentry layers. |
| applications/data_manager/src/predictions.rs | Adds endpoints for saving/querying predictions (URL-encoded JSON query param). |
| applications/data_manager/src/portfolios.rs | Adds endpoints for saving/querying portfolios and serializing to JSON. |
| applications/data_manager/src/main.rs | Updates crate name imports and env var naming in tests. |
| applications/data_manager/src/lib.rs | Exposes module structure for the crate. |
| applications/data_manager/src/health.rs | Adds health endpoint implementation. |
| applications/data_manager/src/errors.rs | Adds unified error enum for DuckDB/credentials/Polars/etc. |
| applications/data_manager/src/equity_details.rs | Adds GET equity details CSV and stub sync endpoint (501). |
| applications/data_manager/src/equity_bars.rs | Adds sync/query endpoints for equity bars, including Massive API integration. |
| applications/data_manager/src/data.rs | Adds DataFrame builders and normalization for bars/predictions/portfolios/details. |
| applications/data_manager/bacon.toml | Adds Rust dev tooling config for bacon. |
| applications/data_manager/Dockerfile | Updates build paths and binary naming to data_manager. |
| applications/data_manager/Cargo.toml | Renames crate/lib/bin from datamanager to data_manager. |
| Cargo.toml | Updates Rust workspace member path to applications/data_manager. |
| Cargo.lock | Updates crate name entry to data_manager. |
| .github/workflows/launch_infrastructure.yaml | Updates service matrix to new app names/paths and continues to call mask build/push. |
Comments suppressed due to low confidence (2)
tools/pyproject.toml:11
tools/Dockerfilerunsprefect worker startviauv run --package tools ..., but thetoolspackage no longer declares Prefect as a dependency. This will make the training worker image fail at runtime because theprefectCLI won't be installed. Addprefect>=3,<4back totools/pyproject.toml, or update the Dockerfile/entrypoint to run Prefect from a package that depends on it.
applications/ensemble_manager/src/ensemble_manager/server.py:64MODEL_VERSION_SSM_PARAMETERis set to/fund/ensemble_manager/model_version, but the infrastructure SSM parameter is created at/fund/ensemble-manager/model_version(dash, not underscore). With the current value, the service will never find the parameter and will always fall back tolatestdiscovery. Update this constant to match the actual parameter name.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 18
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
applications/portfolio_manager/Dockerfile (1)
9-31:⚠️ Potential issue | 🟠 MajorDrop root before starting the Python server.
The final stage never switches away from root, so
uvicornand the application keep full container privileges at runtime. There isn't anything afterCOPY --from=builder /app /appthat still needs root.🔒 Proposed hardening
COPY --from=builder /app /app +RUN useradd --system --uid 10001 --create-home appuser && \ + chown -R appuser:appuser /app +USER appuser EXPOSE 8080 ENTRYPOINT ["uv", "run", "--package", "portfolio_manager", "uvicorn", "portfolio_manager.server:application", "--host", "0.0.0.0", "--port", "8080", "--app-dir", "applications/portfolio_manager/src"]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@applications/portfolio_manager/Dockerfile` around lines 9 - 31, The final stage leaves the container running as root; after COPY --from=builder /app /app create a non-root user (e.g., appuser), set appropriate ownership on /app (chown /app and any runtime dirs), and switch to that user with USER before the ENTRYPOINT so uv and the portfolio_manager app run unprivileged; update ENV/PATH only if needed and ensure no subsequent steps require root privileges after the switch.applications/data_manager/Dockerfile (1)
34-84:⚠️ Potential issue | 🟠 MajorThe path rename is fine, but the runtime image still runs as root.
The final stage never sets
USER, sodata_managerkeeps full container privileges at runtime even though it only binds to port 8080. That leaves avoidable blast radius if the process or one of its native dependencies is compromised.🔒 Proposed hardening
COPY --from=builder /tmp/data_manager /usr/local/bin/data_manager +RUN useradd --system --uid 10001 --create-home appuser && \ + chown appuser:appuser /usr/local/bin/data_manager /app +USER appuser EXPOSE 8080 ENTRYPOINT ["/usr/local/bin/data_manager"]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@applications/data_manager/Dockerfile` around lines 34 - 84, The final Dockerfile stage leaves the process running as root; create a non-root user and group (e.g. data_manager user), chown the installed binary and any required runtime dirs to that UID/GID, and add a USER instruction before ENTRYPOINT so /usr/local/bin/data_manager runs unprivileged; update any file permissions (binary and /app if necessary) and ensure the chosen user has no sudo/root privileges and a locked shell.models/tide/src/tide/notifications.py (1)
19-23:⚠️ Potential issue | 🟠 MajorAvoid logging raw sender/recipient emails.
This logs email addresses directly in warning/exception paths, which is a privacy/compliance risk. Log booleans/counts instead of PII.
🔧 Suggested fix
if not sender_email or not recipient_emails_raw: logger.warning( "Notification emails not configured, skipping", - sender_email=sender_email, - recipient_emails=recipient_emails_raw, + has_sender_email=bool(sender_email), + has_recipient_emails=bool(recipient_emails_raw), ) return @@ except Exception: logger.exception( "Failed to send training notification", - sender_email=sender_email, - recipients=recipient_emails, + has_sender_email=bool(sender_email), + recipient_count=len(recipient_emails), state=state_name, )Also applies to: 81-85
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@models/tide/src/tide/notifications.py` around lines 19 - 23, The warning currently logs raw PII (sender_email and recipient_emails_raw); change the logger.warning call(s) (the one shown and the similar call around lines 81-85) to avoid printing email addresses and instead log non-PII indicators such as whether sender_email is set (e.g., bool(sender_email)) and the recipient count (e.g., len(recipient_emails_raw)) or a masked placeholder; update the logger call signatures to pass these booleans/counts (and a short descriptive message) rather than the raw sender_email/recipient_emails_raw values.models/tide/tests/test_notifications.py (1)
41-50:⚠️ Potential issue | 🟡 MinorClear
os.environin these notification tests.
patch.dictkeeps the rest of the process environment by default. That can mask a broken env-var rename if the runner already has notification variables set. Passclear=Trueso each case exercises only theFUND_TRAINING_NOTIFICATION_*keys it declares.🧪 Representative fix
patch.dict( "os.environ", { "FUND_TRAINING_NOTIFICATION_SENDER_EMAIL": "sender@example.com", "FUND_TRAINING_NOTIFICATION_RECIPIENT_EMAILS": "recipient@example.com", }, + clear=True, ),Apply the same change to the other
patch.dict("os.environ", ...)blocks in this file.Also applies to: 76-85, 104-115, 133-142, 153-162, 176-186
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@models/tide/tests/test_notifications.py` around lines 41 - 50, The tests in test_notifications.py use patch.dict("os.environ", {...}) without clearing the rest of the process env which can hide broken env-var renames; update each patch.dict(...) call that sets FUND_TRAINING_NOTIFICATION_SENDER_EMAIL and FUND_TRAINING_NOTIFICATION_RECIPIENT_EMAILS (and the other patches at the locations that set FUND_TRAINING_NOTIFICATION_* keys) to pass clear=True so only the supplied keys exist during the test — change patch.dict("os.environ", {...}) to patch.dict("os.environ", {...}, clear=True) for every such block in the file.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@applications/ensemble_manager/Dockerfile`:
- Around line 1-33: The container currently runs as root; add an unprivileged
runtime user and switch to it before starting the app: in the Dockerfile (server
stage) create a non-root user (e.g., "app" or "uvuser") and group, chown /app
and any runtime dirs (referenced by WORKDIR /app) to that user, set HOME if
needed, and add a USER instruction so the ENTRYPOINT (uv run ... defined in
ENTRYPOINT) runs unprivileged; ensure the created user has no shell or sudo and
that file ownership permits reading/writing by that user.
In `@applications/ensemble_manager/src/ensemble_manager/server.py`:
- Around line 62-63: The default DATAMANAGER_BASE_URL environment fallback is
using the wrong hostname; update the default string used by DATAMANAGER_BASE_URL
(and the related env var name FUND_DATAMANAGER_BASE_URL) from
"http://datamanager:8080" to "http://data-manager:8080" so it matches the
docker-compose service name (data-manager) used by the infrastructure; locate
the DATAMANAGER_BASE_URL constant in ensemble_manager/server.py and change only
the default URL string.
In `@infrastructure/compute.py`:
- Around line 1-2: The current import "from secrets import data_manager_secret,
portfolio_manager_secret, shared_secret" shadows Python's stdlib secrets module;
rename the local module (e.g., to secrets_manager or aws_secrets) and update
this import to "from secrets_manager import data_manager_secret,
portfolio_manager_secret, shared_secret" (or the chosen name) and adjust any
other references to that module across the repo so stdlib secrets remains
available for functions like secrets.token_hex().
- Around line 127-128: The hardcoded acm_certificate_arn = None should be
replaced with a configurable value read from your config module; update
infrastructure/compute.py to import and use the config entry (e.g.,
acm_certificate_arn = config.get("acm_certificate_arn") or
config.acm_certificate_arn) instead of the literal None and ensure the config.py
exposes an optional acm_certificate_arn setting with a default of None so HTTPS
can be toggled without code changes; reference the acm_certificate_arn symbol in
compute.py and add the new optional acm_certificate_arn key in config.py with
default None.
In `@infrastructure/config.py`:
- Around line 72-75: The config fetch uses stack_config.require_object(...) to
read budgetAlertEmailAddresses which loses Pulumi secret tracking; change the
call that populates budget_alert_email_addresses to use
stack_config.require_secret_object(...) so the returned value remains a secret
Output and is redacted in state. Locate the assignment to
budget_alert_email_addresses and replace require_object with
require_secret_object while keeping the existing cast and variable name.
In `@infrastructure/docker-compose.yaml`:
- Around line 40-41: The docker-compose service "ensemble-manager" is using the
wrong environment variable name; update the image reference to use the renamed
ARN variable AWS_ECR_ENSEMBLE_MANAGER_SERVER_IMAGE_ARN instead of
AWS_ECR_EQUITY_PRICE_MODEL_SERVER_IMAGE_ARN so it follows the established
AWS_ECR_{SERVICE_NAME}_SERVER_IMAGE_ARN pattern; locate the "ensemble-manager"
service block and replace the image value accordingly.
In `@infrastructure/iam.py`:
- Around line 2-9: The local module name "secrets" shadows Python's stdlib
module; update imports used in this file (data_manager_secret,
data_manager_secret_name, portfolio_manager_secret,
portfolio_manager_secret_name, shared_secret, shared_secret_name) to avoid the
conflict — either rename the local module (e.g., infrastructure_secrets.py) and
import its symbols, or switch to an explicit relative import (use the
package-local module via a relative import or alias the local module) so stdlib
"secrets" is not shadowed and the references to those symbols
(data_manager_secret, portfolio_manager_secret, shared_secret, etc.) still
resolve correctly.
- Around line 92-103: The IAM policy statement with "Sid":
"ManageEC2ECSELBBudgetsAndServiceDiscovery" currently uses broad service
wildcards (ec2:*, ecs:*, elasticloadbalancing:*, budgets:*, servicediscovery:*)
and Resource:"*", so replace the wildcards with a minimal set of required
actions (enumerate create/update/delete and necessary describe/list actions) for
each service (e.g., explicit ElasticLoadBalancing actions instead of
elasticloadbalancing:*) and where possible constrain "Resource" to specific ARN
patterns (load balancer, target group, ECS cluster/service ARNs, budget ARNs,
servicediscovery namespaces) while retaining global/wildcard only for actions
that truly require it; update the policy block identified by the Sid
"ManageEC2ECSELBBudgetsAndServiceDiscovery" in infrastructure/iam.py accordingly
and verify against the AWS service authorization reference for supported
resource-level permissions.
In `@infrastructure/networking.py`:
- Around line 95-101: The NAT Gateway is created only as nat using
subnet_id=public_subnet_1.id which creates a single-AZ availability risk; either
create a second NAT Gateway (e.g., nat_ha) in public_subnet_2 and update route
tables for private subnets to use the AZ-local NAT, or explicitly document and
accept the single-NAT cost trade-off—modify the resource creation to add a
second aws.ec2.NatGateway (reference public_subnet_2.id and a new eip) and
ensure route table entries point private subnets to the appropriate NAT gateway
for HA.
In `@infrastructure/notifications.py`:
- Around line 1-50: The budget notifications are configured to send emails
directly via subscriber_email_addresses on the aws.budgets.Budget resource
instead of using the infrastructure_alerts_topic SNS topic; update the Budget
resource to use subscriber_sns_topic_arns=[infrastructure_alerts_topic.arn] and
add an SNS TopicPolicy on infrastructure_alerts_topic that grants Principal
"budgets.amazonaws.com" permission to Publish so AWS Budgets can post to the
topic; alternatively remove the infrastructure_alerts_topic and its
TopicSubscription loop if you prefer direct-email notifications instead of
SNS-based delivery.
In `@infrastructure/secrets.py`:
- Around line 34-68: Update the secret config key names passed into
serialize_secret_config_object to match the new service naming convention:
replace "datamanagerSecretValue" with a consistent key like
"dataManagerSecretValue" or "data_manager_secret_value" in the
aws.secretsmanager.SecretVersion call that constructs
"data_manager_secret_version", and replace "portfoliomanagerSecretValue" with
"portfolioManagerSecretValue" or "portfolio_manager_secret_value" in the
aws.secretsmanager.SecretVersion call for "portfolio_manager_secret_version";
keep the "sharedSecretValue" key decision consistent as well and then update the
corresponding Pulumi stack configuration entries so the code in
serialize_secret_config_object and any consumers read the new keys.
- Around line 13-32: The three Secret resources data_manager_secret,
portfolio_manager_secret, and shared_secret currently set
recovery_window_in_days=0 which prevents recovery after deletion; update
secrets.py to make recovery_window_in_days a configurable non-zero value for
production (e.g., 7–30 days) by introducing a variable (e.g.,
recovery_window_days or secrets_recovery_window) read from configuration or
environment and use it for these aws.secretsmanager.Secret creations so
production stacks get a safe recovery window while allowing 0 for local/dev via
config.
In `@infrastructure/storage.py`:
- Around line 97-161: Add ECR lifecycle policies to each repository to prevent
unbounded image accumulation; for each repository resource
(data_manager_repository, portfolio_manager_repository,
ensemble_manager_repository, tide_trainer_repository,
training_server_repository, training_worker_repository) create a corresponding
aws.ecr.LifecyclePolicy resource (e.g., data_manager_repository_lifecycle) that
references repository=<repository>.name and uses a JSON policy (via json.dumps)
to expire old images (for example keep last 10 images with rulePriority 1,
selection tagStatus "any", countType "imageCountMoreThan", countNumber 10, and
action type "expire"); repeat this pattern after each repository definition to
manage storage costs.
In `@libraries/python/tests/test_infrastructure_iam.py`:
- Around line 20-24: The assertions in tests/test_infrastructure_iam.py are
brittle because they check exact substrings in the rendered Terraform/JSON
(variable infrastructure_iam) which breaks on harmless formatting changes;
update the two assertions that reference "github_actions_infrastructure_policy"
and the managed_policy_arns check to use whitespace-tolerant regex matching
instead of direct substring checks (e.g. use re.search on infrastructure_iam
with a pattern that matches the quoted policy name and a separate pattern like
r"managed_policy_arns\s*=\s*\[.*github_actions_infrastructure_policy\.arn.*\]"
to allow arbitrary whitespace/formatting), and apply the same change to the
similar assertions at lines 30-32 so the tests validate intent not formatting.
In `@libraries/python/tests/test_infrastructure_storage.py`:
- Around line 7-22: The tests currently grep the raw file text via
load_infrastructure_storage/INFRASTRUCTURE_STORAGE_PATH which is brittle;
instead parse or import the module and assert on actual exported resource
definitions: update test_storage_contains_s3_bucket_encryption_resources and
test_storage_contains_s3_public_access_block_resources to load the module (or
ast.parse INFRASTRUCTURE_STORAGE_PATH.read_text()) and inspect the AST or the
module object for the presence of the resource definitions (e.g. names
"data_bucket_encryption", "model_artifacts_bucket_encryption",
"data_bucket_public_access_block", "model_artifacts_bucket_public_access_block")
rather than asserting on raw quoted strings — use ast to find assignments/dict
keys or importlib to get the exported dict/object and assert those keys exist.
In `@maskfile.md`:
- Around line 71-75: The build chooses dockerfile paths using
applications/${application_name}/Dockerfile which fails for hyphenated option
names (e.g., portfolio-manager) because directories are underscore-based
(portfolio_manager); update the logic that builds dockerfile to normalize
application_name (replace '-' with '_') before composing the path (use the
normalized var wherever dockerfile path is constructed, e.g., the dockerfile
variable assignment and any other uses that reference application_name).
In `@models/tide/Dockerfile`:
- Around line 17-43: The trainer stage in the Dockerfile runs as root; create an
unprivileged runtime user (e.g., "trainer" or "tideuser") inside the trainer
stage, give it a non-zero UID/GID, chown the application files (the /app tree
copied from the builder) and any runtime directories to that user, and add a
USER instruction to switch to that account before the ENTRYPOINT so the
container runs unprivileged; ensure the created user has a home or workable
WORKDIR and that ENTRYPOINT ("uv run --package tide python -m tide.workflow")
remains executable by that user.
- Around line 1-43: The Dockerfile copies a virtual environment built in the
builder stage into the trainer stage which is brittle because uv creates a .venv
with absolute paths; instead rebuild the Python environment in the final trainer
stage: after copying /bin/uv and the source (the COPY --from=builder /app /app
step) run uv sync --no-dev --package tide in the trainer stage (or switch the
trainer base to the identical python:3.12.10-slim image) so the runtime .venv is
created against the trainer's Python install; ensure this happens before
ENTRYPOINT uses uv to run tide.workflow.
---
Outside diff comments:
In `@applications/data_manager/Dockerfile`:
- Around line 34-84: The final Dockerfile stage leaves the process running as
root; create a non-root user and group (e.g. data_manager user), chown the
installed binary and any required runtime dirs to that UID/GID, and add a USER
instruction before ENTRYPOINT so /usr/local/bin/data_manager runs unprivileged;
update any file permissions (binary and /app if necessary) and ensure the chosen
user has no sudo/root privileges and a locked shell.
In `@applications/portfolio_manager/Dockerfile`:
- Around line 9-31: The final stage leaves the container running as root; after
COPY --from=builder /app /app create a non-root user (e.g., appuser), set
appropriate ownership on /app (chown /app and any runtime dirs), and switch to
that user with USER before the ENTRYPOINT so uv and the portfolio_manager app
run unprivileged; update ENV/PATH only if needed and ensure no subsequent steps
require root privileges after the switch.
In `@models/tide/src/tide/notifications.py`:
- Around line 19-23: The warning currently logs raw PII (sender_email and
recipient_emails_raw); change the logger.warning call(s) (the one shown and the
similar call around lines 81-85) to avoid printing email addresses and instead
log non-PII indicators such as whether sender_email is set (e.g.,
bool(sender_email)) and the recipient count (e.g., len(recipient_emails_raw)) or
a masked placeholder; update the logger call signatures to pass these
booleans/counts (and a short descriptive message) rather than the raw
sender_email/recipient_emails_raw values.
In `@models/tide/tests/test_notifications.py`:
- Around line 41-50: The tests in test_notifications.py use
patch.dict("os.environ", {...}) without clearing the rest of the process env
which can hide broken env-var renames; update each patch.dict(...) call that
sets FUND_TRAINING_NOTIFICATION_SENDER_EMAIL and
FUND_TRAINING_NOTIFICATION_RECIPIENT_EMAILS (and the other patches at the
locations that set FUND_TRAINING_NOTIFICATION_* keys) to pass clear=True so only
the supplied keys exist during the test — change patch.dict("os.environ", {...})
to patch.dict("os.environ", {...}, clear=True) for every such block in the file.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 8a55260a-86c9-42fc-8aa0-499713b9722a
⛔ Files ignored due to path filters (2)
Cargo.lockis excluded by!**/*.lockuv.lockis excluded by!**/*.lock
📒 Files selected for processing (104)
.github/workflows/launch_infrastructure.yamlCargo.tomlapplications/data_manager/Cargo.tomlapplications/data_manager/Dockerfileapplications/data_manager/bacon.tomlapplications/data_manager/src/data.rsapplications/data_manager/src/equity_bars.rsapplications/data_manager/src/equity_details.rsapplications/data_manager/src/errors.rsapplications/data_manager/src/health.rsapplications/data_manager/src/lib.rsapplications/data_manager/src/main.rsapplications/data_manager/src/portfolios.rsapplications/data_manager/src/predictions.rsapplications/data_manager/src/router.rsapplications/data_manager/src/startup.rsapplications/data_manager/src/state.rsapplications/data_manager/src/storage.rsapplications/data_manager/tests/common/mod.rsapplications/data_manager/tests/test_data.rsapplications/data_manager/tests/test_errors.rsapplications/data_manager/tests/test_handlers.rsapplications/data_manager/tests/test_state_and_health.rsapplications/data_manager/tests/test_storage.rsapplications/ensemble_manager/Dockerfileapplications/ensemble_manager/pyproject.tomlapplications/ensemble_manager/src/ensemble_manager/predictions_schema.pyapplications/ensemble_manager/src/ensemble_manager/preprocess.pyapplications/ensemble_manager/src/ensemble_manager/server.pyapplications/ensemble_manager/tests/conftest.pyapplications/ensemble_manager/tests/test_predictions_schema.pyapplications/ensemble_manager/tests/test_preprocess.pyapplications/ensemble_manager/tests/test_server.pyapplications/equitypricemodel/Dockerfileapplications/portfolio_manager/Dockerfileapplications/portfolio_manager/pyproject.tomlapplications/portfolio_manager/src/portfolio_manager/alpaca_client.pyapplications/portfolio_manager/src/portfolio_manager/beta.pyapplications/portfolio_manager/src/portfolio_manager/consolidation.pyapplications/portfolio_manager/src/portfolio_manager/data_client.pyapplications/portfolio_manager/src/portfolio_manager/enums.pyapplications/portfolio_manager/src/portfolio_manager/exceptions.pyapplications/portfolio_manager/src/portfolio_manager/portfolio_schema.pyapplications/portfolio_manager/src/portfolio_manager/regime.pyapplications/portfolio_manager/src/portfolio_manager/report.pyapplications/portfolio_manager/src/portfolio_manager/risk_management.pyapplications/portfolio_manager/src/portfolio_manager/server.pyapplications/portfolio_manager/src/portfolio_manager/statistical_arbitrage.pyapplications/portfolio_manager/tests/test_alpaca_client.pyapplications/portfolio_manager/tests/test_beta.pyapplications/portfolio_manager/tests/test_consolidation.pyapplications/portfolio_manager/tests/test_data_client.pyapplications/portfolio_manager/tests/test_portfolio_schema.pyapplications/portfolio_manager/tests/test_portfolio_server.pyapplications/portfolio_manager/tests/test_regime.pyapplications/portfolio_manager/tests/test_report.pyapplications/portfolio_manager/tests/test_risk_management.pyapplications/portfolio_manager/tests/test_statistical_arbitrage.pyinfrastructure/__main__.pyinfrastructure/compute.pyinfrastructure/config.pyinfrastructure/docker-compose.yamlinfrastructure/iam.pyinfrastructure/networking.pyinfrastructure/notifications.pyinfrastructure/parameters.pyinfrastructure/secrets.pyinfrastructure/storage.pylibraries/python/pyproject.tomllibraries/python/src/internal/combine_data.pylibraries/python/src/internal/equity_details_schema.pylibraries/python/tests/test_equity_details_schema.pylibraries/python/tests/test_infrastructure_config.pylibraries/python/tests/test_infrastructure_configuration.pylibraries/python/tests/test_infrastructure_iam.pylibraries/python/tests/test_infrastructure_networking.pylibraries/python/tests/test_infrastructure_storage.pymaskfile.mdmodels/__init__.pymodels/tide/Dockerfilemodels/tide/__init__.pymodels/tide/pyproject.tomlmodels/tide/src/tide/__init__.pymodels/tide/src/tide/deploy.pymodels/tide/src/tide/notifications.pymodels/tide/src/tide/run.pymodels/tide/src/tide/tasks.pymodels/tide/src/tide/tide_data.pymodels/tide/src/tide/tide_model.pymodels/tide/src/tide/trainer.pymodels/tide/src/tide/workflow.pymodels/tide/tests/__init__.pymodels/tide/tests/conftest.pymodels/tide/tests/test_deploy.pymodels/tide/tests/test_notifications.pymodels/tide/tests/test_run.pymodels/tide/tests/test_tasks.pymodels/tide/tests/test_tide_data.pymodels/tide/tests/test_tide_model.pymodels/tide/tests/test_trainer.pymodels/tide/tests/test_workflow.pypyproject.tomltools/Dockerfiletools/pyproject.toml
💤 Files with no reviewable changes (3)
- tools/pyproject.toml
- applications/equitypricemodel/Dockerfile
- libraries/python/tests/test_infrastructure_configuration.py
…t container users
- Fix default DATAMANAGER_BASE_URL fallback in ensemble_manager/server.py from
"http://datamanager:8080" to "http://data-manager:8080" to match the docker-compose
service name after the renaming in this PR
- Fix ECR repository names in infrastructure/storage.py to use underscore-prefixed
application names (fund/data_manager-server, fund/portfolio_manager-server,
fund/ensemble_manager-server) to match the CI matrix application_name values used
in the mask push command (fund/${application_name}-${stage_name})
- Fix ensemble-manager docker-compose service to reference AWS_ECR_ENSEMBLE_MANAGER_SERVER_IMAGE_ARN
instead of the stale AWS_ECR_EQUITY_PRICE_MODEL_SERVER_IMAGE_ARN
- Add non-root appuser (uid 10001) to applications/ensemble_manager/Dockerfile server stage
- Add non-root appuser (uid 10001) to models/tide/Dockerfile trainer stage
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
models/tide/Dockerfile (1)
39-48:⚠️ Potential issue | 🟠 MajorRebuild
.venvin trainer instead of copying it from builder.Line 39 copies
/app(including the builder-created environment), and Line 48 runs it viauv runin a different base image. This is brittle and can break due to interpreter/path/ABI mismatch between stages. Recreate the environment in the trainer stage.Suggested fix
COPY --from=builder /app /app +RUN rm -rf /app/.venv && uv sync --no-dev --package tide ENV PYTHONPATH=/app/models/tide/src🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@models/tide/Dockerfile` around lines 39 - 48, The Dockerfile currently copies the entire /app from the builder (COPY --from=builder /app /app) which includes the builder-created virtualenv and can cause interpreter/ABI mismatches at runtime; instead, remove copying of the builder .venv and rebuild the virtual environment in the trainer stage: ensure the trainer stage installs dependencies and creates a fresh venv (or uses system site-packages) prior to setting ENV PYTHONPATH=/app/models/tide/src and before setting ENTRYPOINT ["uv","run","--package","tide","python","-m","tide.workflow"]; update any USER/chown steps (useradd appuser) to run after venv creation so the environment is owned by appuser.applications/ensemble_manager/Dockerfile (1)
36-38: 🧹 Nitpick | 🔵 TrivialAdd a Docker
HEALTHCHECKfor liveness/readiness.This is still missing, so supervisors cannot detect a hung-but-running process.
💡 Proposed patch
USER appuser + +HEALTHCHECK --interval=30s --timeout=3s --start-period=20s --retries=3 \ + CMD python -c "import socket; s = socket.create_connection(('127.0.0.1', 8080), 2); s.close()" || exit 1 + EXPOSE 8080🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@applications/ensemble_manager/Dockerfile` around lines 36 - 38, Add a Docker HEALTHCHECK to the Dockerfile so supervisors can detect hung-but-running processes; add a HEALTHCHECK instruction after EXPOSE/ENTRYPOINT that periodically probes the running uvicorn app started by the ENTRYPOINT (ensemble_manager.server:application) — prefer an HTTP GET against a health endpoint (e.g., /health or /) on port 8080 and return success only on HTTP 200, with sensible interval, timeout and retries; if no HTTP health endpoint exists, use a TCP check against port 8080 instead. Ensure the HEALTHCHECK command is idempotent and exits with 0 on healthy and non-0 on unhealthy so orchestration can mark the container accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@applications/ensemble_manager/src/ensemble_manager/server.py`:
- Line 65: The SSM parameter constant MODEL_VERSION_SSM_PARAMETER uses an
underscore but the infra defines a hyphen; update the constant value from
"/fund/ensemble_manager/model_version" to "/fund/ensemble-manager/model_version"
so the SSM lookup (the code that reads MODEL_VERSION_SSM_PARAMETER) will find
the correct parameter and avoid falling back to "latest".
In `@infrastructure/storage.py`:
- Around line 97-160: The ECR repositories (data_manager_repository,
portfolio_manager_repository, ensemble_manager_repository,
tide_trainer_repository, training_server_repository, training_worker_repository)
are configured with mutable tags which allows downstream exports to reference
unstable ":latest" semantics; change the exports to return only the repository
URI (not a full image:tag), stop exporting any deployable image URIs with
mutable tags, and ensure CI injects a concrete image tag or digest at deploy
time; additionally set image_tag_mutability="IMMUTABLE" on these
aws.ecr.Repository resources (and the similar repositories referenced later) so
pushed tags cannot be overwritten.
---
Duplicate comments:
In `@applications/ensemble_manager/Dockerfile`:
- Around line 36-38: Add a Docker HEALTHCHECK to the Dockerfile so supervisors
can detect hung-but-running processes; add a HEALTHCHECK instruction after
EXPOSE/ENTRYPOINT that periodically probes the running uvicorn app started by
the ENTRYPOINT (ensemble_manager.server:application) — prefer an HTTP GET
against a health endpoint (e.g., /health or /) on port 8080 and return success
only on HTTP 200, with sensible interval, timeout and retries; if no HTTP health
endpoint exists, use a TCP check against port 8080 instead. Ensure the
HEALTHCHECK command is idempotent and exits with 0 on healthy and non-0 on
unhealthy so orchestration can mark the container accordingly.
In `@models/tide/Dockerfile`:
- Around line 39-48: The Dockerfile currently copies the entire /app from the
builder (COPY --from=builder /app /app) which includes the builder-created
virtualenv and can cause interpreter/ABI mismatches at runtime; instead, remove
copying of the builder .venv and rebuild the virtual environment in the trainer
stage: ensure the trainer stage installs dependencies and creates a fresh venv
(or uses system site-packages) prior to setting ENV
PYTHONPATH=/app/models/tide/src and before setting ENTRYPOINT
["uv","run","--package","tide","python","-m","tide.workflow"]; update any
USER/chown steps (useradd appuser) to run after venv creation so the environment
is owned by appuser.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 89835f82-e8ef-41fe-9613-c4faac099273
📒 Files selected for processing (5)
applications/ensemble_manager/Dockerfileapplications/ensemble_manager/src/ensemble_manager/server.pyinfrastructure/docker-compose.yamlinfrastructure/storage.pymodels/tide/Dockerfile
There was a problem hiding this comment.
Pull request overview
Large-scale reorganization to improve naming consistency and separate concerns across infrastructure, applications, and model training (introducing the new models/tide package/workflow and renaming services/packages for readability).
Changes:
- Restructures Pulumi
infrastructure/into modular files and updates resource/service naming. - Introduces
models/tideas a first-class workspace package with its own Docker image, workflow, and tests. - Renames/realigns application packages (e.g.,
portfolio_manager,ensemble_manager,data_manager) and updates tests/imports accordingly.
Reviewed changes
Copilot reviewed 72 out of 108 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/pyproject.toml | Removes Prefect dependency from tools package. |
| tools/Dockerfile | Updates build context/PYTHONPATH for tooling image; still defines Prefect worker stage. |
| pyproject.toml | Updates uv workspace members; adds Tide tests path. |
| models/tide/tests/test_workflow.py | Updates workflow import/patch targets to tide.workflow. |
| models/tide/tests/test_trainer.py | Switches trainer import to tide.trainer. |
| models/tide/tests/test_tide_model.py | Switches model import to tide.tide_model. |
| models/tide/tests/test_tide_data.py | Switches data import to tide.tide_data. |
| models/tide/tests/test_tasks.py | Switches task import/patch targets to tide.tasks. |
| models/tide/tests/test_run.py | Switches run import/patch targets to tide.run; updates expected artifact naming. |
| models/tide/tests/test_notifications.py | Switches notifications import and env var names to FUND_*. |
| models/tide/tests/test_deploy.py | Switches deploy import/patch targets to tide.deploy. |
| models/tide/tests/conftest.py | Adds shared test fixture for generating raw data. |
| models/tide/tests/init.py | Marks Tide tests package. |
| models/tide/src/tide/workflow.py | Implements Prefect flow/tasks for Tide training pipeline; updates env var names and artifact paths. |
| models/tide/src/tide/trainer.py | Refactors trainer module; removes __main__ script behavior. |
| models/tide/src/tide/tide_model.py | Adds TiDE model implementation, training utilities, and serialization. |
| models/tide/src/tide/tasks.py | Refactors training-data prep tasks; removes __main__ script behavior. |
| models/tide/src/tide/run.py | Updates to call tide.workflow.training_pipeline; renames env vars. |
| models/tide/src/tide/notifications.py | Renames notification env vars to FUND_*. |
| models/tide/src/tide/deploy.py | Adds flow deployment helper using module-path entrypoint. |
| models/tide/src/tide/init.py | Marks Tide package. |
| models/tide/pyproject.toml | New Tide package definition and dependencies. |
| models/tide/init.py | Marks models/tide as a module. |
| models/tide/Dockerfile | Adds CUDA-based trainer image for Tide workflow. |
| models/init.py | Marks models as a Python package. |
| maskfile.md | Updates infra/image commands, imports, service naming, and model commands for Tide. |
| libraries/python/tests/test_infrastructure_storage.py | Adds tests asserting infra storage module contains key resources. |
| libraries/python/tests/test_infrastructure_networking.py | Adds tests asserting networking module contains NAT alarm configuration. |
| libraries/python/tests/test_infrastructure_iam.py | Adds tests asserting IAM module constraints/policy wiring. |
| libraries/python/tests/test_infrastructure_configuration.py | Removes monolithic infra entrypoint tests (superseded by modular tests). |
| libraries/python/tests/test_infrastructure_config.py | Adds config-focused infra tests (Pulumi config + OIDC claim constants). |
| libraries/python/tests/test_equity_details_schema.py | Updates import path to internal.equity_details_schema. |
| libraries/python/src/internal/equity_details_schema.py | Introduces shared Pandera schema for equity details. |
| libraries/python/src/internal/combine_data.py | Renames combine function and removes CLI entrypoint. |
| libraries/python/pyproject.toml | Adds structlog dependency to shared Python library. |
| infrastructure/storage.py | Adds storage module (S3 + ECR repos + lifecycle policies). |
| infrastructure/secrets.py | Adds secrets module (Secrets Manager + versions). |
| infrastructure/parameters.py | Renames parameters and scopes paths by stack name. |
| infrastructure/notifications.py | Adds budget/SNS alerting resources. |
| infrastructure/networking.py | Adds VPC/subnets/SGs/endpoints + NAT alarm. |
| infrastructure/iam.py | Adds IAM/OIDC/roles/policies + SES identity/SSM params for notifications. |
| infrastructure/docker-compose.yaml | Renames compose services and updates base URL env vars. |
| infrastructure/config.py | Centralizes Pulumi config loading/validation and shared constants. |
| infrastructure/Pulumi.production.yaml | Updates encrypted config values (secret rotations/renames). |
| applications/portfolio_manager/tests/test_statistical_arbitrage.py | Updates imports to portfolio_manager.*. |
| applications/portfolio_manager/tests/test_risk_management.py | Updates imports to portfolio_manager.*. |
| applications/portfolio_manager/tests/test_report.py | Updates imports to portfolio_manager.*. |
| applications/portfolio_manager/tests/test_regime.py | Updates imports to portfolio_manager.*. |
| applications/portfolio_manager/tests/test_portfolio_server.py | Updates imports/patch targets to portfolio_manager.*. |
| applications/portfolio_manager/tests/test_portfolio_schema.py | Updates imports to portfolio_manager.*. |
| applications/portfolio_manager/tests/test_data_client.py | Updates imports/patch targets to portfolio_manager.*. |
| applications/portfolio_manager/tests/test_consolidation.py | Updates imports to portfolio_manager.*. |
| applications/portfolio_manager/tests/test_beta.py | Updates imports to portfolio_manager.*. |
| applications/portfolio_manager/tests/test_alpaca_client.py | Updates imports/patch targets to portfolio_manager.*. |
| applications/portfolio_manager/src/portfolio_manager/statistical_arbitrage.py | Adds pair selection logic for stat arb based on correlations/z-scores/signals. |
| applications/portfolio_manager/src/portfolio_manager/server.py | Renames env vars and updates predictions base URL to ensemble manager. |
| applications/portfolio_manager/src/portfolio_manager/risk_management.py | Adds vol-parity sizing + beta-neutral optimization. |
| applications/portfolio_manager/src/portfolio_manager/report.py | Adds formatted reporting for regime/beta/signals/pairs/portfolio sizing. |
| applications/portfolio_manager/src/portfolio_manager/regime.py | Adds regime classifier based on SPY volatility/autocorrelation. |
| applications/portfolio_manager/src/portfolio_manager/portfolio_schema.py | Adds Pandera schemas and custom checks for pairs/portfolio outputs. |
| applications/portfolio_manager/src/portfolio_manager/exceptions.py | Adds domain-specific exceptions. |
| applications/portfolio_manager/src/portfolio_manager/enums.py | Adds enums for sides/actions. |
| applications/portfolio_manager/src/portfolio_manager/data_client.py | Adds data-manager HTTP client helpers. |
| applications/portfolio_manager/src/portfolio_manager/consolidation.py | Adds prediction consolidation + realized volatility computation. |
| applications/portfolio_manager/src/portfolio_manager/beta.py | Adds market beta estimation and portfolio beta computation. |
| applications/portfolio_manager/src/portfolio_manager/alpaca_client.py | Adds Alpaca client wrapper with error mapping and rate limiting. |
| applications/portfolio_manager/pyproject.toml | Renames package to portfolio_manager. |
| applications/portfolio_manager/Dockerfile | Updates COPY paths, PYTHONPATH, and uv package entrypoint. |
| applications/equitypricemodel/Dockerfile | Removes legacy equity price model Dockerfile. |
| applications/ensemble_manager/tests/test_server.py | Updates imports/patch targets to ensemble_manager.*. |
| applications/ensemble_manager/tests/test_preprocess.py | Updates import to ensemble_manager.preprocess. |
| applications/ensemble_manager/tests/test_predictions_schema.py | Updates import to ensemble_manager.predictions_schema. |
| applications/ensemble_manager/tests/conftest.py | Adds shared test fixture for generating raw data. |
| applications/ensemble_manager/src/ensemble_manager/server.py | Refactors server to use shared schema + Tide model/data; updates env var naming and SSM path scheme. |
| applications/ensemble_manager/src/ensemble_manager/preprocess.py | Adds equity bar filtering helper. |
| applications/ensemble_manager/src/ensemble_manager/predictions_schema.py | Adds Pandera schema + validation checks for predictions. |
| applications/ensemble_manager/pyproject.toml | Renames package to ensemble_manager and depends on tide. |
| applications/ensemble_manager/Dockerfile | Adds Docker build for ensemble manager service. |
| applications/data_manager/tests/test_storage.rs | Updates crate name imports to data_manager. |
| applications/data_manager/tests/test_state_and_health.rs | Updates crate name imports to data_manager. |
| applications/data_manager/tests/test_handlers.rs | Updates crate name imports to data_manager. |
| applications/data_manager/tests/test_errors.rs | Updates crate name imports to data_manager. |
| applications/data_manager/tests/test_data.rs | Updates crate name imports to data_manager. |
| applications/data_manager/tests/common/mod.rs | Adds shared LocalStack/DuckDB env + server spawning helpers for tests. |
| applications/data_manager/src/storage.rs | Adds/extends S3 IO + DuckDB query layer for bars/predictions/portfolios/details. |
| applications/data_manager/src/state.rs | Adds state construction from env (AWS + Massive + S3). |
| applications/data_manager/src/startup.rs | Updates env var names; updates logging identifiers. |
| applications/data_manager/src/router.rs | Adds axum router wiring for endpoints. |
| applications/data_manager/src/predictions.rs | Adds predictions save/query handlers. |
| applications/data_manager/src/portfolios.rs | Adds portfolios save/get handlers. |
| applications/data_manager/src/main.rs | Updates crate import name and env var names in tests. |
| applications/data_manager/src/lib.rs | Exposes modules for the Rust crate. |
| applications/data_manager/src/health.rs | Adds health endpoint. |
| applications/data_manager/src/errors.rs | Adds unified error type. |
| applications/data_manager/src/equity_details.rs | Adds equity details get/sync handlers (sync stub). |
| applications/data_manager/src/equity_bars.rs | Adds equity bars sync/query handlers. |
| applications/data_manager/src/data.rs | Adds DataFrame construction helpers for all domain types. |
| applications/data_manager/bacon.toml | Adds local Rust dev task config. |
| applications/data_manager/Dockerfile | Updates paths/bin name to data_manager. |
| applications/data_manager/Cargo.toml | Renames crate/bin to data_manager. |
| Cargo.toml | Updates workspace member path. |
| Cargo.lock | Updates crate name entry. |
| CLAUDE.md | Documents new models/ folder purpose. |
| .github/workflows/launch_infrastructure.yaml | Updates service names/paths to new package directories. |
Comments suppressed due to low confidence (2)
tools/pyproject.toml:11
toolsno longer declaresprefectas a dependency, but the tooling image still starts a Prefect worker (prefect worker startin tools/Dockerfile). As-is, the container will fail at runtime because theprefectCLI/module won’t be installed. Re-addprefect>=3,<4totoolsdependencies (or switch the worker image to install Prefect via another package and ensure it’s present).
tools/Dockerfile:19- The Prefect worker image doesn’t include the
models/tidepackage, but deployments are registered withEntrypointType.MODULE_PATHpointing attide.workflow(see models/tide/src/tide/deploy.py). The worker won’t be able to import/execute the flow code withouttideinstalled. Copymodels/tide/into this image and ensureuv syncinstalls thetidepackage (which will also pull in Prefect).
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@applications/ensemble_manager/src/ensemble_manager/server.py`:
- Around line 65-66: The code currently defaults _environment to "development"
which causes MODEL_VERSION_SSM_PARAMETER to point to the wrong SSM namespace;
change behavior so FUND_ENVIRONMENT is required instead of silently defaulting:
replace the defaulting logic for _environment (the variable named _environment
and the derived MODEL_VERSION_SSM_PARAMETER) with a check that reads
FUND_ENVIRONMENT and raises a clear error (or logs and aborts) if it is unset,
so SSM lookups fail fast and never silently use the wrong namespace.
In `@infrastructure/storage.py`:
- Around line 117-166: Repository names use a mixed underscore/hyphen convention
(e.g., resource data_manager_repository uses name "fund/data_manager-server"
while other repos use hyphens like "fund/tide-trainer"); standardize naming by
updating the name fields on the ECR resources (data_manager_repository,
portfolio_manager_repository, ensemble_manager_repository and any other
aws.ecr.Repository instances) to use a single convention (preferably hyphen-only
or underscore-only) and ensure lifecycle resources reference the updated
repository.name values consistently.
- Around line 185-200: The training_server ECR resources are created but not
imported into the stack entrypoint or exported as outputs; import the symbol
training_server_repository in the module that aggregates repositories (same
place other repos are imported) and add a stack export for
training_server_repository (the repository name or id) and
training_server_image_uri (the full image URI) alongside the existing exports
for data_manager, portfolio_manager, ensemble_manager, tide_trainer, and
training_worker so the resource is accessible to other stacks and consumers;
locate where other repositories are imported and exported and mirror their
pattern using the names training_server_repository and
training_server_image_uri.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 713235a0-88ef-48a6-b959-f7c61e4be03d
📒 Files selected for processing (8)
CLAUDE.mdapplications/ensemble_manager/src/ensemble_manager/server.pyinfrastructure/Pulumi.production.yamlinfrastructure/iam.pyinfrastructure/parameters.pyinfrastructure/storage.pylibraries/python/tests/test_infrastructure_storage.pymaskfile.md
chrisaddy
left a comment
There was a problem hiding this comment.
Overall this is a clean reorganization. A few items worth addressing below.
Overview
Changes
infrastructure/into separate filesapplications/sub-packages for readabilityContext
There were some redundancies in model training workflows and I had a backlog of tweaks I wanted to make so this is all of that. Open to any ideas or adjustments you want.
NOTE: I'll need to do an
upon the infrastructure to update everything and then I'll either update this pull request or include the removal of the "alias" attributes from theinfrastructure/resources in a subsequent pull request. Additionally, I'll need to rebuild and push images to ECR once theuphas been run.Summary by CodeRabbit
New Features
Infrastructure Updates