Build initial risk management resources by forstmeier · Pull Request #625 · oscmcompany/fund

forstmeier · 2025-09-07T01:53:04Z

Overview

Changes

build initial risk management functionality
add risk management unit tests
build pandera schemas for predictions and portfolios
general cleanup

Comments

This still likely needs some tweaking but the bones are there.

Summary by CodeRabbit

New Features
- Portfolio risk tools: returns/volatility, position actions, z‑score ranking, and optimal portfolio construction.
- Alpaca account integration returning normalized positions.
- Model workflow: daily-return target and prediction postprocessing.
Refactor
- Centralized/modernized validation schemas; timestamps as floats.
- Dataset now uses explicit load-and-preprocess API.
- Removed legacy market-data and S3 access modules.
Documentation
- Updated CLI/prerequisites guidance and deployment prompts.
Tests
- Added tests for risk management, portfolio, prediction, and dataset flows.
Bug Fixes
- Documentation changes include a regression in endpoint test placeholders and URLs.

coderabbitai · 2025-09-07T01:53:12Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Removes datamanager Alpaca and S3 clients and their tests; adds Alpaca client and risk-management logic to portfoliomanager with tests; introduces portfolio and prediction schemas and tests; revises equity_bar and company_information schemas and tests; refactors TFTDataset API and tests; updates .flox manifest, pyproject deps, maskfile, and minor infra/lint tweaks.

Changes

Cohort / File(s)	Summary
Env manifest `.flox/env/manifest.toml`	Add `mask.pkg-path = "mask"` under `[install]` and compact `systems` into a single-line array.
Datamanager removals `applications/datamanager/src/datamanager/alpaca_client.py`, `applications/datamanager/src/datamanager/s3_client.py`, `applications/datamanager/tests/*`	Delete AlpacaClient and S3Client modules and their unit tests.
Portfoliomanager: deps & client `applications/portfoliomanager/pyproject.toml`, `applications/portfoliomanager/src/portfoliomanager/alpaca_client.py`	Add workspace/internal, pandera[polars,pandas], alpaca-py deps; add new AlpacaAccount and AlpacaClient with Pandera-polars validation and rate limiting.
Portfoliomanager: risk management `applications/portfoliomanager/src/portfoliomanager/risk_management.py`, `applications/portfoliomanager/tests/test_risk_management.py`	Add risk-management utilities (actions, returns/volatility, performance, ranking, portfolio construction) and comprehensive tests.
Schemas — company info & equity bars `libraries/python/src/internal/company_information.py`, `libraries/python/src/internal/equity_bar.py`, `libraries/python/tests/test_company_information.py`, `libraries/python/tests/test_equity_bar.py`	Move to schema-level checks, add defaults for sector/industry, change equity timestamp to Float64, simplify checks, add uniqueness on (ticker,timestamp); update tests accordingly.
New schemas — portfolio & prediction `libraries/python/src/internal/portfolio.py`, `libraries/python/src/internal/prediction.py`, `libraries/python/tests/test_portfolio.py`, `libraries/python/tests/test_prediction.py`	Add portfolio_schema with balance/count checks and prediction_schema with per-ticker date-count, same-dates, and quantile monotonicity checks; add tests.
TFT dataset & tests `libraries/python/src/internal/tft_dataset.py`, `libraries/python/tests/test_tft_dataset.py`, `libraries/python/src/internal/tft_model.py`	Refactor TFTDataset to no-arg constructor, add preprocess_and_set_data and postprocess_predictions, introduce daily_return feature/target, update schema and tests; minor model comment tweak.
Misc lint & infra `libraries/python/src/internal/mhsa_network.py`, `infrastructure/stack.yml`	Add noqa comment; add trailing newline to stack.yml.
Maskfile/docs `maskfile.md`	Reformat messages, change redirections, expand prerequisites guidance, update SSH config and test placeholders (some regressions introduced in endpoint placeholders).

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor PM as PortfolioManager
  participant AC as AlpacaClient (new)
  participant AT as TradingClient (alpaca.trading)
  participant VS as position_schema

  PM->>AC: get_account()
  AC->>AT: get_account()
  AT-->>AC: TradeAccount
  AC->>AT: get_all_positions()
  AT-->>AC: [Position...]
  AC->>AC: map -> Polars DataFrame
  AC->>VS: validate(positions)
  VS-->>AC: validated positions
  AC-->>PM: AlpacaAccount(cash, positions)
  note right of AC: rate-limited via time.sleep

sequenceDiagram
  autonumber
  actor PM as PortfolioManager
  participant RM as risk_management
  participant ES as equity_bar_schema
  participant PS as prediction_schema
  participant PortS as portfolio_schema

  PM->>RM: add_equity_bars_returns_and_realized_volatility_columns(equity_bars)
  RM->>ES: validate(equity_bars)
  ES-->>RM: validated + computed returns/vol
  PM->>RM: add_predictions_zscore_ranked_columns(predictions)
  RM->>PS: validate(predictions)
  PS-->>RM: ranked predictions
  PM->>RM: create_optimal_portfolio(preds, positions, max_capital, now)
  RM->>PortS: validate(portfolio)
  PortS-->>PM: validated optimal portfolio

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

data manager port to rust #623 — related replacement/port of datamanager (Rust) touching the same subsystem as the removed datamanager modules.
Implement DataFrame validation with pandera #619 — related changes to pandera-based validation and datamanager client usage.
got 2 services talking to each other #622 — related deployment/manifest and microservice launch changes overlapping manifest and infra edits.

Suggested reviewers

chrisaddy

Poem

I thump my paw on polished logs,
Old clients hop away like frogs.
New schemas sprout in tidy rows,
Z-scores hum where data flows.
I nibble code and watch it grow — a balanced book, ready to go! 🐇📈

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The PR title "Build initial risk management resources" is concise, focused, and accurately reflects the primary change (adding risk management functionality, tests, and related schemas), making it clear to reviewers what the main intent of the changeset is.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch 09-06-build_initial_risk_management_resources

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1ec02bd and e91c4e7.

⛔ Files ignored due to path filters (2)

.flox/env/manifest.lock is excluded by !**/*.lock
uv.lock is excluded by !**/*.lock

📒 Files selected for processing (2)

.flox/env/manifest.toml (1 hunks)
libraries/python/src/internal/tft_model.py (1 hunks)

✅ Files skipped from review due to trivial changes (1)

libraries/python/src/internal/tft_model.py

🚧 Files skipped from review as they are similar to previous changes (1)

.flox/env/manifest.toml

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Run Python tests
GitHub Check: Run Python quality checks

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

forstmeier · 2025-09-07T01:53:23Z

Build initial risk management resources #625 👈 (View in Graphite)
master

This stack of pull requests is managed by Graphite. Learn more about stacking.

graphite-app · 2025-09-07T01:55:15Z

Graphite Automations

"Assign author to pull request" took an action on this PR • (09/07/25)

1 assignee was added to this PR based on John Forstmeier's automation.

Copilot

Pull Request Overview

This PR builds initial risk management resources for PocketSizeFund by adding portfolio optimization functionality and refactoring data schemas. The changes enable the system to generate optimal portfolios based on predictions while managing position risks through PDT rules and performance thresholds.

Key changes:

Adds comprehensive risk management functions for portfolio optimization
Introduces new data validation schemas for predictions and portfolios
Updates TFT dataset to predict daily returns instead of closing prices

Reviewed Changes

Copilot reviewed 21 out of 24 changed files in this pull request and generated 21 comments.

Show a summary per file

File	Description
maskfile.md	Removes emoji characters from script output messages
libraries/python/src/internal/tft_dataset.py	Adds daily return calculation and postprocessing functionality
libraries/python/src/internal/prediction.py	New schema for validating prediction data with quantiles
libraries/python/src/internal/portfolio.py	New schema for validating portfolio position data
applications/portfoliomanager/src/portfoliomanager/risk_management.py	Core risk management functions for portfolio optimization
applications/portfoliomanager/src/portfoliomanager/alpaca_client.py	Client for interacting with Alpaca trading API

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

coderabbitai

Actionable comments posted: 18

🧹 Nitpick comments (24)

infrastructure/stack.yml (1)

143-143: Route Grafana over TLS (websecure) and keep HTTP->HTTPS redirect.

Networks look correct for Traefik (public) and Prometheus (internal). Recommend switching the Grafana router to websecure and enabling TLS, plus an HTTP redirect to harden access.

Example labels to add (no diff since lines are outside the changed hunk):
yaml
labels:
- traefik.enable=true
- traefik.http.routers.grafana.rule=Host(grafana.example.com)
- traefik.http.routers.grafana.entrypoints=websecure
- traefik.http.routers.grafana.tls=true
- traefik.http.routers.grafana-web.rule=Host(grafana.example.com)
- traefik.http.routers.grafana-web.entrypoints=web
- traefik.http.routers.grafana-web.middlewares=redirect-to-https
- traefik.http.middlewares.redirect-to-https.redirectscheme.scheme=https
libraries/python/src/internal/tft_model.py (1)
124-124: Remove unused noqa / dead comment line.

Ruff flags this (RUF100). Delete the commented-out line rather than carrying a suppressed noqa.
-        # NOTE: static_continuous_features = Tensor.zeros(self.batch_size, 1, 0)  # noqa: E501, ERA001
libraries/python/src/internal/company_information.py (1)

15-26: Consider vectorized checks for performance and clearer errors.

Element-wise Python lambdas can be slower. If feasible, use column-scoped checks or vectorized predicates for each column to produce column-specific errors.
Example (conceptual, adjust to pandera-polars API):
python
checks=[
pa.Check(lambda df: df["sector"].str.to_uppercase() == df["sector"], element_wise=True, error="sector must be uppercase"),
pa.Check(lambda df: df["industry"].str.to_uppercase() == df["industry"], element_wise=True, error="industry must be uppercase"),
pa.Check(lambda df: df["sector"].str.strip_chars() == df["sector"], element_wise=True, error="sector must be stripped"),
pa.Check(lambda df: df["industry"].str.strip_chars() == df["industry"], element_wise=True, error="industry must be stripped"),
]

If you prefer to keep top-level checks, add a test covering missing values to ensure the new uppercase defaults behave as intended.
libraries/python/src/internal/prediction.py (2)
16-18: Remove unused noqa comments flagged by Ruff (RUF100).

The # noqa: E501 markers are unnecessary here.

Apply:
-        message = f"Each ticker must have exactly {dates_count} unique dates, found: {unique_dates_per_ticker}"  # noqa: E501
+        message = f"Each ticker must have exactly {dates_count} unique dates, found: {unique_dates_per_ticker}"
@@
-                message = f"Expected all tickers to have the same dates, mismatch between: {first_ticker_dates} and: {set(dates)}"  # noqa: E501
+                message = f"Expected all tickers to have the same dates, mismatch between: {first_ticker_dates} and: {set(dates)}"
Also applies to: 34-35

40-75: Optional: enforce non-empty frame if required by contract.

Empty inputs currently pass both cross-record checks. If at least one ticker/week is required, add a schema-level check for non-empty.
libraries/python/tests/test_prediction.py (2)
19-19: Ruff S101 in tests — OK to ignore or disable for test paths.

If Ruff gates CI, add a per-file or dir-level ignore for S101 in tests.

Example pyproject fragment (outside this file):
[tool.ruff]
extend-exclude = []

[tool.ruff.per-file-ignores]
"**/tests/**/*.py" = ["S101"]
Also applies to: 79-79, 143-143

7-17: Optional: add coercion tests.

Since prediction_schema has coerce=True, add a case with string timestamps/quantiles to assert dtype coercion occurs.

Also applies to: 131-144
applications/portfoliomanager/src/portfoliomanager/alpaca_client.py (4)
60-61: Avoid double validation of positions.

You validate in get_account and again in AlpacaAccount.__init__. Keep one (prefer the entity constructor to enforce invariants).

Apply:
-        position_schema.validate(positions)
+        # rely on AlpacaAccount to validate on construction
Also applies to: 17-19

27-27: Remove unused noqa (RUF100).

# noqa: FBT001 is unnecessary.

Apply:
-        is_paper: bool,  # noqa: FBT001
+        is_paper: bool,
62-65: Optional: use Decimal for cash to avoid float drift.

Financial amounts benefit from exact arithmetic. Consider Decimal in AlpacaAccount and callers.

74-96: Schema consistency with existing helper.

Nice reuse of is_uppercase shape. Consider importing the shared helper from internal.portfolio to avoid duplication if both exist.
applications/portfoliomanager/tests/test_risk_management.py (2)
31-31: Remove unused noqa directives

These noqa directives are not needed as the corresponding rules are not enabled in your linting configuration.
-    assert len(result) == 2  # noqa: PLR2004
+    assert len(result) == 2
Apply similar changes to lines 51, 107, 430-432, 436, 467, 498, 525, 561, 571, 575, 611, and 625.

Also applies to: 51-51, 107-107, 430-432, 436-436, 467-467, 498-498, 525-525, 561-561, 571-571, 575-575, 611-611, 625-625

427-427: Consider using UTC timezone consistently

While datetime.now(tz=UTC) works, consider using a fixed timestamp for test reproducibility.
-        predictions, positions, 20000.0, datetime.now(tz=UTC)
+        predictions, positions, 20000.0, datetime(2024, 1, 15, tzinfo=UTC)
This suggestion applies to all occurrences of datetime.now(tz=UTC) in the test file.
libraries/python/src/internal/tft_dataset.py (1)
29-31: Consider documenting the API change rationale

The constructor signature change from __init__(self, data: pl.DataFrame) to __init__(self) followed by preprocess_and_set_data(data: pl.DataFrame) is a significant API change. Consider adding a docstring explaining the two-step initialization pattern.
    def __init__(self) -> None:
+        """Initialize TFTDataset without data.
+        
+        Data should be provided via preprocess_and_set_data() method to allow
+        for better separation of initialization and data processing steps.
+        """
        pass
libraries/python/tests/test_tft_dataset.py (1)
188-244: Redundant assertion on Line 244.

Line 240 already checks that targets.shape[1] == 1, making the identical check on Line 244 redundant.
             targets = batch["targets"]
             assert targets.shape[1] == 1
             expected_tensor_dimensions = 2
             assert len(targets.shape) == expected_tensor_dimensions
             assert targets.shape[0] > 0
-            assert targets.shape[1] == 1
applications/portfoliomanager/src/portfoliomanager/risk_management.py (3)
216-224: Consider making the uncertainty threshold configurable.

The hardcoded minimum_inter_quartile_range = 0.75 might need adjustment based on market conditions or strategy requirements.
 def create_optimal_portfolio(
     predictions: pl.DataFrame,
     positions: pl.DataFrame,
     maximum_capital: float,
     current_timestamp: datetime,
+    minimum_inter_quartile_range: float = 0.75,
 ) -> pl.DataFrame:
     predictions = predictions.clone()
     positions = positions.clone()
 
-    minimum_inter_quartile_range = 0.75
     high_uncertainty_tickers = (
261-262: Consider making position count targets configurable.

The hardcoded target of 10 positions per side might need adjustment based on portfolio size or risk tolerance.
 def create_optimal_portfolio(
     predictions: pl.DataFrame,
     positions: pl.DataFrame,
     maximum_capital: float,
     current_timestamp: datetime,
+    target_positions_per_side: int = 10,
 ) -> pl.DataFrame:
Then update lines 261-262:
-    new_long_positions_needed = max(0, 10 - maintained_long_count)
-    new_short_positions_needed = max(0, 10 - maintained_short_count)
+    new_long_positions_needed = max(0, target_positions_per_side - maintained_long_count)
+    new_short_positions_needed = max(0, target_positions_per_side - maintained_short_count)
23-24: Remove unused noqa directives.

Multiple noqa directives for FBT003 and E501 are unused according to static analysis. These should be removed to keep the code clean.

Remove the # noqa: FBT003 and # noqa: E501 comments from lines 23, 24, 43, 153, 154, 173, 174, 201, 202, 203, 287, 288, 296, and 297, as they're not suppressing any actual linting errors.

Also applies to: 43-43, 153-154, 173-174, 201-203, 287-288, 296-297
libraries/python/src/internal/portfolio.py (3)
30-31: Remove unused noqa comments flagged by Ruff.

E501 isn’t enabled; drop the # noqa: E501 to satisfy RUF100.
-        message = f"Expected {total_positions_count} total positions, found: {data.lazyframe.collect().height}"  # noqa: E501
+        message = f"Expected {total_positions_count} total positions, found: {data.lazyframe.collect().height}"
-        message = f"Expected long and short dollar amount sums to be within {maximum_imbalance_percentage * 100}%, found long: {long_sum}, short: {short_sum}"  # noqa: E501
+        message = f"Expected long and short dollar amount sums to be within {maximum_imbalance_percentage * 100}%, found long: {long_sum}, short: {short_sum}"
Also applies to: 57-58

88-96: Pass check functions directly; avoid unnecessary lambdas.

Cleaner and keeps signatures discoverable.
-        pa.Check(
-            check_fn=lambda df: check_position_side_counts(df),
-            error="Each side must have expected position counts",
-        ),
-        pa.Check(
-            check_fn=lambda df: check_position_side_sums(df),
-            error="Position side sums must be approximately equal",
-        ),
+        pa.Check(
+            check_fn=check_position_side_counts,
+            error="Each side must have expected position counts",
+        ),
+        pa.Check(
+            check_fn=check_position_side_sums,
+            error="Position side sums must be approximately equal",
+        ),
65-83: Consider making columns non-nullable.

If nulls are not expected, enforce nullable=False to catch missing data early.
-        "ticker": pa.Column(
-            dtype=str,
-            checks=[pa.Check(is_uppercase)],
-        ),
+        "ticker": pa.Column(
+            dtype=str,
+            checks=[pa.Check(is_uppercase)],
+            nullable=False,
+        ),
...
-        "timestamp": pa.Column(
-            dtype=int,
-            checks=[pa.Check.greater_than(0)],
-        ),
+        "timestamp": pa.Column(
+            dtype=int,
+            checks=[pa.Check.greater_than(0)],
+            nullable=False,
+        ),
...
-        "side": pa.Column(
-            dtype=str,
-            checks=[
+        "side": pa.Column(
+            dtype=str,
+            checks=[
                 pa.Check.isin(["LONG", "SHORT"]),
                 pa.Check(is_uppercase),
-            ],
-        ),
+            ],
+            nullable=False,
+        ),
...
-        "dollar_amount": pa.Column(
-            dtype=float,
-            checks=[pa.Check.greater_than(0)],
-        ),
+        "dollar_amount": pa.Column(
+            dtype=float,
+            checks=[pa.Check.greater_than(0)],
+            nullable=False,
+        ),
libraries/python/tests/test_portfolio.py (3)
84-117: Assert error-message substrings are brittle; prefer tighter matching or parametrization.

Substring matches can pass unexpectedly. Consider parametrizing tests to cover both long/short mismatch cases and assert full messages.
-    with pytest.raises(SchemaError, match="Expected 10 long side positions, found: 15"):
+    with pytest.raises(
+        SchemaError,
+        match=r"^Expected 10 long side positions, found: 15$",
+    ):
         portfolio_schema.validate(data)
Optionally, parametrize LONG/SHORT imbalances to reduce duplication.

Also applies to: 119-153

38-40: Ruff S101 in tests.

If Ruff flags assert in tests, add a per-file ignore or disable S101 for tests in ruff config.

Example ruff.toml snippet:
[tool.ruff]
extend-exclude = ["libraries/python/tests/**"]

[tool.ruff.per-file-ignores]
"libraries/python/tests/**" = ["S101"]
10-36: Reduce boilerplate ticker lists.

Use a helper/fixture to generate N distinct tickers to keep tests short and focused.
-            "ticker": [
-                "AAPL", "GOOGL", "MSFT", "AMZN", "TSLA", "NVDA", "META", "NFLX", "BABA", "CRM",
-                "AMD", "INTC", "ORCL", "ADBE", "PYPL", "SHOP", "SPOT", "ROKU", "ZM", "DOCU",
-            ],
+            "ticker": [f"T{i:03d}" for i in range(20)],
Also applies to: 122-148

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 89695a9 and 13825cb.

⛔ Files ignored due to path filters (2)

.flox/env/manifest.lock is excluded by !**/*.lock
uv.lock is excluded by !**/*.lock

📒 Files selected for processing (22)

.flox/env/manifest.toml (2 hunks)
applications/datamanager/src/datamanager/alpaca_client.py (0 hunks)
applications/datamanager/src/datamanager/s3_client.py (0 hunks)
applications/datamanager/tests/test_alpaca_client.py (0 hunks)
applications/datamanager/tests/test_s3_client.py (0 hunks)
applications/portfoliomanager/pyproject.toml (1 hunks)
applications/portfoliomanager/src/portfoliomanager/alpaca_client.py (1 hunks)
applications/portfoliomanager/src/portfoliomanager/risk_management.py (1 hunks)
applications/portfoliomanager/tests/test_risk_management.py (1 hunks)
infrastructure/stack.yml (1 hunks)
libraries/python/src/internal/company_information.py (1 hunks)
libraries/python/src/internal/equity_bar.py (1 hunks)
libraries/python/src/internal/mhsa_network.py (1 hunks)
libraries/python/src/internal/portfolio.py (1 hunks)
libraries/python/src/internal/prediction.py (1 hunks)
libraries/python/src/internal/tft_dataset.py (5 hunks)
libraries/python/src/internal/tft_model.py (2 hunks)
libraries/python/tests/test_equity_bar.py (0 hunks)
libraries/python/tests/test_portfolio.py (1 hunks)
libraries/python/tests/test_prediction.py (1 hunks)
libraries/python/tests/test_tft_dataset.py (6 hunks)
maskfile.md (25 hunks)

💤 Files with no reviewable changes (5)

libraries/python/tests/test_equity_bar.py
applications/datamanager/tests/test_s3_client.py
applications/datamanager/src/datamanager/s3_client.py
applications/datamanager/tests/test_alpaca_client.py
applications/datamanager/src/datamanager/alpaca_client.py

🧰 Additional context used

🧬 Code graph analysis (9)

libraries/python/tests/test_prediction.py (1)

libraries/python/tests/test_equity_bar.py (4)

test_equity_bar_schema_type_coercion (180-198)

test_equity_bar_schema_negative_timestamp (61-76)

test_equity_bar_schema_ticker_lowercase_fails (25-40)

test_equity_bar_schema_missing_required_column (201-216)

libraries/python/src/internal/portfolio.py (2)

libraries/python/tests/test_company_information.py (1)

test_company_information_schema_sector_lowercase_fails (19-28)

libraries/python/tests/test_equity_bar.py (1)

test_equity_bar_schema_ticker_lowercase_fails (25-40)

libraries/python/tests/test_portfolio.py (1)

libraries/python/tests/test_equity_bar.py (5)

test_equity_bar_schema_valid_data (7-22)

test_equity_bar_schema_type_coercion (180-198)

test_equity_bar_schema_multiple_rows (237-253)

test_equity_bar_schema_ticker_lowercase_fails (25-40)

test_equity_bar_schema_zero_timestamp (79-94)

applications/portfoliomanager/tests/test_risk_management.py (1)

applications/portfoliomanager/src/portfoliomanager/risk_management.py (5)

add_equity_bars_returns_and_realized_volatility_columns (29-71)

add_positions_pdt_locked_column (7-26)

add_positions_performance_columns (74-182)

add_predictions_zscore_ranked_columns (185-204)

create_optimal_portfolio (207-322)

applications/portfoliomanager/src/portfoliomanager/alpaca_client.py (2)

libraries/python/src/internal/portfolio.py (1)

is_uppercase (6-9)

applications/datamanager/src/datamanager/alpaca_client.py (2)

__init__ (23-41)

fetch_latest_data (64-170)

libraries/python/src/internal/company_information.py (1)

libraries/python/tests/test_company_information.py (10)

test_company_information_schema_both_fields_uppercase_passes (43-53)

test_company_information_schema_type_coercion (139-152)

test_company_information_schema_no_whitespace_passes (80-90)

test_company_information_schema_special_characters (210-220)

test_company_information_schema_empty_string (185-194)

test_company_information_schema_missing_sector_column (117-125)

test_company_information_schema_industry_whitespace_fails (68-77)

test_company_information_schema_whitespace_fails (56-65)

test_company_information_schema_sector_lowercase_fails (19-28)

test_company_information_schema_mixed_case_fails (173-182)

libraries/python/src/internal/tft_dataset.py (1)

libraries/python/tests/test_equity_bar.py (1)

test_equity_bar_schema_type_coercion (180-198)

libraries/python/src/internal/equity_bar.py (1)

libraries/python/tests/test_equity_bar.py (8)

test_equity_bar_schema_type_coercion (180-198)

test_equity_bar_schema_negative_volume (144-159)

test_equity_bar_schema_valid_data (7-22)

test_equity_bar_schema_zero_timestamp (79-94)

test_equity_bar_schema_ticker_lowercase_fails (25-40)

test_equity_bar_schema_negative_timestamp (61-76)

test_equity_bar_schema_negative_prices (97-123)

test_equity_bar_schema_zero_volume_not_allowed (162-177)

libraries/python/tests/test_tft_dataset.py (1)

libraries/python/src/internal/tft_dataset.py (3)

TFTDataset (26-392)

preprocess_and_set_data (32-204)

get_batches (261-339)

🪛 Ruff (0.12.2)

libraries/python/tests/test_prediction.py

19-19: Use of assert detected

(S101)

79-79: Use of assert detected

(S101)

143-143: Use of assert detected

(S101)

libraries/python/src/internal/mhsa_network.py

39-39: Unused noqa directive (non-enabled: ERA001)

Remove unused noqa directive

(RUF100)

libraries/python/src/internal/prediction.py

17-17: Unused noqa directive (non-enabled: E501)

Remove unused noqa directive

(RUF100)

34-34: Unused noqa directive (non-enabled: E501)

Remove unused noqa directive

(RUF100)

libraries/python/src/internal/portfolio.py

30-30: Unused noqa directive (non-enabled: E501)

Remove unused noqa directive

(RUF100)

57-57: Unused noqa directive (non-enabled: E501)

Remove unused noqa directive

(RUF100)

applications/portfoliomanager/src/portfoliomanager/risk_management.py

23-23: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

24-24: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

43-43: Unused noqa directive (non-enabled: E501)

Remove unused noqa directive

(RUF100)

153-153: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

154-154: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

173-173: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

174-174: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

201-201: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

202-202: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

203-203: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

287-287: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

288-288: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

296-296: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

297-297: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

libraries/python/tests/test_portfolio.py

39-39: Use of assert detected

(S101)

libraries/python/src/internal/tft_model.py

124-124: Unused noqa directive (non-enabled: E501, ERA001)

Remove unused noqa directive

(RUF100)

applications/portfoliomanager/tests/test_risk_management.py

30-30: Use of assert detected

(S101)

31-31: Use of assert detected

(S101)

31-31: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

50-50: Use of assert detected

(S101)

51-51: Use of assert detected

(S101)

51-51: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

74-74: Use of assert detected

(S101)

85-85: Use of assert detected

(S101)

86-86: Use of assert detected

(S101)

89-89: Unused noqa directive (non-enabled: E501)

Remove unused noqa directive

(RUF100)

105-105: Use of assert detected

(S101)

106-106: Use of assert detected

(S101)

107-107: Use of assert detected

(S101)

107-107: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

110-110: Unused noqa directive (non-enabled: E501)

Remove unused noqa directive

(RUF100)

127-127: Unused noqa directive (non-enabled: E501)

Remove unused noqa directive

(RUF100)

148-148: Unused noqa directive (non-enabled: E501)

Remove unused noqa directive

(RUF100)

170-170: Use of assert detected

(S101)

213-213: Use of assert detected

(S101)

214-214: Use of assert detected

(S101)

257-257: Use of assert detected

(S101)

258-258: Use of assert detected

(S101)

301-301: Use of assert detected

(S101)

304-304: Use of assert detected

(S101)

343-343: Use of assert detected

(S101)

344-344: Use of assert detected

(S101)

359-359: Use of assert detected

(S101)

360-360: Use of assert detected

(S101)

362-362: Use of assert detected

(S101)

363-363: Use of assert detected

(S101)

364-364: Use of assert detected

(S101)

381-381: Use of assert detected

(S101)

382-382: Use of assert detected

(S101)

385-385: Use of assert detected

(S101)

402-402: Use of assert detected

(S101)

403-403: Use of assert detected

(S101)

430-430: Use of assert detected

(S101)

430-430: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

431-431: Use of assert detected

(S101)

431-431: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

432-432: Use of assert detected

(S101)

432-432: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

436-436: Use of assert detected

(S101)

436-436: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

467-467: Use of assert detected

(S101)

467-467: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

468-468: Use of assert detected

(S101)

469-469: Use of assert detected

(S101)

470-470: Use of assert detected

(S101)

497-497: Use of assert detected

(S101)

498-498: Use of assert detected

(S101)

498-498: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

525-525: Use of assert detected

(S101)

525-525: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

527-527: Use of assert detected

(S101)

561-561: Use of assert detected

(S101)

561-561: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

571-571: Use of assert detected

(S101)

571-571: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

575-575: Use of assert detected

(S101)

575-575: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

611-611: Use of assert detected

(S101)

611-611: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

615-615: Use of assert detected

(S101)

619-619: Use of assert detected

(S101)

621-621: Use of assert detected

(S101)

622-622: Use of assert detected

(S101)

623-623: Use of assert detected

(S101)

624-624: Use of assert detected

(S101)

625-625: Use of assert detected

(S101)

625-625: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

628-628: Use of assert detected

(S101)

applications/portfoliomanager/src/portfoliomanager/alpaca_client.py

27-27: Unused noqa directive (non-enabled: FBT001)

Remove unused noqa directive

(RUF100)

libraries/python/tests/test_tft_dataset.py

174-174: Use of assert detected

(S101)

177-177: Use of assert detected

(S101)

179-179: Use of assert detected

(S101)

184-184: Use of assert detected

(S101)

185-185: Use of assert detected

(S101)

235-235: Use of assert detected

(S101)

240-240: Use of assert detected

(S101)

242-242: Use of assert detected

(S101)

243-243: Use of assert detected

(S101)

244-244: Use of assert detected

(S101)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Run Python quality checks
GitHub Check: Run Python tests

🔇 Additional comments (27)

.flox/env/manifest.toml (1)

26-26: LGTM on systems normalization.

Single-line systems array is fine; behavior is unchanged.

applications/portfoliomanager/pyproject.toml (1)

7-8: uv enforcement present for workspace dependency resolution – the project uniformly uses uv (in CI, Docker builds, tests.yaml, and documentation) to sync and run dependencies, ensuring “internal” is resolved correctly; no fallback needed unless pip-based installs are later introduced.

libraries/python/src/internal/tft_model.py (1)

21-21: Doc comment update to daily return — looks good.

Matches the dataset changes; no functional impact.

libraries/python/src/internal/prediction.py (1)

62-72: Confirm test-message coupling with pandera error propagation.

Since the validators raise ValueError with detailed messages, Pandera typically surfaces those in SchemaError. Your tests match on those exact strings—good—but brittle if Pandera changes messaging. Consider either:

returning bool in validators and relying solely on error= text, or

pinning Pandera minor version and keeping tests as-is.

libraries/python/tests/test_prediction.py (2)

1-20: LGTM: happy path coverage.

Validates schema shape and types as intended.

82-111: Test messages rely on underlying ValueError passthrough.

Good specificity, but may be fragile across Pandera versions. If flakiness appears, switch to matching the schema-level error= text or relax the regex.

Also applies to: 114-129

applications/portfoliomanager/tests/test_risk_management.py (2)

169-171: LGTM! Good handling of null returns

The test correctly verifies that null values are propagated for missing close prices, which is critical for data integrity in financial calculations.

406-437: Well-structured portfolio creation test

This test provides good coverage of the fresh start scenario with appropriate assertions for balanced long/short positions and capital allocation.

libraries/python/src/internal/tft_dataset.py (4)

169-174: LGTM! Proper handling of daily returns calculation

The calculation of daily returns using pct_change() over ticker groups and filtering out null values is correct and follows financial data processing best practices.

333-335: Good alignment with risk management objectives

The change to use daily_return as the target variable instead of close_price aligns well with the broader risk management module's focus on returns-based analysis.

341-392: Well-structured postprocessing implementation

The postprocessing method correctly:

Unscales the quantile predictions back to daily return values

Maps encoded tickers back to strings

Generates appropriate timestamps

Returns a properly structured DataFrame

This implementation provides a clean interface between the model predictions and downstream risk management components.

395-442: LGTM! Comprehensive schema update

The schema updates with explicit dtypes and appropriate validation checks improve data integrity. The addition of daily_return to the schema maintains consistency with the preprocessing logic.

libraries/python/tests/test_tft_dataset.py (6)

1-2: LGTM!

The math module import is appropriately added to support the new finite value checks for daily returns.

22-22: LGTM!

Converting volume values from floats to integers aligns with real-world expectations where volume represents whole share counts.

Also applies to: 63-63, 98-98

44-46: LGTM!

The new API pattern with no-argument constructor and explicit preprocess_and_set_data() call improves separation of concerns and makes the data loading step more explicit.

Also applies to: 71-73, 111-113

92-107: LGTM!

Adding a fourth day to the test data is necessary for batch generation with the specified input and output lengths. The test data structure properly supports the daily return calculation requirements.

144-186: LGTM!

The test properly validates the daily return calculation functionality:

Verifies the column exists after preprocessing

Confirms row filtering (first row excluded due to null return)

Validates all return values are finite floats

The test data with specific close prices provides clear expected returns for verification.

229-233: Verify the data_type parameter value.

The test uses data_type="train" but based on the method signature in the implementation, this should align with the intended test behavior. The change from "predict" to "train" means this test now validates training batch generation with targets.

libraries/python/src/internal/equity_bar.py (3)

7-13: LGTM!

The ticker uppercase validation is well-implemented using a lambda check with element-wise validation and a clear error message.

45-49: Good addition of data integrity constraints.

The addition of unique=["ticker", "timestamp"] and coerce=True at the schema level provides important data quality guarantees:

Prevents duplicate ticker-timestamp combinations

Enables automatic type coercion for compatible data types

These are valuable safeguards for downstream processing.

15-43: Verify downstream logic handles zero-volume bars
The schema change now permits volume == 0 (previously only > 0), and while the corresponding test (test_equity_bar_schema_zero_volume_not_allowed) was removed, we must ensure no part of the codebase—filters, aggregations, reporting, or metrics—implicitly assumes volume > 0. Manually search for any comparisons or filters on volume (e.g. > 0 or != 0) and update them if they should include zero.

applications/portfoliomanager/src/portfoliomanager/risk_management.py (5)

7-26: LGTM!

The PDT (Pattern Day Trader) lock detection logic correctly identifies positions opened on the current date by converting timestamps to UTC dates for comparison.

29-44: Clear error handling for insufficient data.

The function properly validates that each ticker has at least 30 bars before computing realized volatility, with a descriptive error message listing the problematic tickers.

46-71: Good approach for maintaining row alignment.

Using row_idx to maintain alignment during the join operation ensures data integrity when computing rolling statistics. The function correctly handles null values and computes both daily returns and realized volatility.

134-182: Complex but correct position management logic.

The conditional logic for determining close_position and maintain_position properly handles:

PDT-locked positions that cannot be closed

Long positions hitting lower thresholds (stop-loss)

Short positions hitting upper thresholds (stop-loss)

Profitable positions to maintain

The logic correctly implements risk management rules.

189-189: Good defensive programming with fallback for standard deviation.

Using or 1e-8 prevents division by zero when all predictions have the same value, though this edge case might warrant logging.

libraries/python/tests/test_portfolio.py (1)

7-39: Good happy-path coverage.

Valid dataset shape and coercion expectations look solid.

coderabbitai · 2025-09-07T02:05:07Z

 cargo.pkg-path = "cargo"
 gum.pkg-path = "gum"
-
+mask.pkg-path = "mask"


💡 Verification agent

🧩 Analysis chain

Confirm the “mask” package is resolvable and actually used.

If this is the Nix attribute for the mask task runner, verify it exists in your channel and that the repo contains Maskfile tasks; otherwise it adds an unused dependency.

Run:

🏁 Script executed:

#!/bin/bash # 1) Check for mask tasks in the repo fd -HIa '^(Maskfile|maskfile|mask\.yml)$' -t f -S -0 | xargs -0 -I{} sh -c 'echo "found: {}" && sed -n "1,60p" "{}"' # 2) (Optional) Confirm package availability in nixpkgs (requires nix) command -v nix >/dev/null && nix search nixpkgs#mask | head -n 10 || echo "nix not available"

Length of output: 153

🏁 Script executed:

#!/bin/bash # 1) Search for Maskfile, maskfile or mask.yml in the repo fd -HI '^(Maskfile|maskfile|mask\.yml)$' -t f -x sh -c 'echo "found: {}" && sed -n "1,60p" "{}"' # 2) Confirm nix package availability if command -v nix >/dev/null; then nix search nixpkgs mask | head -n 10 else echo "nix not available" fi

Length of output: 17

Remove unused mask dependency
.flox/env/manifest.toml:17 sets mask.pkg-path = "mask", but no Maskfile, maskfile, or mask.yml exists in the repo. Remove this entry (or add a Maskfile if you intend to use the Mask task runner).

🤖 Prompt for AI Agents

.flox/env/manifest.toml around line 17: the manifest declares mask.pkg-path = "mask" but there are no Maskfile/maskfile/mask.yml in the repo, so remove this unused dependency entry from the file (or alternatively add a Maskfile if you intend to use the Mask task runner); update the manifest by deleting the mask.pkg-path line and verify flox operations still work.

@chrisaddy you're the expert here given our .md root Maskfile.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

coderabbitai

Actionable comments posted: 15

♻️ Duplicate comments (7)

libraries/python/src/internal/mhsa_network.py (1)
39-39: Remove unused noqa (RUF100).

ERA001 isn’t enabled; drop the directive.
-        # shape: (batch, heads_count, sequence_length, head_dimension)  # noqa: ERA001
+        # shape: (batch, heads_count, sequence_length, head_dimension)
maskfile.md (2)
434-436: Fix positional parameter parsing in test_endpoint.

"$1/$2/$3" are required; current values are literals.
-    local name="1$"
-    local url="2$"
-    local context="3$"
+    local name="$1"
+    local url="$2"
+    local context="$3"
472-473: Fix PortfolioManager port in production tests.

Use 8081 to match the local URLs and service mapping.
-    test_endpoint "PortfolioManager" "http://$MANAGER_IP:808/health" "production"
-    test_endpoint "PortfolioManager (root)" "http://$MANAGER_IP:808/" "production"
+    test_endpoint "PortfolioManager" "http://$MANAGER_IP:8081/health" "production"
+    test_endpoint "PortfolioManager (root)" "http://$MANAGER_IP:8081/" "production"
applications/portfoliomanager/src/portfoliomanager/risk_management.py (2)
96-107: Log-return guard looks good; resolves prior concern.

372-377: Narrow exception handling and (optionally) log.

Catching Exception hides real issues; catch the expected types.
-    except Exception:  # noqa: BLE001
-        return 0.0
+    except (TypeError, ValueError):
+        # logger.warning("Error calculating capital for side %s", side)
+        return 0.0
applications/portfoliomanager/src/portfoliomanager/alpaca_client.py (1)
49-49: Apply the previous review fix for side parsing
-                "side": str(account_position.side).replace("PositionSide.", "").upper(),
+                "side": str(account_position.side).replace("PositionSide.", "").upper(),
Note: This appears to already be correctly implemented, so this is just confirming the fix was applied.
applications/portfoliomanager/pyproject.toml (1)
12-12: Pin upper bound for alpaca-py 0.x series.

0.x minors can be breaking; bound to <0.43 to avoid surprise upgrades.
-    "alpaca-py>=0.42.1",
+    "alpaca-py>=0.42.1,<0.43",

🧹 Nitpick comments (27)

libraries/python/tests/test_company_information.py (2)
101-103: Defaults for None → "NOT AVAILABLE" look good; add a combined-null case

Nice pivot from exception-based behavior to defaults. Consider adding a test where both fields are None to lock in the intended behavior.

Example to add outside this hunk:
def test_company_information_schema_both_null_defaults() -> None:
    data = pl.DataFrame({"sector": [None], "industry": [None]})
    df = company_information_schema.validate(data)
    assert df["sector"][0] == "NOT AVAILABLE"
    assert df["industry"][0] == "NOT AVAILABLE"
101-103: Silence Ruff/Bandit S101 on asserts in tests (or ignore in config)

Ruff flags S101 here. If your pipeline enforces it on tests, either add per-line ignores or configure per-file ignores.

Per-line option (minimal change):
-    assert validated_df["sector"][0] == "NOT AVAILABLE"
-    assert validated_df["industry"][0] == "SOFTWARE"
+    assert validated_df["sector"][0] == "NOT AVAILABLE"  # noqa: S101
+    assert validated_df["industry"][0] == "SOFTWARE"  # noqa: S101
-    assert validated_df["industry"][0] == "NOT AVAILABLE"
-    assert validated_df["sector"][0] == "TECHNOLOGY"
+    assert validated_df["industry"][0] == "NOT AVAILABLE"  # noqa: S101
+    assert validated_df["sector"][0] == "TECHNOLOGY"  # noqa: S101
Config option (outside this file, preferred):
# pyproject.toml
[tool.ruff.lint.per-file-ignores]
"libraries/python/tests/test_company_information.py" = ["S101"]
Also applies to: 114-116
libraries/python/src/internal/tft_model.py (1)
124-124: Remove unused noqa (RUF100).

No need to suppress E501/ERA001 on a commented-out line.
-        # NOTE: static_continuous_features = Tensor.zeros(self.batch_size, 1, 0)  # noqa: E501, ERA001
+        # NOTE: static_continuous_features = Tensor.zeros(self.batch_size, 1, 0)
libraries/python/src/internal/prediction.py (3)
6-9: Reuse shared is_uppercase helper to avoid duplication.

Import the existing helper from internal.portfolio for consistency.
-import pandera.polars as pa
-import polars as pl
-from pandera.polars import PolarsData
+import pandera.polars as pa
+import polars as pl
+from pandera.polars import PolarsData
+from internal.portfolio import is_uppercase
@@
-def is_uppercase(data: PolarsData) -> pl.LazyFrame:
-    return data.lazyframe.select(
-        pl.col(data.key).str.to_uppercase() == pl.col(data.key)
-    )
+# reused from internal.portfolio
23-24: Drop unused E501 no-qa tags (RUF100).

Long f-strings don’t need suppression if you keep them; or wrap onto multiple lines.
-        message = f"Each ticker must have exactly {dates_count} unique dates, found: {unique_dates_per_ticker}"  # noqa: E501
+        message = f"Each ticker must have exactly {dates_count} unique dates, found: {unique_dates_per_ticker}"
@@
-                message = f"Expected all tickers to have the same dates, mismatch between: {first_ticker_dates} and: {set(dates)}"  # noqa: E501
+                message = f"Expected all tickers to have the same dates, mismatch between: {first_ticker_dates} and: {set(dates)}"
Also applies to: 40-41

62-71: Align Check.error messages with raised ValueErrors (keeps tests stable).

Pandera may surface either message; make them identical to the exceptions you raise.
-        pa.Check(
-            check_fn=lambda df: check_dates_count_per_ticker(df),
-            name="check_dates_count_per_ticker",
-            error="Each ticker must have expected date count",
-        ),
+        pa.Check(
+            check_fn=lambda df: check_dates_count_per_ticker(df),
+            name="check_dates_count_per_ticker",
+            error="Each ticker must have exactly 7 unique dates",
+        ),
@@
-        pa.Check(
-            check_fn=lambda df: check_same_dates_per_ticker(df),
-            name="check_same_dates_per_ticker",
-            error="All tickers must have same date values",
-        ),
+        pa.Check(
+            check_fn=lambda df: check_same_dates_per_ticker(df),
+            name="check_same_dates_per_ticker",
+            error="Expected all tickers to have the same dates",
+        ),
maskfile.md (2)
198-198: Use a non-zero SSH ConnectTimeout.

Zero causes immediate failure; use something reasonable for retries.
-    if ssh -o ConnectTimeout=0 pocketsizefund-production 'docker info -f "{{.ServerVersion}} {{.Swarm.LocalNodeState}}"' 2>/dev/null; then
+    if ssh -o ConnectTimeout=10 pocketsizefund-production 'docker info -f "{{.ServerVersion}} {{.Swarm.LocalNodeState}}"' 2>/dev/null; then
218-220: Standardize redirections.

Prefer >/dev/null 2>&1 when you intend to suppress both stdout and stderr.

Example:
-docker context use default >/dev/null >&1 || true
+docker context use default >/dev/null 2>&1 || true
Apply similarly across the listed lines.

Also applies to: 229-231, 341-341, 412-412, 479-479, 493-493, 511-511, 686-686, 692-692, 697-697
libraries/python/src/internal/tft_dataset.py (3)
130-149: Zero-fill vs. >0 constraints conflict. Choose one strategy.

You fill missing OHLC with 0.0, then enforce >0 in schema—this will fail whenever data were missing. Either:

allow zeros in schema (>= 0), or

avoid zero-fill for OHLC (forward-fill/backfill or drop rows) and keep >0.

Option A (permit zero placeholders):
-        "open_price": pa.Column(dtype=float, checks=pa.Check.greater_than(0)),
+        "open_price": pa.Column(dtype=float, checks=pa.Check.greater_than_or_equal_to(0)),
-        "high_price": pa.Column(dtype=float, checks=pa.Check.greater_than(0)),
+        "high_price": pa.Column(dtype=float, checks=pa.Check.greater_than_or_equal_to(0)),
-        "low_price": pa.Column(dtype=float, checks=pa.Check.greater_than(0)),
+        "low_price": pa.Column(dtype=float, checks=pa.Check.greater_than_or_equal_to(0)),
-        "close_price": pa.Column(dtype=float, checks=pa.Check.greater_than(0)),
+        "close_price": pa.Column(dtype=float, checks=pa.Check.greater_than_or_equal_to(0)),
Option B (preferred for data quality): replace the 0.0 fills for OHLC with a forward-fill/backfill per ticker and keep the >0 checks. I can draft that if you prefer.

Also applies to: 406-420

229-233: Use consistent Polars expressions for min/max dates.

Mirror earlier style for clarity and to avoid mixing Series/Python values in select.
-        minimum_date: date = self.data.select(self.data["date"].min()).item()
-        maximum_date: date = self.data.select(self.data["date"].max()).item()
+        minimum_date: date = self.data.select(pl.col("date").min()).item()
+        maximum_date: date = self.data.select(pl.col("date").max()).item()
341-347: Fix type hint for input_batch.

The function expects the batch dict, not a single Tensor.
-    def postprocess_predictions(
-        self,
-        input_batch: Tensor,  # static_categorical_features
+    def postprocess_predictions(
+        self,
+        input_batch: dict[str, Tensor],  # expects batch dict
libraries/python/tests/test_tft_dataset.py (3)
118-126: Make batch count assertion less brittle.

Future changes to windowing could validly yield >1 batch. Prefer ≥1.
-    assert len(batches) == 1
+    assert len(batches) >= 1
238-245: Remove duplicate assertion.

targets.shape[1] == 1 is asserted twice.
-            assert targets.shape[1] == 1
             expected_tensor_dimensions = 2
             assert len(targets.shape) == expected_tensor_dimensions
             assert targets.shape[0] > 0
             assert targets.shape[1] == 1
144-186: Ignore Bandit S101 for test asserts (or configure Ruff).

The S101 warnings are expected in pytest-style tests. Either keep as-is or add a test-only ignore in Ruff.

Add to pyproject.toml:
[tool.ruff.lint.per-file-ignores]
"libraries/python/tests/**" = ["S101"]
libraries/python/src/internal/equity_bar.py (1)
46-50: Nit: comment typo.

“partion” → “partition”.
-    strict="filter",  # allows DuckDB partion columns
+    strict="filter",  # allows DuckDB partition columns
applications/portfoliomanager/src/portfoliomanager/risk_management.py (6)
13-26: Vectorize PDT lock derivation; drop map_elements.

Leverage Polars datetime casting for speed and clarity.
-    return positions.with_columns(
-        pl.when(
-            pl.col("timestamp")
-            .cast(pl.Int64)
-            .map_elements(
-                lambda ts: datetime.fromtimestamp(ts, tz=UTC).date(),
-                return_dtype=pl.Date,
-            )
-            == current_date
-        )
-        .then(True)  # noqa: FBT003
-        .otherwise(False)  # noqa: FBT003
-        .alias("pdt_locked")
-    )
+    return positions.with_columns(
+        (
+            pl.col("timestamp")
+            .cast(pl.Int64)
+            .cast(pl.Datetime(time_unit="s", time_zone="UTC"))
+            .dt.date()
+            == pl.lit(current_date)
+        ).alias("pdt_locked")
+    )
111-135: Replace Python loop with vectorized cumulative returns for scalability.

Iterating rows will bottleneck with many positions. Consider precomputing per-ticker cumulative sums and using asof joins.

I can provide a vectorized Polars rewrite if you want it in this PR.

115-121: Use integer seconds once and reuse.

Avoid float comparisons and repeated conversions.
-    for row in position_predictions.iter_rows(named=True):
+    current_ts_s = int(current_timestamp.timestamp())
+    for row in position_predictions.iter_rows(named=True):
@@
-            & (pl.col("timestamp") <= current_timestamp.timestamp())
+            & (pl.col("timestamp") <= current_ts_s)
139-181: Null thresholds produce null predicates; default to False explicitly.

Fill missing thresholds to avoid null-tristate booleans leaking into when/then.
-    return positions_with_data.with_columns(
+    return positions_with_data.with_columns(
+        pl.col("original_lower_threshold").fill_null(float("inf")).alias("original_lower_threshold"),
+        pl.col("original_upper_threshold").fill_null(float("-inf")).alias("original_upper_threshold"),
222-246: Ensure ordering before head/tail selection.

If callers forget to sort predictions via add_predictions_zscore_ranked_columns, selection becomes arbitrary. Sort defensively here.
-    available_predictions = predictions.filter(
-        ~pl.col("ticker").is_in(excluded_tickers)
-    )
+    available_predictions = (
+        predictions.filter(~pl.col("ticker").is_in(excluded_tickers))
+        .sort(["composite_score", "inter_quartile_range"], descending=[True, False])
+    )
23-24: Remove unused noqa FBT003 directives.

Rule FBT003 isn’t enabled; the directives create noise.
-        .then(True)  # noqa: FBT003
-        .otherwise(False)  # noqa: FBT003
+        .then(True)
+        .otherwise(False)
(Apply similarly to other instances in this file.)

Also applies to: 43-43, 159-160, 179-180, 207-209, 293-294, 302-303
applications/portfoliomanager/src/portfoliomanager/alpaca_client.py (2)
27-27: Remove unused noqa directive

The noqa: FBT001 directive is not needed since FBT001 is not enabled.
-        is_paper: bool,  # noqa: FBT001
+        is_paper: bool,
68-71: Code duplication: is_uppercase function exists elsewhere

The is_uppercase function is duplicated between this file and libraries/python/src/internal/portfolio.py. Consider importing it from the internal library to reduce duplication.
+from internal.portfolio import is_uppercase
-
-def is_uppercase(data: pa.PolarsData) -> pl.LazyFrame:
-    return data.lazyframe.select(
-        pl.col(data.key).str.to_uppercase() == pl.col(data.key)
-    )
applications/portfoliomanager/tests/test_risk_management.py (3)
89-91: Remove unused noqa directives; keep asserts in tests by configuring Ruff.

Ruff flags unused “noqa: E501/PLR2004” here. Drop them and, if needed, ignore S101 for tests in Ruff config.

Minimal cleanup examples:
-def test_add_equity_bars_returns_and_realized_volatility_columns_sufficient_data_success() -> (  # noqa: E501
-    None
-):
+def test_add_equity_bars_returns_and_realized_volatility_columns_sufficient_data_success() -> None:
@@
-    assert len(result) == 35  # noqa: PLR2004
+    assert len(result) == 35
Optional Ruff config (root pyproject.toml):
[tool.ruff.lint.per-file-ignores]
"**/tests/**" = ["S101"]
Also applies to: 110-112, 127-129, 148-150, 430-436, 467-470, 498-499, 525-527, 561-575, 611-625

406-433: Add negative-path test for empty optimal portfolio.

When all predictions are excluded and there are no maintained positions, create_optimal_portfolio should raise ValueError. Add a test to lock this behavior.
+def test_create_optimal_portfolio_raises_when_no_components() -> None:
+    predictions = pl.DataFrame({"ticker": ["HIGH"], "composite_score": [1.0], "inter_quartile_range": [0.9]})
+    positions = pl.DataFrame({"ticker": [], "timestamp": [], "side": [], "dollar_amount": [], "close_position": [], "maintain_position": []})
+    with pytest.raises(ValueError, match="No portfolio components"):
+        create_optimal_portfolio(predictions, positions, 20000.0, datetime.now(tz=UTC))
473-499: Strengthen assertions: verify 1 long and 1 short after high-uncertainty exclusion.

This makes the intent explicit and guards against future ordering changes.
 result = create_optimal_portfolio(
     predictions, positions, 20000.0, datetime.now(tz=UTC)
 )
 
 assert "HIGH_UNCERT" not in result["ticker"].to_list()
-assert len(result) == 2  # only 2 available predictions
+assert len(result) == 2
+assert result.filter(pl.col("side") == "LONG").height == 1
+assert result.filter(pl.col("side") == "SHORT").height == 1
applications/portfoliomanager/pyproject.toml (1)
5-5: Optional: relax exact Python patch pin.

Using a range reduces churn in CI while staying within 3.12.
-requires-python = "==3.12.10"
+requires-python = ">=3.12,<3.13"

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 13825cb and fd654f7.

⛔ Files ignored due to path filters (2)

.flox/env/manifest.lock is excluded by !**/*.lock
uv.lock is excluded by !**/*.lock

📒 Files selected for processing (23)

.flox/env/manifest.toml (2 hunks)
applications/datamanager/src/datamanager/alpaca_client.py (0 hunks)
applications/datamanager/src/datamanager/s3_client.py (0 hunks)
applications/datamanager/tests/test_alpaca_client.py (0 hunks)
applications/datamanager/tests/test_s3_client.py (0 hunks)
applications/portfoliomanager/pyproject.toml (1 hunks)
applications/portfoliomanager/src/portfoliomanager/alpaca_client.py (1 hunks)
applications/portfoliomanager/src/portfoliomanager/risk_management.py (1 hunks)
applications/portfoliomanager/tests/test_risk_management.py (1 hunks)
infrastructure/stack.yml (1 hunks)
libraries/python/src/internal/company_information.py (1 hunks)
libraries/python/src/internal/equity_bar.py (1 hunks)
libraries/python/src/internal/mhsa_network.py (1 hunks)
libraries/python/src/internal/portfolio.py (1 hunks)
libraries/python/src/internal/prediction.py (1 hunks)
libraries/python/src/internal/tft_dataset.py (5 hunks)
libraries/python/src/internal/tft_model.py (2 hunks)
libraries/python/tests/test_company_information.py (2 hunks)
libraries/python/tests/test_equity_bar.py (0 hunks)
libraries/python/tests/test_portfolio.py (1 hunks)
libraries/python/tests/test_prediction.py (1 hunks)
libraries/python/tests/test_tft_dataset.py (6 hunks)
maskfile.md (28 hunks)

💤 Files with no reviewable changes (5)

applications/datamanager/src/datamanager/s3_client.py
libraries/python/tests/test_equity_bar.py
applications/datamanager/tests/test_alpaca_client.py
applications/datamanager/src/datamanager/alpaca_client.py
applications/datamanager/tests/test_s3_client.py

🚧 Files skipped from review as they are similar to previous changes (4)

infrastructure/stack.yml
.flox/env/manifest.toml
libraries/python/src/internal/portfolio.py
libraries/python/src/internal/company_information.py

🧰 Additional context used

🧬 Code graph analysis (9)

libraries/python/src/internal/tft_dataset.py (1)

libraries/python/tests/test_equity_bar.py (1)

test_equity_bar_schema_type_coercion (180-198)

libraries/python/tests/test_tft_dataset.py (1)

libraries/python/src/internal/tft_dataset.py (3)

TFTDataset (26-392)

preprocess_and_set_data (32-204)

get_batches (261-339)

libraries/python/src/internal/equity_bar.py (1)

libraries/python/tests/test_equity_bar.py (9)

test_equity_bar_schema_type_coercion (180-198)

test_equity_bar_schema_negative_volume (144-159)

test_equity_bar_schema_valid_data (7-22)

test_equity_bar_schema_missing_required_column (201-216)

test_equity_bar_schema_negative_timestamp (61-76)

test_equity_bar_schema_multiple_rows (237-253)

test_equity_bar_schema_zero_timestamp (79-94)

test_equity_bar_schema_negative_prices (97-123)

test_equity_bar_schema_zero_volume_not_allowed (162-177)

applications/portfoliomanager/src/portfoliomanager/alpaca_client.py (2)

libraries/python/src/internal/portfolio.py (1)

is_uppercase (6-9)

applications/datamanager/src/datamanager/alpaca_client.py (2)

__init__ (23-41)

fetch_latest_data (64-170)

applications/portfoliomanager/src/portfoliomanager/risk_management.py (1)

libraries/python/src/internal/dates.py (1)

Date (8-31)

applications/portfoliomanager/tests/test_risk_management.py (1)

applications/portfoliomanager/src/portfoliomanager/risk_management.py (5)

add_equity_bars_returns_and_realized_volatility_columns (29-71)

add_positions_pdt_locked_column (7-26)

add_positions_performance_columns (74-188)

add_predictions_zscore_ranked_columns (191-210)

create_optimal_portfolio (213-328)

libraries/python/src/internal/prediction.py (2)

applications/portfoliomanager/src/portfoliomanager/alpaca_client.py (1)

is_uppercase (68-71)

libraries/python/src/internal/portfolio.py (1)

is_uppercase (6-9)

libraries/python/tests/test_portfolio.py (1)

libraries/python/tests/test_equity_bar.py (7)

test_equity_bar_schema_valid_data (7-22)

test_equity_bar_schema_type_coercion (180-198)

test_equity_bar_schema_negative_volume (144-159)

test_equity_bar_schema_multiple_rows (237-253)

test_equity_bar_schema_negative_timestamp (61-76)

test_equity_bar_schema_ticker_lowercase_fails (25-40)

test_equity_bar_schema_zero_timestamp (79-94)

libraries/python/tests/test_prediction.py (1)

libraries/python/tests/test_equity_bar.py (5)

test_equity_bar_schema_type_coercion (180-198)

test_equity_bar_schema_valid_data (7-22)

test_equity_bar_schema_negative_timestamp (61-76)

test_equity_bar_schema_multiple_rows (237-253)

test_equity_bar_schema_ticker_lowercase_fails (25-40)

🪛 Ruff (0.12.2)

libraries/python/tests/test_company_information.py

102-102: Use of assert detected

(S101)

103-103: Use of assert detected

(S101)

115-115: Use of assert detected

(S101)

116-116: Use of assert detected

(S101)

libraries/python/tests/test_tft_dataset.py

174-174: Use of assert detected

(S101)

177-177: Use of assert detected

(S101)

179-179: Use of assert detected

(S101)

184-184: Use of assert detected

(S101)

185-185: Use of assert detected

(S101)

235-235: Use of assert detected

(S101)

240-240: Use of assert detected

(S101)

242-242: Use of assert detected

(S101)

243-243: Use of assert detected

(S101)

244-244: Use of assert detected

(S101)

applications/portfoliomanager/src/portfoliomanager/alpaca_client.py

27-27: Unused noqa directive (non-enabled: FBT001)

Remove unused noqa directive

(RUF100)

applications/portfoliomanager/src/portfoliomanager/risk_management.py

23-23: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

24-24: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

43-43: Unused noqa directive (non-enabled: E501)

Remove unused noqa directive

(RUF100)

159-159: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

160-160: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

179-179: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

180-180: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

207-207: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

208-208: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

209-209: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

293-293: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

294-294: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

302-302: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

303-303: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

applications/portfoliomanager/tests/test_risk_management.py

30-30: Use of assert detected

(S101)

31-31: Use of assert detected

(S101)

31-31: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

50-50: Use of assert detected

(S101)

51-51: Use of assert detected

(S101)

51-51: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

74-74: Use of assert detected

(S101)

85-85: Use of assert detected

(S101)

86-86: Use of assert detected

(S101)

89-89: Unused noqa directive (non-enabled: E501)

Remove unused noqa directive

(RUF100)

105-105: Use of assert detected

(S101)

106-106: Use of assert detected

(S101)

107-107: Use of assert detected

(S101)

107-107: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

110-110: Unused noqa directive (non-enabled: E501)

Remove unused noqa directive

(RUF100)

127-127: Unused noqa directive (non-enabled: E501)

Remove unused noqa directive

(RUF100)

148-148: Unused noqa directive (non-enabled: E501)

Remove unused noqa directive

(RUF100)

170-170: Use of assert detected

(S101)

213-213: Use of assert detected

(S101)

214-214: Use of assert detected

(S101)

257-257: Use of assert detected

(S101)

258-258: Use of assert detected

(S101)

301-301: Use of assert detected

(S101)

304-304: Use of assert detected

(S101)

343-343: Use of assert detected

(S101)

344-344: Use of assert detected

(S101)

359-359: Use of assert detected

(S101)

360-360: Use of assert detected

(S101)

362-362: Use of assert detected

(S101)

363-363: Use of assert detected

(S101)

364-364: Use of assert detected

(S101)

381-381: Use of assert detected

(S101)

382-382: Use of assert detected

(S101)

385-385: Use of assert detected

(S101)

402-402: Use of assert detected

(S101)

403-403: Use of assert detected

(S101)

430-430: Use of assert detected

(S101)

430-430: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

431-431: Use of assert detected

(S101)

431-431: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

432-432: Use of assert detected

(S101)

432-432: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

436-436: Use of assert detected

(S101)

436-436: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

467-467: Use of assert detected

(S101)

467-467: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

468-468: Use of assert detected

(S101)

469-469: Use of assert detected

(S101)

470-470: Use of assert detected

(S101)

497-497: Use of assert detected

(S101)

498-498: Use of assert detected

(S101)

498-498: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

525-525: Use of assert detected

(S101)

525-525: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

527-527: Use of assert detected

(S101)

561-561: Use of assert detected

(S101)

561-561: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

571-571: Use of assert detected

(S101)

571-571: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

575-575: Use of assert detected

(S101)

575-575: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

611-611: Use of assert detected

(S101)

611-611: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

615-615: Use of assert detected

(S101)

619-619: Use of assert detected

(S101)

621-621: Use of assert detected

(S101)

622-622: Use of assert detected

(S101)

623-623: Use of assert detected

(S101)

624-624: Use of assert detected

(S101)

625-625: Use of assert detected

(S101)

625-625: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

628-628: Use of assert detected

(S101)

libraries/python/src/internal/mhsa_network.py

39-39: Unused noqa directive (non-enabled: ERA001)

Remove unused noqa directive

(RUF100)

libraries/python/src/internal/prediction.py

23-23: Unused noqa directive (non-enabled: E501)

Remove unused noqa directive

(RUF100)

40-40: Unused noqa directive (non-enabled: E501)

Remove unused noqa directive

(RUF100)

libraries/python/src/internal/tft_model.py

124-124: Unused noqa directive (non-enabled: E501, ERA001)

Remove unused noqa directive

(RUF100)

libraries/python/tests/test_portfolio.py

39-39: Use of assert detected

(S101)

libraries/python/tests/test_prediction.py

19-19: Use of assert detected

(S101)

79-79: Use of assert detected

(S101)

143-143: Use of assert detected

(S101)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Run Python tests

🔇 Additional comments (9)

libraries/python/src/internal/tft_model.py (1)

21-21: Doc comment alignment looks good.

Updating output_size to “daily return” matches the dataset changes.

libraries/python/tests/test_prediction.py (1)

7-20: Tests cover the happy path well.

Schema, shape, and types validated as expected.

libraries/python/tests/test_portfolio.py (1)

7-40: Solid baseline validation.

Balanced sides, unique tickers, and coercion paths are exercised.

maskfile.md (1)

318-321: Good fix on wait time.

Replacing sleep 0 with sleep 10 prevents races during stack teardown.

libraries/python/src/internal/tft_dataset.py (1)

169-174: Daily return derivation is correct.

pct_change over “ticker” with null-drop keeps sequences consistent.
libraries/python/tests/test_tft_dataset.py (1)
46-47: Potential timestamp unit mismatch in TFTDataset may break ordering/windows.

TFTDataset.fill logic mixes ms and s epochs when filling null timestamps, which can misorder rows. Please standardize epoch units inside TFTDataset before slicing batches.

Apply in internal/tft_dataset.py (outside this file):
-        data = data.with_columns(
-            pl.col("timestamp")
-            .cast(pl.Datetime)
-            .dt.replace_time_zone("UTC")
-            .cast(pl.Int64)
-            .floordiv(1000)
-        )
+        # normalize all timestamps to seconds since epoch
+        data = data.with_columns(
+            pl.when(pl.col("timestamp") > 10**12)   # ms -> s
+            .then(pl.col("timestamp").floordiv(1000))
+            .otherwise(pl.col("timestamp"))
+            .cast(pl.Int64)
+        )
libraries/python/src/internal/equity_bar.py (1)
35-43: Confirm whether zero volume should be allowed.

Schema permits 0 via greater_than_or_equal_to(0), but tests in snippets still reject zero volume. Align policy and tests.

If zero is not allowed:
-        "volume": pa.Column(
-            dtype=int,
-            checks=[pa.Check.greater_than_or_equal_to(0)],
-        ),
+        "volume": pa.Column(dtype=int, nullable=False, checks=[pa.Check.greater_than(0)]),
applications/portfoliomanager/src/portfoliomanager/alpaca_client.py (2)

37-65: LGTM: Well-structured Alpaca integration

The get_account method properly implements rate limiting, handles API responses with appropriate casting, and validates the data using pandera schemas. The error handling and data transformation logic is sound.

74-96: LGTM: Comprehensive position schema validation

The position schema correctly validates all critical fields with appropriate constraints:

Ticker symbols are uppercase strings

Side values are restricted to LONG/SHORT

Dollar and share amounts are positive floats

coderabbitai

Actionable comments posted: 6

♻️ Duplicate comments (5)

libraries/python/src/internal/mhsa_network.py (1)

39-39: Remove unused noqa directive (RUF100).

Drop the # noqa: ERA001 from the shape comment or enable ERA001 in Ruff config. The directive is currently unused.
-        # shape: (batch, heads_count, sequence_length, head_dimension)  # noqa: ERA001
+        # shape: (batch, heads_count, sequence_length, head_dimension)

applications/portfoliomanager/src/portfoliomanager/risk_management.py (1)

373-386: Avoid broad except Exception; handle nulls directly.

Broad catches can hide real issues; summation can be expressed null-safe.

Apply:

     if filtered_positions.height == 0:
         return 0.0
 
-    try:
-        side_capital_amount = filtered_positions.select(pl.sum("dollar_amount")).item()
-        return float(side_capital_amount or 0)
-
-    except Exception:  # noqa: BLE001
-        return 0.0
+    side_capital_amount = filtered_positions.select(
+        pl.col("dollar_amount").sum()
+    ).item()
+    return float(side_capital_amount or 0.0)

libraries/python/src/internal/prediction.py (1)

45-51: Fix Polars uppercase check (use project helper or Polars string API).

Current lambda uses .upper() which is invalid for Polars Series in pandera-polars contexts; reuse the existing helper for consistency.

+from internal.portfolio import is_uppercase
@@
-        "ticker": pa.Column(
-            dtype=str,
-            checks=[
-                pa.Check(
-                    lambda s: s.upper() == s,
-                    error="Ticker must be uppercase",
-                    element_wise=True,
-                )
-            ],
-        ),
+        "ticker": pa.Column(
+            dtype=str,
+            checks=[pa.Check(is_uppercase, error="Ticker must be uppercase")],
+        ),

maskfile.md (2)

472-473: Correct PortfolioManager production port (808 → 8081).

Endpoints currently probe the wrong port.

-    test_endpoint "PortfolioManager" "http://$MANAGER_IP:808/health" "production"
-    test_endpoint "PortfolioManager (root)" "http://$MANAGER_IP:808/" "production"
+    test_endpoint "PortfolioManager" "http://$MANAGER_IP:8081/health" "production"
+    test_endpoint "PortfolioManager (root)" "http://$MANAGER_IP:8081/" "production"

434-449: Fix function parameters (placeholders printed literally today).

Positional args are malformed, causing incorrect output and statuses.

-test_endpoint() {
-    local name="1$"
-    local url="2$"
-    local context="3$"
+test_endpoint() {
+    local name="$1"
+    local url="$2"
+    local context="$3"

🧹 Nitpick comments (24)

libraries/python/src/internal/tft_model.py (1)
124-124: Remove unused noqa directives on a commented-out line (RUF100).

The NOTE line is commented and still carries # noqa: E501, ERA001. Either delete the line or drop the noqa suffixes.
-        # NOTE: static_continuous_features = Tensor.zeros(self.batch_size, 1, 0)  # noqa: E501, ERA001
+        # NOTE: static_continuous_features = Tensor.zeros(self.batch_size, 1, 0)
applications/portfoliomanager/src/portfoliomanager/alpaca_client.py (3)
27-27: Remove unused noqa (RUF100).

FBT001 isn’t enabled; drop the directive.
-        is_paper: bool,  # noqa: FBT001
+        is_paper: bool,
60-65: Avoid double validation; validate in one place.

You validate positions here and again in AlpacaAccount. Keep only one (recommend keeping it in AlpacaAccount).
-        position_schema.validate(positions)
-
         return AlpacaAccount(
             cash_amount=float(cast("str", account.cash)),
             positions=positions,
         )
68-96: Consider deduping validation helpers.

is_uppercase duplicates logic in libraries/python/src/internal/portfolio.py. If import boundaries allow, reuse a single shared helper to avoid drift.
libraries/python/src/internal/portfolio.py (3)
6-9: Uppercase check helper mirrors another module.

To reduce duplication, consider centralizing is_uppercase for reuse (or import from portfoliomanager if layering permits).

12-15: Hard-coded total position count (20) may be too rigid.

If live portfolios aren’t always 20 positions, parameterize this via schema context/config or move the count check outside the schema.

78-112: Schema options: consider strictness.

If you want to forbid unexpected columns in portfolio inputs, set strict=True. If not, current coerce=True and unique=["ticker"] look fine.
-portfolio_schema = pa.DataFrameSchema(
+portfolio_schema = pa.DataFrameSchema(
     {
       ...
     },
     unique=["ticker"],
     coerce=True,
+    # strict=True,  # optionally enforce no extra columns
     checks=[
       ...
     ],
 )
libraries/python/tests/test_portfolio.py (3)
115-116: Relax brittle error-message matching.

Exact string match on the counts makes tests fragile. Prefer a stable substring.

Apply:
-with pytest.raises(SchemaError, match="Expected 10 long side positions, found: 15"):
+with pytest.raises(SchemaError, match=r"Expected\s+10\s+long\s+side\s+positions"):
151-152: Relax brittle error-message matching (amount sums).

Match on a stable phrase rather than the full sentence.

Apply:
-with pytest.raises(SchemaError, match="long and short dollar amount sums"):
+with pytest.raises(SchemaError, match=r"long\s+and\s+short\s+dollar\s+amount\s+sums"):
39-39: Ruff S101 in tests: ignore for test files.

Asserts in tests are fine. Consider a per-file-ignores rule instead of changing assertions.

Add to pyproject.toml:
[tool.ruff.lint.per-file-ignores]
"**/tests/**" = ["S101"]
applications/portfoliomanager/src/portfoliomanager/risk_management.py (6)
93-107: Cast end timestamp once to Int; avoid float comparison and repeated calls.

Safer and a touch faster.

Apply:
-    original_equity_bars_with_returns = original_equity_bars.sort(
-        ["ticker", "timestamp"]
-    )
+    original_equity_bars_with_returns = original_equity_bars.sort(
+        ["ticker", "timestamp"]
+    )
+    end_ts = int(current_timestamp.timestamp())
@@
-        ticker_bars = original_equity_bars_with_returns.filter(
+        ticker_bars = original_equity_bars_with_returns.filter(
             (pl.col("ticker") == ticker)
             & (pl.col("timestamp") >= position_timestamp)
-            & (pl.col("timestamp") <= current_timestamp.timestamp())
+            & (pl.col("timestamp") <= end_ts)
         )
130-179: Optional: validate dependency column presence.

If callers pass raw bars, log_daily_returns is missing and the sum yields None; fail fast with a clear message.

Apply near Line 130 before the join:
+    if "log_daily_returns" not in original_equity_bars_with_returns.columns:
+        raise ValueError("original_equity_bars must include 'log_daily_returns'; run add_equity_bars_returns_and_realized_volatility_columns first.")
198-201: Guard against empty predictions.

Mean/std on empty frames can produce None; return early.

Apply:
 def add_predictions_zscore_ranked_columns(predictions: pl.DataFrame) -> pl.DataFrame:
     predictions = predictions.clone()
+    if predictions.height == 0:
+        return predictions.with_columns(
+            pl.lit(0.0).alias("z_score_return"),
+            pl.lit(0.0).alias("inter_quartile_range"),
+            pl.lit(0.0).alias("composite_score"),
+            pl.lit(False).alias("pdt_locked"),
+            pl.lit(False).alias("close_position"),
+            pl.lit(False).alias("maintain_position"),
+        )
245-253: Naming: threshold is a maximum, not minimum.

minimum_inter_quartile_range reads backwards; rename for clarity.

Apply:
-    minimum_inter_quartile_range = 0.75
+    max_inter_quartile_range = 0.75
@@
-            pl.col("inter_quartile_range") > minimum_inter_quartile_range
+            pl.col("inter_quartile_range") > max_inter_quartile_range
281-297: Round monetary amounts to cents.

Avoid fractional cents in allocations.

Apply:
-        pl.lit(dollar_amount_per_long).alias("dollar_amount"),
+        pl.lit(dollar_amount_per_long).round(2).alias("dollar_amount"),
@@
-        pl.lit(dollar_amount_per_short).alias("dollar_amount"),
+        pl.lit(dollar_amount_per_short).round(2).alias("dollar_amount"),
23-24: Remove unused # noqa directives flagged by Ruff (RUF100).

These rules aren’t enabled; drop the comments to keep lint clean.

Example edits (apply similarly to all occurrences):
-        .then(True)  # noqa: FBT003
-        .otherwise(False)  # noqa: FBT003
+        .then(True)
+        .otherwise(False)
Also applies to: 42-42, 150-151, 170-171, 198-200, 286-287, 295-296
applications/portfoliomanager/tests/test_risk_management.py (2)
31-31: Clean up unused # noqa comments and consider ignoring S101 in tests.

These noqa entries (PLR2004/E501) aren’t active; remove them. If S101 triggers in tests, prefer a per-file-ignores config.

Example:
-    assert len(result) == 2  # noqa: PLR2004
+    assert len(result) == 2
And in pyproject.toml:
[tool.ruff.lint.per-file-ignores]
"**/tests/**" = ["S101"]
Also applies to: 51-51, 74-74, 85-86, 89-89, 105-108, 111-111, 128-128, 195-195, 536-538, 542-542, 573-573, 604-604, 631-631, 667-667, 677-677, 681-681, 717-731, 734-734

532-535: Stabilize time-dependent tests (optional).

Using datetime.now(tz=UTC) is fine here, but fixed timestamps improve determinism.

Apply, e.g.:
-    result = create_optimal_portfolio(
-        predictions, positions, 20000.0, datetime.now(tz=UTC)
-    )
+    fixed_now = datetime(2024, 1, 20, tzinfo=UTC)
+    result = create_optimal_portfolio(predictions, positions, 20000.0, fixed_now)
Also applies to: 569-572, 599-602, 627-629, 661-663, 710-715
libraries/python/src/internal/prediction.py (2)
10-18: Prefer n_unique over unique().len for performance and clarity.

Aggregate counts directly; avoid materializing per-ticker lists in Python.
-    grouped = data.lazyframe.group_by("ticker").agg(
-        pl.col("timestamp").unique().alias("unique_dates")
-    )
-
-    unique_dates_per_ticker = grouped.collect()["unique_dates"].to_list()
-
-    if not all(len(dates) == dates_count for dates in unique_dates_per_ticker):
-        message = f"Each ticker must have exactly {dates_count} unique dates, found: {unique_dates_per_ticker}"  # noqa: E501
-        raise ValueError(message)
+    counts = data.lazyframe.group_by("ticker").agg(
+        pl.col("timestamp").n_unique().alias("n_unique")
+    ).collect()
+    invalid = counts.filter(pl.col("n_unique") != dates_count)
+    if invalid.height > 0:
+        raise ValueError(f"Each ticker must have exactly {dates_count} unique dates")
34-35: Remove unused noqa E501 directives.

Ruff flags these as unused; line lengths are fine now.
-        message = f"Each ticker must have exactly {dates_count} unique dates, found: {unique_dates_per_ticker}"  # noqa: E501
+        message = f"Each ticker must have exactly {dates_count} unique dates, found: {unique_dates_per_ticker}"
@@
-                message = f"Expected all tickers to have the same dates, mismatch between: {first_ticker_dates} and: {set(dates)}"  # noqa: E501
+                message = f"Expected all tickers to have the same dates, mismatch between: {first_ticker_dates} and: {set(dates)}"
Also applies to: 17-18
libraries/python/tests/test_prediction.py (1)
131-143: Add a test for quantile monotonicity (if you adopt the new check).

Covers the essential constraint that distributions are coherent.
 def test_prediction_schema_float_quantile_values() -> None:
@@
     validated_df = prediction_schema.validate(valid_data)
     assert validated_df.shape == (7, 5)
+
+
+def test_prediction_schema_quantiles_not_monotonic_fails() -> None:
+    data = pl.DataFrame(
+        {
+            "ticker": ["AAPL"] * 7,
+            "timestamp": [1, 2, 3, 4, 5, 6, 7],
+            "quantile_10": [100.0] * 7,
+            "quantile_50": [150.0] * 7,
+            "quantile_90": [140.0] * 7,  # lower than q50 should fail
+        }
+    )
+    with pytest.raises(SchemaError, match="Quantiles must be monotonic"):
+        prediction_schema.validate(data)
maskfile.md (3)
198-205: Use a non-zero SSH ConnectTimeout.

With 0s, every attempt fails immediately despite retries.
-    if ssh -o ConnectTimeout=0 pocketsizefund-production 'docker info -f "{{.ServerVersion}} {{.Swarm.LocalNodeState}}"' 2>/dev/null; then
+    if ssh -o ConnectTimeout=10 pocketsizefund-production 'docker info -f "{{.ServerVersion}} {{.Swarm.LocalNodeState}}"' 2>/dev/null; then
636-641: Show stack tasks (don’t discard stdout).

The intent is to display tasks; current redirection hides them.
-echo "️ Infrastructure Stack:"
-docker stack ps infrastructure >/dev/null || echo "Infrastructure stack not deployed"
+echo "️ Infrastructure Stack:"
+if ! docker stack ps infrastructure; then
+  echo "Infrastructure stack not deployed"
+fi
@@
-echo " Applications Stack:"
-docker stack ps applications >/dev/null || echo "Applications stack not deployed"
+echo " Applications Stack:"
+if ! docker stack ps applications; then
+  echo "Applications stack not deployed"
+fi
495-499: Actually print service list in health check.

Don’t redirect the table to /dev/null if you want operators to see it.
-        if docker service ls --format "table {{.Name}}\t{{.Replicas}}\t{{.Image}}" >/dev/null; then
-            echo " Services listed successfully"
+        if docker service ls --format "table {{.Name}}\t{{.Replicas}}\t{{.Image}}"; then
+            :
         else
             echo "  No services found or connection error"
         fi

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fd654f7 and 15cc83e.

⛔ Files ignored due to path filters (2)

.flox/env/manifest.lock is excluded by !**/*.lock
uv.lock is excluded by !**/*.lock

📒 Files selected for processing (23)

.flox/env/manifest.toml (2 hunks)
applications/datamanager/src/datamanager/alpaca_client.py (0 hunks)
applications/datamanager/src/datamanager/s3_client.py (0 hunks)
applications/datamanager/tests/test_alpaca_client.py (0 hunks)
applications/datamanager/tests/test_s3_client.py (0 hunks)
applications/portfoliomanager/pyproject.toml (1 hunks)
applications/portfoliomanager/src/portfoliomanager/alpaca_client.py (1 hunks)
applications/portfoliomanager/src/portfoliomanager/risk_management.py (1 hunks)
applications/portfoliomanager/tests/test_risk_management.py (1 hunks)
infrastructure/stack.yml (1 hunks)
libraries/python/src/internal/company_information.py (1 hunks)
libraries/python/src/internal/equity_bar.py (1 hunks)
libraries/python/src/internal/mhsa_network.py (1 hunks)
libraries/python/src/internal/portfolio.py (1 hunks)
libraries/python/src/internal/prediction.py (1 hunks)
libraries/python/src/internal/tft_dataset.py (5 hunks)
libraries/python/src/internal/tft_model.py (2 hunks)
libraries/python/tests/test_company_information.py (2 hunks)
libraries/python/tests/test_equity_bar.py (0 hunks)
libraries/python/tests/test_portfolio.py (1 hunks)
libraries/python/tests/test_prediction.py (1 hunks)
libraries/python/tests/test_tft_dataset.py (6 hunks)
maskfile.md (28 hunks)

💤 Files with no reviewable changes (5)

libraries/python/tests/test_equity_bar.py
applications/datamanager/tests/test_s3_client.py
applications/datamanager/src/datamanager/s3_client.py
applications/datamanager/tests/test_alpaca_client.py
applications/datamanager/src/datamanager/alpaca_client.py

🚧 Files skipped from review as they are similar to previous changes (6)

infrastructure/stack.yml
applications/portfoliomanager/pyproject.toml
.flox/env/manifest.toml
libraries/python/src/internal/tft_dataset.py
libraries/python/src/internal/company_information.py
libraries/python/src/internal/equity_bar.py

🧰 Additional context used

🧬 Code graph analysis (5)

libraries/python/src/internal/portfolio.py (2)

applications/portfoliomanager/src/portfoliomanager/alpaca_client.py (1)

is_uppercase (68-71)

libraries/python/tests/test_equity_bar.py (1)

test_equity_bar_schema_ticker_lowercase_fails (25-40)

libraries/python/tests/test_tft_dataset.py (1)

libraries/python/src/internal/tft_dataset.py (3)

TFTDataset (26-392)

preprocess_and_set_data (32-204)

get_batches (261-339)

applications/portfoliomanager/src/portfoliomanager/alpaca_client.py (2)

libraries/python/src/internal/portfolio.py (1)

is_uppercase (6-9)

applications/datamanager/src/datamanager/alpaca_client.py (1)

__init__ (23-41)

applications/portfoliomanager/src/portfoliomanager/risk_management.py (1)

libraries/python/src/internal/dates.py (1)

Date (8-31)

applications/portfoliomanager/tests/test_risk_management.py (1)

applications/portfoliomanager/src/portfoliomanager/risk_management.py (5)

add_equity_bars_returns_and_realized_volatility_columns (29-66)

add_positions_pdt_locked_column (7-26)

add_positions_performance_columns (69-179)

add_predictions_zscore_ranked_columns (182-201)

create_optimal_portfolio (204-303)

🪛 Ruff (0.12.2)

libraries/python/tests/test_tft_dataset.py

174-174: Use of assert detected

(S101)

177-177: Use of assert detected

(S101)

179-179: Use of assert detected

(S101)

184-184: Use of assert detected

(S101)

185-185: Use of assert detected

(S101)

235-235: Use of assert detected

(S101)

240-240: Use of assert detected

(S101)

242-242: Use of assert detected

(S101)

243-243: Use of assert detected

(S101)

244-244: Use of assert detected

(S101)

libraries/python/tests/test_company_information.py

102-102: Use of assert detected

(S101)

103-103: Use of assert detected

(S101)

115-115: Use of assert detected

(S101)

116-116: Use of assert detected

(S101)

applications/portfoliomanager/src/portfoliomanager/alpaca_client.py

27-27: Unused noqa directive (non-enabled: FBT001)

Remove unused noqa directive

(RUF100)

applications/portfoliomanager/src/portfoliomanager/risk_management.py

23-23: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

24-24: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

42-42: Unused noqa directive (non-enabled: E501)

Remove unused noqa directive

(RUF100)

150-150: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

151-151: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

170-170: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

171-171: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

198-198: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

199-199: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

200-200: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

286-286: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

287-287: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

295-295: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

296-296: Unused noqa directive (non-enabled: FBT003)

Remove unused noqa directive

(RUF100)

applications/portfoliomanager/tests/test_risk_management.py

30-30: Use of assert detected

(S101)

31-31: Use of assert detected

(S101)

31-31: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

50-50: Use of assert detected

(S101)

51-51: Use of assert detected

(S101)

51-51: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

74-74: Use of assert detected

(S101)

85-85: Use of assert detected

(S101)

86-86: Use of assert detected

(S101)

89-89: Unused noqa directive (non-enabled: E501)

Remove unused noqa directive

(RUF100)

105-105: Use of assert detected

(S101)

106-106: Use of assert detected

(S101)

107-107: Use of assert detected

(S101)

108-108: Use of assert detected

(S101)

108-108: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

111-111: Unused noqa directive (non-enabled: E501)

Remove unused noqa directive

(RUF100)

128-128: Unused noqa directive (non-enabled: E501)

Remove unused noqa directive

(RUF100)

181-181: Use of assert detected

(S101)

182-182: Use of assert detected

(S101)

183-183: Use of assert detected

(S101)

184-184: Use of assert detected

(S101)

189-189: Use of assert detected

(S101)

190-190: Use of assert detected

(S101)

191-191: Use of assert detected

(S101)

192-192: Use of assert detected

(S101)

195-195: Unused noqa directive (non-enabled: E501)

Remove unused noqa directive

(RUF100)

217-217: Use of assert detected

(S101)

259-259: Use of assert detected

(S101)

260-260: Use of assert detected

(S101)

302-302: Use of assert detected

(S101)

303-303: Use of assert detected

(S101)

345-345: Use of assert detected

(S101)

348-348: Use of assert detected

(S101)

390-390: Use of assert detected

(S101)

391-391: Use of assert detected

(S101)

449-449: Use of assert detected

(S101)

450-450: Use of assert detected

(S101)

465-465: Use of assert detected

(S101)

466-466: Use of assert detected

(S101)

468-468: Use of assert detected

(S101)

469-469: Use of assert detected

(S101)

470-470: Use of assert detected

(S101)

487-487: Use of assert detected

(S101)

488-488: Use of assert detected

(S101)

491-491: Use of assert detected

(S101)

508-508: Use of assert detected

(S101)

509-509: Use of assert detected

(S101)

536-536: Use of assert detected

(S101)

536-536: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

537-537: Use of assert detected

(S101)

537-537: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

538-538: Use of assert detected

(S101)

538-538: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

542-542: Use of assert detected

(S101)

542-542: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

573-573: Use of assert detected

(S101)

573-573: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

574-574: Use of assert detected

(S101)

575-575: Use of assert detected

(S101)

576-576: Use of assert detected

(S101)

603-603: Use of assert detected

(S101)

604-604: Use of assert detected

(S101)

604-604: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

631-631: Use of assert detected

(S101)

631-631: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

633-633: Use of assert detected

(S101)

667-667: Use of assert detected

(S101)

667-667: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

677-677: Use of assert detected

(S101)

677-677: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

681-681: Use of assert detected

(S101)

681-681: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

717-717: Use of assert detected

(S101)

717-717: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

721-721: Use of assert detected

(S101)

725-725: Use of assert detected

(S101)

727-727: Use of assert detected

(S101)

728-728: Use of assert detected

(S101)

729-729: Use of assert detected

(S101)

730-730: Use of assert detected

(S101)

731-731: Use of assert detected

(S101)

731-731: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

734-734: Use of assert detected

(S101)

libraries/python/src/internal/mhsa_network.py

39-39: Unused noqa directive (non-enabled: ERA001)

Remove unused noqa directive

(RUF100)

libraries/python/src/internal/prediction.py

17-17: Unused noqa directive (non-enabled: E501)

Remove unused noqa directive

(RUF100)

34-34: Unused noqa directive (non-enabled: E501)

Remove unused noqa directive

(RUF100)

libraries/python/src/internal/tft_model.py

124-124: Unused noqa directive (non-enabled: E501, ERA001)

Remove unused noqa directive

(RUF100)

libraries/python/tests/test_portfolio.py

39-39: Use of assert detected

(S101)

libraries/python/tests/test_prediction.py

19-19: Use of assert detected

(S101)

79-79: Use of assert detected

(S101)

143-143: Use of assert detected

(S101)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Run Python tests
GitHub Check: Run Python quality checks

🔇 Additional comments (14)

libraries/python/src/internal/tft_model.py (1)

21-21: Doc update aligns with dataset change.

Comment switch to “daily return” is consistent with the new target. LGTM.

libraries/python/tests/test_company_information.py (2)

101-103: Tests correctly assert defaulting behavior for null sector.

Asserting “NOT AVAILABLE” matches the new schema behavior. LGTM.

114-116: Tests correctly assert defaulting behavior for null industry.

Behavior verified without relying on exceptions. LGTM.

libraries/python/tests/test_tft_dataset.py (3)

44-47: API usage updated to no-arg constructor + preprocess.

Pattern is clear and matches TFTDataset changes. LGTM.

Also applies to: 71-74, 111-114

144-186: Good coverage for daily_return creation and filtering.

Solid checks for existence, count after null-filter, and numeric finiteness.

188-245: Targets shape validation is appropriate.

Confirms [N, 1] target layout for training batches. LGTM.

applications/portfoliomanager/src/portfoliomanager/alpaca_client.py (1)

46-53: Confirm alpaca-py Position field types
Verified that Position.qty is a str and Position.market_value is an Optional[str] in the pinned alpaca-py; casting them to float is correct.

libraries/python/src/internal/portfolio.py (1)

42-76: Efficient single-pass sum checks.

Good use of a single aggregation with null-safety; clear error message construction. LGTM.

libraries/python/tests/test_portfolio.py (1)

7-41: Good, comprehensive happy-path test for portfolio schema.

Covers shape and balanced constraints well.

applications/portfoliomanager/tests/test_risk_management.py (4)

14-31: PDT lock tests look solid.

Covers same-day, previous-day, and mixed scenarios.

149-193: Great regression test for per-ticker return partitioning.

Prevents cross-ticker bleed in pct_change/log returns.

394-451: Nice multi-ticker independence check for performance logic.

Validates maintain/close symmetry across tickers.

512-543: Portfolio construction happy-path is well covered.

Asserts side counts and near-equal capital split.

libraries/python/tests/test_prediction.py (1)

1-5: LGTM: solid coverage of schema constraints.

Happy path and key failure modes are covered; messages match schema errors.

forstmeier · 2025-09-08T13:45:38Z

@chrisaddy I'm fixing a slew of issues I discovered as part of the CodeRabbit feedback so feel free to review but don't merge this PR yet.

forstmeier · 2025-09-08T18:44:54Z

@chrisaddy this is ready for review now (all feedback addressed and checks passing). I'm going to cut a separate pull request for consolidating the pdt_locked, maintain_position and close_position columns into a single "action" column, add in pandera schemas (permanent or intermediary), and some overall cleanup so if you have thoughts on that I'll address it separately.

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

libraries/python/src/internal/tft_model.py (1)

237-241: Use full quantile predictions in validation loss
In validate(), the call to quantile_loss passes only the median predictions (outputs["predictions"]) instead of the full quantile tensor, which mismatches its signature (expects shape (batch_size, output_size, num_quantiles)) and understates the loss. Mirror train() by reshaping and passing all quantiles:

 File libraries/python/src/internal/tft_model.py (around 237–241)
-            loss = quantile_loss(
-                outputs["predictions"],
-                inputs["targets"],
-                self.quantiles,
-            )
+            loss = quantile_loss(
+                outputs["quantiles"].reshape(
+                    -1, self.output_size, len(self.quantiles)
+                ),
+                inputs["targets"].reshape(-1, self.output_size),
+                self.quantiles,
+            )

♻️ Duplicate comments (8)

libraries/python/src/internal/mhsa_network.py (1)

39-39: Remove unused noqa (RUF100).

ERA001 isn’t enabled; drop the directive.

-        # shape: (batch, heads_count, sequence_length, head_dimension)  # noqa: ERA001
+        # shape: (batch, heads_count, sequence_length, head_dimension)

applications/portfoliomanager/pyproject.toml (1)

12-12: Pin alpaca-py upper bound (0.x minors can break).

Guard against future breaking changes.
-    "alpaca-py>=0.42.1",
+    "alpaca-py>=0.42.1,<0.43",

maskfile.md (3)

198-205: Fix SSH ConnectTimeout=0 (always fails immediately).

Use a sane positive timeout.

-    if ssh -o ConnectTimeout=0 pocketsizefund-production 'docker info -f "{{.ServerVersion}} {{.Swarm.LocalNodeState}}"' 2>/dev/null; then
+    if ssh -o ConnectTimeout=10 pocketsizefund-production 'docker info -f "{{.ServerVersion}} {{.Swarm.LocalNodeState}}"' 2>/dev/null; then

434-449: Fix positional parameter expansion in test_endpoint.

Current “1$ 2$ 3$” never captures args.

-    local name="1$"
-    local url="2$"
-    local context="3$"
+    local name="$1"
+    local url="$2"
+    local context="$3"

472-473: Correct PortfolioManager production port.

Should be 8081 to match local and service mapping.

-    test_endpoint "PortfolioManager" "http://$MANAGER_IP:808/health" "production"
-    test_endpoint "PortfolioManager (root)" "http://$MANAGER_IP:808/" "production"
+    test_endpoint "PortfolioManager" "http://$MANAGER_IP:8081/health" "production"
+    test_endpoint "PortfolioManager (root)" "http://$MANAGER_IP:8081/" "production"

libraries/python/src/internal/prediction.py (1)

55-64: Reuse shared uppercase checker for consistency.

Align with other schemas; avoids API differences.

-        "ticker": pa.Column(
-            dtype=str,
-            checks=[
-                pa.Check(
-                    lambda s: s.upper() == s,
-                    error="Ticker must be uppercase",
-                    element_wise=True,
-                )
-            ],
-        ),
+        "ticker": pa.Column(
+            dtype=str,
+            checks=[pa.Check(is_uppercase, error="Ticker must be uppercase")],
+        ),

Add import:

@@
-import pandera.polars as pa
+import pandera.polars as pa
 import polars as pl
 from pandera.polars import PolarsData
+from internal.portfolio import is_uppercase

applications/portfoliomanager/src/portfoliomanager/alpaca_client.py (1)

63-65: Enum-to-string handling — fixed as discussed earlier.

Casting side to str before replace(...).upper() handles Enum inputs safely. Good catch.

applications/portfoliomanager/src/portfoliomanager/risk_management.py (1)

13-26: Avoid Python row-wise lambda; use vectorized tz-aware conversion (and trade-day timezone).

The map_elements lambda is slow and compares UTC date, not the trading day in America/New_York.

-    return positions.with_columns(
-        pl.when(
-            pl.col("timestamp")
-            .cast(pl.Float64)
-            .map_elements(
-                lambda ts: datetime.fromtimestamp(ts, tz=UTC).date(),
-                return_dtype=pl.Date,
-            )
-            == current_datetime.date()
-        )
-        .then(pl.lit("PDT_LOCKED"))
-        .otherwise(pl.lit("UNSPECIFIED"))
-        .alias("action")
-    )
+    eastern = "America/New_York"
+    current_date = (
+        pl.lit(current_datetime)
+        .cast(pl.Datetime(time_zone="UTC"))
+        .dt.convert_time_zone(eastern)
+        .dt.date()
+    )
+    return positions.with_columns(
+        pl.col("timestamp")
+        .cast(pl.Float64)
+        .cast(pl.Datetime(time_zone="UTC"))
+        .dt.convert_time_zone(eastern)
+        .dt.date()
+        .eq(current_date)
+        .map_elements(lambda x: "PDT_LOCKED" if x else "UNSPECIFIED", return_dtype=pl.Utf8)
+        .alias("action")
+    )

🧹 Nitpick comments (36)

applications/portfoliomanager/pyproject.toml (1)
5-5: Consider relaxing Python pin.

Exact patch pin “==3.12.10” can block security updates; prefer a range.
-requires-python = "==3.12.10"
+requires-python = ">=3.12,<3.13"
maskfile.md (2)
15-21: Standardize redirection: use 2>&1 (or drop if unnecessary).

“>/dev/null >&1” is odd/no-op; prefer “>/dev/null 2>&1”.
-if ! command -v docker >/dev/null >&1; then
+if ! command -v docker >/dev/null 2>&1; then
@@
-if ! command -v pulumi >/dev/null >&1; then
+if ! command -v pulumi >/dev/null 2>&1; then
495-499: Detect empty service list correctly.

Current check always prints “Services listed successfully” even if none.
-        if docker service ls --format "table {{.Name}}\t{{.Replicas}}\t{{.Image}}" >/dev/null; then
-            echo " Services listed successfully"
-        else
-            echo "  No services found or connection error"
-        fi
+        if [[ -n "$(docker service ls --format '{{.Name}}')" ]]; then
+            echo " Services listed successfully"
+        else
+            echo "  No services found or connection error"
+        fi
libraries/python/src/internal/prediction.py (2)
17-17: Remove unused noqa (RUF100).

E501 isn’t enabled; drop both instances.
-        message = f"Each ticker must have exactly {dates_count} unique dates, found: {unique_dates_per_ticker}"  # noqa: E501
+        message = f"Each ticker must have exactly {dates_count} unique dates, found: {unique_dates_per_ticker}"
@@
-                message = f"Expected all tickers to have the same dates, mismatch between: {first_ticker_dates} and: {set(dates)}"  # noqa: E501
+                message = f"Expected all tickers to have the same dates, mismatch between: {first_ticker_dates} and: {set(dates)}"
Also applies to: 34-34

41-47: Rename variable for clarity (it’s not lazy after collect).

Minor readability tweak.
-def check_monotonic_quantiles(data: PolarsData) -> bool:
-    lazy_frame = data.lazyframe.collect()
+def check_monotonic_quantiles(data: PolarsData) -> bool:
+    df = data.lazyframe.collect()
@@
-        not (lazy_frame["quantile_10"] <= lazy_frame["quantile_50"]).all()
-        or not (lazy_frame["quantile_50"] <= lazy_frame["quantile_90"]).all()
+        not (df["quantile_10"] <= df["quantile_50"]).all()
+        or not (df["quantile_50"] <= df["quantile_90"]).all()
libraries/python/tests/test_prediction.py (3)
1-1: Silence Ruff S101 in tests (allow assert).

Tests conventionally use bare assert. Either configure Ruff or add a file-level ignore.
+# ruff: noqa: S101
 from datetime import UTC, datetime, timedelta
62-65: Avoid fragile cell assignment in Polars.

Direct data[1, "timestamp"] = -1.0 can be version-dependent. Use a vectorized update.
-    data[1, "timestamp"] = -1.0  # introduce a negative timestamp
+    # introduce a negative timestamp at row 1 robustly
+    data = (
+        data.with_row_count("row_nr")
+        .with_columns(
+            pl.when(pl.col("row_nr") == 1)
+            .then(pl.lit(-1.0))
+            .otherwise(pl.col("timestamp"))
+            .alias("timestamp")
+        )
+        .drop("row_nr")
+    )
146-163: Add a failing case for non‑monotonic quantiles.

You validate success cases but don’t assert that the monotonic constraint is enforced.
@@
 def test_prediction_schema_float_quantile_values() -> None:
@@
     validated_df = prediction_schema.validate(valid_data)
     assert validated_df.shape == (7, 5)
+
+
+def test_prediction_schema_non_monotonic_quantiles_fails() -> None:
+    base_date = datetime(2024, 1, 1, tzinfo=UTC)
+    bad = pl.DataFrame(
+        {
+            "ticker": ["AAPL"] * 7,
+            "timestamp": [(base_date + timedelta(days=i)).timestamp() for i in range(7)],
+            "quantile_10": [100.0] * 7,
+            "quantile_50": [150.0] * 7,
+            "quantile_90": [200.0] * 7,
+        }
+    )
+    # break monotonicity on one row
+    bad = bad.with_row_count("row_nr").with_columns(
+        pl.when(pl.col("row_nr") == 3).then(pl.lit(120.0)).otherwise(pl.col("quantile_10")).alias("quantile_10")
+    ).drop("row_nr")
+    with pytest.raises(SchemaError):
+        prediction_schema.validate(bad)
applications/portfoliomanager/src/portfoliomanager/alpaca_client.py (4)
27-27: Remove unused noqa directive.

Ruff reports RUF100; the FBT001 rule isn’t enabled. Drop the inline suppression.
-        is_paper: bool,  # noqa: FBT001
+        is_paper: bool,
75-76: Avoid double validation.

You validate positions here and again in AlpacaAccount.__init__. Keep it in one place (constructor) to reduce overhead and centralize invariants.
-        position_schema.validate(positions)
-
         return AlpacaAccount(
             cash_amount=float(cast("str", account.cash)),
             positions=positions,
         )
83-86: DRY the uppercase check helper.

is_uppercase duplicates internal.portfolio.is_uppercase. Importing the shared helper keeps behavior consistent across schemas.

Example:
-from typing import cast
+from typing import cast
@@
-import pandera.polars as pa
+import pandera.polars as pa
 import polars as pl
 from alpaca.trading import Position, TradeAccount, TradingClient
+from internal.portfolio import is_uppercase
@@
-def is_uppercase(data: pa.PolarsData) -> pl.LazyFrame:
-    return data.lazyframe.select(
-        pl.col(data.key).str.to_uppercase() == pl.col(data.key)
-    )
37-44: Optional: wrap Alpaca calls with targeted exception handling.

Transient API errors will currently bubble up untyped. Consider catching alpaca.common.exceptions.APIError (and similar) to enrich context and preserve retry points.
libraries/python/src/internal/equity_bar.py (4)
8-14: Use vectorized uppercase check; disallow null tickers.

Element-wise lambdas are slower and fragile with nulls. Prefer a vectorized check and make the column non-nullable.
-        "ticker": pa.Column(
-            dtype=str,
-            checks=[
-                pa.Check(
-                    lambda s: s.upper() == s,
-                    error="Ticker must be uppercase",
-                    element_wise=True,
-                )
-            ],
-        ),
+        "ticker": pa.Column(
+            dtype=str,
+            nullable=False,
+            checks=[
+                pa.Check(
+                    lambda s: s.str.to_uppercase() == s,
+                    error="Ticker must be uppercase",
+                )
+            ],
+        ),
16-19: Prefer integer timestamps to avoid float precision pitfalls.

You currently coerce to Float64. Using integer epoch (ms) is safer for uniqueness and joins.
-        "timestamp": pa.Column(
-            dtype=pl.Float64,
-            checks=[pa.Check.greater_than(0)],
-        ),
+        "timestamp": pa.Column(
+            dtype=int,
+            checks=[pa.Check.greater_than(0)],
+        ),
20-44: Require non-null numeric fields.

Core OHLCV fields should be present; allowing nulls invites downstream errors.
-        "open_price": pa.Column(
-            dtype=float,
-            checks=[pa.Check.greater_than(0)],
-        ),
+        "open_price": pa.Column(dtype=float, nullable=False, checks=[pa.Check.greater_than(0)]),
-        "high_price": pa.Column(
-            dtype=float,
-            checks=[pa.Check.greater_than(0)],
-        ),
+        "high_price": pa.Column(dtype=float, nullable=False, checks=[pa.Check.greater_than(0)]),
-        "low_price": pa.Column(
-            dtype=float,
-            checks=[pa.Check.greater_than(0)],
-        ),
+        "low_price": pa.Column(dtype=float, nullable=False, checks=[pa.Check.greater_than(0)]),
-        "close_price": pa.Column(
-            dtype=float,
-            checks=[pa.Check.greater_than(0)],
-        ),
+        "close_price": pa.Column(dtype=float, nullable=False, checks=[pa.Check.greater_than(0)]),
-        "volume": pa.Column(
-            dtype=int,
-            checks=[pa.Check.greater_than_or_equal_to(0)],
-        ),
+        "volume": pa.Column(dtype=int, nullable=False, checks=[pa.Check.greater_than_or_equal_to(0)]),
45-51: Nit: fix comment typo; consider adding OHLC consistency checks.

“partion” → “partition”.

Optional: add schema-level checks to enforce high >= max(open, close) and low <= min(open, close).
-    strict="filter",  # allows DuckDB partion columns
+    strict="filter",  # allows DuckDB partition columns
Example OHLC checks (add alongside unique=):
checks=[
    pa.Check(lambda df: df.lazyframe.select(
        (pl.col("high_price") >= pl.max_horizontal(pl.col("open_price"), pl.col("close_price"))).all()
    ), error="high_price must be >= open/close"),
    pa.Check(lambda df: df.lazyframe.select(
        (pl.col("low_price") <= pl.min_horizontal(pl.col("open_price"), pl.col("close_price"))).all()
    ), error="low_price must be <= open/close"),
],
libraries/python/tests/test_equity_bar.py (2)
29-36: Float timestamps: good alignment; make first test consistent.

Nice switch to float timestamps throughout. For consistency, update the remaining int timestamp in test_equity_bar_schema_valid_data to float.
-            "timestamp": [1674000000],
+            "timestamp": [1674000000.0],
Also applies to: 47-54, 65-72, 83-90, 110-117, 130-137, 148-155, 166-173, 219-236

21-23: Consider adding a uniqueness test for (ticker, timestamp).

Schema now enforces uniqueness; add a quick test that duplicates a row and expects SchemaError to catch regressions.
applications/portfoliomanager/src/portfoliomanager/risk_management.py (5)
36-45: Count non-null prices when enforcing minimum bars.

Using total row count can pass tickers with many null close_price values. Count valid prices instead.
-    ticker_counts = equity_bars.group_by("ticker").agg(pl.len().alias("count"))
-    insufficient_tickers = ticker_counts.filter(
-        pl.col("count") < minimum_bars_per_ticker_required
-    )
+    ticker_counts = equity_bars.group_by("ticker").agg(
+        pl.col("close_price").drop_nulls().count().alias("valid_price_count")
+    )
+    insufficient_tickers = ticker_counts.filter(
+        pl.col("valid_price_count") < minimum_bars_per_ticker_required
+    )
41-44: Remove unused noqa.

The E501 noqa is flagged as unused by Ruff; drop it.
-        message = f"Tickers with insufficient data (< {minimum_bars_per_ticker_required} rows): {insufficient_list}"  # noqa: E501
+        message = f"Tickers with insufficient data (< {minimum_bars_per_ticker_required} rows): {insufficient_list}"
91-101: Guard against missing log_daily_returns in input bars.

This function assumes original_equity_bars has log_daily_returns; add a fallback computation to avoid runtime errors when callers pass raw bars.
-    original_equity_bars_with_returns = original_equity_bars.sort(
-        ["ticker", "timestamp"]
-    )
+    bars = original_equity_bars.sort(["ticker", "timestamp"])
+    if "log_daily_returns" not in bars.columns:
+        dr = pl.col("close_price").pct_change().over("ticker")
+        bars = bars.with_columns(
+            pl.when(pl.col("close_price").is_not_null())
+            .then(pl.when((dr + 1) > 0).then((dr + 1).log()).otherwise(None))
+            .otherwise(None)
+            .alias("log_daily_returns")
+        )
@@
-        ticker_bars = original_equity_bars_with_returns.filter(
+        ticker_bars = bars.filter(
Also applies to: 107-113

345-350: Narrow exception handling in _filter_side_capital_amount.

Catching Exception hides real bugs; scope it to expected failures.
-    except Exception:  # noqa: BLE001
+    except (TypeError, ValueError):
         return 0.0
263-271: Selection policy depends on input order; consider explicit sorting.

head/tail assume predictions are pre-sorted. If that’s intentional, document it; otherwise sort by composite_score descending first.
-    available_predictions = predictions.filter(
-        ~pl.col("ticker").is_in(excluded_tickers)
-    )
+    available_predictions = (
+        predictions.filter(~pl.col("ticker").is_in(excluded_tickers))
+        .sort("composite_score", descending=True)
+    )
Also applies to: 272-281
applications/portfoliomanager/tests/test_risk_management.py (4)
14-33: Tests assume UTC trading day; update if switching to America/New_York.

If you adopt the tz-aware conversion above, set current_datetime to an Eastern-local datetime whose date matches the intended trading day, or compare against a derived current_date to keep tests deterministic.

170-189: Assert realized_volatility resets per ticker boundary.

After fixing realized_volatility grouping, add assertions that the first window within each ticker is None and does not use prior ticker data.

87-112: Remove unused noqa directives and rely on test-friendly Ruff config.

Multiple “unused noqa” (E501, PLR2004) are flagged. Either enable those rules or remove the comments; also consider per-file-ignores for S101 in tests instead of changing assertions.

Example cleanup:
-assert len(result) == 35  # noqa: PLR2004
+assert len(result) == 35
And drop unused “# noqa: E501” where present.

Also applies to: 126-143, 191-216, 523-535, 563-569, 592-594, 619-622, 659-674, 714-732

504-533: Stabilize time-dependent tests.

Using datetime.now(tz=UTC) is fine here since you don't assert on it, but consider freezing the timestamp for reproducibility in case future checks rely on it.
libraries/python/src/internal/tft_model.py (2)
124-124: Remove unused noqa/commented code (RUF100)

This dead, commented-out line with a noqa trips Ruff. Simply delete it.
-        # NOTE: static_continuous_features = Tensor.zeros(self.batch_size, 1, 0)  # noqa: E501, ERA001
113-121: Derive batch size at runtime to avoid shape coupling with sequence length

Several places assume self.batch_size == input_length. Compute batch_size from inputs (e.g., encoder_categorical_features.shape[0]) and use that for zeros/views/expands. Prevents subtle shape bugs as soon as batch>1 or sequence lengths vary.

Also applies to: 132-135, 144-151, 162-170, 176-181
libraries/python/tests/test_portfolio.py (2)
1-1: Silence Ruff S101 for pytest asserts (file-level)

Allow pytest-style asserts without noise.
+ # ruff: noqa: S101  # allow pytest-style asserts in tests
 from datetime import UTC, datetime
34-36: Prefer integer epoch seconds for timestamp fields

Schema likely expects integer-like values; int(...) avoids float artifacts.
-            "timestamp": [datetime(2025, 1, 1, 0, 0, 0, 0, tzinfo=UTC).timestamp()] * 20,
+            "timestamp": [int(datetime(2025, 1, 1, 0, 0, 0, 0, tzinfo=UTC).timestamp())] * 20,
Apply similarly in the other tests in this module.
libraries/python/tests/test_tft_dataset.py (1)
1-2: Silence Ruff S101 for pytest asserts (file-level)

Keep tests idiomatic with pytest asserts.
+ # ruff: noqa: S101  # allow pytest-style asserts in tests
 import math
libraries/python/src/internal/portfolio.py (2)
6-9: Consider consolidating duplicate is_uppercase helper functions.

There's an identical is_uppercase function in applications/portfoliomanager/src/portfoliomanager/alpaca_client.py (lines 82-85). Consider extracting this shared validation logic to a common utilities module to avoid code duplication and ensure consistency.

78-112: Consider extracting default parameter values as module constants.

The hard-coded default values for total_positions_count=20 and maximum_imbalance_percentage=0.05 in the DataFrame-level checks could be extracted as module-level constants for better maintainability and reusability.
+DEFAULT_TOTAL_POSITIONS_COUNT = 20  # 10 long and 10 short
+DEFAULT_MAX_IMBALANCE_PERCENTAGE = 0.05  # 5%
+
 def check_position_side_counts(
     data: PolarsData,
-    total_positions_count: int = 20,  # 10 long and 10 short
+    total_positions_count: int = DEFAULT_TOTAL_POSITIONS_COUNT,
 ) -> bool:

 def check_position_side_sums(
     data: PolarsData,
-    maximum_imbalance_percentage: float = 0.05,  # 5%
+    maximum_imbalance_percentage: float = DEFAULT_MAX_IMBALANCE_PERCENTAGE,
 ) -> bool:
libraries/python/src/internal/tft_dataset.py (2)
341-392: Consider adding validation for prediction array shape.

The function assumes predictions has shape (batch_size, output_length, 1, 3) for 3 quantiles. Consider adding a shape validation at the beginning of the function to fail fast with a clear error message if the shape doesn't match expectations.
 def postprocess_predictions(
     self,
     input_batch: Tensor,  # static_categorical_features
     predictions: Tensor,  # quantiles
     current_datetime: datetime,
 ) -> pl.DataFrame:
     predictions_array = predictions.numpy()
 
     batch_size, output_length, _, _ = predictions_array.shape
+    
+    # Validate predictions shape
+    if predictions_array.shape[2] != 1 or predictions_array.shape[3] != 3:
+        raise ValueError(
+            f"Expected predictions shape (batch_size, output_length, 1, 3), "
+            f"got {predictions_array.shape}"
+        )
361-370: Consider using timezone-aware datetime for consistency.

The function creates timestamps without explicit timezone handling. Since the preprocessing step converts timestamps to UTC (line 144), consider making the datetime operations timezone-aware for consistency.
 for time_idx in range(output_length):
     timestamp = int(
         (current_datetime + timedelta(days=time_idx))
         .replace(
             hour=0,
             minute=0,
             second=0,
             microsecond=0,
+            tzinfo=current_datetime.tzinfo or timezone.utc,
         )
         .timestamp()
     )
Note: You'll need to import timezone from the datetime module.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 89695a9 and 1ec02bd.

⛔ Files ignored due to path filters (2)

.flox/env/manifest.lock is excluded by !**/*.lock
uv.lock is excluded by !**/*.lock

📒 Files selected for processing (23)

.flox/env/manifest.toml (2 hunks)
applications/datamanager/src/datamanager/alpaca_client.py (0 hunks)
applications/datamanager/src/datamanager/s3_client.py (0 hunks)
applications/datamanager/tests/test_alpaca_client.py (0 hunks)
applications/datamanager/tests/test_s3_client.py (0 hunks)
applications/portfoliomanager/pyproject.toml (1 hunks)
applications/portfoliomanager/src/portfoliomanager/alpaca_client.py (1 hunks)
applications/portfoliomanager/src/portfoliomanager/risk_management.py (1 hunks)
applications/portfoliomanager/tests/test_risk_management.py (1 hunks)
infrastructure/stack.yml (1 hunks)
libraries/python/src/internal/company_information.py (1 hunks)
libraries/python/src/internal/equity_bar.py (1 hunks)
libraries/python/src/internal/mhsa_network.py (1 hunks)
libraries/python/src/internal/portfolio.py (1 hunks)
libraries/python/src/internal/prediction.py (1 hunks)
libraries/python/src/internal/tft_dataset.py (5 hunks)
libraries/python/src/internal/tft_model.py (2 hunks)
libraries/python/tests/test_company_information.py (2 hunks)
libraries/python/tests/test_equity_bar.py (12 hunks)
libraries/python/tests/test_portfolio.py (1 hunks)
libraries/python/tests/test_prediction.py (1 hunks)
libraries/python/tests/test_tft_dataset.py (6 hunks)
maskfile.md (28 hunks)

💤 Files with no reviewable changes (4)

applications/datamanager/tests/test_s3_client.py
applications/datamanager/tests/test_alpaca_client.py
applications/datamanager/src/datamanager/alpaca_client.py
applications/datamanager/src/datamanager/s3_client.py

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-09-08T01:38:43.412Z

Learnt from: forstmeier
PR: pocketsizefund/pocketsizefund#625
File: applications/portfoliomanager/src/portfoliomanager/alpaca_client.py:6-6
Timestamp: 2025-09-08T01:38:43.412Z
Learning: In alpaca-py, both flat imports (from alpaca.trading import TradingClient, TradeAccount, Position) and modular imports (from alpaca.trading.client import TradingClient; from alpaca.trading.models import TradeAccount, Position) are valid. The account model class is TradeAccount, not Account.

Applied to files:

applications/portfoliomanager/src/portfoliomanager/alpaca_client.py

🧬 Code graph analysis (5)

libraries/python/src/internal/portfolio.py (1)

applications/portfoliomanager/src/portfoliomanager/alpaca_client.py (1)

is_uppercase (83-86)

applications/portfoliomanager/src/portfoliomanager/alpaca_client.py (2)

libraries/python/src/internal/portfolio.py (1)

is_uppercase (6-9)

applications/datamanager/src/datamanager/alpaca_client.py (2)

fetch_latest_data (64-170)

__init__ (23-41)

applications/portfoliomanager/tests/test_risk_management.py (1)

applications/portfoliomanager/src/portfoliomanager/risk_management.py (5)

add_equity_bars_returns_and_realized_volatility_columns (29-64)

add_positions_action_column (7-26)

add_positions_performance_columns (67-177)

add_predictions_zscore_ranked_columns (180-201)

create_optimal_portfolio (204-303)

applications/portfoliomanager/src/portfoliomanager/risk_management.py (1)

applications/portfoliomanager/src/portfoliomanager/preprocess.py (1)

filter_equity_bars (4-21)

libraries/python/tests/test_tft_dataset.py (1)

libraries/python/src/internal/tft_dataset.py (3)

TFTDataset (26-392)

preprocess_and_set_data (32-204)

get_batches (261-339)

🪛 Ruff (0.12.2)

libraries/python/src/internal/mhsa_network.py

39-39: Unused noqa directive (non-enabled: ERA001)

Remove unused noqa directive

(RUF100)

libraries/python/tests/test_portfolio.py

42-42: Use of assert detected

(S101)

libraries/python/tests/test_company_information.py

102-102: Use of assert detected

(S101)

103-103: Use of assert detected

(S101)

115-115: Use of assert detected

(S101)

116-116: Use of assert detected

(S101)

libraries/python/src/internal/prediction.py

17-17: Unused noqa directive (non-enabled: E501)

Remove unused noqa directive

(RUF100)

34-34: Unused noqa directive (non-enabled: E501)

Remove unused noqa directive

(RUF100)

libraries/python/tests/test_equity_bar.py

177-177: Use of assert detected

(S101)

libraries/python/src/internal/tft_model.py

124-124: Unused noqa directive (non-enabled: E501, ERA001)

Remove unused noqa directive

(RUF100)

applications/portfoliomanager/src/portfoliomanager/alpaca_client.py

27-27: Unused noqa directive (non-enabled: FBT001)

Remove unused noqa directive

(RUF100)

applications/portfoliomanager/tests/test_risk_management.py

30-30: Use of assert detected

(S101)

31-31: Use of assert detected

(S101)

31-31: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

50-50: Use of assert detected

(S101)

51-51: Use of assert detected

(S101)

51-51: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

72-72: Use of assert detected

(S101)

83-83: Use of assert detected

(S101)

84-84: Use of assert detected

(S101)

87-87: Unused noqa directive (non-enabled: E501)

Remove unused noqa directive

(RUF100)

103-103: Use of assert detected

(S101)

104-104: Use of assert detected

(S101)

105-105: Use of assert detected

(S101)

106-106: Use of assert detected

(S101)

106-106: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

109-109: Unused noqa directive (non-enabled: E501)

Remove unused noqa directive

(RUF100)

126-126: Unused noqa directive (non-enabled: E501)

Remove unused noqa directive

(RUF100)

177-177: Use of assert detected

(S101)

178-178: Use of assert detected

(S101)

179-179: Use of assert detected

(S101)

180-180: Use of assert detected

(S101)

185-185: Use of assert detected

(S101)

186-186: Use of assert detected

(S101)

187-187: Use of assert detected

(S101)

188-188: Use of assert detected

(S101)

191-191: Unused noqa directive (non-enabled: E501)

Remove unused noqa directive

(RUF100)

213-213: Use of assert detected

(S101)

255-255: Use of assert detected

(S101)

297-297: Use of assert detected

(S101)

339-339: Use of assert detected

(S101)

383-383: Use of assert detected

(S101)

441-441: Use of assert detected

(S101)

442-442: Use of assert detected

(S101)

457-457: Use of assert detected

(S101)

458-458: Use of assert detected

(S101)

460-460: Use of assert detected

(S101)

461-461: Use of assert detected

(S101)

462-462: Use of assert detected

(S101)

479-479: Use of assert detected

(S101)

480-480: Use of assert detected

(S101)

483-483: Use of assert detected

(S101)

500-500: Use of assert detected

(S101)

501-501: Use of assert detected

(S101)

527-527: Use of assert detected

(S101)

527-527: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

528-528: Use of assert detected

(S101)

528-528: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

529-529: Use of assert detected

(S101)

529-529: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

533-533: Use of assert detected

(S101)

533-533: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

563-563: Use of assert detected

(S101)

563-563: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

564-564: Use of assert detected

(S101)

565-565: Use of assert detected

(S101)

566-566: Use of assert detected

(S101)

592-592: Use of assert detected

(S101)

593-593: Use of assert detected

(S101)

593-593: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

619-619: Use of assert detected

(S101)

619-619: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

621-621: Use of assert detected

(S101)

659-659: Use of assert detected

(S101)

659-659: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

669-669: Use of assert detected

(S101)

669-669: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

673-673: Use of assert detected

(S101)

673-673: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

714-714: Use of assert detected

(S101)

714-714: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

718-718: Use of assert detected

(S101)

722-722: Use of assert detected

(S101)

724-724: Use of assert detected

(S101)

725-725: Use of assert detected

(S101)

726-726: Use of assert detected

(S101)

727-727: Use of assert detected

(S101)

728-728: Use of assert detected

(S101)

728-728: Unused noqa directive (non-enabled: PLR2004)

Remove unused noqa directive

(RUF100)

731-731: Use of assert detected

(S101)

libraries/python/tests/test_prediction.py

25-25: Use of assert detected

(S101)

101-101: Use of assert detected

(S101)

162-162: Use of assert detected

(S101)

applications/portfoliomanager/src/portfoliomanager/risk_management.py

43-43: Unused noqa directive (non-enabled: E501)

Remove unused noqa directive

(RUF100)

libraries/python/tests/test_tft_dataset.py

174-174: Use of assert detected

(S101)

177-177: Use of assert detected

(S101)

179-179: Use of assert detected

(S101)

184-184: Use of assert detected

(S101)

185-185: Use of assert detected

(S101)

235-235: Use of assert detected

(S101)

240-240: Use of assert detected

(S101)

242-242: Use of assert detected

(S101)

243-243: Use of assert detected

(S101)

244-244: Use of assert detected

(S101)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Run Python tests

🔇 Additional comments (20)

infrastructure/stack.yml (1)

143-143: LGTM on trailing newline.

.flox/env/manifest.toml (1)

17-17: Confirm mask package availability
Mask entry (mask.pkg-path = "mask") looks good; manually verify that the mask package exists in your nixpkgs/flox channels (lines 17 & 26) to avoid env resolution failures.

libraries/python/src/internal/company_information.py (1)

7-27: LGTM: defaults + checks align with tests.

Defaults are uppercase and checks enforce uppercase/stripped.

libraries/python/tests/test_company_information.py (1)

101-116: LGTM: tests reflect defaulting behavior.

libraries/python/src/internal/prediction.py (1)

65-68: Confirm timestamp dtype choice.

Epochs are typically Int64; if floats aren’t required, prefer pl.Int64.

applications/portfoliomanager/src/portfoliomanager/alpaca_client.py (1)

45-59: Typed empty positions frame — nice.

Returning a typed empty DataFrame when there are no positions avoids downstream schema errors. Good defensive design.

libraries/python/src/internal/tft_model.py (1)

21-21: Comment alignment LGTM

output_size=1 with “daily return” matches the dataset target shift.

libraries/python/tests/test_portfolio.py (2)

9-43: Good positive-case coverage

Balanced portfolio happy-path validates shape and schema as expected.

119-121: Nice: assert on explicit error message improves diagnosability

The targeted message match for unbalanced sides is helpful.

libraries/python/tests/test_tft_dataset.py (4)

44-47: API usage LGTM

Using no-arg constructor + preprocess_and_set_data(data) aligns with the refactor.

Also applies to: 71-74

138-142: Batch shapes validated appropriately

Predict-batch shapes assertions are clear and sufficient.

174-186: daily_return column/rows checks are on point

Presence, row count after first-row filter, and finiteness checks look solid.

235-245: Target tensor shape checks read well

Targets are 2D with single output as intended.

libraries/python/src/internal/portfolio.py (2)

12-39: LGTM! Efficient aggregation and clear error handling.

The implementation correctly performs a single aggregation to compute all counts, avoiding multiple collects. The error messages are specific and helpful for debugging validation failures.

42-75: LGTM! Robust handling of edge cases.

The implementation correctly:

Aggregates both sums in a single pass

Handles potential null values with fill_null(0.0)

Guards against division by zero with the total_sum <= 0.0 check

Provides clear error messages with actual values for debugging

libraries/python/src/internal/tft_dataset.py (5)

1-1: LGTM! Necessary imports for datetime operations.

The datetime imports are required for the new postprocessing functionality.

27-31: Good API design change to decouple initialization from data processing.

The change from passing data in the constructor to using a separate preprocess_and_set_data method provides better flexibility and clearer separation of concerns.

333-335: LGTM! Correct target update for return prediction.

The change from close_price to daily_return as the target variable aligns with the new focus on predicting returns rather than raw prices, which is a more standard approach in financial modeling.

395-442: Schema validation looks comprehensive and well-structured.

The updated schema with explicit dtypes and the addition of daily_return properly validates all the data transformations performed in the preprocessing step.

169-174: Verify drop of single-row ticker groups
No tests cover tickers with only one data point, so the filter(pl.col("daily_return").is_not_null()) will remove entire groups of length 1. Confirm this is intended behavior or add handling (e.g. fill, flag, or retain singletons) and include a test for that edge case.

forstmeier · 2025-09-12T15:53:33Z

@chrisaddy this is ready for review.

Copilot AI review requested due to automatic review settings September 7, 2025 01:53

graphite-app Bot assigned forstmeier Sep 7, 2025

forstmeier requested a review from chrisaddy September 7, 2025 01:53

forstmeier added application labels Sep 7, 2025

forstmeier added this to the Refactor milestone Sep 7, 2025

Copilot AI reviewed Sep 7, 2025

View reviewed changes

coderabbitai Bot reviewed Sep 7, 2025

View reviewed changes

forstmeier force-pushed the 09-06-build_initial_risk_management_resources branch from 13825cb to fd654f7 Compare September 7, 2025 17:30

coderabbitai Bot reviewed Sep 7, 2025

View reviewed changes

forstmeier force-pushed the 09-06-build_initial_risk_management_resources branch from fd654f7 to 15cc83e Compare September 8, 2025 01:16

coderabbitai Bot reviewed Sep 8, 2025

View reviewed changes

forstmeier force-pushed the 09-06-build_initial_risk_management_resources branch from 15cc83e to 3f49657 Compare September 8, 2025 15:20

Build initial risk management resources

1ec02bd

forstmeier force-pushed the 09-06-build_initial_risk_management_resources branch from 3f49657 to 1ec02bd Compare September 8, 2025 19:46

coderabbitai Bot reviewed Sep 8, 2025

View reviewed changes

Comment thread applications/portfoliomanager/src/portfoliomanager/alpaca_client.py

Comment thread applications/portfoliomanager/src/portfoliomanager/risk_management.py

forstmeier mentioned this pull request Sep 16, 2025

Datamanager predictions handlers #626

Merged

Fix merge conflicts with master branch

e91c4e7

forstmeier merged commit 3fddac8 into master Sep 16, 2025
4 checks passed

forstmeier deleted the 09-06-build_initial_risk_management_resources branch September 16, 2025 17:34

coderabbitai Bot mentioned this pull request Dec 7, 2025

Model rebuild #637

Merged

coderabbitai Bot mentioned this pull request Jan 21, 2026

portfolio fixes #669

Merged

coderabbitai Bot mentioned this pull request Mar 5, 2026

Build initial risk/data management functionality for statistical arbitrage #785

Merged

coderabbitai Bot mentioned this pull request Mar 25, 2026

Unify all float/datetime numeric value handling #807

Merged

Conversation

forstmeier commented Sep 7, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Changes

Comments

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Sep 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

forstmeier commented Sep 7, 2025

Uh oh!

graphite-app Bot commented Sep 7, 2025

Graphite Automations

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Sep 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

forstmeier Sep 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Sep 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

forstmeier commented Sep 7, 2025 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Sep 7, 2025 •

edited

Loading

coderabbitai Bot Sep 7, 2025 •

edited

Loading