Skip to content

Update tools package structure#763

Merged
forstmeier merged 4 commits intomasterfrom
update-tools-package-structure
Feb 24, 2026
Merged

Update tools package structure#763
forstmeier merged 4 commits intomasterfrom
update-tools-package-structure

Conversation

@forstmeier
Copy link
Copy Markdown
Collaborator

@forstmeier forstmeier commented Feb 22, 2026

Overview

Changes

  • restructure tools/ to include src/ as to align with uv specifications
  • add back unit tests (temporarily excluded from coverage)
  • update maskfile.md commands with uv package references

Context

Another tiny bit of cleanup. I bet most of the comments from the bots can be ignored because this is basically just a move.

Summary by CodeRabbit

  • New Features

    • Added equity details data synchronization capability with improved command options for data retrieval.
  • Documentation

    • Updated command signatures and added OPTIONS documentation for improved clarity on available parameters and flags.
  • Tests

    • Added comprehensive test coverage for model artifact downloads, training data preparation, and data synchronization functions.
  • Chores

    • Updated dependency constraints and added Polars library; refined error handling and artifact path management.

@forstmeier forstmeier self-assigned this Feb 22, 2026
@forstmeier forstmeier added python Python code updates markdown Markdown code updates labels Feb 22, 2026
@github-project-automation github-project-automation Bot moved this to To Do in Overview Feb 22, 2026
@forstmeier forstmeier moved this from To Do to In Progress in Overview Feb 22, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Feb 22, 2026

📝 Walkthrough

Walkthrough

This change reorganizes tools into a proper Python package structure with UV, updates documentation for the invoke command with optional flags, refactors error handling in data sync modules, adds a new equity-details sync module, updates artifact and training paths, and introduces comprehensive test coverage for multiple tool modules.

Changes

Cohort / File(s) Summary
Documentation & Root Configuration
maskfile.md, pyproject.toml
Updated invoke command signature from required date_range to optional FLAGS (date_range and data_type); added test discovery path and coverage omit for tools; added polars dependency and tightened sagemaker constraint.
Tools Package Structure
tools/pyproject.toml, tools/src/tools/**
Added UV package configuration (package = true, src = ["src"]); removed massive dependency; refactored error handling in sync_equity_bars_data.py to use RuntimeError instead of logging; added new sync_equity_details_data.py module with HTTP POST sync functionality; updated download_model_artifacts.py path parsing to require 3 path components; removed docstring from prepare_training_data.py.
Test Coverage
tools/tests/test_download_model_artifacts.py, tools/tests/test_prepare_training_data.py, tools/tests/test_run_training_job.py, tools/tests/test_sync_equity_bars_data.py, tools/tests/test_sync_equity_details_data.py
Added comprehensive test suites covering model artifact download, training data preparation, training job execution, and equity data synchronization with mocked S3/HTTP clients and extensive assertion coverage.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 62.96% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Update tools package structure' accurately reflects the main objective of the PR—restructuring the tools directory to align with uv packaging specifications by introducing a src/ directory.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch update-tools-package-structure

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Feb 22, 2026

Greptile Summary

Restructured tools/ package to align with uv specifications by introducing a src/tools/ directory layout. Updated all command invocations in maskfile.md to use module form (python -m tools.module_name) instead of direct script paths. Re-added unit tests that were temporarily excluded and configured coverage to explicitly omit the tools package.

  • Moved all Python scripts from tools/ to tools/src/tools/ (5 scripts)
  • Updated tools/pyproject.toml to declare package structure with package = true and src = ["src"]
  • Removed unused massive dependency from tools
  • Added polars dependency to root workspace (already in use by prepare_training_data.py)
  • Added version constraint <3 to sagemaker dependency in root
  • Updated 7 mask commands to invoke scripts using module syntax
  • Re-added 5 test files with correct import paths
  • Updated vulture command to reference new whitelist path

All changes are mechanical refactoring with no logic modifications. The restructuring follows uv package conventions and maintains backward compatibility through proper __name__ == "__main__" blocks in all scripts.

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • This is a pure structural refactoring with no logic changes. All Python modules were simply moved from tools/ to tools/src/tools/ to comply with uv package conventions. Command invocations were updated to use the proper module syntax (-m tools.module_name), which is the correct way to invoke packages. Dependencies were cleaned up (removed unused massive, added polars which was already in use). Tests were re-added with correct import paths. All scripts retain their __name__ == "__main__" blocks ensuring they work as modules.
  • No files require special attention

Important Files Changed

Filename Overview
maskfile.md Updated command invocations from direct script paths to module (-m) form, aligned with uv package structure
pyproject.toml Added polars dependency, sagemaker version constraint, tools/tests to testpaths, and tools/** to coverage omit
tools/pyproject.toml Configured as uv package with src layout, removed massive dependency (no longer used)
tools/src/tools/prepare_training_data.py Moved to src layout, removed module docstring

Last reviewed commit: d21ad98

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

28 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment thread pyproject.toml Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR restructures the tools/ Python package to a src/ layout for uv packaging, restores/adjusts tooling tests, and expands the datamanager service to support syncing equity “details” (sector/industry) into S3 via a new POST endpoint.

Changes:

  • Restructure tools/ into tools/src/tools (uv src layout) and update invocation paths (python -m tools...), plus vulture whitelist path updates.
  • Add/restore Python unit tests for tools and adjust repo-level pytest + coverage configuration.
  • Add POST /equity-details sync flow in datamanager (fetch Massive tickers w/ pagination, filter, store CSV in S3) and update infra/workflow knobs (logging/env).

Reviewed changes

Copilot reviewed 22 out of 28 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tools/tests/test_sync_equity_details_data.py Adds unit test coverage for the equity-details sync client script.
tools/tests/test_sync_equity_bars_data.py Adds unit tests for date parsing and equity-bars sync request formatting.
tools/tests/test_run_training_job.py Adds unit test for SageMaker estimator .fit() invocation (module mocking).
tools/tests/test_prepare_training_data.py Adds unit tests for S3 reads/writes and Polars transforms in training data prep.
tools/tests/test_download_model_artifacts.py Adds unit test coverage for artifact download/extract logic.
tools/sync_equity_categories.py Removes legacy standalone categories sync script (now handled in service).
tools/src/tools/vulture_whitelist.py Adds vulture whitelist for type-only imports under new src layout.
tools/src/tools/sync_equity_details_data.py New CLI module to trigger datamanager /equity-details sync.
tools/src/tools/sync_equity_bars_data.py Adjusts error handling/messages for date-range parsing; retains bars backfill logic.
tools/src/tools/run_training_job.py New CLI module for launching SageMaker training.
tools/src/tools/prepare_training_data.py New CLI module to build consolidated training parquet and write to S3.
tools/src/tools/download_model_artifacts.py New CLI module to list/download/extract model artifacts from S3.
tools/src/tools/init.py Establishes the tools package under src/.
tools/pyproject.toml Marks tools as a uv package with src layout.
pyproject.toml Updates pytest discovery paths and coverage omit patterns; constrains SageMaker version.
maskfile.md Updates commands to python -m tools..., adds datamanager invoke flags and vulture path.
infrastructure/main.py Adds RUST_LOG env to deployed service configuration.
applications/datamanager/tests/test_storage.rs Adds test coverage for writing equity details CSV to S3; expands bars query test range.
applications/datamanager/tests/test_state_and_health.rs Updates env behavior expectation (required vars now panic).
applications/datamanager/tests/test_handlers.rs Adds extensive handler tests for equity-details sync and pagination/failure cases.
applications/datamanager/src/storage.rs Adds write_equity_details_dataframe_to_s3 and centralizes the details key constant.
applications/datamanager/src/state.rs Makes MASSIVE + S3 env vars required (panic if missing).
applications/datamanager/src/startup.rs Adjusts tracing default filter behavior; improves env-guard test utilities.
applications/datamanager/src/router.rs Adds POST route for /equity-details sync handler.
applications/datamanager/src/main.rs Ensures tests set additional env vars (SENTRY_DSN/ENVIRONMENT/RUST_LOG).
applications/datamanager/src/equity_details.rs Implements equity-details sync via Massive API with pagination + S3 upload.
.github/workflows/sync_data.yaml Updates scheduled sync command to pass --data-type equity-bars.
Comments suppressed due to low confidence (2)

tools/src/tools/sync_equity_bars_data.py:16

  • RuntimeError is raised from JSONDecodeError without any message, which yields an empty/opaque error (and you removed the prior logger context). Include a descriptive message (e.g., invalid JSON + the expected shape) when rethrowing so callers/CLI logs are useful.
    tools/src/tools/sync_equity_bars_data.py:30
  • RuntimeError is raised from ValueError here without a message, which obscures whether the issue was start_date, end_date, or format mismatch. Raise with a message that mentions the required format (YYYY-MM-DD) and the received values to make CLI failures actionable.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pyproject.toml Outdated
Comment thread tools/pyproject.toml Outdated
Comment thread tools/tests/test_download_model_artifacts.py
Comment thread pyproject.toml Outdated
Comment thread maskfile.md
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 15

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@applications/datamanager/src/equity_details.rs`:
- Around line 219-243: This loop in equity_details.rs duplicates normalization
(uppercasing and filling missing with "NOT AVAILABLE") that's also done in
create_equity_details_dataframe in data.rs; extract a shared normalization
helper and constant (e.g., pub const NOT_AVAILABLE: &str and a function like
normalize_optional_uppercase(option: Option<String>) -> String) into a common
module, replace the match logic in the for loop (and the corresponding logic in
create_equity_details_dataframe) to call that helper, and ensure EQUITY_TYPES
check still uses ticker_type.unwrap_or_default() before passing through or
alongside the helper so behavior is preserved.
- Around line 277-284: The error response currently returns internal S3 SDK
details by interpolating err into the response body; update the Err branch that
contains the warn!("Failed to upload to S3: {}", err) and the
StatusCode::BAD_GATEWAY response so the HTTP response body uses a generic
message (e.g., "DataFrame created but S3 upload failed") without including err,
while keeping the warn! log with err for diagnostics; reference the existing
warn! call and the StatusCode::BAD_GATEWAY response construction to locate and
change only the response string interpolation.

In `@pyproject.toml`:
- Line 37: The comment next to the testpaths entry "tools/tests" is misleading:
it says tests are being excluded but the "testpaths" list (containing
"tools/tests") actually includes them; update the comment adjacent to
"tools/tests" in the testpaths array to accurately state that these tests are
being included/discovered, or if the intent was to exclude them instead, move
"tools/tests" from testpaths into the coverage omit pattern (the "tools/*" entry
in the omit list) and adjust the comment accordingly; reference the testpaths
array and the omit list when making the change.

In `@tools/pyproject.toml`:
- Line 6: Remove the duplicate boto3 declaration and consolidate it at the
workspace root using the higher constraint: delete the dependency entry
"boto3>=1.35.0" from the equitypricemodel pyproject and remove "boto3>=1.40.74"
from tools/pyproject.toml, then add or update the root pyproject.toml
dependencies to include "boto3>=1.40.74" (keeping "massive>=2.0.2" where
relevant) so the workspace uses the single, higher-version boto3 spec.

In `@tools/src/tools/sync_equity_bars_data.py`:
- Around line 19-20: The RuntimeError raised in the JSONDecodeError and
ValueError handlers currently use bare "raise RuntimeError from e" so the
RuntimeError has no message and caller logs show error=""; update those raises
to include a descriptive message (e.g., "Invalid date range JSON" or "Date range
values invalid, must include 'start_date' and 'end_date'") so the original
exception text appears in caller.logger.exception; locate the two bare raises in
the date-range parsing routine (the JSONDecodeError except block and the date
parsing/ValueError except block) and change them to raise
RuntimeError("<descriptive message>") from e, matching the pattern used where
message = "Date range JSON must contain 'start_date' and 'end_date' fields".

In `@tools/src/tools/sync_equity_details_data.py`:
- Line 20: Remove the unnecessary "# noqa: PLR2004" directives from the
conditional checks that compare status_code (the lines containing "if
status_code >= 400:"), and any other identical occurrences (e.g., the second
occurrence around line 32) so the file no longer contains unused noqa comments;
simply delete the " # noqa: PLR2004" suffixes from those "if status_code >=
400:" statements in sync_equity_details_data.py.
- Around line 41-51: Remove the unnecessary arguments dict and loop: instead of
building arguments = {"base_url": base_url} and iterating for argument in
[base_url], perform a direct check on base_url (if not base_url) and call
logger.error("Missing required positional argument(s)", base_url=base_url)
followed by sys.exit(1); update the check around the base_url variable where
it's currently validated to simplify the logic in sync_equity_details_data.py.

In `@tools/tests/test_download_model_artifacts.py`:
- Line 45: The test currently uses mock_tar.extractall.assert_called_once(),
which doesn't verify the path or filter arguments; update the assertion in
tools/tests/test_download_model_artifacts.py to assert that extractall was
called with the expected arguments (i.e.,
assert_called_once_with(path=destination_directory, filter="data") or the
equivalent kwarg form) so the test explicitly checks the destination_directory
and the "data" filter passed to mock_tar.extractall.
- Line 12: Update the mocked S3 key in tests to match the Prefix used by the
implementation and verify extractall args: change the test key from
"artifacts/run_20250601/output/model.tar.gz" to
"artifacts/equitypricemodel/run_20250601/output/model.tar.gz" (since
list_objects_v2 is called with Prefix="artifacts/equitypricemodel"), and adjust
the parsing in the implementation that currently uses parts[1] to extract the
run/version to use parts[2] (refer to the code that splits S3 key and constructs
target_path). Also replace the extractall assertion to
assert_called_once_with(path=destination_directory, filter="data") so the test
verifies the exact arguments.

In `@tools/tests/test_prepare_training_data.py`:
- Around line 98-115: The test test_read_equity_bars_from_s3_returns_dataframe
only checks that mock_s3_client.get_object was called once but not the
arguments; update the assertion to verify Bucket and Key just like the category
test: replace mock_s3_client.get_object.assert_called_once() with
mock_s3_client.get_object.assert_called_once_with(Bucket="test-bucket",
Key=f"equity_bars/{_TARGET_DATE}.parquet") (or the exact key format used by
read_equity_bars_from_s3) so the S3 key construction is validated; reference:
test_read_equity_bars_from_s3_returns_dataframe, read_equity_bars_from_s3, and
mock_s3_client.get_object.

In `@tools/tests/test_run_training_job.py`:
- Line 12: Remove the unused noqa directive on the import of run_training_job:
in the import statement that reads "from tools.run_training_job import
run_training_job  # noqa: E402" delete the " # noqa: E402" suffix so the line
becomes a normal import; this removes the unused directive causing the RUF100
warning while leaving the run_training_job import intact.

In `@tools/tests/test_sync_equity_bars_data.py`:
- Around line 19-83: Add three new unit tests to cover the error paths of
validate_and_parse_dates: implement
test_validate_and_parse_dates_raises_on_invalid_json which passes a malformed
JSON string and asserts validate_and_parse_dates raises (ValueError) with a
helpful message, test_validate_and_parse_dates_raises_on_missing_fields which
passes JSON missing start_date or end_date and asserts a raise with a message
indicating missing fields, and
test_validate_and_parse_dates_raises_when_start_exceeds_end_after_clamping which
supplies dates that clamp so start_date > end_date and asserts a raise;
reference the validate_and_parse_dates function in these tests and assert the
raised exception type and that the exception message contains the expected hint
text so regressions in the new error branches are detected.
- Line 60: Remove the unused noqa directive from the assertion so the line reads
simply "assert status_code == 200"; specifically edit the assertion in
tools/tests/test_sync_equity_bars_data.py (the failing assertion "assert
status_code == 200  # noqa: PLR2004") and delete the trailing "# noqa: PLR2004"
token.

In `@tools/tests/test_sync_equity_details_data.py`:
- Around line 25-33: Add a test that covers the error branch of
sync_equity_details_data by mocking requests.post to return a response with
status_code >= 400 (e.g., 500) and text (e.g., "internal server error"), then
assert that calling sync_equity_details_data(base_url="http://localhost:8080")
raises a RuntimeError with the expected message using pytest.raises; use
MagicMock for the response and patch
"tools.sync_equity_details_data.requests.post" to return it so the test (e.g.,
test_sync_equity_details_data_raises_on_error_status) verifies the error path.
- Line 18: Remove the unnecessary noqa directive on the assertion line: edit the
test assertion `assert status_code == 200  # noqa: PLR2004` and delete the `#
noqa: PLR2004` comment so the line is simply `assert status_code == 200`; this
removes a directive for a rule (PLR2004) that isn't enabled (Ruff RUF100) and
avoids dead noqa comments in `test_sync_equity_details_data.py`.

Comment thread applications/datamanager/src/equity_details.rs
Comment thread applications/datamanager/src/equity_details.rs
Comment thread pyproject.toml Outdated
Comment thread tools/pyproject.toml Outdated
Comment thread tools/src/tools/sync_equity_bars_data.py
Comment thread tools/tests/test_run_training_job.py
Comment thread tools/tests/test_sync_equity_bars_data.py
Comment thread tools/tests/test_sync_equity_bars_data.py
Comment thread tools/tests/test_sync_equity_details_data.py
Comment thread tools/tests/test_sync_equity_details_data.py
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pyproject.toml (1)

6-16: ⚠️ Potential issue | 🔴 Critical

Move sagemaker to tools/pyproject.toml and remove unused uvicorn dependency from root.

Per the coding guideline to scan and remove unused dependencies and move duplicates to workspace root:

  1. sagemaker>=2.256.0,<3: Used only in tools/src/tools/run_training_job.py. This dependency should live exclusively in tools/pyproject.toml rather than the workspace root, since it is not shared across multiple members.

  2. uvicorn>=0.35.0: No imports found anywhere in the codebase. This dependency is unused and should be removed.

  3. structlog, tinygrad, numpy: These are duplicated—they appear in both root pyproject.toml and applications/equitypricemodel/pyproject.toml. Remove these from the equitypricemodel member to consolidate them at the workspace root as per the guideline.

The sagemaker>=2.256.0,<3 upper bound constraint is a good defensive practice, but the dependency belongs in the member package, not at the workspace root.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyproject.toml` around lines 6 - 16, Remove the unused and misplaced
dependencies from the workspace root pyproject.toml and consolidate duplicates:
delete uvicorn>=0.35.0 from the root dependencies list; remove
sagemaker>=2.256.0,<3 from the root and add that exact requirement to the tools
package pyproject.toml where run_training_job.py uses it; and remove
structlog>=25.5.0, tinygrad>=0.10.3, and numpy>=1.26.4 from the
applications/equitypricemodel member so those packages remain only at the
workspace root (keep their versions as currently specified in root).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pyproject.toml`:
- Line 56: The coverage omit pattern "tools/*" is too broad and excludes source
files under tools/; update the omit list in pyproject.toml to target only tests
(e.g., replace "tools/*" with "tools/tests/*" or a pattern like
"tools/**/test_*.py") so only test files are omitted; locate the omit entry
(omit = ["*/__init__.py", "**/test_*.py", "tools/*"]) and change the "tools/*"
element to a more specific tests-only pattern.

---

Outside diff comments:
In `@pyproject.toml`:
- Around line 6-16: Remove the unused and misplaced dependencies from the
workspace root pyproject.toml and consolidate duplicates: delete uvicorn>=0.35.0
from the root dependencies list; remove sagemaker>=2.256.0,<3 from the root and
add that exact requirement to the tools package pyproject.toml where
run_training_job.py uses it; and remove structlog>=25.5.0, tinygrad>=0.10.3, and
numpy>=1.26.4 from the applications/equitypricemodel member so those packages
remain only at the workspace root (keep their versions as currently specified in
root).

---

Duplicate comments:
In `@pyproject.toml`:
- Around line 34-38: The inline comment and exclusion are correct — leave the
testpaths array as-is and keep the "tools/tests" entry commented out so it
remains excluded from test discovery; specifically ensure the testpaths list
(the testpaths variable) still contains "applications/*/tests",
"libraries/python/tests", and the commented-out "tools/tests" line with its
explanatory comment.

Comment thread pyproject.toml Outdated
@forstmeier forstmeier changed the base branch from master to datamanager-equity-details-refactoring-and-cleanup February 22, 2026 15:50
coderabbitai[bot]
coderabbitai Bot previously approved these changes Feb 22, 2026
Base automatically changed from datamanager-equity-details-refactoring-and-cleanup to master February 24, 2026 14:45
@forstmeier forstmeier dismissed coderabbitai[bot]’s stale review February 24, 2026 14:45

The base branch was changed.

coderabbitai[bot]
coderabbitai Bot previously requested changes Feb 24, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
pyproject.toml (1)

53-53: "tools/**" still omits all tools source files from coverage — original concern unresolved.

Changing "tools/*""tools/**" makes the exclusion explicitly recursive, but it continues to silence coverage for source modules under tools/src/ in addition to tests. The earlier suggestion was to narrow this to "tools/tests/*" so that source coverage is still measured. If the intent genuinely is to exclude all tools code temporarily, the pattern is functional but source coverage gaps will be invisible.

🔧 Restore source-file coverage (original suggestion)
-omit = ["*/__init__.py", "**/test_*.py", "tools/**"]
+omit = ["*/__init__.py", "**/test_*.py", "tools/tests/*"]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyproject.toml` at line 53, The omit pattern in pyproject.toml currently uses
"tools/**" which recursively excludes all files under tools (including source
like tools/src/) from coverage; update the omit list entry for the coverage omit
variable to "tools/tests/*" (or another tests-only pattern) so only test files
under tools are excluded and source modules (e.g., tools/src/*) are still
measured by coverage; if the intent truly is to hide all tools code, keep
"tools/**" but add a clear comment explaining that decision.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pyproject.toml`:
- Line 16: The dependency "polars" in pyproject.toml currently has no version
specifier; change it to include a minimum version (e.g., "polars>=1.38.1") to
prevent resolving to incompatible newer minors, then regenerate the lockfile
(e.g., poetry lock / pip-compile) and verify CI uses the updated lockfile;
update any dependency list that references "polars" to the same spec if present.

In `@tools/pyproject.toml`:
- Around line 8-10: The [tool.uv] table contains an undocumented key src =
["src"] which is likely ignored; remove the src = ["src"] line from the
[tool.uv] block and either leave only package = true (since src/ is the default)
or, if you intentionally want to override defaults, move the src declaration
into the [tool.uv.build-backend] table using the documented keys (e.g.,
module-root/module-name) instead of src; update the pyproject.toml so [tool.uv]
only contains valid keys like package = true or relocate/rename the source-root
setting under [tool.uv.build-backend].

In `@tools/src/tools/download_model_artifacts.py`:
- Around line 46-53: The current logic treats any S3 key with ≥3 path segments
as an artifact; tighten validation by requiring the object key to match the
expected artifact suffix (e.g., endswith "output/model.tar.gz") or a larger
minimum segment count before calling options.add and appending to
file_objects_with_timestamps. Update the block around file_object_name_parts,
options.add, and file_objects_with_timestamps to first verify file_object_name
(or file_object_name_parts) has the expected suffix (or at least the expected
number of segments) and skip/continue for keys that don't match so only real
artifact runs are selected.

---

Duplicate comments:
In `@pyproject.toml`:
- Line 53: The omit pattern in pyproject.toml currently uses "tools/**" which
recursively excludes all files under tools (including source like tools/src/)
from coverage; update the omit list entry for the coverage omit variable to
"tools/tests/*" (or another tests-only pattern) so only test files under tools
are excluded and source modules (e.g., tools/src/*) are still measured by
coverage; if the intent truly is to hide all tools code, keep "tools/**" but add
a clear comment explaining that decision.

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 989f7af and d21ad98.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (5)
  • pyproject.toml
  • tools/pyproject.toml
  • tools/src/tools/download_model_artifacts.py
  • tools/src/tools/prepare_training_data.py
  • tools/tests/test_download_model_artifacts.py
💤 Files with no reviewable changes (1)
  • tools/src/tools/prepare_training_data.py

Comment thread pyproject.toml
Comment thread tools/pyproject.toml
Comment thread tools/src/tools/download_model_artifacts.py
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 16 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@forstmeier forstmeier dismissed coderabbitai[bot]’s stale review February 24, 2026 15:11

Dismissing and re-requesting because re-requesting doesn't seem to be triggering.

@forstmeier forstmeier merged commit bf5adac into master Feb 24, 2026
11 checks passed
@github-project-automation github-project-automation Bot moved this from In Progress to Done in Overview Feb 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

markdown Markdown code updates python Python code updates

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

2 participants