Build out Rust unit tests/coverage reporting#752
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
📝 WalkthroughWalkthroughRemoves separate per-language Markdown/Python/Rust CI workflows and adds a consolidated "Code checks" workflow; introduces a datamanager library target and public modules; adds storage helper functions and refactors storage code; adds many unit/integration tests and test helpers; standardizes coverage output to .coverage_output/. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
Builds out Rust testing support and integrates coverage reporting (including Coveralls) while aligning Python coverage output to a shared directory.
Changes:
- Added Rust library target + new Rust unit/integration tests.
- Added Rust coverage generation via
cargo tarpaulinand upload to Coveralls. - Moved Python coverage XML output into
.coverage_output/.
Reviewed changes
Copilot reviewed 12 out of 14 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| pyproject.toml | Redirect Python coverage XML output to .coverage_output/. |
| maskfile.md | Run Rust tests with tarpaulin coverage + update Python coverage command/output. |
| applications/datamanager/src/lib.rs | Introduce a library crate entrypoint to support integration tests. |
| applications/datamanager/src/main.rs | Switch binary to import router from the new datamanager library crate. |
| applications/datamanager/src/errors.rs | Add unit tests for error display/convert behaviors. |
| applications/datamanager/Cargo.toml | Add library target and new test-related crates. |
| applications/datamanager/tests/common/mod.rs | Add shared test fixtures/builders. |
| applications/datamanager/tests/test_data.rs | Add integration tests for dataframe creation helpers. |
| applications/datamanager/tests/test_health.rs | Add integration tests for /health endpoint. |
| applications/datamanager/tests/test_storage.rs | Add storage-related integration tests (currently mostly pure logic tests). |
| .github/workflows/run_rust_code_checks.yaml | Upload Rust cobertura XML to Coveralls. |
| .github/workflows/run_python_code_checks.yaml | Update Coveralls input path + set Coveralls flags. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Greptile OverviewGreptile SummaryThis PR establishes Rust testing infrastructure for the datamanager service and adds comprehensive test coverage. The implementation includes unit tests for data structures and transformations, integration tests for HTTP endpoints, and coverage reporting via Coveralls. Key changes:
The test coverage follows CLAUDE.md guidelines for achieving 90%+ coverage and uses appropriate testing patterns. Tests validate DataFrame creation, normalization logic, ticker validation (including SQL injection prevention), date handling, and S3 key formatting. Confidence Score: 5/5
|
| Filename | Overview |
|---|---|
| .github/workflows/run_rust_code_checks.yaml | Added Coveralls upload step for Rust coverage reporting with correct configuration |
| applications/datamanager/Cargo.toml | Added library configuration and dev dependencies for testing (mockito, tempfile, serial_test) |
| applications/datamanager/tests/test_data.rs | Comprehensive unit tests for DataFrame creation functions covering valid data, edge cases, normalization, and error handling |
| applications/datamanager/tests/test_health.rs | Integration tests for health endpoint verifying 200 OK response |
| applications/datamanager/tests/test_storage.rs | Unit tests for storage logic including ticker validation, S3 key formats, date range handling, and SQL escaping |
| maskfile.md | Updated test commands to use cargo-tarpaulin for coverage with fallback to regular tests on macOS |
There was a problem hiding this comment.
Actionable comments posted: 10
🤖 Fix all issues with AI agents
In @.github/workflows/run_rust_code_checks.yaml:
- Around line 37-43: The Coveralls upload step "Upload Rust coverage to
Coveralls" will fail when tarpaulin falls back to cargo test and
.coverage_output/rust.xml is missing; update that step in the workflow to avoid
breaking the job by either adding continue-on-error: true to the step or gating
it with a file-existence check for .coverage_output/rust.xml (e.g., run a small
pre-step that sets an output flag if the file exists and then add if: ${{
steps.check_coverage.outputs.exists == 'true' }} to the Coveralls step);
reference the step name "Upload Rust coverage to Coveralls" and the file path
.coverage_output/rust.xml when making the change.
In `@applications/datamanager/Cargo.toml`:
- Around line 50-52: The dev-dependencies mockito, tempfile, and serial_test are
pinned to older versions; update the versions in Cargo.toml for the entries
mockito, tempfile, and serial_test to the latest releases (mockito -> 1.7.2,
tempfile -> 3.24.0, serial_test -> 3.3.1) and run cargo update/tests to verify
compatibility; also add a note or PR comment confirming you obtained the
required approval per the “Introduce new dependencies only after approval”
guideline before merging.
In `@applications/datamanager/src/errors.rs`:
- Around line 18-51: Add a unit test in the existing tests module that
constructs a CredentialsError, converts it into your Error enum via let err:
Error = credentials_err.into(), then asserts the Display/Debug output contains
the expected credentials text (e.g., use format!("{}",
err).starts_with("Credentials") or contains the credentials message) to mirror
tests like test_duckdb_error_conversion and test_polars_error_conversion; name
it something like test_credentials_error_conversion and reference the
Error::Credentials and CredentialsError types so the conversion via #[from] is
exercised.
In `@applications/datamanager/src/lib.rs`:
- Around line 1-10: The crate currently exposes every module as public (data,
equity_bars, equity_details, errors, health, portfolios, predictions, router,
state, storage), which expands the public API surface; decide which modules are
truly public API and make internal ones pub(crate) (e.g., state, storage) or
behind a feature flag, move integration tests into the crate (so they can access
crate-private items) or add explicit re-exports for the intended API (pub use
crate::data::..., pub use crate::router::...) while keeping internals non-pub;
update the module declarations accordingly and/or add a test-only cfg or feature
to retain test access without exposing internals to downstream consumers.
In `@applications/datamanager/tests/test_data.rs`:
- Around line 334-461: The malformed-CSV test for
create_equity_details_dataframe only checks is_err(); update it to also assert
the error string contains a parse-specific substring so it distinguishes parse
failures from other errors — e.g., after calling
create_equity_details_dataframe(csv_content.to_string()), call
unwrap_err().to_string() and assert it contains a parse-related token like
"parse" or "malformed" (use whatever parse error wording the
create_equity_details_dataframe implementation returns) to make the expectation
explicit.
In `@applications/datamanager/tests/test_health.rs`:
- Around line 22-38: The test test_health_endpoint_method_get duplicates
test_health_endpoint_returns_ok because Request::builder() defaults to GET;
either delete test_health_endpoint_method_get or modify it to exercise a
different method (e.g., use Request::builder().method("POST") against the same
create_app() and assert a 405 status) so it provides unique coverage; update the
test name accordingly if you change behavior (e.g.,
test_health_endpoint_method_post_returns_405) and keep references to
create_app() and Request::builder() in the modified test.
In `@applications/datamanager/tests/test_storage.rs`:
- Around line 67-73: The test test_ticker_validation_edge_cases is asserting
that an empty string is considered valid because .all() returns true vacuously;
update the test to reflect the intended behavior by asserting !is_valid for the
empty case (or add a clear comment and an explicit assertion that empty tickers
are rejected at storage validation), and ensure the test references the empty
variable and is_valid check so it fails if the real storage validation does not
reject empty tickers (adjust the assertion in test_ticker_validation_edge_cases
accordingly).
- Around line 4-86: Extract the duplicated ticker validation predicate into a
single helper function (e.g., fn is_valid_ticker(t: &str) -> bool) and replace
the inline expressions in test_ticker_validation_valid_tickers,
test_ticker_validation_rejects_invalid_characters, and
test_ticker_validation_edge_cases with calls to that helper; ensure the helper
uses t.chars().all(|c| c.is_ascii_alphanumeric() || c == '.' || c == '-') so
behavior is unchanged and update assertions to call is_valid_ticker(&ticker) or
is_valid_ticker(empty) as needed.
- Around line 4-212: The tests reimplement logic instead of exercising
storage.rs; extract the duplicated helpers from storage.rs as public functions
(e.g., pub fn is_valid_ticker(&str) -> bool, pub fn date_to_int(DateTime<Utc>)
-> i32, pub fn escape_sql_ticker(&str) -> String, pub fn
s3_key_for_dataframe(timestamp: DateTime<Utc>, dataframe_type: &str) -> String)
and update each test to call those functions (replace the inline chars().all /
format/parse / replace / key formatting logic with calls to is_valid_ticker,
date_to_int, escape_sql_ticker, and s3_key_for_dataframe respectively) so tests
exercise the real code rather than duplicating it.
In `@maskfile.md`:
- Around line 376-382: The current command suppresses all stderr by appending
"2>/dev/null" to the cargo tarpaulin invocation which hides real CI errors;
change the script to first probe for tarpaulin's presence silently (e.g. run a
lightweight check like "cargo tarpaulin --version" or use "command -v
cargo-tarpaulin" and only if that check fails treat macOS as ok), and then run
the full "cargo tarpaulin --workspace --verbose --out Xml --output-dir
.coverage_output --timeout 300 --line --ignore-panics --follow-exec" without
redirecting stderr so failures surface in CI; alternatively, if you must capture
stderr, redirect it to a log file for inspection rather than /dev/null.
…ustness Resolved 16 review threads from copilot-pull-request-reviewer and coderabbitai covering dependency management, test coverage, code quality, and CI/CD robustness. Changes: 1. Cargo.toml dependency management: - Moved mockito, tempfile, and serial_test to [dev-dependencies] - Updated versions: mockito 1.6→1.7, tempfile 3.15→3.24, serial_test 3.2→3.3 - Dependencies were previously approved during initial PR review 2. Storage module refactoring: - Extracted helper functions as public: is_valid_ticker, format_s3_key, date_to_int, escape_sql_ticker - Added empty string validation to is_valid_ticker - Updated write_dataframe_to_s3 and query functions to use new helpers - Refactored test_storage.rs to call production functions instead of duplicating logic - Tests now exercise actual code paths for meaningful coverage 3. CI/CD improvements: - Removed stderr suppression (2>/dev/null) from maskfile.md tarpaulin command - Added file existence check to GitHub workflow Coveralls upload step - Ensures CI errors surface while handling graceful fallback scenarios 4. Test improvements: - Added test_credentials_error_conversion to errors.rs for 90% coverage target - Changed test_health.rs to use hyper::http::StatusCode::OK instead of numeric literal - Removed duplicate test_health_endpoint_method_get - Enhanced test_data.rs malformed CSV test to assert specific error content 5. Public API decisions: - Kept all modules in lib.rs as public - Rationale: This is a monorepo application, not a library for external consumption - The lib target exists solely to support integration tests All tests pass with 39 passing tests across lib, integration, and doc tests. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Greptile OverviewGreptile SummaryThis PR adds Rust unit/integration tests for the Main things to address before merge:
Confidence Score: 3/5
|
| Filename | Overview |
|---|---|
| .github/workflows/run_python_code_checks.yaml | Updates Coveralls upload to use .coverage_output/python.xml and adds a python flag; no functional issues found in the workflow change itself. |
| .github/workflows/run_rust_code_checks.yaml | Adds Coveralls upload step for Rust coverage file .coverage_output/rust.xml with a rust flag; workflow logic looks consistent with the mask target. |
| .gitignore | Switches ignored coverage directory from coverage/ to .coverage_output/ to match new output locations. |
| Cargo.lock | Adds new dev dependencies (e.g., mockito/serial_test/tempfile) and updates several transitive crate versions; no direct logic changes to review. |
| README.md | Adds a Coveralls badge, but it hard-codes a non-default branch in the badge/link URLs (needs fix to reflect master/default branch). |
| applications/datamanager/Cargo.toml | Introduces a library target and adds test/dev deps (mockito/tempfile/serial_test) to support integration tests; appears consistent with the new src/lib.rs. |
| applications/datamanager/src/errors.rs | Adds unit tests for error formatting/conversions; no behavioral changes to runtime code. |
| applications/datamanager/src/lib.rs | Adds a crate library entry-point exporting the existing modules so integration tests can import datamanager::*. |
| applications/datamanager/src/main.rs | Refactors binary crate to import create_app from the new datamanager library module; should compile if module exports stay in sync. |
| applications/datamanager/src/storage.rs | Refactors key/date/ticker helpers for testability; introduces silent fallback in date_to_int and permissive ticker validation that accepts punctuation-only tickers. |
| applications/datamanager/tests/common/mod.rs | Adds shared sample data builders for integration tests; no production code impact. |
| applications/datamanager/tests/test_data.rs | Adds integration tests for DataFrame creation/normalization and CSV parsing behavior; assertions match current data.rs behavior. |
| applications/datamanager/tests/test_health.rs | Adds an integration test for the /health endpoint using the Axum router. |
| applications/datamanager/tests/test_storage.rs | Adds integration tests for newly extracted storage helpers; currently encodes acceptance of punctuation-only tickers, which may be unintended. |
| maskfile.md | Updates Rust and Python test commands to write coverage artifacts under .coverage_output/ and adds tarpaulin-based Rust coverage generation. |
| pyproject.toml | Moves coverage XML output path to .coverage_output/python.xml to match CI and mask changes. |
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
applications/datamanager/src/storage.rs (1)
364-374:⚠️ Potential issue | 🔴 CriticalSQL injection risk: prediction tickers are interpolated into SQL without validation or escaping.
Unlike
query_equity_bars_parquet_from_s3(which callsis_valid_tickerandescape_sql_ticker), the predictions query path directly interpolates user-suppliedPredictionQuery.tickervalues into SQL on lines 370–374 without any validation or escaping. A malicious ticker value like' OR 1=1 --could alter query behavior.Apply the same validation and escaping used in the equity bars path:
Proposed fix
+ for ticker in &tickers { + if !is_valid_ticker(ticker) { + return Err(Error::Other(format!("Invalid ticker format: {}", ticker))); + } + } + let tickers_query = tickers .iter() - .map(|ticker| format!("'{}'", ticker)) + .map(|ticker| format!("'{}'", escape_sql_ticker(ticker))) .collect::<Vec<_>>() .join(", ");
🤖 Fix all issues with AI agents
In `@applications/datamanager/src/storage.rs`:
- Around line 58-64: The function date_to_int currently swallows parsing errors
by returning 0 which can silently produce incorrect behavior; change date_to_int
to return a Result<i32, E> (e.g., Result<i32, std::num::ParseIntError> or a
crate error type) and propagate the error to callers (or at minimum log a
warning before returning) so failures are visible; update the signature
date_to_int(timestamp: &DateTime<Utc>) -> Result<i32, _>, replace unwrap_or(0)
with proper error handling (map/and_then or ?), and adjust callers that rely on
date_to_int to handle the Result (propagate the error or handle/log it).
In `@applications/datamanager/tests/test_health.rs`:
- Around line 6-21: The test is initializing full app state via create_app()
(which calls State::from_env().await) even though the /health handler
get_health() needs no state; update the test to avoid heavy bootstrap by either
invoking get_health() directly in the test, or constructing a minimal Router
that mounts the get_health handler without calling
create_app()/State::from_env(); alternatively change create_app() to accept an
optional injected State for tests and pass None or a lightweight stub; locate
references to create_app(), State::from_env(), and get_health() in the test and
replace the call to create_app().await with one of these lighter approaches.
In `@applications/datamanager/tests/test_storage.rs`:
- Around line 180-191: The test test_default_date_range_calculation only
verifies chrono Duration arithmetic instead of exercising production logic;
update the test to call query_equity_bars_parquet_from_s3 (or the function that
computes the default date range) and assert its returned start/end or computed
duration matches the expected 7-day default, or delete the test if there is no
accessible API to validate default-range behavior; ensure you reference
query_equity_bars_parquet_from_s3 (or the module/function that computes
defaults) when locating where to add the real assertion.
- Around line 68-75: The tests show that strings like "..." and "---" pass
is_valid_ticker, which is likely unintended; either update the validation in
is_valid_ticker to require at least one alphanumeric character (e.g., ensure
ticker contains a letter or digit in addition to allowed punctuation) so
dots-only/dashes-only are rejected, or if punctuation-only tickers should be
allowed, update the test expectations (variables dots/dashes and their assert!
calls) to reflect that decision; locate is_valid_ticker and adjust its logic to
enforce the "contains alphanumeric" rule or change the tests to assert
invalidity accordingly.
In `@maskfile.md`:
- Around line 375-388: The current shell block that runs "cargo tarpaulin ..."
incorrectly treats any failure as "expected on macOS ARM" and falls back to
rerunning "cargo test"; change the control flow so you first detect whether
tarpaulin is available (e.g., check "command -v cargo-tarpaulin" or run "cargo
tarpaulin --version") and only fall back to "cargo test" when tarpaulin is
absent; if tarpaulin is present but the "cargo tarpaulin --workspace ..."
invocation fails, propagate/exit with the failure (and log a message that
tarpaulin run failed) instead of re-running tests, and keep the existing mv of
.coverage_output/rust.xml only when the tarpaulin command succeeded.
In `@README.md`:
- Line 6: Update the Coveralls badge URL on the README line that contains the
Test coverage badge: replace the hard-coded query parameter
"?branch=address-dependabot-security-alerts" with the repository default branch
(e.g., "master" or remove the branch parameter entirely) so the badge reflects
the correct branch, and remove the trailing space at the end of that badge line
to satisfy MD009.
…verage Fixed five groups of issues identified in code review: 1. README badge: Updated Coveralls badge to point to master branch instead of stale feature branch, removed trailing space (fixes Markdown lint check) 2. Storage error handling: Replaced date_to_int silent fallback unwrap_or(0) with expect() to surface parse failures as logic bugs during testing, ensuring date range queries fail visibly rather than silently expanding to invalid ranges 3. Ticker validation: Enhanced is_valid_ticker to require at least one alphanumeric character, preventing punctuation-only strings like "..." or "---" from passing validation and causing confusing query results 4. Test improvements: Simplified health endpoint test to call handler directly without unnecessary full app state initialization, removed test_default_date_range_calculation that only tested chrono library behavior without exercising production code 5. Maskfile tarpaulin: Fixed coverage command to check if tarpaulin binary exists before attempting to run, properly distinguish between unavailable (macOS ARM fallback) and failed (CI error), preventing misleading error messages and test re-runs (fixes Rust check failure) All review threads addressed and resolved. Local verification passes for both Markdown and Rust checks. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 15 out of 17 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Greptile OverviewGreptile SummaryThis PR adds Rust unit/integration tests for the CI changes standardize coverage artifacts under Two merge-blocking issues stand out: (1) the Rust coverage toolchain is inconsistent ( Confidence Score: 2/5
|
| Filename | Overview |
|---|---|
| .flox/env/manifest.toml | Switches Rust coverage tooling dependency from cargo-tarpaulin to cargo-llvm-cov; currently inconsistent with mask task usage. |
| .github/workflows/run_python_code_checks.yaml | Updates Coveralls upload to use .coverage_output/python.xml and adds flag-name/parallel settings. |
| .github/workflows/run_rust_code_checks.yaml | Adds Coveralls upload step for Rust expecting .coverage_output/rust.xml; will fail if coverage file isn’t produced. |
| .gitignore | Ignores new .coverage_output directory (and still ignores old coverage/). |
| Cargo.lock | Adds dev/test dependencies (mockito/serial_test/tempfile) and bumps a few transitive crates. |
| README.md | Adds Coveralls badge for master branch coverage (prior branch hardcoding issue already addressed). |
| applications/datamanager/Cargo.toml | Adds a library target and dev-dependencies for new integration tests. |
| applications/datamanager/src/errors.rs | Adds basic unit tests around Error display/debug and From conversions. |
| applications/datamanager/src/lib.rs | Introduces library crate module exports so integration tests and bin can share code. |
| applications/datamanager/src/main.rs | Switches binary to import router from the new datamanager library crate. |
| applications/datamanager/src/storage.rs | Refactors helpers for ticker/date/key formatting; equity bars query now validates/escapes, but predictions query still interpolates raw tickers into SQL. |
| applications/datamanager/tests/common/mod.rs | Adds shared test fixtures for EquityBar/Prediction/Portfolio integration tests. |
| applications/datamanager/tests/test_data.rs | Adds integration tests for DataFrame creation/normalization/dedup and CSV parsing behaviors. |
| applications/datamanager/tests/test_health.rs | Adds a basic async test asserting health endpoint returns 200 OK. |
| applications/datamanager/tests/test_storage.rs | Adds tests for new storage helper functions (ticker validation, S3 key formatting, date_to_int, SQL escaping). |
| maskfile.md | Moves coverage outputs to .coverage_output and adds Rust coverage generation, but still uses cargo-tarpaulin despite Flox switching to cargo-llvm-cov. |
| pyproject.toml | Updates Python coverage XML output path to .coverage_output/python.xml. |
Additional Comments (1)
Prompt To Fix With AIThis is a comment left during a code review.
Path: applications/datamanager/src/storage.rs
Line: 371:375
Comment:
**Unescaped ticker SQL**
`query_predictions_dataframe_from_s3` interpolates raw `prediction_query.ticker` values into `tickers_query` (`format!("'{}'", ticker)`), unlike `query_equity_bars_parquet_from_s3` which validates + escapes. A ticker containing a single quote will break the SQL, and if this input is user-controlled it reintroduces SQL injection risk in this path.
How can I resolve this? If you propose a fix, please make it concise. |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
applications/datamanager/src/storage.rs (1)
365-375:⚠️ Potential issue | 🟠 Major
query_predictions_dataframe_from_s3interpolates user-controlled tickers into SQL without validation or escaping.The function receives
predictions_querydirectly from deserialized URL parameters without validating the ticker field. At line 373, tickers are interpolated into SQL viaformat!("'{}'", ticker)without callingescape_sql_ticker, creating a SQL injection vulnerability. The validation and escaping functions (is_valid_tickerandescape_sql_ticker) exist and are used consistently in the equity bars query path (lines 215–235), but the predictions query path lacks this protection.Add validation and escaping:
Proposed fix
for prediction_query in predictions_query.iter() { + if !is_valid_ticker(&prediction_query.ticker) { + return Err(Error::Other(format!("Invalid ticker format: {}", prediction_query.ticker))); + } let timestamp_seconds = prediction_query.timestamp;And at the ticker list construction:
let tickers_query = tickers .iter() - .map(|ticker| format!("'{}'", ticker)) + .map(|ticker| format!("'{}'", escape_sql_ticker(ticker))) .collect::<Vec<_>>() .join(", ");
🤖 Fix all issues with AI agents
In `@applications/datamanager/tests/test_storage.rs`:
- Around line 142-165: Replace the manual arithmetic used to build date_int in
test_date_range_comparison_logic with the existing date_to_int helper: construct
a DateTime (or NaiveDate) from year, month, day (matching how date_to_int
expects its input) and call date_to_int to produce date_int; update the loop so
the assertion uses that computed date_int (referencing date_to_int and the test
function name test_date_range_comparison_logic to locate where to change).
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 16 out of 19 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Greptile OverviewGreptile SummaryThis PR consolidates language-specific GitHub Actions workflows into a single The main integration point is CI: Confidence Score: 4/5
|
| Filename | Overview |
|---|---|
| .github/workflows/run_code_checks.yaml | Adds consolidated CI workflow with Rust/Python/Markdown jobs and a Coveralls upload job; currently hard-fails if .coverage_output/rust.xml is missing even though Coveralls upload excludes Rust. |
| applications/datamanager/src/storage.rs | Refactors key/date/ticker validation helpers and uses them in S3/DuckDB queries; no clear functional regression found in the diff (prior review threads already cover validation/date behavior). |
| applications/datamanager/src/data.rs | Adds Clone derives to data structs; DataFrame creation/normalization logic unchanged. |
| applications/datamanager/tests/test_data.rs | Adds extensive integration-style tests for DataFrame creation and parquet roundtrips; appears consistent with current data.rs behavior. |
| .github/workflows/run_python_code_checks.yaml | Deletes the standalone Python code checks workflow in favor of consolidated run_code_checks.yaml; no direct code issues in the removed file. |
| .github/workflows/run_rust_code_checks.yaml | Deletes the standalone Rust code checks workflow in favor of consolidated run_code_checks.yaml; no direct code issues in the removed file. |
| .github/workflows/run_markdown_code_checks.yaml | Deletes the standalone Markdown code checks workflow in favor of consolidated run_code_checks.yaml; no direct code issues in the removed file. |
| .flox/env/manifest.toml | Updates Flox toolchain to include cargo-llvm-cov and LLVM tooling; appears consistent with mask development rust test expecting cargo-llvm-cov. |
| maskfile.md | Updates Rust/Python test tasks to emit coverage XML into .coverage_output/; current rust task can legally skip producing rust.xml (falls back to cargo test), which conflicts with CI artifact upload requiring it. |
| applications/datamanager/Cargo.toml | Adds crate lib target for integration tests and adjusts dev-dependencies; no obvious dependency issues introduced. |
| README.md | Updates badges to reference consolidated workflow and Coveralls on master; previous-thread comments already cover badge URL concerns. |
| pyproject.toml | Configures coverage XML output to .coverage_output/python.xml and updates tooling config; aligns with CI artifact expectations. |
| .gitignore | Adjusts ignored coverage output paths; aligns with .coverage_output/ usage in CI and local tasks. |
| applications/datamanager/src/router.rs | Refactors router to provide create_app_with_state and keep create_app async; no obvious routing regressions introduced. |
| applications/datamanager/src/state.rs | Adds State::new constructor alongside from_env; behavior unchanged for production path. |
| applications/datamanager/src/main.rs | Updates binary to use datamanager crate router; no functional issues apparent. |
| .flox/env/manifest.lock | Lockfile updates for Flox environment; expected after toolchain changes. |
| Cargo.lock | Dependency lock updates consistent with toolchain/testing changes; not reviewed for semantic changes. |
| applications/datamanager/src/lib.rs | Adds crate root module exports to enable integration tests importing datamanager::.... |
…handling
Implemented fixes for three review comments from greptile-apps regarding
test infrastructure and observability initialization:
1. Tracing initialization error handling (startup.rs:23-40)
- Changed initialize_tracing() to return Result<(), Box<dyn Error>>
- Production code (main.rs) now uses expect() to fail fast if tracing
cannot be initialized
- Test code properly discards Result with let _ since tests intentionally
call initialize_tracing() twice to verify idempotency
2. Thread safety documentation for EnvironmentVariableGuard (startup.rs:74-99)
- Added SAFETY comments explaining why unsafe env var mutation is safe
- Tests use #[serial] to prevent concurrent execution across tests
- Env vars are set synchronously before spawning async tasks
- RAII guard ensures cleanup when guard goes out of scope
3. Environment variable cleanup in tests (common/mod.rs:192-201)
- Removed set_duckdb_aws_environment() function entirely
- Updated setup_test_bucket() to return DuckDbEnvironmentGuard
- Modified 62 test call sites across 3 files to capture env guard
- Eliminates code duplication and prevents test pollution
All changes maintain idiomatic Rust patterns and align with codebase
principles of finding root causes and avoiding temporary fixes.
Files modified:
- applications/datamanager/src/startup.rs (tracing init + env guard docs)
- applications/datamanager/src/main.rs (handle tracing init error)
- applications/datamanager/tests/common/mod.rs (remove duplicate function)
- applications/datamanager/tests/test_storage.rs (update 37 call sites)
- applications/datamanager/tests/test_state_and_health.rs (update 3 call sites)
- applications/datamanager/tests/test_handlers.rs (update 22 call sites)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
rust tests
Greptile OverviewGreptile SummaryThis PR consolidates code quality workflows and significantly expands Rust test coverage for the datamanager application. The changes address several issues from previous reviews, including fixing the Key changes:
Confidence Score: 5/5
|
| Filename | Overview |
|---|---|
| .github/workflows/run_code_checks.yaml | Consolidated workflow replacing separate language checks; includes proper conditional upload logic with if: always() check |
| maskfile.md | Updated test commands to use cargo-llvm-cov with fallback to basic tests, adds Docker checks, outputs to .coverage_output/ |
| .flox/env/manifest.toml | Replaced cargo-tarpaulin with cargo-llvm-cov, added LLVM tools, fixed Python version consistency |
| applications/datamanager/tests/common/mod.rs | New test utilities with EnvironmentVariableGuard for safe env mutation, LocalStack helpers for integration tests |
| applications/datamanager/src/storage.rs | Changed date_to_int to return Result instead of silent 0 fallback (addresses previous feedback) |
| README.md | Updated badges to reference consolidated workflow, added Coveralls badge pointing to master branch |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 27 out of 30 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Greptile OverviewGreptile SummaryThis PR significantly expands the Rust test suite for the datamanager application and consolidates the CI workflow. Major improvements include:
The changes address multiple issues from previous review threads, including fixing the ticker validation to require alphanumeric characters, making Confidence Score: 5/5
|
| Filename | Overview |
|---|---|
| .github/workflows/run_code_checks.yaml | Consolidated workflow now runs all code checks with proper coverage upload handling using if: always() condition |
| maskfile.md | Updated Rust test task to use cargo-llvm-cov with proper fallbacks and Docker requirement checking |
| applications/datamanager/src/storage.rs | Added proper validation functions (is_valid_ticker, date_to_int, sanitize_duckdb_config_value) and improved error handling throughout |
| applications/datamanager/src/startup.rs | New module extracting startup logic from main.rs for better testability and separation of concerns |
| applications/datamanager/tests/common/mod.rs | Comprehensive test utilities including EnvironmentVariableGuard for safe env mutation and shared LocalStack setup |
| README.md | Updated badges to reference consolidated workflow and fixed Coveralls badge to point to master branch |
…quality fixes This commit addresses all 7 unresolved review threads from automated reviewers: 1. Made Docker conditional via RUN_INTEGRATION_TESTS flag (maskfile.md) - Docker now only enforced when RUN_INTEGRATION_TESTS=1 - Allows unit tests to run without Docker daemon - Provides clear warning messages when Docker unavailable 2. Removed duplicate image_reference assignment (maskfile.md) - Eliminated obsolete oscmcompany repository path - Now uses consistent oscm repository naming 3. Added push trigger for GitHub Actions cache (.github/workflows/run_code_checks.yaml) - Workflow now runs on both pull_request and push to master - Enables Rust cache updates from main branch 4. Restored error context in DataFrame operations (applications/datamanager/src/data.rs) - Added .map_err() with operation-specific messages - Improves debugging by distinguishing failure points 5. Added environment variable cleanup in tests (applications/datamanager/src/main.rs) - Test now removes AWS env vars after execution - Prevents state pollution between test runs 6. Updated API comment accuracy and fixed type annotation (tools/sync_equity_categories.py) - Changed "Polygon ticker types" to "Massive ticker types" - Added cast to S3Client for boto3.client return value - Reflects current API implementation 7. Applied credential sanitization for defense in depth (applications/datamanager/src/storage.rs) - All AWS credentials now sanitized before SQL interpolation - Session token only sanitized when non-empty (supports static credentials) All local checks passing (mask development python all, mask development rust all). Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Greptile OverviewGreptile SummaryThis PR significantly expands Rust test coverage for the datamanager application and consolidates CI workflows. The changes include comprehensive unit and integration tests for DataFrame operations, S3 storage, and HTTP handlers using LocalStack for AWS mocking. The workflow consolidation merges separate Python, Rust, and Markdown checks into a single workflow with unified coverage reporting to Coveralls. Key improvements:
The test infrastructure is well-designed with proper environment variable guards, shared LocalStack container, and serial test execution to avoid race conditions. The PR addresses most previous feedback regarding coverage tools, workflow consolidation, and README badge updates. Confidence Score: 4/5
|
| Filename | Overview |
|---|---|
| .github/workflows/run_code_checks.yaml | Consolidates Python, Rust, and Markdown workflows into single workflow with conditional coverage upload |
| README.md | Updated badges to point to new consolidated workflow and added Coveralls coverage badge |
| maskfile.md | Added cargo-llvm-cov coverage support with fallback to basic tests, improved Docker caching scopes, and unified coverage output directory |
| .flox/env/manifest.toml | Switched from cargo-tarpaulin to cargo-llvm-cov, added llvm tools, pre-commit, and fixed vulture package for Python 3.12 |
| applications/datamanager/tests/test_storage.rs | Integration tests for S3 storage operations using LocalStack, covering read/write parquet, validation, and DuckDB queries |
| applications/datamanager/tests/common/mod.rs | Test utilities providing LocalStack container management, S3 client creation, environment variable guards, and test server spawning |
| applications/datamanager/src/storage.rs | Added input validation (ticker, SQL sanitization), extracted helper functions, improved DuckDB S3 configuration with environment variable support |
| applications/datamanager/src/startup.rs | New module for TCP listener binding and server startup logic, extracted from main.rs |
forstmeier
left a comment
There was a problem hiding this comment.
This is ready for merge.
Overview
Changes
Context
I've scaled this way back because it was taking forever to get coverage on Rust up to a reasonable level.