rust tests#753
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing touches🧪 Generate unit tests (beta)
Tip Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile OverviewGreptile SummaryThis PR expands Rust test coverage for Key areas touched:
Blocking issues to address before merge are mainly around newly introduced panics in request/runtime code paths and a resource leak in the LocalStack test harness. Confidence Score: 3/5
|
| Filename | Overview |
|---|---|
| .gitignore | Adds ignores for coverage/ and some local dev artifacts; no functional code changes. |
| Cargo.lock | Updates lockfile for new/changed Rust dev dependencies (testcontainers/localstack, duckdb bundled). |
| applications/datamanager/Cargo.toml | Adds duckdb bundled feature and introduces testcontainers dev dependencies for LocalStack-backed integration tests; review new Docker dependency in CI. |
| applications/datamanager/src/data.rs | Simplifies DataFrame construction by relying on Polars errors directly; behavior unchanged aside from less contextual logging on failures. |
| applications/datamanager/src/equity_details.rs | Refactors CSV response generation but introduces .expect() panics in the HTTP handler on CSV/UTF-8 errors. |
| applications/datamanager/src/lib.rs | Exports new startup module for reuse in main/tests. |
| applications/datamanager/src/main.rs | Moves startup logic into reusable functions and adds unit tests for exit-code mapping; overall behavior preserved. |
| applications/datamanager/src/startup.rs | New startup module with tracing/sentry init and server runner plus tests; uses unsafe env var mutation in tests and network polling. |
| applications/datamanager/src/storage.rs | Adds DuckDB custom S3 endpoint env support, but changes date_to_int to unwrap and can panic; otherwise query logic unchanged. |
| applications/datamanager/tests/common/mod.rs | Adds shared test helpers with LocalStack via testcontainers; currently leaks the container with Box::leak and mutates env vars unsafely. |
| applications/datamanager/tests/test_data.rs | Adds tracing init to existing DataFrame unit tests; no logic changes to assertions. |
| applications/datamanager/tests/test_errors.rs | Adds basic display/formatting coverage for custom Error enum. |
| applications/datamanager/tests/test_handlers.rs | Adds end-to-end handler tests using LocalStack + mockito; relies on Docker availability and serial env var mutation. |
| applications/datamanager/tests/test_state_and_health.rs | Adds state/env and basic routing tests; uses unsafe env var mutation with serial execution. |
| applications/datamanager/tests/test_storage.rs | Adds storage round-trip tests using LocalStack; includes coverage of ticker validation and default date range paths. |
There was a problem hiding this comment.
Pull request overview
This PR expands the Rust test suite for the datamanager application (including integration tests against S3 via LocalStack), refactors server startup/observability initialization into a dedicated module, and adjusts DuckDB/S3 configuration and related dependencies to improve CI robustness.
Changes:
- Add extensive integration tests for handlers, storage, state/env, and error formatting (with a shared LocalStack-based S3 test harness).
- Refactor service startup concerns into
startup.rsand simplifymain.rsto delegate initialization and server run logic. - Update DuckDB/S3 configuration handling (custom endpoint/SSL via env) and dependency setup (DuckDB bundled, add testcontainers).
Reviewed changes
Copilot reviewed 13 out of 15 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| applications/datamanager/tests/common/mod.rs | Adds shared integration-test utilities: LocalStack S3, bucket setup/cleanup, DuckDB env setup, and a spawned Axum server helper. |
| applications/datamanager/tests/test_storage.rs | New storage-focused integration tests covering S3 read/write and DuckDB query paths. |
| applications/datamanager/tests/test_state_and_health.rs | New tests for /health, router 404 behavior, and State::from_env behavior with env guards. |
| applications/datamanager/tests/test_handlers.rs | New end-to-end handler tests for predictions/portfolios/equity-details/equity-bars, including error-path coverage. |
| applications/datamanager/tests/test_errors.rs | Adds a small unit test for Error display formatting. |
| applications/datamanager/tests/test_data.rs | Adds test tracing initialization and calls it from existing DataFrame-related tests. |
| applications/datamanager/src/storage.rs | Updates date_to_int signature/behavior and refactors DuckDB S3 configuration to support custom endpoint/SSL via env vars. |
| applications/datamanager/src/startup.rs | New module for Sentry/tracing initialization and server run/serve helpers (with tests). |
| applications/datamanager/src/main.rs | Refactors binary entrypoint to use datamanager::startup and adds tests for exit-code behavior. |
| applications/datamanager/src/lib.rs | Exposes the new startup module. |
| applications/datamanager/src/equity_details.rs | Simplifies CSV response writing logic for equity details. |
| applications/datamanager/src/data.rs | Simplifies error propagation in DataFrame creation/transforms by relying on ? conversions. |
| applications/datamanager/Cargo.toml | Enables DuckDB bundled feature and adds testcontainers dependencies for LocalStack-backed tests. |
| Cargo.lock | Updates lockfile to include new transitive dependencies (testcontainers/bollard/etc.) and version resolution changes. |
| .gitignore | Ignores coverage output and additional local/dev artifacts. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Greptile OverviewGreptile SummaryThis PR refactors Key things to address before merge:
Confidence Score: 3/5
|
| Filename | Overview |
|---|---|
| .github/workflows/run_code_checks.yaml | Adds Rust coverage artifact to Coveralls upload; no functional code changes. |
| applications/datamanager/src/startup.rs | Adds centralized sentry/tracing init and server runner; issues found with ignored tracing init errors and unsafe env var mutation in tests. |
| applications/datamanager/src/storage.rs | Adds DuckDB S3 endpoint/SSL env config; issue found where predictions ticker list is interpolated into SQL without escaping/validation. |
| applications/datamanager/tests/common/mod.rs | Adds shared integration-test infra (LocalStack, env guards, server spawning); issue found where some env var mutations are unguarded and leak across tests (plus previously noted LocalStack leak). |
| applications/datamanager/tests/test_handlers.rs | Adds end-to-end handler tests using LocalStack and mockito; relies on shared env setup. |
Additional Comments (1)
In Prompt To Fix With AIThis is a comment left during a code review.
Path: applications/datamanager/src/storage.rs
Line: 417:436
Comment:
**Ticker SQL injection risk**
In `query_predictions_dataframe_from_s3`, `tickers_query` is built with `format!("'{}'", ticker)` without escaping/validation. If `PredictionQuery.ticker` is user-controlled (it comes from request query parsing in the predictions handler), a ticker containing `'` can break out of the string literal and inject SQL into DuckDB. This is different from the equity bars query, which validates and escapes tickers. Apply the same validation/escaping here (or use parameters instead of string interpolation).
How can I resolve this? If you propose a fix, please make it concise. |
…handling
Implemented fixes for three review comments from greptile-apps regarding
test infrastructure and observability initialization:
1. Tracing initialization error handling (startup.rs:23-40)
- Changed initialize_tracing() to return Result<(), Box<dyn Error>>
- Production code (main.rs) now uses expect() to fail fast if tracing
cannot be initialized
- Test code properly discards Result with let _ since tests intentionally
call initialize_tracing() twice to verify idempotency
2. Thread safety documentation for EnvironmentVariableGuard (startup.rs:74-99)
- Added SAFETY comments explaining why unsafe env var mutation is safe
- Tests use #[serial] to prevent concurrent execution across tests
- Env vars are set synchronously before spawning async tasks
- RAII guard ensures cleanup when guard goes out of scope
3. Environment variable cleanup in tests (common/mod.rs:192-201)
- Removed set_duckdb_aws_environment() function entirely
- Updated setup_test_bucket() to return DuckDbEnvironmentGuard
- Modified 62 test call sites across 3 files to capture env guard
- Eliminates code duplication and prevents test pollution
All changes maintain idiomatic Rust patterns and align with codebase
principles of finding root causes and avoiding temporary fixes.
Files modified:
- applications/datamanager/src/startup.rs (tracing init + env guard docs)
- applications/datamanager/src/main.rs (handle tracing init error)
- applications/datamanager/tests/common/mod.rs (remove duplicate function)
- applications/datamanager/tests/test_storage.rs (update 37 call sites)
- applications/datamanager/tests/test_state_and_health.rs (update 3 call sites)
- applications/datamanager/tests/test_handlers.rs (update 22 call sites)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Greptile OverviewGreptile SummaryThis PR refactors Overall the refactor aligns server startup with more testable seams (e.g., Confidence Score: 3/5
|
| Filename | Overview |
|---|---|
| .github/workflows/run_code_checks.yaml | Updates Coveralls upload to include Rust coverage XML alongside Python. |
| applications/datamanager/Cargo.toml | Enables DuckDB bundled feature and adds testcontainers dependencies for integration tests. |
| applications/datamanager/src/equity_details.rs | Refactors CSV response handling but introduces request-handler panics via expect/from_utf8 (already flagged in previous threads). |
| applications/datamanager/src/main.rs | Moves observability/server startup into startup; adds tests but one async test calls real init paths and can be flaky/order-dependent. |
| applications/datamanager/src/startup.rs | Adds startup/observability module with tests; contains env var mutation patterns and tracing init handling issues (covered in previous threads). |
| applications/datamanager/src/storage.rs | Adds DuckDB S3 endpoint/SSL env configuration and sanitization helper; still interpolates AWS credentials into SQL without escaping, enabling breakage/injection. |
| applications/datamanager/tests/common/mod.rs | Adds shared LocalStack + env guard helpers; includes intentional container leak and unsafe env mutation patterns (covered in previous threads). |
| applications/datamanager/tests/test_storage.rs | Adds storage integration tests; exercises DuckDB S3 configuration paths, including new endpoint/SSL env vars. |
Additional Comments (1)
Prompt To Fix With AIThis is a comment left during a code review.
Path: applications/datamanager/src/storage.rs
Line: 152:155
Comment:
**Config injection not sanitized**
`DUCKDB_S3_ENDPOINT` is sanitized, but the AWS credential values interpolated into the DuckDB `SET` statements (`access_key_id`, `secret_access_key`, and `session_token`) are not. Those fields can legally contain characters like `'` or `;` (especially session tokens), which will break the SQL batch and can allow statement injection into DuckDB configuration. You likely want to either (1) escape single quotes for *all* interpolated values (credentials + region), or (2) set these options via DuckDB parameters/PRAGMAs that don’t require string interpolation.
How can I resolve this? If you propose a fix, please make it concise. |
Greptile OverviewGreptile SummaryThis PR primarily improves Most of the Rust-side changes are organizational, but there are a few new runtime/CI hazards introduced by the refactor and test additions (notably: a GitHub Actions workflow input that will fail, new Confidence Score: 2/5
|
| Filename | Overview |
|---|---|
| .github/workflows/launch_infrastructure.yaml | Adds Docker Buildx setup and removes conditional artifact download step. |
| .github/workflows/run_code_checks.yaml | Adds rust-cache settings and Coveralls rust.xml upload, but includes an invalid input (run-id) for actions/download-artifact@v4 which will break CI. |
| applications/datamanager/src/equity_bars.rs | Adds response header setting via .parse().unwrap(); these unwraps can panic in handler path. |
| applications/datamanager/src/main.rs | Main delegates to startup; test mutates env vars unsafely without restoring, leaking state across tests. |
| applications/datamanager/src/startup.rs | Introduces startup module and tests; previous threads already cover tracing init handling and env var guard unsafety. |
| applications/datamanager/src/state.rs | State initialization uses reqwest::Client::builder().build().unwrap() which can panic during env-based initialization. |
| applications/datamanager/src/storage.rs | Adds DuckDB S3 endpoint config and parquet serialization; uses full DataFrame clones when writing parquet, which can double memory usage on large frames. |
| applications/datamanager/tests/common/mod.rs | Adds LocalStack + env guards; previous threads already flag intentional container leak and unsafe env mutation patterns. |
Additional Comments (4)
Prompt To Fix With AIThis is a comment left during a code review.
Path: .github/workflows/run_code_checks.yaml
Line: 82:88
Comment:
**CI fails on artifact input**
`actions/download-artifact@v4` doesn’t accept a `run-id` input; artifacts are already scoped to the current workflow run. With this in place, the step will error with an “Unexpected input” failure and block coverage upload.
```suggestion
merge-multiple: true
```
How can I resolve this? If you propose a fix, please make it concise.
These Also appears in: Prompt To Fix With AIThis is a comment left during a code review.
Path: applications/datamanager/src/equity_bars.rs
Line: 96:104
Comment:
**Unwrap can panic in handler**
These `.parse().unwrap()` calls will panic the whole service if header parsing fails (it’s a request path, so panics crash the process). Even though the literals look safe, this is still a runtime abort in production; please avoid `unwrap()` in handlers.
Also appears in: `applications/datamanager/src/equity_bars.rs:98`, `applications/datamanager/src/equity_bars.rs:104`.
How can I resolve this? If you propose a fix, please make it concise.
Consider propagating the error instead of unwrapping. Prompt To Fix With AIThis is a comment left during a code review.
Path: applications/datamanager/src/state.rs
Line: 24:27
Comment:
**Panic during state init**
`reqwest::Client::builder().build().unwrap()` can panic if the client builder fails (e.g., invalid TLS backend configuration). Since this is used in `State::from_env()`, it can take down the service during startup rather than returning an error path.
Consider propagating the error instead of unwrapping.
How can I resolve this? If you propose a fix, please make it concise.
Same pattern also appears in Prompt To Fix With AIThis is a comment left during a code review.
Path: applications/datamanager/src/storage.rs
Line: 106:112
Comment:
**Full DataFrame clone before parquet**
`writer.finish(&mut dataframe.clone())` clones the entire DataFrame just to obtain a mutable reference for serialization. For large frames this doubles memory usage and can lead to OOMs or major slowdowns on the write-to-S3 path.
Same pattern also appears in `query_equity_bars_parquet_from_s3` at `applications/datamanager/src/storage.rs:358-365`.
How can I resolve this? If you propose a fix, please make it concise. |
42d8f15
into
rust-unit-tests-and-coverage-updates
This pull request makes several improvements to the
datamanagerapplication, focusing on observability, error handling, configuration flexibility, and testability. The main changes include refactoring the server startup and observability initialization into a newstartupmodule, simplifying error handling in DataFrame creation, allowing custom DuckDB S3 endpoints, and adding new integration tests. There are also dependency updates to improve testing capabilities.Observability and Server Startup Refactor:
startupmodule (startup.rs). This improves code organization, testability, and separation of concerns. The main entrypoint (main.rs) is now much simpler and includes tests for server startup logic. [1] [2] [3]Error Handling Simplification:
data.rs). [1] [2] [3] [4] [5]equity_details.rs).Configuration and Flexibility:
storage.rs).date_to_intutility to panic on parse errors, as the format is guaranteed, simplifying its usage (storage.rs). [1] [2]Testing Improvements:
startup.rs,main.rs). [1] [2]testcontainersandtestcontainers-modulesfor improved integration testing, and enabled thebundledfeature for DuckDB to ease local development (Cargo.toml). [1] [2]Dependency Updates:
testcontainersandtestcontainers-modulesfor localstack integration in tests, and enabled thebundledfeature for DuckDB to facilitate local builds (Cargo.toml). [1] [2]These changes collectively improve the maintainability, configurability, and reliability of the
datamanagerservice.