Update data/portfolio manager scheduled invocations#823
Conversation
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 17 minutes and 23 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (4)
📝 WalkthroughWalkthroughThe PR introduces a recurring daily scheduler for equity bar synchronization in the data_manager application. A new scheduler module implements timezone-aware sync scheduling at 18:00 Eastern time, skipping weekends. The equity_bars sync handler is refactored to extract a Changes
Sequence DiagramsequenceDiagram
participant Tokio as Tokio Runtime
participant Scheduler as Sync Scheduler
participant Time as Time Check
participant API as Massive API
participant Parser as JSON Parser
participant S3 as S3 Storage
participant Log as Logger
Tokio->>Scheduler: spawn_sync_scheduler(state)
loop Daily Cycle
Scheduler->>Scheduler: compute_seconds_until_sync()
Scheduler->>Time: sleep for N seconds
Time-->>Scheduler: wake at 18:00 ET
Scheduler->>Scheduler: check if Eastern weekday is weekend
alt Weekend Detected
Scheduler->>Log: log "Weekend, skipping"
Scheduler->>Scheduler: continue to next cycle
else Weekday
Scheduler->>API: fetch equity bars for date
API-->>Scheduler: JSON response with results
alt No Results Field
Scheduler->>Log: log Ok(None) - no data
else Results Present
Scheduler->>Parser: parse results to DataFrame
Parser-->>Scheduler: DataFrame
Scheduler->>S3: upload to S3
S3-->>Scheduler: Ok(s3_key)
Scheduler->>Log: log Ok(Some(key)) - success
end
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile SummaryThis PR adds a self-scheduling daily equity-bar sync to the data manager (firing at 18:00 Eastern, weekdays only) by extracting Confidence Score: 5/5
|
| Filename | Overview |
|---|---|
| applications/data_manager/src/scheduler.rs | New scheduler module: correctly computes the next 18:00 Eastern sync time and skips weekends. Minor test assertion uses .as_secs() > 0 which can evaluate to 0 for sub-second durations. |
| applications/data_manager/src/equity_bars.rs | Clean extraction of fetch_and_store; date now converted to Eastern for the API URL. Weekend guard in the sync HTTP handler still uses the UTC weekday, inconsistent with the Eastern date passed to fetch_and_store. |
| applications/data_manager/src/startup.rs | State is now built once and shared between the scheduler and the Axum app via create_app_with_state; straightforward and correct. |
| applications/portfolio_manager/src/portfolio_manager/scheduler.py | Adds a defensive weekend check after waking from sleep; catch_up is already False at that point so continue correctly re-enters the sleep path via _seconds_until_next_rebalance(). |
| applications/data_manager/tests/test_handlers.rs | Test timestamps updated from midnight UTC to noon UTC to avoid the UTC/Eastern day-boundary issue; status code expectations corrected to 500 to match the simplified error path. |
Prompt To Fix All With AI
This is a comment left during a code review.
Path: applications/data_manager/src/equity_bars.rs
Line: 297-298
Comment:
**Weekend check uses UTC weekday, API query uses Eastern date**
`payload.date.weekday()` evaluates the UTC weekday, but `fetch_and_store` formats the Massive API request URL using the Eastern date. During the UTC/Eastern day boundary (midnight–05:00 UTC in winter), the two can disagree — e.g. a UTC Monday payload could resolve to a Sunday Eastern date, producing a wasted API call rather than a skip. The scheduler path avoids this by firing at 22:00–23:00 UTC, but direct callers of the HTTP endpoint are exposed. For consistency, convert to Eastern before checking the weekday here.
```suggestion
let weekday = payload.date.with_timezone(&Eastern).weekday();
if weekday == Weekday::Sat || weekday == Weekday::Sun {
info!("Skipping weekend date: {}", payload.date.with_timezone(&Eastern).format("%Y-%m-%d"));
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: applications/data_manager/src/scheduler.rs
Line: 88-93
Comment:
**`as_secs() > 0` can silently fail for sub-second durations**
If this test runs within one second before 18:00:00 Eastern, `target_eastern` is still today (the `>= target_eastern` guard only advances when `now_eastern` is equal or later), the resulting `Duration` is less than one second, and `.as_secs()` truncates to `0` — failing the assertion. Comparing against `Duration::ZERO` directly avoids the truncation.
```suggestion
assert!(
duration > std::time::Duration::ZERO,
"Expected positive wait time, got {:?}",
duration
);
```
How can I resolve this? If you propose a fix, please make it concise.Reviews (2): Last reviewed commit: "Address PR #823 review feedback on data_..." | Re-trigger Greptile
There was a problem hiding this comment.
Pull request overview
This PR updates scheduled invocations across services by adding weekday-aware scheduling for portfolio rebalancing and introducing a new daily equity-bars sync scheduler in data_manager.
Changes:
- Skip scheduled portfolio rebalances on weekends in
portfolio_manager. - Add a new
data_managerscheduler that runs an automated equity-bars sync at 18:00 US/Eastern on weekdays. - Refactor equity-bars sync logic into a reusable
fetch_and_storefunction and update related tests/error expectations.
Reviewed changes
Copilot reviewed 7 out of 8 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| applications/portfolio_manager/src/portfolio_manager/scheduler.py | Adds a weekend guard to skip scheduled rebalances on Saturdays/Sundays. |
| applications/data_manager/src/startup.rs | Builds State once, spawns the new sync scheduler, and wires the router with shared state. |
| applications/data_manager/src/scheduler.rs | Introduces the new time-based weekday scheduler for equity-bars sync plus basic unit tests. |
| applications/data_manager/src/equity_bars.rs | Extracts fetch_and_store for reuse by HTTP handler + scheduler; updates error handling paths. |
| applications/data_manager/src/lib.rs | Exposes the new scheduler module. |
| applications/data_manager/tests/test_handlers.rs | Updates handler tests for new status-code behavior and naming. |
| applications/data_manager/Cargo.toml / Cargo.lock | Adds chrono-tz dependency for US/Eastern scheduling. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Warning
CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.
Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@applications/data_manager/src/equity_bars.rs`:
- Around line 120-130: The code in fetch_and_store uses the provided
DateTime<Utc> directly to build date_str causing UTC calendar-day drift; convert
the input timestamp to the market timezone (America/New_York / Eastern Time) and
derive a date-only value (local date) before performing weekend checks in sync()
and before building the upstream path (i.e., compute the market-local date and
then format that to date_str); update fetch_and_store (and the sync caller) to
accept either a Date-only value or to convert the DateTime via chrono's timezone
conversion so all logic (weekend skip and URL construction) uses the
market-local date rather than the UTC calendar day.
- Around line 207-215: The map_err for serde_json::from_value in the BarResult
parsing (and the similar branch handling DataFrame rendering) must not return
raw payloads like text_content or full DataFrame strings as the Err value;
instead log the detailed payload locally (use warn! or error! with
results/first_result/text_content/DataFrame samples) and return a stable, small
error string such as "failed to parse bar results" or "failed to render
DataFrame" from the closure. Update the closure around
serde_json::from_value::<Vec<BarResult>>(results.clone()).map_err(...) and the
corresponding DataFrame-to-string error path to log details but replace the
returned String with a generic message so downstream HTTP 500s don’t leak or
bloat with raw payloads.
In `@applications/data_manager/src/scheduler.rs`:
- Around line 23-29: The current block that advances target_date when
now_eastern >= target_eastern (variables: now_eastern, target_eastern,
target_date, Eastern.from_local_datetime, sync_time, succ_opt()) always skips
the current trading day for late restarts; modify this path to first consult the
last successful sync (e.g., a function like get_last_successful_sync_date() or
by probing S3 partitions for the partition for target_date) and only advance
target_date if that last_synced_date >= target_date (meaning today already
processed); otherwise keep target_date as today and schedule a catch-up run for
target_date; if no last_successful_sync is available, fall back to checking S3
partition existence for target_date before calling
target_date.succ_opt().unwrap() and recomputing target_eastern so you don't
silently skip a missed run.
- Around line 39-41: The current calculation uses
(target_eastern.with_timezone(&Utc) - now_utc).num_seconds().max(0) as u64 which
truncates sub-second precision; instead compute the full chrono::Duration
between target_eastern.with_timezone(&Utc) and now_utc, convert that to a
std::time::Duration (handling negatives by treating them as zero), and pass that
exact Duration to the sleeper (e.g., tokio::time::sleep). Replace uses of
num_seconds()/max(0) as u64 with a to_std()-based conversion and zero-clamp so
sleeps use the precise remaining time.
In `@applications/data_manager/src/startup.rs`:
- Around line 52-54: The scheduler is currently started via
spawn_sync_scheduler(state.clone()) before the server listener is bound,
allowing the detached task to run even if TcpListener::bind fails; move the call
to spawn_sync_scheduler so it runs only after the listener is successfully
created (i.e., after TcpListener::bind completes) and arrange for the scheduler
task to be tied to the server process lifetime (for example, spawn it from the
same async context that starts the server or attach it to the server future),
keeping references to State (from State::from_env()) and using
create_app_with_state(state) after bind so the scheduler is not started on bind
failure.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 55cd5655-15fa-48f6-9b2b-d77f174ace4a
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (7)
applications/data_manager/Cargo.tomlapplications/data_manager/src/equity_bars.rsapplications/data_manager/src/lib.rsapplications/data_manager/src/scheduler.rsapplications/data_manager/src/startup.rsapplications/data_manager/tests/test_handlers.rsapplications/portfolio_manager/src/portfolio_manager/scheduler.py
Five groups of issues addressed based on bot review threads:
Raw payload exposure (equity_bars.rs): The BarResult parse failure and
DataFrame creation failure map_err closures were returning text_content
or raw data directly into the Err value, which propagated to HTTP 500
bodies and scheduler error logs. Replaced with sanitized messages;
existing warn! calls already capture the details. The S3 upload failure
now logs row/column counts and date instead of rendering the full
DataFrame via {}.
UTC vs ET date for API URL (equity_bars.rs): date_str used the UTC
calendar date. When an arbitrary UTC timestamp is passed (e.g., midnight
UTC on Jan 1 = Dec 31 ET), the wrong date was fetched. Changed to
date.with_timezone(&Eastern).format(...) and added chrono_tz::US::Eastern
import. Updated four integration tests that passed midnight UTC timestamps
to use noon UTC so the ET calendar date still maps to the expected date.
Double Utc::now() in sync_loop (scheduler.rs): now_eastern and today each
called Utc::now() independently, risking a clock skew between the weekend
check and the fetch date. Captured once as now_utc and derived both from it.
Bind before spawning scheduler (startup.rs): spawn_sync_scheduler was called
before TcpListener::bind, so a failed bind still left the scheduler running.
Reordered to bind first, then spawn.
Duration precision (scheduler.rs): seconds_until_next_sync() truncated the
wait time to u64 seconds via num_seconds(), making the test flaky at the
boundary and sleeping slightly early. Renamed to duration_until_next_sync()
returning std::time::Duration via to_std().unwrap_or(Duration::ZERO).
Updated sync_loop to sleep on the Duration directly and log .as_secs().
Updated tests accordingly.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
@coderabbitai or @greptile-apps can you approve this? |
|
Tip For best results, initiate chat on the files or code changes.
A quick note: coverage dropped by ~1.4% (77.31% vs. 78.71% on [approve] |
|
Reviewed the diff — changes look good:
I tried to submit an approval but the tool call was blocked — you'll need to approve manually on GitHub. Tip: You can customize Greptile's behavior for this repo with |
Stack created with GitHub Stacks CLI • Give Feedback 💬
Summary by CodeRabbit
New Features
Chores