feat(cdr): Phase 3.1 ranking foundation (cheap-rank + orchestrator + deep-rank)#64
Conversation
…e 3.1) First commit of Phase 3.1 multi-plan ranking engine. Adds: - ``matches_geography(plan, *, state, postcode, distributor)`` — AND-combined filter against ``geography.includedPostcodes / excludedPostcodes / distributors / state``. Wildcard semantics when a filter is None, national-plan semantics when a field is omitted. - ``cheap_rank_score(plan)`` — heuristic ranking score: ``peak_rate_cents * 0.7 + daily_supply_cents * 0.3``. Both terms in cents to land on roughly the same numeric scale (~30 c/kWh peak vs ~100 c/day supply). Picks the highest unitPrice across the first tariffPeriod's timeOfUseRates as "peak" — works for both flat SINGLE_RATE plans and TOU plans without parsing time-of-use schedules. - ``filter_eligible_plans(plans, **filters)`` — bulk geography pass. - ``cheap_rank(plans, *, top_k=20)`` — sort by ascending score, truncate to top-K. Plans that can't be scored (None) are dropped, not zero-scored, so malformed payloads can't falsely surface as cheapest. Tests: 29 new tests cover all four functions including edge cases (malformed unitPrice mid-list, missing tariffPeriod, case-insensitive distributor match, national plans w/o geography block). Smoke-tested against real CDR fixture (GLO731031MR@VEC ZEROHERO): score 56.7 cents = 36c*0.7 + 105c*0.3 ✓. Refs PHASE-3-ROADMAP.md §3.1. Next commit: extend registry with ``eligible_plans_for(...)`` query. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 3.1 commit 2 of 4-6. Adds the async end-to-end pipeline that ties together list-fetch + detail-fetch + geography filter + cheap-rank. - ``fetch_plans_for_retailer(session, retailer, *, cache, detail_delay_sec)`` Pulls plan list (1 call) then plan detail per planId (N calls) with a configurable sleep between detail fetches to respect the documented 25-29 req/s EME proxy budget. Default 0.05s = 20 req/s leaves headroom. Optional ``cache`` dict keyed by planId skips re-fetches across daily runs; coordinator owns TTL semantics. Per-plan failures (``CdrPlanNotFound``, ``CdrAPIError``, ``CdrUnavailable``) are logged and skipped so a single stale planId doesn't sink the whole retailer. Whole-retailer list-failures return empty rather than raising — one bad retailer mustn't block ranking against the others. ``cdr_brand`` discriminator flows through to both underlying calls so shared-base-URI brands (Energy Locals, OVO, Radian, Future X) get disambiguated. - ``rank_alternatives(session, retailers, *, state, postcode, distributor, top_k, cache, detail_delay_sec)`` — top-level orchestrator. Iterates retailers, accumulates eligible PlanDetailV2 bodies, runs ``filter_eligible_plans`` + ``cheap_rank``. Caller decides which retailers to scan; this module doesn't pre-filter the registry by state because EME refdata2 doesn't carry per-retailer region info. Pragmatic v1 callers will pass current retailer + AGL/Origin/EA/Red competitor set. - Constants: ``DEFAULT_DETAIL_DELAY_SEC = 0.05`` (20 req/s). Tests: 12 new orchestrator tests cover happy path, cache reuse, missing planId, retailer-level CDR failures, per-plan failures, brand discriminator propagation, end-to-end ranking across multiple retailers, geography drop, top-K truncation, failed retailer isolation. 41/41 ranking tests pass; 689/689 full suite green. Next commit: deep-rank via evaluator (consumption replay) — narrows the top-K cheap-rank survivors down to true projected savings using the user's actual HA history. Refs PHASE-3-ROADMAP.md §3.1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 3.1 commit 3 of 4-6. Adds the second pass that narrows cheap-rank top-K survivors down to true projected savings using the user's actual HA consumption. - ``deep_rank(plans, slots, *, entry_options)`` — runs the full streaming evaluator (TOU, stepped, controlled load, all 8 retailer incentive parsers) against the user's actual half-hour consumption slots and sorts ascending by ``total_aud_inc_gst``. Returns ``[(plan, breakdown), ...]`` so callers have both the projected cost (for ranking + savings display) and the plan body (for the "switch to X" prompt). Defensive behavior: - Empty slots → empty list. Can't rank without consumption. - Plans whose evaluator returns ``slot_count == 0`` are dropped, not zero-scored (would falsely surface as "free"). Same rationale as cheap_rank's None-drop on malformed plans. - Per-plan evaluator exceptions are logged + skipped so a single poison plan doesn't sink the batch. - ``entry_options`` (opt-in fields like ``ovo_interest_balance_aud``, ``vpp_batteries_enrolled``) flows through so per-plan credit math respects user-supplied data. Tests: 6 new tests cover ordering, breakdown return shape, empty slots, zero-slot drop, evaluator-exception isolation, entry_options propagation. 47/47 ranking tests pass; 695/695 full suite green. Next commit: coordinator hook — async_track_time_change(00:30 local) to run rank_alternatives → deep_rank daily; persist top-K to coordinator data. Refs PHASE-3-ROADMAP.md §3.1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. 🗂️ Base branches to auto review (2)
Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
WalkthroughThis PR adds Phase 3.1, a complete multi-plan CDR ranking engine: geography-aware plan filtering, cheap heuristic scoring (TOU peak + daily supply), async per-retailer fetch with caching, cross-retailer aggregation, and deep re-ranking using consumption projection. Includes 405 new lines of implementation and 756 lines of comprehensive test coverage. ChangesMulti-Plan CDR Ranking Engine
Sequence Diagram(s)sequenceDiagram
participant Caller
participant rank_alternatives
participant fetch_plans_for_retailer
participant plan_list_api
participant plan_detail_api
participant cheap_rank
participant deep_rank
participant evaluate
Caller->>rank_alternatives: retailers, state, postcode, distributor, top_k
loop for each retailer
rank_alternatives->>fetch_plans_for_retailer: retailer, cache
fetch_plans_for_retailer->>plan_list_api: fetch_plan_list
plan_list_api-->>fetch_plans_for_retailer: plan summaries
loop for each summary with planId
fetch_plans_for_retailer->>plan_detail_api: fetch_plan_detail
plan_detail_api-->>fetch_plans_for_retailer: plan details (or error)
end
fetch_plans_for_retailer-->>rank_alternatives: plan list
end
rank_alternatives->>cheap_rank: aggregate filtered by geography
cheap_rank-->>rank_alternatives: top-K by heuristic score
rank_alternatives-->>Caller: cheap-ranked shortlist
Caller->>deep_rank: shortlist, consumption slots
loop for each plan
deep_rank->>evaluate: plan, slots, entry_options
evaluate-->>deep_rank: CostBreakdown (or exception)
end
deep_rank-->>Caller: top-K by total_aud_inc_gst
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Reviewer's GuideIntroduces a new multi-plan ranking engine for CDR plans, including a fast heuristic ranker, an orchestrator to fetch and filter plan data across retailers with caching and rate limiting, and a deep ranking pass that re-scores top candidates using the existing streaming evaluator against HA consumption slots. File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- In
rank_alternativesyou awaitfetch_plans_for_retailerin a loop, which serializes retailer calls; consider running these withasyncio.gather(with an optional concurrency limit) so retailers are fetched in parallel while still respecting per-retailer rate limits. - The rate limiting in
fetch_plans_for_retaileris implemented as a fixeddetail_delay_secsleep per plan; you might want to centralize this into a shared limiter/token-bucket so multiple callers don’t accidentally exceed the global 25 req/s budget. - In
deep_rankyou catch a broadExceptionand log only the message; usingexc_info=True(or equivalent) would make it much easier to diagnose malformed plans that cause evaluator failures without changing runtime behavior.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `rank_alternatives` you await `fetch_plans_for_retailer` in a loop, which serializes retailer calls; consider running these with `asyncio.gather` (with an optional concurrency limit) so retailers are fetched in parallel while still respecting per-retailer rate limits.
- The rate limiting in `fetch_plans_for_retailer` is implemented as a fixed `detail_delay_sec` sleep per plan; you might want to centralize this into a shared limiter/token-bucket so multiple callers don’t accidentally exceed the global 25 req/s budget.
- In `deep_rank` you catch a broad `Exception` and log only the message; using `exc_info=True` (or equivalent) would make it much easier to diagnose malformed plans that cause evaluator failures without changing runtime behavior.
## Individual Comments
### Comment 1
<location path="custom_components/pricehawk/cdr/ranking.py" line_range="391-395" />
<code_context>
+ if not slots:
+ return []
+ scored: list[tuple[Decimal, dict[str, Any], CostBreakdown]] = []
+ for plan in plans:
+ try:
+ bd = evaluate(
+ plan,
+ {"slots": slots},
+ entry_options=entry_options,
+ )
+ except Exception as err: # noqa: BLE001 — one bad plan must not sink the batch
+ _LOGGER.info(
+ "deep_rank: plan %s evaluator raised %s; skipping",
</code_context>
<issue_to_address>
**suggestion:** Using bare `Exception` with `info` logging may hide useful debugging detail when `evaluate` fails.
Catching `Exception` makes sense to keep the batch running, but logging only at `info` without a traceback will make evaluator failures hard to debug. Consider raising the log level to `warning`/`error` or using `_LOGGER.exception` (optionally behind a flag or sampling) so stack traces are available while still swallowing the error to continue ranking.
```suggestion
except Exception as err: # noqa: BLE001 — one bad plan must not sink the batch
_LOGGER.exception(
"deep_rank: plan %s evaluator raised %s; skipping",
plan.get("planId", "?"),
err,
)
```
</issue_to_address>
### Comment 2
<location path="tests/test_cdr_ranking.py" line_range="678-687" />
<code_context>
+class TestDeepRank:
</code_context>
<issue_to_address>
**suggestion (testing):** The evaluator-exception test may be brittle; consider explicitly controlling evaluate()’s failure mode.
In `test_evaluator_exception_doesnt_sink_batch`, the `poison` plan currently depends on the real evaluator raising on `unitPrice: object()`. If the evaluator later handles invalid rates more defensively, this test could still pass without exercising the exception path. Consider patching `evaluate` to raise a synthetic exception for a specific planId (e.g. "poison") and asserting that `deep_rank` still returns the other plan. That would verify the "one bad plan must not sink the batch" behaviour without depending on evaluator internals.
Suggested implementation:
```python
class TestDeepRank:
def test_orders_by_total_projected_cost(self):
"""Lower per-kWh + lower supply must rank cheaper for the same load."""
cheap = _make_full_plan(plan_id="cheap", peak="0.20", supply="0.80")
mid = _make_full_plan(plan_id="mid", peak="0.30", supply="1.00")
expensive = _make_full_plan(plan_id="exp", peak="0.45", supply="1.20")
slots = _consumption_slots(n_days=2, kwh_per_slot=0.5)
ranked = deep_rank([expensive, cheap, mid], slots)
ids = [p.get("planId") for p, _ in ranked]
assert ids == ["cheap", "mid", "exp"]
def test_evaluator_exception_doesnt_sink_batch(self, monkeypatch):
"""A single bad plan must not prevent other plans from being ranked."""
good = _make_full_plan(plan_id="good", peak="0.25", supply="1.00")
poison = _make_full_plan(plan_id="poison", peak="0.30", supply="1.10")
slots = _consumption_slots(n_days=1, kwh_per_slot=1.0)
# Synthetic evaluator: explode only for the poison plan, otherwise defer.
def fake_evaluate(plan, *args, **kwargs):
if plan.get("planId") == "poison":
raise RuntimeError("synthetic evaluator failure")
from cdr_ranking import evaluate as real_evaluate # module path to be confirmed
return real_evaluate(plan, *args, **kwargs)
# Patch the evaluator used by deep_rank so that only the poison plan fails.
monkeypatch.setattr("cdr_ranking.evaluate", fake_evaluate)
ranked = deep_rank([poison, good], slots)
ids = [p.get("planId") for p, _ in ranked]
# The good plan should still be ranked even if the poison plan evaluation fails.
assert ids == ["good"]
```
1. Replace `cdr_ranking` in the `from cdr_ranking import evaluate` import and in `monkeypatch.setattr("cdr_ranking.evaluate", ...)` with the actual module path where both `deep_rank` and `evaluate` are defined (e.g. `myapp.cdr.ranking` or similar).
2. If `evaluate` has a different signature than `(plan, slots, ...)`, adjust the `fake_evaluate` wrapper to match the real signature (keep `*args, **kwargs` to forward any extra parameters unchanged).
3. Ensure that `deep_rank` indeed calls the patched `evaluate` symbol (and not e.g. a differently named helper); if it uses a different name (e.g. `_evaluate_plan`), patch that symbol instead in the `monkeypatch.setattr` call.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Two Sourcery findings on the ranking foundation PR: - ranking.py:395 — ``deep_rank`` exception handler now uses ``_LOGGER.exception(...)`` instead of ``_LOGGER.info(..., err)``. Captures the full traceback so a malformed CDR plan that crashes the evaluator (rare but observed during parser rollout) is actually debuggable without re-running the rank job in verbose mode. ``noqa: BLE001`` retained — broad catch is the contract (one bad plan must not sink the batch). - test_cdr_ranking.py — replaced ``test_evaluator_exception_doesnt_sink_batch`` (which relied on ``object()`` triggering a real evaluator failure on ``unitPrice``) with ``test_evaluator_exception_doesnt_sink_batch_explicit_mock`` that patches ``evaluate`` to raise selectively. Test now exercises the exception-isolation contract directly and survives any future evaluator hardening that would silently no-op the old test. Tests: 47/47 ranking, 695/695 full suite still green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/test_cdr_ranking.py`:
- Around line 147-237: Add a unit test to document and assert behavior when a
negative unitPrice appears: in tests/test_cdr_ranking.py add a method (e.g.,
test_negative_unit_price_handled) that builds a plan via
_make_plan(peak="-0.10", supply="1.00"), calls cheap_rank_score(plan) and
asserts the expected Decimal score (Decimal("23.00")) to ensure
_extract_peak_rate_cents and cheap_rank_score correctly handle negative rates
without raising.
- Around line 716-756: Move the MagicMock import out of the individual test
functions and into the module-level imports so both
test_evaluator_exception_doesnt_sink_batch_explicit_mock and
test_passes_entry_options_through use the shared MagicMock symbol; update the
top-of-file import list to include MagicMock (alongside AsyncMock and patch) and
remove the local "from unittest.mock import MagicMock" lines inside those test
functions so they reference the module-level MagicMock.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 9bccd7ba-f42e-40f9-bb57-d2cee019696c
📒 Files selected for processing (2)
custom_components/pricehawk/cdr/ranking.pytests/test_cdr_ranking.py
📜 Review details
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Useasync/awaitfor all I/O operations
NEVER hardcode tokens, API keys, or credentials in any file — use HA config entry storage
State restore MUST validate storage version before loading
from_dict() methods MUST receive an explicit HA-timezone date — no date.today() fallback
Files:
tests/test_cdr_ranking.pycustom_components/pricehawk/cdr/ranking.py
⚙️ CodeRabbit configuration file
**/*.py: Check for: type hints on all public functions, no bareexcept:, SQL injection risks, missing input sanitisation, secrets not in code, Flask Blueprint structure respected, APScheduler job error handling.
Files:
tests/test_cdr_ranking.pycustom_components/pricehawk/cdr/ranking.py
**/test*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Tariff rate calculation changes require edge case tests (negative rates, midnight boundaries, empty windows)
Files:
tests/test_cdr_ranking.py
🔇 Additional comments (14)
custom_components/pricehawk/cdr/ranking.py (7)
1-82: LGTM!
84-132: LGTM!
135-200: LGTM!
203-235: LGTM!
243-309: LGTM!
312-350: LGTM!
358-405: LGTM!tests/test_cdr_ranking.py (7)
1-70: LGTM!
78-139: LGTM!
245-256: LGTM!
264-302: LGTM!
310-480: LGTM!
483-608: LGTM!
678-714: LGTM!
Two CodeRabbit findings on PR #64: - test_cdr_ranking.py — added ``test_negative_unit_price_scores_negatively`` to satisfy the AEGIS rule: "tariff rate calculation changes require negative rate edge case tests". Documents that ``_extract_peak_rate_cents`` accepts negative unitPrice (rare but observed on AEMO export-only feed-in plans) and that negative-peak plans rank cheaper than positive-peak ones. Sanity-checks the cheap-rank ordering contract under negative input. - test_cdr_ranking.py — hoisted ``MagicMock`` to the module-level import on line 17 (alongside ``AsyncMock`` and ``patch``) and removed the in-function imports at the two TestDeepRank methods that used it. Consistency nit, no behavior change. Tests: 48/48 ranking, 696/696 full suite green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PriceHawk's integration branch is ``dev``, not ``main`` or ``develop``. The current ``auto_review.base_branches`` list omitted ``dev``, so CR auto-skipped every PR opened against it (e.g. #64), requiring a manual ``@coderabbitai review`` trigger. Adding ``dev`` here means future feature PRs against ``dev`` get CR review immediately without manual intervention. Other base branches kept untouched. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(coordinator): Phase 3.1 commit 4 — daily ranking job hook Wires the multi-plan ranking pipeline (from PR #64) into the coordinator on a daily 00:30 local schedule. Cheap-rank only for now; deep-rank (consumption replay) joins in Phase 3.2 when the universal HA-history backfill ships. Architecture ------------ Pure logic in new ``custom_components/pricehawk/cdr/ranking_job.py``: - ``get_user_geography(options)`` — extracts ``(state, postcode, distributor)`` from a config_entry's options. ``distributor`` comes from the user's current cdr_plan's ``geography.distributors[0]`` (the plan the user already accepted, so its distributor is by definition theirs). - ``get_competitor_retailers(session, options, *, competitor_fragments)`` — returns the retailer list scanned daily. Composition: user's CURRENT retailer (via ``cdr_plan.data.brand``) + hardcoded big-4 nationally-active competitors (AGL, Origin, EnergyAustralia, Red Energy). Dedup by brand_id so the user's retailer isn't double-scanned when it's also a big-4. Custom fragment tuple supported for tests + future user-configurable list. - ``run_ranking_job(session, options, *, top_k, plan_cache, competitor_fragments)`` — top-level orchestrator. Returns the cheap-ranked top-K plans. Coordinator-side wrappers ------------------------- - New attributes on PriceHawkCoordinator: - ``_cheap_ranked_alternatives: list[dict]`` - ``_ranking_last_run_at: datetime | None`` - ``_ranking_plan_cache: dict[str, dict]`` - ``_ranking_unsub: CALLBACK_TYPE | None`` - ``schedule_daily_ranking()`` registers ``async_track_time_change`` for ``hour=0, minute=30`` local. Safe to call twice (second call replaces first). - ``cancel_ranking()`` clears the handle. - ``async_run_ranking_job(*, top_k)`` is a thin delegate around ``ranking_job.run_ranking_job``. Owns HA-side concerns: session acquisition, exception swallowing across the daily boundary, state persistence, cache reset. Empty-result runs keep prior ranking (transient failure → don't zero yesterday's data). Why a separate module --------------------- PriceHawkCoordinator inherits from ``DataUpdateCoordinator[dict[str, Any]]`` which gets mocked away by ``tests/conftest.py``. The class is therefore unreachable in unit tests. Extracting the pure logic to ``ranking_job`` keeps it unit-testable while the class methods stay as thin delegates. Tests ----- 17 new tests in ``tests/test_coordinator_ranking.py``: - Module constants (competitor fragments, run-time). - ``get_user_geography``: 6 cases (happy path, missing fields, list ordering, empty options). - ``get_competitor_retailers``: 5 cases (current retailer first, dedup, missing brand, unmatchable fragment, custom override). - ``run_ranking_job``: 4 cases (empty retailers, geography forwarding, cache passthrough, top_k forwarding). Also added ``homeassistant.helpers.aiohttp_client`` to ``conftest.py`` mock list — coordinator.py imports it; without the mock, importing the coordinator silently fell back to a MagicMock class. Full suite: 713/713 pass (was 696 — +17 new). Ruff clean. Refs PHASE-3-ROADMAP.md §3.1. Next commit: HA service ``pricehawk.rank_alternatives`` in ``__init__.py``. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: PR #68 CR findings — cache TTL + distributors type guard Two findings from CodeRabbit and Sourcery on PR #68 (coordinator hook). cache TTL (Sourcery + CR Major, same issue): - coordinator.py — replaced unconditional ``cache.clear()`` at end of ``async_run_ranking_job`` with a date-rollover check at the start. Same-day reruns (scheduled at 00:30, manual via rank_alternatives service) now reuse cached plans as the docstring promised. New local day → cache clears before the run so overnight republished plans get fresh data. - New ``_ranking_cache_date: date | None`` attribute tracks the cache's "as-of" day. distributors type guard (CR Minor): - ranking_job.py — ``get_user_geography`` now type-guards ``distributors`` with ``isinstance(..., list)`` before indexing. Malformed CDR payload could ship distributors as a string (e.g. ``"United Energy"``) and ``[0]`` would silently become ``"U"`` — skewing the ranking filter without raising. Tests: 18/18 ranking tests pass (was 17 + 1 new for type guard). Full suite 714/714. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: PR #68 round-2 CR — nested shape guards + asyncio.Lock Three CodeRabbit findings on the round-1 fixes (f998d16): 1. ranking_job.py — extracted ``_safe_plan_data(options)`` helper that walks ``cdr_plan["data"]`` with isinstance gates at every level. Previously ``cdr_plan["data"]`` was assumed to be a dict; a malformed payload could ship it as a string (e.g. ``cdr_plan = {"data": "broken"}``) and trigger AttributeError on ``.get("geography")``. Same for ``brand`` — now ``isinstance(..., str)`` before ``.strip()``. ``geography`` non-dict also guarded in ``get_user_geography``. First-distributor-must-be-str check added to prevent dict / int from sneaking through as a distributor filter value. 2. coordinator.py — added ``self._ranking_lock = asyncio.Lock()`` to serialise concurrent runs. Scheduled 00:30 callback + manual service trigger could enter ``async_run_ranking_job`` simultaneously, interleaving ``_ranking_plan_cache`` mutations and duplicating every expensive CDR detail fetch. The lock means the second caller blocks briefly then returns the freshly-populated cache results. 3. tests/test_coordinator_ranking.py — added 4 new tests for non-dict ``cdr_plan``, non-dict ``data``, non-dict ``geography``, and non-string first distributor. Covers the AttributeError regression paths CR flagged. Tests: 22/22 ranking + 722/722 full suite green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ifecycle (#72) * feat(coordinator): Phase 3.1 commit 4 — daily ranking job hook Wires the multi-plan ranking pipeline (from PR #64) into the coordinator on a daily 00:30 local schedule. Cheap-rank only for now; deep-rank (consumption replay) joins in Phase 3.2 when the universal HA-history backfill ships. Architecture ------------ Pure logic in new ``custom_components/pricehawk/cdr/ranking_job.py``: - ``get_user_geography(options)`` — extracts ``(state, postcode, distributor)`` from a config_entry's options. ``distributor`` comes from the user's current cdr_plan's ``geography.distributors[0]`` (the plan the user already accepted, so its distributor is by definition theirs). - ``get_competitor_retailers(session, options, *, competitor_fragments)`` — returns the retailer list scanned daily. Composition: user's CURRENT retailer (via ``cdr_plan.data.brand``) + hardcoded big-4 nationally-active competitors (AGL, Origin, EnergyAustralia, Red Energy). Dedup by brand_id so the user's retailer isn't double-scanned when it's also a big-4. Custom fragment tuple supported for tests + future user-configurable list. - ``run_ranking_job(session, options, *, top_k, plan_cache, competitor_fragments)`` — top-level orchestrator. Returns the cheap-ranked top-K plans. Coordinator-side wrappers ------------------------- - New attributes on PriceHawkCoordinator: - ``_cheap_ranked_alternatives: list[dict]`` - ``_ranking_last_run_at: datetime | None`` - ``_ranking_plan_cache: dict[str, dict]`` - ``_ranking_unsub: CALLBACK_TYPE | None`` - ``schedule_daily_ranking()`` registers ``async_track_time_change`` for ``hour=0, minute=30`` local. Safe to call twice (second call replaces first). - ``cancel_ranking()`` clears the handle. - ``async_run_ranking_job(*, top_k)`` is a thin delegate around ``ranking_job.run_ranking_job``. Owns HA-side concerns: session acquisition, exception swallowing across the daily boundary, state persistence, cache reset. Empty-result runs keep prior ranking (transient failure → don't zero yesterday's data). Why a separate module --------------------- PriceHawkCoordinator inherits from ``DataUpdateCoordinator[dict[str, Any]]`` which gets mocked away by ``tests/conftest.py``. The class is therefore unreachable in unit tests. Extracting the pure logic to ``ranking_job`` keeps it unit-testable while the class methods stay as thin delegates. Tests ----- 17 new tests in ``tests/test_coordinator_ranking.py``: - Module constants (competitor fragments, run-time). - ``get_user_geography``: 6 cases (happy path, missing fields, list ordering, empty options). - ``get_competitor_retailers``: 5 cases (current retailer first, dedup, missing brand, unmatchable fragment, custom override). - ``run_ranking_job``: 4 cases (empty retailers, geography forwarding, cache passthrough, top_k forwarding). Also added ``homeassistant.helpers.aiohttp_client`` to ``conftest.py`` mock list — coordinator.py imports it; without the mock, importing the coordinator silently fell back to a MagicMock class. Full suite: 713/713 pass (was 696 — +17 new). Ruff clean. Refs PHASE-3-ROADMAP.md §3.1. Next commit: HA service ``pricehawk.rank_alternatives`` in ``__init__.py``. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: PR #68 CR findings — cache TTL + distributors type guard Two findings from CodeRabbit and Sourcery on PR #68 (coordinator hook). cache TTL (Sourcery + CR Major, same issue): - coordinator.py — replaced unconditional ``cache.clear()`` at end of ``async_run_ranking_job`` with a date-rollover check at the start. Same-day reruns (scheduled at 00:30, manual via rank_alternatives service) now reuse cached plans as the docstring promised. New local day → cache clears before the run so overnight republished plans get fresh data. - New ``_ranking_cache_date: date | None`` attribute tracks the cache's "as-of" day. distributors type guard (CR Minor): - ranking_job.py — ``get_user_geography`` now type-guards ``distributors`` with ``isinstance(..., list)`` before indexing. Malformed CDR payload could ship distributors as a string (e.g. ``"United Energy"``) and ``[0]`` would silently become ``"U"`` — skewing the ranking filter without raising. Tests: 18/18 ranking tests pass (was 17 + 1 new for type guard). Full suite 714/714. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: PR #68 round-2 CR — nested shape guards + asyncio.Lock Three CodeRabbit findings on the round-1 fixes (f998d16): 1. ranking_job.py — extracted ``_safe_plan_data(options)`` helper that walks ``cdr_plan["data"]`` with isinstance gates at every level. Previously ``cdr_plan["data"]`` was assumed to be a dict; a malformed payload could ship it as a string (e.g. ``cdr_plan = {"data": "broken"}``) and trigger AttributeError on ``.get("geography")``. Same for ``brand`` — now ``isinstance(..., str)`` before ``.strip()``. ``geography`` non-dict also guarded in ``get_user_geography``. First-distributor-must-be-str check added to prevent dict / int from sneaking through as a distributor filter value. 2. coordinator.py — added ``self._ranking_lock = asyncio.Lock()`` to serialise concurrent runs. Scheduled 00:30 callback + manual service trigger could enter ``async_run_ranking_job`` simultaneously, interleaving ``_ranking_plan_cache`` mutations and duplicating every expensive CDR detail fetch. The lock means the second caller blocks briefly then returns the freshly-populated cache results. 3. tests/test_coordinator_ranking.py — added 4 new tests for non-dict ``cdr_plan``, non-dict ``data``, non-dict ``geography``, and non-string first distributor. Covers the AttributeError regression paths CR flagged. Tests: 22/22 ranking + 722/722 full suite green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(integration): Phase 3.1 commit 5 — rank_alternatives HA service + lifecycle Wires the ranking job into the integration's lifecycle and exposes a manual-trigger HA service. Completes Phase 3.1's user-visible plumbing (sensor + dashboard exposure is commit 6). Lifecycle wiring ---------------- - ``async_setup_entry`` now calls ``coordinator.schedule_daily_ranking()`` to register the 00:30 local-time callback, plus ``hass.async_create_task`` for an immediate first run so the alternatives data isn't empty until midnight on a fresh install. - ``async_unload_entry`` calls ``coordinator.cancel_ranking()`` mirroring the existing ``cancel_persist()`` pattern. HA service: pricehawk.rank_alternatives --------------------------------------- - ``services.yaml`` declares the service with a single ``top_k`` int field (1-100, default 20). - Handler clamps top_k to [1, 100] and delegates to ``coordinator.async_run_ranking_job(top_k=...)``. - Most useful right after a plan switch: lets the user force-run the ranking pipeline so the alternatives sensor reflects the new distributor / postcode without waiting for the next 00:30 schedule. - Unregistered alongside the other two services in ``async_unload_entry`` when the last config entry goes away. No tests -------- The service handler is a closure inside ``async_setup_entry`` (tightly coupled to HA's service registration plumbing — can't be unit-tested without HA itself). The underlying ``async_run_ranking_job`` is already covered by ``tests/test_coordinator_ranking.py`` (17 tests via the ``ranking_job`` module). Live smoke test will validate the wiring once the integration runs in HA. Full suite: 713/713 still passing. Ruff clean. services.yaml validates. Refs PHASE-3-ROADMAP.md §3.1. Next (optional commit 6): sensor for ranked alternatives + dashboard exposure. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: PR #69 CR finding — handle invalid top_k payload defensively CR found that ``int(call.data.get("top_k", 20))`` raises ValueError/TypeError on malformed service data (e.g. a YAML typo sending ``top_k: "abc"``). The exception propagates back to the service caller and fails the run instead of degrading gracefully. Now wraps the coercion in try/except, logs a warning with the bad value, and falls back to the default 20. Clamp [1, 100] still applied after. Same kind of defence as elsewhere in the codebase (``safe_int``, ``safe_decimal`` for incentive parsers). Tests: 714/714 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
First 3 of 4-6 planned commits for Phase 3.1 (multi-plan ranking engine).
Pure additive module: no existing code touched, no behavior change at runtime
yet. Provides the building blocks the coordinator hook + HA service will use
in the next PR.
What was broken
Nothing was broken. Before this PR, PriceHawk could only compute cost for the
user's CURRENT plan + Amber wholesale comparator. It had no machinery to
answer the question that motivates the whole project: "what plan should
I switch to?"
What this fixes
Adds the pure-logic foundation for answering that question:
(
peak_rate * 0.7 + daily_supply * 0.3) so we can narrow ~300+residential plans per state down to a top-K in O(n) without firing
the heavy evaluator on every plan.
rate limiting + caching, applies geography filter, runs cheap-rank.
using the user's actual HA consumption replay through the streaming
evaluator.
No coordinator wiring yet — that's the next PR.
Test plan
pytest tests/test_cdr_ranking.py -v— 47/47 passpytest -q— 695/695 full suite pass (633 baseline + 47 new + 15drive-by from prior PRs already on dev)
ruff check custom_components/pricehawk/cdr/ranking.py tests/test_cdr_ranking.py— cleanGLO731031MR@VECfixture: cheap_rank_scorereturns 56.7 cents (matches manual calc: 36c peak * 0.7 + 105c
supply * 0.3)
Changes
New files
custom_components/pricehawk/cdr/ranking.pytests/test_cdr_ranking.pyCommits
24d3643—cheap_rank_score+matches_geography+filter_eligible_plans+cheap_rank(29 tests)07c0459—fetch_plans_for_retailer+rank_alternatives(12 tests; rate-limit + cache + failure-isolation)540602e—deep_rank(6 tests; evaluator integration, zero-slot drop, entry_options passthrough)Why
Per
.planning/PHASE-3-ROADMAP.md§3.1. This is the first of 5 sub-phases(3.0 already shipped in v1.5.0; 3.2–3.5 follow). Foundation is pure-logic
so it can be tested + reviewed in isolation before wiring it into the
coordinator (next PR) where the blast radius is bigger.
Heuristic design choice: peak-rate-dominated because peak rate drives ~80% of
the bill for most AU households; supply charge is a meaningful but smaller
fraction. Both terms in cents (peak c/kWh, supply c/day) to keep the same
numeric scale (~30 vs ~100). All ex-GST since all plans share the multiplier;
ranking order is preserved.
Defensive-by-default everywhere:
DROPPED from results, not zero-scored. Otherwise garbage would surface as
"cheapest" and mislead users.
logged + skipped so one bad plan doesn't sink a retailer's batch.
bad retailer doesn't block ranking against the others.
Breaking Changes
None. Pure additive — no existing code path touches the new module yet.
Runtime behavior unchanged on
dev.Files Changed
custom_components/pricehawk/cdr/ranking.py(+401, new)tests/test_cdr_ranking.py(+758, new)🤖 Generated with Claude Code
Summary by Sourcery
Introduce a new multi-plan ranking module for CDR plans, including cheap heuristic scoring, orchestration across retailers, and deep ranking by projected cost, along with comprehensive tests.
New Features:
Tests:
Summary
Added a new multi-plan CDR ranking module (Phase 3.1) that provides three primary components for scoring and ranking CDR electricity plans without modifying existing code paths:
Changes Made
Geography Matching: Implemented
matches_geography()to filter plans based on included/excluded postcodes, distributor matching, state, and nationwide fallback logic.Cheap-Rank Heuristic Scorer: Added
cheap_rank_score()and supporting functions to quickly score plans using a weighted formula (peak_rate_cents × 0.7 + daily_supply_cents × 0.3), with robust handling of malformed plans (dropped from results). Filters eligible plans viafilter_eligible_plans()and ranks them viacheap_rank()to produce a top-K shortlist.Async Orchestrator: Implemented
fetch_plans_for_retailer()andrank_alternatives()to:planIdDeep-Rank Re-Scorer: Added
deep_rank()to re-rank the cheap-rank shortlist using the streaming evaluator against provided consumption slots, projecting true costs and filtering out plans with zero billable slots. Evaluator exceptions are caught per-plan to prevent one failure from failing the batch.Comprehensive Test Suite: Added 47 new tests covering geography matching, heuristic scoring, bulk filtering, orchestration/caching, cross-retailer ranking, and deep-rank/evaluator behavior. All 695 tests passing locally. Smoke test validates
cheap_rank_scoreon fixture GLO731031MR@VEC returns expected 56.7 cents.Why These Changes
These components form the foundation for intelligent multi-plan comparison and selection in the CDR system. The staged approach (fast heuristic → detailed evaluation) optimizes performance while maintaining accuracy.
Breaking Changes
None. The module is purely additive and not yet integrated into the coordinator. Coordinator-level integration deferred to subsequent PR(s).