Skip to content

feat(coordinator): Phase 3.1 commit 4 — daily ranking job hook#68

Merged
Artic0din merged 3 commits into
devfrom
phase-3-1-coordinator-hook
May 17, 2026
Merged

feat(coordinator): Phase 3.1 commit 4 — daily ranking job hook#68
Artic0din merged 3 commits into
devfrom
phase-3-1-coordinator-hook

Conversation

@Artic0din
Copy link
Copy Markdown
Owner

@Artic0din Artic0din commented May 16, 2026

Summary

Commit 4 of Phase 3.1. Wires the ranking foundation (PR #64) into the coordinator on a daily 00:30 local schedule. Cheap-rank only; deep-rank deferred to Phase 3.2 (needs HA-history backfill).

What was broken

The ranking foundation (cdr/ranking.py) was merged in #64 but had no consumer. Nothing scheduled it; nothing persisted results. Just inert library code.

What this fixes

  1. schedule_daily_ranking() registers async_track_time_change at 00:30 local. Daily job runs after midnight rollover so today's daily_cost_history is final.
  2. async_run_ranking_job() orchestrates get_competitor_retailersrank_alternatives and persists top-K to coordinator._cheap_ranked_alternatives.
  3. Plan-detail cache keyed by planId so daily reruns skip unchanged plans.
  4. Empty-result runs preserve prior ranking (transient failure → don't zero yesterday's data).

Still not consumer-visible — sensor wiring + lifecycle invocation come in commits 5-6.

Test plan

  • pytest tests/test_coordinator_ranking.py -v — 17/17 pass
  • pytest -q — 713/713 full suite pass (was 696 + 17 new)
  • ruff check — clean
  • Coordinator integration test deferred until commit 5 wires lifecycle
  • Live CDR endpoint smoke test deferred until commit 5

Changes

New files

File Lines Purpose
custom_components/pricehawk/cdr/ranking_job.py 152 Pure-logic orchestration (testable without HA)
tests/test_coordinator_ranking.py 247 17 tests: geography, retailers, run_job

Modified

File Δ Purpose
custom_components/pricehawk/coordinator.py +84 / -1 New attrs + 3 method delegates
tests/conftest.py +2 Mock homeassistant.helpers.aiohttp_client

Coordinator-side additions

# __init__:
self._cheap_ranked_alternatives: list[dict] = []
self._ranking_last_run_at: datetime | None = None
self._ranking_plan_cache: dict[str, dict] = {}
self._ranking_unsub: CALLBACK_TYPE | None = None

# Methods (thin delegates to ranking_job module):
def schedule_daily_ranking(self) -> None
def cancel_ranking(self) -> None
async def async_run_ranking_job(self, *, top_k=20) -> list[dict]

Why

Per .planning/PHASE-3-ROADMAP.md §3.1. The roadmap calls for both cheap-rank and deep-rank in Phase 3.1, but deep-rank needs HA-history backfill (Phase 3.2) to have meaningful consumption to rank against. Shipping cheap-rank-only now means users see ranked alternatives by daily supply + peak rate immediately; deep-rank slots in as a refinement once Phase 3.2 lands.

Pure logic lives in cdr/ranking_job.py (not coordinator.py) because PriceHawkCoordinator inherits from DataUpdateCoordinator[dict[str, Any]] which gets mocked away by conftest. Class is unreachable in tests. Module functions stay testable; class methods are thin delegates.

Breaking Changes

None. Pure additive on coordinator — no existing attribute or method touched. Schedule isn't wired to lifecycle yet (commit 5), so runtime behavior unchanged on dev.

Files Changed

  • custom_components/pricehawk/cdr/ranking_job.py (+152, new)
  • custom_components/pricehawk/coordinator.py (+84, -1)
  • tests/test_coordinator_ranking.py (+247, new)
  • tests/conftest.py (+2)

🤖 Generated with Claude Code

Summary by Sourcery

Integrate a daily multi-plan ranking job into the coordinator and introduce a testable orchestration layer for ranking competitors based on user geography and configuration.

New Features:

  • Add a coordinator-managed daily ranking job scheduled for 00:30 local time that computes and persists top-K cheaper alternatives.
  • Introduce a ranking_job orchestration module that derives user geography, resolves competitor retailers, and runs the cheap-rank pipeline over the retailer registry.

Enhancements:

  • Extend the coordinator with state for cached ranked alternatives, last run time, and a per-plan cache to avoid re-fetching unchanged plans while handling empty-result runs gracefully.

Tests:

  • Add unit tests for ranking_job orchestration covering module constants, user geography extraction, competitor retailer selection, and end-to-end run_ranking_job behavior.
  • Update test configuration to mock aiohttp_client to support the new ranking job tests.
  • What changed

    • Schedules and wires a daily "cheap-ranking" job into the coordinator to run at 00:30 local time.
    • Adds a pure-orchestration ranking job module (custom_components/pricehawk/cdr/ranking_job.py) that:
      • Safely extracts user geography (postcode, distributor) from config-entry options with defensive guards for malformed CDR payloads.
      • Builds a deduplicated retailer list (current retailer + competitor fragments, default "big-4") and resolves competitors via the registry.
      • Calls the ranking pipeline (rank_alternatives) with top_k and an optional per-plan cache; returns [] when no retailers resolved.
    • Coordinator additions (custom_components/pricehawk/coordinator.py):
      • New ranking state: _cheap_ranked_alternatives (list), _ranking_last_run_at (datetime | None), _ranking_plan_cache (dict), _ranking_cache_date (date | None), _ranking_unsub (CALLBACK_TYPE | None), and a new asyncio.Lock() _ranking_lock to serialize runs.
      • New public delegate methods: schedule_daily_ranking(), cancel_ranking(), async_run_ranking_job(top_k=DEFAULT_TOP_K).
      • schedule_daily_ranking registers async_track_time_change at 00:30 local (replacing any prior schedule); cancel_ranking clears the subscription.
      • async_run_ranking_job resets per-plan cache on local date-rollover, invokes run_ranking_job with coordinator options and cache, persists non-empty results and last-run timestamp, preserves prior results when a run returns empty, and logs/returns prior alternatives on pipeline exceptions.
    • Robustness and cache behavior (follow-up):
      • Added _safe_plan_data and defensive isinstance guards to avoid AttributeError on malformed CDR payloads (non-dict shapes, non-list distributors, non-string distributor values).
      • Introduced self._ranking_lock = asyncio.Lock() to serialize concurrent runs and prevent concurrent mutations of the plan cache.
      • Cache clearing changed to date-rollover semantics so same-day reruns reuse fetched plan details.
    • Tests and test infra:
      • New tests: tests/test_coordinator_ranking.py (geography extraction, competitor composition, orchestration, parameter forwarding, empty-registry behavior, and guarded failure paths).
      • tests/conftest.py updated to register homeassistant.helpers.aiohttp_client mock for test imports.
      • Local full test suite reported passing; ranking tests expanded after follow-up fixes.
  • Why

    • To run the previously inert ranking library on a daily basis and persist cheap-ranking results in the coordinator for downstream features and future sensor exposure (lifecycle/sensor wiring deferred to later commits).
    • To avoid redundant work on same-day reruns by caching plan details per planId and only refreshing on local date rollover.
    • To harden the orchestration against malformed CDR payloads and concurrent runs.
  • Breaking changes

    • None. Scheduling and orchestration are added but not yet wired into runtime lifecycle; no public API breaking changes.
  • Files changed

File Lines Added Lines Removed
custom_components/pricehawk/cdr/ranking_job.py 169 0
custom_components/pricehawk/coordinator.py 115 2
tests/test_coordinator_ranking.py 300 0
tests/conftest.py 2 0
Total 586 2

Review Change Stack

Wires the multi-plan ranking pipeline (from PR #64) into the
coordinator on a daily 00:30 local schedule. Cheap-rank only for
now; deep-rank (consumption replay) joins in Phase 3.2 when the
universal HA-history backfill ships.

Architecture
------------

Pure logic in new ``custom_components/pricehawk/cdr/ranking_job.py``:

- ``get_user_geography(options)`` — extracts ``(state, postcode,
  distributor)`` from a config_entry's options. ``distributor``
  comes from the user's current cdr_plan's
  ``geography.distributors[0]`` (the plan the user already accepted,
  so its distributor is by definition theirs).

- ``get_competitor_retailers(session, options, *,
  competitor_fragments)`` — returns the retailer list scanned daily.
  Composition: user's CURRENT retailer (via ``cdr_plan.data.brand``)
  + hardcoded big-4 nationally-active competitors (AGL, Origin,
  EnergyAustralia, Red Energy). Dedup by brand_id so the user's
  retailer isn't double-scanned when it's also a big-4. Custom
  fragment tuple supported for tests + future user-configurable list.

- ``run_ranking_job(session, options, *, top_k, plan_cache,
  competitor_fragments)`` — top-level orchestrator. Returns the
  cheap-ranked top-K plans.

Coordinator-side wrappers
-------------------------

- New attributes on PriceHawkCoordinator:
  - ``_cheap_ranked_alternatives: list[dict]``
  - ``_ranking_last_run_at: datetime | None``
  - ``_ranking_plan_cache: dict[str, dict]``
  - ``_ranking_unsub: CALLBACK_TYPE | None``

- ``schedule_daily_ranking()`` registers ``async_track_time_change``
  for ``hour=0, minute=30`` local. Safe to call twice (second call
  replaces first).
- ``cancel_ranking()`` clears the handle.
- ``async_run_ranking_job(*, top_k)`` is a thin delegate around
  ``ranking_job.run_ranking_job``. Owns HA-side concerns: session
  acquisition, exception swallowing across the daily boundary,
  state persistence, cache reset. Empty-result runs keep prior
  ranking (transient failure → don't zero yesterday's data).

Why a separate module
---------------------

PriceHawkCoordinator inherits from
``DataUpdateCoordinator[dict[str, Any]]`` which gets mocked away by
``tests/conftest.py``. The class is therefore unreachable in unit
tests. Extracting the pure logic to ``ranking_job`` keeps it
unit-testable while the class methods stay as thin delegates.

Tests
-----

17 new tests in ``tests/test_coordinator_ranking.py``:

- Module constants (competitor fragments, run-time).
- ``get_user_geography``: 6 cases (happy path, missing fields, list
  ordering, empty options).
- ``get_competitor_retailers``: 5 cases (current retailer first,
  dedup, missing brand, unmatchable fragment, custom override).
- ``run_ranking_job``: 4 cases (empty retailers, geography
  forwarding, cache passthrough, top_k forwarding).

Also added ``homeassistant.helpers.aiohttp_client`` to ``conftest.py``
mock list — coordinator.py imports it; without the mock, importing
the coordinator silently fell back to a MagicMock class.

Full suite: 713/713 pass (was 696 — +17 new). Ruff clean.

Refs PHASE-3-ROADMAP.md §3.1. Next commit: HA service
``pricehawk.rank_alternatives`` in ``__init__.py``.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@sourcery-ai
Copy link
Copy Markdown

sourcery-ai Bot commented May 16, 2026

Reviewer's Guide

Wires the previously standalone ranking foundation into the coordinator via a daily 00:30 scheduled job, implementing HA-independent orchestration logic in a new ranking_job module, adding coordinator state and wrappers for scheduling and execution, and covering the behavior with unit tests for geography, retailer selection, and ranking job orchestration.

File-Level Changes

Change Details Files
Introduce HA-independent ranking job orchestration that drives competitor discovery and ranking using existing ranking logic.
  • Add get_user_geography helper to extract postcode and distributor from config options.
  • Add get_competitor_retailers to build a prioritized, de-duplicated list of the user’s current retailer plus big-4 competitors from the registry.
  • Add run_ranking_job to glue together retailer resolution, geography extraction, and rank_alternatives, with support for top_k and a plan cache.
custom_components/pricehawk/cdr/ranking_job.py
Extend the coordinator with state, scheduling, and execution hooks for the daily ranking job.
  • Define constants for the daily ranking run time (00:30) in local time.
  • Add coordinator attributes for cached ranked alternatives, last run timestamp, per-plan cache, and unsubscribe handle.
  • Implement schedule_daily_ranking using async_track_time_change, with idempotent rescheduling behavior.
  • Implement cancel_ranking to tear down the scheduled callback safely.
  • Implement async_run_ranking_job to invoke run_ranking_job with HA session and options, handle exceptions by preserving previous results, clear the plan cache after successful runs, and update persisted ranking and last-run timestamp only when non-empty results are returned.
custom_components/pricehawk/coordinator.py
Adjust test scaffolding and add tests for ranking job orchestration, geography, and competitor selection.
  • Extend the Home Assistant mocks to include aiohttp_client for client-session acquisition in coordinator code.
  • Add tests verifying competitor fragment constants and the configured ranking run time.
  • Add tests for get_user_geography across present/missing/ambiguous data cases.
  • Add tests for get_competitor_retailers covering ordering, de-duplication, missing brands, unmatched fragments, and custom fragment overrides.
  • Add tests for run_ranking_job covering empty retailers, geography forwarding, plan cache passthrough, and top_k forwarding.
tests/conftest.py
tests/test_coordinator_ranking.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 16, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 16528a09-25ee-4a62-804f-0e2d14b3343f

📥 Commits

Reviewing files that changed from the base of the PR and between f998d16 and 60153a3.

📒 Files selected for processing (3)
  • custom_components/pricehawk/cdr/ranking_job.py
  • custom_components/pricehawk/coordinator.py
  • tests/test_coordinator_ranking.py
📜 Recent review details
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Use async/await for all I/O operations
NEVER hardcode tokens, API keys, or credentials in any file — use HA config entry storage
State restore MUST validate storage version before loading
from_dict() methods MUST receive an explicit HA-timezone date — no date.today() fallback

Files:

  • tests/test_coordinator_ranking.py
  • custom_components/pricehawk/coordinator.py
  • custom_components/pricehawk/cdr/ranking_job.py

⚙️ CodeRabbit configuration file

**/*.py: Check for: type hints on all public functions, no bare except:, SQL injection risks, missing input sanitisation, secrets not in code, Flask Blueprint structure respected, APScheduler job error handling.

Files:

  • tests/test_coordinator_ranking.py
  • custom_components/pricehawk/coordinator.py
  • custom_components/pricehawk/cdr/ranking_job.py
**/test*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Tariff rate calculation changes require edge case tests (negative rates, midnight boundaries, empty windows)

Files:

  • tests/test_coordinator_ranking.py
🔇 Additional comments (3)
custom_components/pricehawk/cdr/ranking_job.py (1)

31-36: LGTM!

Also applies to: 39-67, 70-82, 84-127, 129-170

custom_components/pricehawk/coordinator.py (1)

6-7: LGTM!

Also applies to: 16-17, 49-50, 67-73, 256-277, 1195-1272

tests/test_coordinator_ranking.py (1)

1-301: LGTM!


Walkthrough

Adds Phase 3.1 daily multi-plan ranking: a new ranking_job module extracts user geography, builds prioritized/deduped retailer lists, and calls rank_alternatives. The coordinator schedules daily runs at 00:30, caches per-plan data with day rollover, persists top-K results, and handles failures safely. Tests validate parsing, composition, and orchestration.

Changes

Phase 3.1 Daily Ranking

Layer / File(s) Summary
Ranking job module and orchestration
custom_components/pricehawk/cdr/ranking_job.py
New module with DEFAULT_COMPETITOR_BRAND_FRAGMENTS constant, get_user_geography() to extract postcode and distributor from options, _safe_plan_data() for defensive plan access, get_competitor_retailers() to build a prioritized deduped retailer list (current retailer first, then competitors by fragment match), and run_ranking_job() to orchestrate the full pipeline and forward parameters to rank_alternatives.
Coordinator scheduling and execution
custom_components/pricehawk/coordinator.py
Adds _RANKING_RUN_HOUR/_MINUTE (00:30), imports async_track_time_change, DEFAULT_TOP_K, and run_ranking_job. New coordinator state: cached top-K alternatives, last-run timestamp, per-plan plan cache with local-day rollover, unsubscribe handle, and an async lock. Implements schedule_daily_ranking(), cancel_ranking(), and async_run_ranking_job() with exception safety and cache rollover.
Test infrastructure and validation
tests/conftest.py, tests/test_coordinator_ranking.py
Registers mocked homeassistant.helpers.aiohttp_client in conftest. New tests exercise coordinator constants, get_user_geography (including malformed inputs), get_competitor_retailers (ordering, deduplication, fallback, fragment override), and run_ranking_job orchestration (empty retailers, forwarding postcode/distributor/top_k, and plan_cache passthrough).

Sequence Diagram

sequenceDiagram
  participant Coordinator as PriceHawkCoordinator
  participant RankingJob as custom_components.pricehawk.cdr.ranking_job
  participant Registry as RetailerRegistry (get_registry)
  participant RankService as rank_alternatives
  Coordinator->>RankingJob: run_ranking_job(session, options, top_k, plan_cache)
  RankingJob->>RankingJob: get_user_geography(options)
  RankingJob->>Registry: get_registry(session)
  Registry-->>RankingJob: retailer endpoints
  RankingJob->>RankingJob: get_competitor_retailers(endpoints, fragments)
  RankingJob->>RankService: rank_alternatives(state, postcode, distributor, top_k, cache)
  RankService-->>RankingJob: ranked alternatives
  RankingJob-->>Coordinator: results or []
  alt Non-empty results
    Coordinator->>Coordinator: persist alternatives, update last_run
  else Empty or exception
    Coordinator->>Coordinator: log, return prior cached alternatives
  end
  Coordinator->>Coordinator: clear plan_cache on day rollover
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • Artic0din/ha-pricehawk#64: Ranking engine PR providing rank_alternatives and DEFAULT_TOP_K, which this orchestration layer imports and invokes.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 52.94% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed Title clearly describes the main change: introducing a daily ranking job hook into the coordinator as part of Phase 3.1, which is the primary objective of this PR.
Description check ✅ Passed Description covers summary, changes, test plan, and breaking changes; all sections are filled. Some template checkboxes remain unchecked but are appropriately addressed (lifecycle/smoke tests deferred to commits 5-6).
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch phase-3-1-coordinator-hook
  • 🛠️ scrub-secrets
  • 🛠️ no-hardcoded-rates
  • 🛠️ amber-api-limits
  • 🛠️ dashboard-protocol-safety

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • In the tests you repeatedly use asyncio.run(...) to invoke async functions; consider marking the tests as async (e.g. via pytest.mark.asyncio or HA’s async test helpers) and awaiting directly to avoid potential event-loop nesting issues and to better match how the code runs under Home Assistant.
  • In async_run_ranking_job, _ranking_last_run_at is only updated when ranked is non-empty, so a run that successfully completes but yields an empty result will appear as if it never ran; if that’s not intentional, consider updating the timestamp whenever the pipeline finishes without raising, regardless of whether any alternatives were found.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In the tests you repeatedly use `asyncio.run(...)` to invoke async functions; consider marking the tests as async (e.g. via `pytest.mark.asyncio` or HA’s async test helpers) and `await`ing directly to avoid potential event-loop nesting issues and to better match how the code runs under Home Assistant.
- In `async_run_ranking_job`, `_ranking_last_run_at` is only updated when `ranked` is non-empty, so a run that successfully completes but yields an empty result will appear as if it never ran; if that’s not intentional, consider updating the timestamp whenever the pipeline finishes without raising, regardless of whether any alternatives were found.

## Individual Comments

### Comment 1
<location path="custom_components/pricehawk/coordinator.py" line_range="1250-1252" />
<code_context>
+            _LOGGER.info(
+                "ranking: persisted %d alternative(s)", len(ranked),
+            )
+        # Reset the cache for tomorrow's run — plans may republish
+        # overnight and we want fresh data daily.
+        self._ranking_plan_cache.clear()
+        return ranked or self._cheap_ranked_alternatives
+
</code_context>
<issue_to_address>
**issue:** Cache clearing semantics conflict with the stated per-day TTL and reduce reuse for same-day manual runs.

Clearing `_ranking_plan_cache` after every successful run means same-day manual reruns won’t reuse cached plans; the effective TTL is per run, not per day. If the intent is same-day reuse (scheduled + manual), either keep the cache across runs and only clear it when the date changes, or update the docstring to describe a per-run cache so behavior and documentation match.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread custom_components/pricehawk/coordinator.py Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@custom_components/pricehawk/cdr/ranking_job.py`:
- Around line 54-56: The code assumes geo.get("distributors") is a list and does
distributors[0] unconditionally; validate the type first to avoid using a
string/dict as a distributor. Update the block around distributors =
geo.get("distributors") or [] so you only pick distributors[0] when
isinstance(distributors, list) and len(distributors) > 0, otherwise set
distributor = None (or handle a dict case explicitly if your domain requires
it). Ensure the change affects the return tuple (currently return None,
postcode, distributor) so malformed distributor values are not propagated into
ranking filters.

In `@custom_components/pricehawk/coordinator.py`:
- Around line 1250-1253: The code currently unconditionally calls
self._ranking_plan_cache.clear() at the end of the ranking run which prevents
subsequent async_run_ranking_job invocations from reusing cached plan details;
remove that unconditional clear and instead only clear the cache when the
calendar day rolls over (or via an explicit cache invalidation API). Concretely:
delete the self._ranking_plan_cache.clear() call in the function that returns
"ranked or self._cheap_ranked_alternatives", add a persistent marker (e.g.
self._last_ranking_date) updated to date.today() each successful run in
async_run_ranking_job, and only call self._ranking_plan_cache.clear() when
date.today() != self._last_ranking_date (or expose a clear_ranking_plan_cache()
method invoked by a daily scheduler). Update code to still return "ranked or
self._cheap_ranked_alternatives" as before.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 5b29680b-c8c2-4ec2-bad1-1e2335d766d0

📥 Commits

Reviewing files that changed from the base of the PR and between 0b885c2 and bfc89a1.

📒 Files selected for processing (4)
  • custom_components/pricehawk/cdr/ranking_job.py
  • custom_components/pricehawk/coordinator.py
  • tests/conftest.py
  • tests/test_coordinator_ranking.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: CodeRabbit Review Gate
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Use async/await for all I/O operations
NEVER hardcode tokens, API keys, or credentials in any file — use HA config entry storage
State restore MUST validate storage version before loading
from_dict() methods MUST receive an explicit HA-timezone date — no date.today() fallback

Files:

  • tests/conftest.py
  • tests/test_coordinator_ranking.py
  • custom_components/pricehawk/cdr/ranking_job.py
  • custom_components/pricehawk/coordinator.py

⚙️ CodeRabbit configuration file

**/*.py: Check for: type hints on all public functions, no bare except:, SQL injection risks, missing input sanitisation, secrets not in code, Flask Blueprint structure respected, APScheduler job error handling.

Files:

  • tests/conftest.py
  • tests/test_coordinator_ranking.py
  • custom_components/pricehawk/cdr/ranking_job.py
  • custom_components/pricehawk/coordinator.py
**/test*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Tariff rate calculation changes require edge case tests (negative rates, midnight boundaries, empty windows)

Files:

  • tests/test_coordinator_ranking.py
🔇 Additional comments (4)
custom_components/pricehawk/cdr/ranking_job.py (1)

1-53: LGTM!

Also applies to: 59-143

custom_components/pricehawk/coordinator.py (1)

16-72: LGTM!

Also applies to: 256-269, 1183-1249

tests/conftest.py (1)

23-23: LGTM!

Also applies to: 36-36

tests/test_coordinator_ranking.py (1)

1-257: LGTM!

Comment thread custom_components/pricehawk/cdr/ranking_job.py Outdated
Comment thread custom_components/pricehawk/coordinator.py Outdated
Two findings from CodeRabbit and Sourcery on PR #68 (coordinator hook).

cache TTL (Sourcery + CR Major, same issue):
- coordinator.py — replaced unconditional ``cache.clear()`` at end of
  ``async_run_ranking_job`` with a date-rollover check at the start.
  Same-day reruns (scheduled at 00:30, manual via rank_alternatives
  service) now reuse cached plans as the docstring promised. New
  local day → cache clears before the run so overnight republished
  plans get fresh data.
- New ``_ranking_cache_date: date | None`` attribute tracks the
  cache's "as-of" day.

distributors type guard (CR Minor):
- ranking_job.py — ``get_user_geography`` now type-guards
  ``distributors`` with ``isinstance(..., list)`` before indexing.
  Malformed CDR payload could ship distributors as a string (e.g.
  ``"United Energy"``) and ``[0]`` would silently become ``"U"`` —
  skewing the ranking filter without raising.

Tests: 18/18 ranking tests pass (was 17 + 1 new for type guard).
Full suite 714/714.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@custom_components/pricehawk/cdr/ranking_job.py`:
- Around line 51-53: The code reads nested fields from options/cdr_plan
(variables cdr_plan, plan_data, geo and brand) without verifying types, which
can raise AttributeError for malformed payloads; fix by adding explicit type
guards (e.g., if isinstance(cdr_plan, dict): plan_data = cdr_plan.get("data") if
isinstance(cdr_plan.get("data"), dict) else {}; else: plan_data = {}; similarly
ensure geography is a dict before using .get and ensure brand is a str before
calling .strip(), falling back to safe defaults), update any public function
signatures in this module to include type hints for these inputs, and replace
any bare excepts around parsing with specific exceptions so malformed inputs are
handled gracefully and do not abort the ranking run.

In `@custom_components/pricehawk/coordinator.py`:
- Around line 1216-1263: async_run_ranking_job can run concurrently and corrupt
_ranking_plan_cache; fix by serializing runs with an asyncio.Lock: add a
persistent instance attribute (e.g. self._ranking_lock = asyncio.Lock()) during
coordinator init, then wrap the body of async_run_ranking_job (including the
today check, cache clear, calling run_ranking_job, and updating
_cheap_ranked_alternatives/_ranking_last_run_at) in an "async with
self._ranking_lock:" block so only one execution mutates _ranking_plan_cache at
a time and other invocations await completion.

In `@tests/test_coordinator_ranking.py`:
- Around line 101-109: Extend the test_non_list_distributors_safely_skipped test
to also assert that get_user_geography returns None for malformed nested shapes
by adding cases where cdr_plan.data is not a dict (e.g., string/int/list) and
where cdr_plan.data.geography is not a dict (e.g., string/int/list); for each
case construct opts like {"cdr_plan": <bad>} and {"cdr_plan": {"data": <bad>}}
respectively, call get_user_geography(opts) and assert the returned distributor
is None so the nested-shape guards around cdr_plan.data and geography are
covered.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 15c014ef-6b0f-4faf-8cc6-3fee7ed4246d

📥 Commits

Reviewing files that changed from the base of the PR and between bfc89a1 and f998d16.

📒 Files selected for processing (3)
  • custom_components/pricehawk/cdr/ranking_job.py
  • custom_components/pricehawk/coordinator.py
  • tests/test_coordinator_ranking.py
📜 Review details
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Use async/await for all I/O operations
NEVER hardcode tokens, API keys, or credentials in any file — use HA config entry storage
State restore MUST validate storage version before loading
from_dict() methods MUST receive an explicit HA-timezone date — no date.today() fallback

Files:

  • tests/test_coordinator_ranking.py
  • custom_components/pricehawk/coordinator.py
  • custom_components/pricehawk/cdr/ranking_job.py

⚙️ CodeRabbit configuration file

**/*.py: Check for: type hints on all public functions, no bare except:, SQL injection risks, missing input sanitisation, secrets not in code, Flask Blueprint structure respected, APScheduler job error handling.

Files:

  • tests/test_coordinator_ranking.py
  • custom_components/pricehawk/coordinator.py
  • custom_components/pricehawk/cdr/ranking_job.py
**/test*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Tariff rate calculation changes require edge case tests (negative rates, midnight boundaries, empty windows)

Files:

  • tests/test_coordinator_ranking.py

Comment thread custom_components/pricehawk/cdr/ranking_job.py Outdated
Comment thread custom_components/pricehawk/coordinator.py Outdated
Comment thread tests/test_coordinator_ranking.py
Artic0din added a commit that referenced this pull request May 16, 2026
Three CodeRabbit findings on the round-1 fixes (f998d16):

1. ranking_job.py — extracted ``_safe_plan_data(options)`` helper
   that walks ``cdr_plan["data"]`` with isinstance gates at every
   level. Previously ``cdr_plan["data"]`` was assumed to be a dict;
   a malformed payload could ship it as a string (e.g.
   ``cdr_plan = {"data": "broken"}``) and trigger AttributeError on
   ``.get("geography")``. Same for ``brand`` — now ``isinstance(...,
   str)`` before ``.strip()``. ``geography`` non-dict also guarded
   in ``get_user_geography``. First-distributor-must-be-str check
   added to prevent dict / int from sneaking through as a
   distributor filter value.

2. coordinator.py — added ``self._ranking_lock = asyncio.Lock()`` to
   serialise concurrent runs. Scheduled 00:30 callback + manual
   service trigger could enter ``async_run_ranking_job``
   simultaneously, interleaving ``_ranking_plan_cache`` mutations
   and duplicating every expensive CDR detail fetch. The lock means
   the second caller blocks briefly then returns the
   freshly-populated cache results.

3. tests/test_coordinator_ranking.py — added 4 new tests for
   non-dict ``cdr_plan``, non-dict ``data``, non-dict ``geography``,
   and non-string first distributor. Covers the AttributeError
   regression paths CR flagged.

Tests: 22/22 ranking + 722/722 full suite green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three CodeRabbit findings on the round-1 fixes (f998d16):

1. ranking_job.py — extracted ``_safe_plan_data(options)`` helper
   that walks ``cdr_plan["data"]`` with isinstance gates at every
   level. Previously ``cdr_plan["data"]`` was assumed to be a dict;
   a malformed payload could ship it as a string (e.g.
   ``cdr_plan = {"data": "broken"}``) and trigger AttributeError on
   ``.get("geography")``. Same for ``brand`` — now ``isinstance(...,
   str)`` before ``.strip()``. ``geography`` non-dict also guarded
   in ``get_user_geography``. First-distributor-must-be-str check
   added to prevent dict / int from sneaking through as a
   distributor filter value.

2. coordinator.py — added ``self._ranking_lock = asyncio.Lock()`` to
   serialise concurrent runs. Scheduled 00:30 callback + manual
   service trigger could enter ``async_run_ranking_job``
   simultaneously, interleaving ``_ranking_plan_cache`` mutations
   and duplicating every expensive CDR detail fetch. The lock means
   the second caller blocks briefly then returns the
   freshly-populated cache results.

3. tests/test_coordinator_ranking.py — added 4 new tests for
   non-dict ``cdr_plan``, non-dict ``data``, non-dict ``geography``,
   and non-string first distributor. Covers the AttributeError
   regression paths CR flagged.

Tests: 22/22 ranking + 722/722 full suite green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Artic0din Artic0din merged commit bbfa51e into dev May 17, 2026
37 of 43 checks passed
@Artic0din Artic0din deleted the phase-3-1-coordinator-hook branch May 17, 2026 02:02
Artic0din added a commit that referenced this pull request May 17, 2026
…ifecycle (#72)

* feat(coordinator): Phase 3.1 commit 4 — daily ranking job hook

Wires the multi-plan ranking pipeline (from PR #64) into the
coordinator on a daily 00:30 local schedule. Cheap-rank only for
now; deep-rank (consumption replay) joins in Phase 3.2 when the
universal HA-history backfill ships.

Architecture
------------

Pure logic in new ``custom_components/pricehawk/cdr/ranking_job.py``:

- ``get_user_geography(options)`` — extracts ``(state, postcode,
  distributor)`` from a config_entry's options. ``distributor``
  comes from the user's current cdr_plan's
  ``geography.distributors[0]`` (the plan the user already accepted,
  so its distributor is by definition theirs).

- ``get_competitor_retailers(session, options, *,
  competitor_fragments)`` — returns the retailer list scanned daily.
  Composition: user's CURRENT retailer (via ``cdr_plan.data.brand``)
  + hardcoded big-4 nationally-active competitors (AGL, Origin,
  EnergyAustralia, Red Energy). Dedup by brand_id so the user's
  retailer isn't double-scanned when it's also a big-4. Custom
  fragment tuple supported for tests + future user-configurable list.

- ``run_ranking_job(session, options, *, top_k, plan_cache,
  competitor_fragments)`` — top-level orchestrator. Returns the
  cheap-ranked top-K plans.

Coordinator-side wrappers
-------------------------

- New attributes on PriceHawkCoordinator:
  - ``_cheap_ranked_alternatives: list[dict]``
  - ``_ranking_last_run_at: datetime | None``
  - ``_ranking_plan_cache: dict[str, dict]``
  - ``_ranking_unsub: CALLBACK_TYPE | None``

- ``schedule_daily_ranking()`` registers ``async_track_time_change``
  for ``hour=0, minute=30`` local. Safe to call twice (second call
  replaces first).
- ``cancel_ranking()`` clears the handle.
- ``async_run_ranking_job(*, top_k)`` is a thin delegate around
  ``ranking_job.run_ranking_job``. Owns HA-side concerns: session
  acquisition, exception swallowing across the daily boundary,
  state persistence, cache reset. Empty-result runs keep prior
  ranking (transient failure → don't zero yesterday's data).

Why a separate module
---------------------

PriceHawkCoordinator inherits from
``DataUpdateCoordinator[dict[str, Any]]`` which gets mocked away by
``tests/conftest.py``. The class is therefore unreachable in unit
tests. Extracting the pure logic to ``ranking_job`` keeps it
unit-testable while the class methods stay as thin delegates.

Tests
-----

17 new tests in ``tests/test_coordinator_ranking.py``:

- Module constants (competitor fragments, run-time).
- ``get_user_geography``: 6 cases (happy path, missing fields, list
  ordering, empty options).
- ``get_competitor_retailers``: 5 cases (current retailer first,
  dedup, missing brand, unmatchable fragment, custom override).
- ``run_ranking_job``: 4 cases (empty retailers, geography
  forwarding, cache passthrough, top_k forwarding).

Also added ``homeassistant.helpers.aiohttp_client`` to ``conftest.py``
mock list — coordinator.py imports it; without the mock, importing
the coordinator silently fell back to a MagicMock class.

Full suite: 713/713 pass (was 696 — +17 new). Ruff clean.

Refs PHASE-3-ROADMAP.md §3.1. Next commit: HA service
``pricehawk.rank_alternatives`` in ``__init__.py``.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: PR #68 CR findings — cache TTL + distributors type guard

Two findings from CodeRabbit and Sourcery on PR #68 (coordinator hook).

cache TTL (Sourcery + CR Major, same issue):
- coordinator.py — replaced unconditional ``cache.clear()`` at end of
  ``async_run_ranking_job`` with a date-rollover check at the start.
  Same-day reruns (scheduled at 00:30, manual via rank_alternatives
  service) now reuse cached plans as the docstring promised. New
  local day → cache clears before the run so overnight republished
  plans get fresh data.
- New ``_ranking_cache_date: date | None`` attribute tracks the
  cache's "as-of" day.

distributors type guard (CR Minor):
- ranking_job.py — ``get_user_geography`` now type-guards
  ``distributors`` with ``isinstance(..., list)`` before indexing.
  Malformed CDR payload could ship distributors as a string (e.g.
  ``"United Energy"``) and ``[0]`` would silently become ``"U"`` —
  skewing the ranking filter without raising.

Tests: 18/18 ranking tests pass (was 17 + 1 new for type guard).
Full suite 714/714.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: PR #68 round-2 CR — nested shape guards + asyncio.Lock

Three CodeRabbit findings on the round-1 fixes (f998d16):

1. ranking_job.py — extracted ``_safe_plan_data(options)`` helper
   that walks ``cdr_plan["data"]`` with isinstance gates at every
   level. Previously ``cdr_plan["data"]`` was assumed to be a dict;
   a malformed payload could ship it as a string (e.g.
   ``cdr_plan = {"data": "broken"}``) and trigger AttributeError on
   ``.get("geography")``. Same for ``brand`` — now ``isinstance(...,
   str)`` before ``.strip()``. ``geography`` non-dict also guarded
   in ``get_user_geography``. First-distributor-must-be-str check
   added to prevent dict / int from sneaking through as a
   distributor filter value.

2. coordinator.py — added ``self._ranking_lock = asyncio.Lock()`` to
   serialise concurrent runs. Scheduled 00:30 callback + manual
   service trigger could enter ``async_run_ranking_job``
   simultaneously, interleaving ``_ranking_plan_cache`` mutations
   and duplicating every expensive CDR detail fetch. The lock means
   the second caller blocks briefly then returns the
   freshly-populated cache results.

3. tests/test_coordinator_ranking.py — added 4 new tests for
   non-dict ``cdr_plan``, non-dict ``data``, non-dict ``geography``,
   and non-string first distributor. Covers the AttributeError
   regression paths CR flagged.

Tests: 22/22 ranking + 722/722 full suite green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(integration): Phase 3.1 commit 5 — rank_alternatives HA service + lifecycle

Wires the ranking job into the integration's lifecycle and exposes
a manual-trigger HA service. Completes Phase 3.1's user-visible
plumbing (sensor + dashboard exposure is commit 6).

Lifecycle wiring
----------------

- ``async_setup_entry`` now calls ``coordinator.schedule_daily_ranking()``
  to register the 00:30 local-time callback, plus ``hass.async_create_task``
  for an immediate first run so the alternatives data isn't empty until
  midnight on a fresh install.
- ``async_unload_entry`` calls ``coordinator.cancel_ranking()`` mirroring
  the existing ``cancel_persist()`` pattern.

HA service: pricehawk.rank_alternatives
---------------------------------------

- ``services.yaml`` declares the service with a single ``top_k`` int field
  (1-100, default 20).
- Handler clamps top_k to [1, 100] and delegates to
  ``coordinator.async_run_ranking_job(top_k=...)``.
- Most useful right after a plan switch: lets the user force-run the
  ranking pipeline so the alternatives sensor reflects the new
  distributor / postcode without waiting for the next 00:30 schedule.
- Unregistered alongside the other two services in
  ``async_unload_entry`` when the last config entry goes away.

No tests
--------

The service handler is a closure inside ``async_setup_entry`` (tightly
coupled to HA's service registration plumbing — can't be unit-tested
without HA itself). The underlying ``async_run_ranking_job`` is already
covered by ``tests/test_coordinator_ranking.py`` (17 tests via the
``ranking_job`` module). Live smoke test will validate the wiring once
the integration runs in HA.

Full suite: 713/713 still passing. Ruff clean. services.yaml validates.

Refs PHASE-3-ROADMAP.md §3.1. Next (optional commit 6): sensor for
ranked alternatives + dashboard exposure.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: PR #69 CR finding — handle invalid top_k payload defensively

CR found that ``int(call.data.get("top_k", 20))`` raises
ValueError/TypeError on malformed service data (e.g. a YAML typo
sending ``top_k: "abc"``). The exception propagates back to the
service caller and fails the run instead of degrading gracefully.

Now wraps the coercion in try/except, logs a warning with the bad
value, and falls back to the default 20. Clamp [1, 100] still
applied after.

Same kind of defence as elsewhere in the codebase (``safe_int``,
``safe_decimal`` for incentive parsers).

Tests: 714/714 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant