feat(integration): Phase 3.1 commit 5 — rank_alternatives service + lifecycle#72
Conversation
Wires the multi-plan ranking pipeline (from PR #64) into the coordinator on a daily 00:30 local schedule. Cheap-rank only for now; deep-rank (consumption replay) joins in Phase 3.2 when the universal HA-history backfill ships. Architecture ------------ Pure logic in new ``custom_components/pricehawk/cdr/ranking_job.py``: - ``get_user_geography(options)`` — extracts ``(state, postcode, distributor)`` from a config_entry's options. ``distributor`` comes from the user's current cdr_plan's ``geography.distributors[0]`` (the plan the user already accepted, so its distributor is by definition theirs). - ``get_competitor_retailers(session, options, *, competitor_fragments)`` — returns the retailer list scanned daily. Composition: user's CURRENT retailer (via ``cdr_plan.data.brand``) + hardcoded big-4 nationally-active competitors (AGL, Origin, EnergyAustralia, Red Energy). Dedup by brand_id so the user's retailer isn't double-scanned when it's also a big-4. Custom fragment tuple supported for tests + future user-configurable list. - ``run_ranking_job(session, options, *, top_k, plan_cache, competitor_fragments)`` — top-level orchestrator. Returns the cheap-ranked top-K plans. Coordinator-side wrappers ------------------------- - New attributes on PriceHawkCoordinator: - ``_cheap_ranked_alternatives: list[dict]`` - ``_ranking_last_run_at: datetime | None`` - ``_ranking_plan_cache: dict[str, dict]`` - ``_ranking_unsub: CALLBACK_TYPE | None`` - ``schedule_daily_ranking()`` registers ``async_track_time_change`` for ``hour=0, minute=30`` local. Safe to call twice (second call replaces first). - ``cancel_ranking()`` clears the handle. - ``async_run_ranking_job(*, top_k)`` is a thin delegate around ``ranking_job.run_ranking_job``. Owns HA-side concerns: session acquisition, exception swallowing across the daily boundary, state persistence, cache reset. Empty-result runs keep prior ranking (transient failure → don't zero yesterday's data). Why a separate module --------------------- PriceHawkCoordinator inherits from ``DataUpdateCoordinator[dict[str, Any]]`` which gets mocked away by ``tests/conftest.py``. The class is therefore unreachable in unit tests. Extracting the pure logic to ``ranking_job`` keeps it unit-testable while the class methods stay as thin delegates. Tests ----- 17 new tests in ``tests/test_coordinator_ranking.py``: - Module constants (competitor fragments, run-time). - ``get_user_geography``: 6 cases (happy path, missing fields, list ordering, empty options). - ``get_competitor_retailers``: 5 cases (current retailer first, dedup, missing brand, unmatchable fragment, custom override). - ``run_ranking_job``: 4 cases (empty retailers, geography forwarding, cache passthrough, top_k forwarding). Also added ``homeassistant.helpers.aiohttp_client`` to ``conftest.py`` mock list — coordinator.py imports it; without the mock, importing the coordinator silently fell back to a MagicMock class. Full suite: 713/713 pass (was 696 — +17 new). Ruff clean. Refs PHASE-3-ROADMAP.md §3.1. Next commit: HA service ``pricehawk.rank_alternatives`` in ``__init__.py``. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two findings from CodeRabbit and Sourcery on PR #68 (coordinator hook). cache TTL (Sourcery + CR Major, same issue): - coordinator.py — replaced unconditional ``cache.clear()`` at end of ``async_run_ranking_job`` with a date-rollover check at the start. Same-day reruns (scheduled at 00:30, manual via rank_alternatives service) now reuse cached plans as the docstring promised. New local day → cache clears before the run so overnight republished plans get fresh data. - New ``_ranking_cache_date: date | None`` attribute tracks the cache's "as-of" day. distributors type guard (CR Minor): - ranking_job.py — ``get_user_geography`` now type-guards ``distributors`` with ``isinstance(..., list)`` before indexing. Malformed CDR payload could ship distributors as a string (e.g. ``"United Energy"``) and ``[0]`` would silently become ``"U"`` — skewing the ranking filter without raising. Tests: 18/18 ranking tests pass (was 17 + 1 new for type guard). Full suite 714/714. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three CodeRabbit findings on the round-1 fixes (f998d16): 1. ranking_job.py — extracted ``_safe_plan_data(options)`` helper that walks ``cdr_plan["data"]`` with isinstance gates at every level. Previously ``cdr_plan["data"]`` was assumed to be a dict; a malformed payload could ship it as a string (e.g. ``cdr_plan = {"data": "broken"}``) and trigger AttributeError on ``.get("geography")``. Same for ``brand`` — now ``isinstance(..., str)`` before ``.strip()``. ``geography`` non-dict also guarded in ``get_user_geography``. First-distributor-must-be-str check added to prevent dict / int from sneaking through as a distributor filter value. 2. coordinator.py — added ``self._ranking_lock = asyncio.Lock()`` to serialise concurrent runs. Scheduled 00:30 callback + manual service trigger could enter ``async_run_ranking_job`` simultaneously, interleaving ``_ranking_plan_cache`` mutations and duplicating every expensive CDR detail fetch. The lock means the second caller blocks briefly then returns the freshly-populated cache results. 3. tests/test_coordinator_ranking.py — added 4 new tests for non-dict ``cdr_plan``, non-dict ``data``, non-dict ``geography``, and non-string first distributor. Covers the AttributeError regression paths CR flagged. Tests: 22/22 ranking + 722/722 full suite green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…+ lifecycle Wires the ranking job into the integration's lifecycle and exposes a manual-trigger HA service. Completes Phase 3.1's user-visible plumbing (sensor + dashboard exposure is commit 6). Lifecycle wiring ---------------- - ``async_setup_entry`` now calls ``coordinator.schedule_daily_ranking()`` to register the 00:30 local-time callback, plus ``hass.async_create_task`` for an immediate first run so the alternatives data isn't empty until midnight on a fresh install. - ``async_unload_entry`` calls ``coordinator.cancel_ranking()`` mirroring the existing ``cancel_persist()`` pattern. HA service: pricehawk.rank_alternatives --------------------------------------- - ``services.yaml`` declares the service with a single ``top_k`` int field (1-100, default 20). - Handler clamps top_k to [1, 100] and delegates to ``coordinator.async_run_ranking_job(top_k=...)``. - Most useful right after a plan switch: lets the user force-run the ranking pipeline so the alternatives sensor reflects the new distributor / postcode without waiting for the next 00:30 schedule. - Unregistered alongside the other two services in ``async_unload_entry`` when the last config entry goes away. No tests -------- The service handler is a closure inside ``async_setup_entry`` (tightly coupled to HA's service registration plumbing — can't be unit-tested without HA itself). The underlying ``async_run_ranking_job`` is already covered by ``tests/test_coordinator_ranking.py`` (17 tests via the ``ranking_job`` module). Live smoke test will validate the wiring once the integration runs in HA. Full suite: 713/713 still passing. Ruff clean. services.yaml validates. Refs PHASE-3-ROADMAP.md §3.1. Next (optional commit 6): sensor for ranked alternatives + dashboard exposure. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CR found that ``int(call.data.get("top_k", 20))`` raises
ValueError/TypeError on malformed service data (e.g. a YAML typo
sending ``top_k: "abc"``). The exception propagates back to the
service caller and fails the run instead of degrading gracefully.
Now wraps the coercion in try/except, logs a warning with the bad
value, and falls back to the default 20. Clamp [1, 100] still
applied after.
Same kind of defence as elsewhere in the codebase (``safe_int``,
``safe_decimal`` for incentive parsers).
Tests: 714/714 pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reviewer's GuideImplements the daily multi-plan ranking lifecycle for the PriceHawk integration by adding a coordinator-driven scheduled ranking job, a manual Home Assistant service to trigger it with configurable top_k, robust option-parsing and concurrency control, and a new ranking_job orchestration module with accompanying tests. File-Level Changes
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (6)
WalkthroughPhase 3.1 adds a daily 00:30 local-time ranking job and on-demand service to fetch cheaper plan alternatives. A new ChangesPhase 3.1 Ranking Pipeline Implementation
Sequence DiagramsequenceDiagram
participant User as User/Setup
participant Setup as async_setup_entry
participant Coord as Coordinator
participant Sched as HA Scheduler
participant Lock as asyncio.Lock
participant Job as run_ranking_job
participant Service as Service Handler
participant Rank as rank_alternatives<br/>(external)
User ->> Setup: Setup integration
Setup ->> Coord: schedule_daily_ranking()
Coord ->> Sched: register 00:30 callback
Setup ->> Coord: async_run_ranking_job()
Coord ->> Lock: acquire
Coord ->> Job: invoke
Job ->> Rank: forward retailers, geography
Rank -->> Job: alternatives
Job -->> Coord: top-K list
Coord ->> Coord: cache result
Lock ->> Coord: release
Note over Sched: Daily at 00:30
Sched ->> Coord: async_run_ranking_job()
Coord ->> Lock: acquire
Coord ->> Job: invoke
Job ->> Rank: forward retailers, geography
Rank -->> Job: alternatives
Job -->> Coord: top-K list
Coord ->> Coord: cache result
Lock ->> Coord: release
User ->> Service: Call rank_alternatives<br/>(top_k=N)
Service ->> Coord: async_run_ranking_job(top_k=N)
Coord ->> Lock: acquire
Coord ->> Job: invoke
Job ->> Rank: forward retailers, geography
Rank -->> Job: alternatives
Job -->> Coord: top-K list
Coord ->> Coord: cache result
Lock ->> Coord: release
Service -->> User: N result(s) logged
Estimated Code Review Effort🎯 3 (Moderate) | ⏱️ ~20 minutes Rationale: New orchestration module is straightforward (safe extraction helpers + function composition), but coordinator adds concurrent-execution semantics (locking, cache invalidation across date rollover, exception swallowing). Service handler is simple delegation. Test suite is large (30+ tests) but validates clear expected behaviors with mocks. Integration wiring is minimal. Moderate complexity comes from understanding the locking strategy and cache reset logic; pure logic is easy to follow. Possibly Related PRs
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Hey - I've found 3 issues, and left some high level feedback:
- In
handle_rank_alternatives, you currently hardcode20in multiple places; consider reusing the sharedDEFAULT_TOP_Kconstant so the service default stays in sync with the ranking logic. - The
sessionparameter inget_competitor_retailers/run_ranking_jobis annotated asaiohttp.ClientSessionbut your tests passNone; either relax the type toClientSession | Noneor stop passingNoneso the type hints match actual usage.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `handle_rank_alternatives`, you currently hardcode `20` in multiple places; consider reusing the shared `DEFAULT_TOP_K` constant so the service default stays in sync with the ranking logic.
- The `session` parameter in `get_competitor_retailers`/`run_ranking_job` is annotated as `aiohttp.ClientSession` but your tests pass `None`; either relax the type to `ClientSession | None` or stop passing `None` so the type hints match actual usage.
## Individual Comments
### Comment 1
<location path="custom_components/pricehawk/__init__.py" line_range="193-201" />
<code_context>
+ # waiting for the next 00:30 schedule fire. Most useful right after
+ # switching plans (so the alternatives ranking reflects the new
+ # distributor / postcode immediately).
+ async def handle_rank_alternatives(call: object) -> None:
+ # CR-fix: malformed service payload (e.g. ``top_k: "abc"`` from
+ # a typo in a YAML automation) would raise ValueError/TypeError
+ # and fail the call. Coerce defensively + fall back to default.
+ raw = call.data.get("top_k", 20) # type: ignore[attr-defined]
+ try:
+ top_k = int(raw)
+ except (TypeError, ValueError):
+ _LOGGER.warning(
+ "rank_alternatives: invalid top_k=%r, using default 20", raw
+ )
</code_context>
<issue_to_address>
**suggestion:** Avoid duplicating the default `top_k` value; reuse the shared constant to prevent drift.
`20` is hardcoded in the default for `call.data.get`, in the warning text, and implicitly as the clamp bound. Prefer a single source of truth: either import `DEFAULT_TOP_K` from the ranking layer and use it in all three places, or define a module-level constant here and reference it everywhere. That avoids divergence if the default changes elsewhere.
Suggested implementation:
```python
RANK_ALTERNATIVES_DEFAULT_TOP_K = 20
# Phase 3.1 commit 5 — manual ranking trigger. Lets users force-run
# the ranking pipeline from Developer Tools → Services without
# waiting for the next 00:30 schedule fire. Most useful right after
# switching plans (so the alternatives ranking reflects the new
# distributor / postcode immediately).
async def handle_rank_alternatives(call: object) -> None:
# CR-fix: malformed service payload (e.g. ``top_k: "abc"`` from
# a typo in a YAML automation) would raise ValueError/TypeError
# and fail the call. Coerce defensively + fall back to default.
raw = call.data.get("top_k", RANK_ALTERNATIVES_DEFAULT_TOP_K) # type: ignore[attr-defined]
try:
top_k = int(raw)
except (TypeError, ValueError):
_LOGGER.warning(
"rank_alternatives: invalid top_k=%r, using default %d",
raw,
RANK_ALTERNATIVES_DEFAULT_TOP_K,
)
top_k = RANK_ALTERNATIVES_DEFAULT_TOP_K
top_k = max(1, min(top_k, 100))
result = await coordinator.async_run_ranking_job(top_k=top_k)
_LOGGER.info(
"rank_alternatives service: ran successfully, %d result(s)",
len(result),
)
```
If there is already a shared default (e.g. `DEFAULT_TOP_K`) in your ranking layer, you may prefer to:
1. Import that constant at the top of this file.
2. Replace `RANK_ALTERNATIVES_DEFAULT_TOP_K` with the imported constant name.
This preserves a single source of truth across modules.
</issue_to_address>
### Comment 2
<location path="custom_components/pricehawk/cdr/ranking_job.py" line_range="84-93" />
<code_context>
+async def get_competitor_retailers(
</code_context>
<issue_to_address>
**suggestion:** Handle the case where the current brand exists but does not resolve to a registry entry more explicitly.
If `current_brand` is set but `find_by_brand` returns no match, we silently ignore it and only use the hardcoded competitors. Consider either logging this condition (debug/info) or adding a more forgiving lookup (e.g., case/whitespace normalization) so it’s easier to detect when the registry and stored plan are out of sync.
Suggested implementation:
```python
current_brand = plan_data.get("brand")
current_retailer: RetailerEndpoint | None = None
if current_brand:
# First, try the exact lookup used previously
current_retailer = registry.find_by_brand(current_brand)
if not current_retailer:
# Fallback: trim whitespace and do a case-insensitive comparison against
# known registry brands to recover from minor formatting drift.
normalized_brand = current_brand.strip().lower()
for retailer in registry:
try:
retailer_brand = retailer.brand
except AttributeError:
# Be defensive if registry entries are not uniform
continue
if retailer_brand and retailer_brand.strip().lower() == normalized_brand:
current_retailer = retailer
break
if not current_retailer:
# Explicitly log when the stored plan refers to a brand that we cannot
# resolve in the registry so configuration drift is discoverable.
_LOGGER.info(
"CDR ranking: current brand %r from plan did not match any registry entry",
current_brand,
)
```
I assumed the existence of:
1. A `registry` object that is iterable and whose items have a `.brand` attribute.
2. A module-level `_LOGGER` logger instance.
To integrate this cleanly, you should:
1. Ensure `registry` is iterable and that its items expose a `brand` attribute (or adjust the loop to match your actual type, e.g., `registry.retailers` or `registry.values()`).
2. Confirm `_LOGGER` is defined in this module (typical Home Assistant pattern: `import logging` and `_LOGGER = logging.getLogger(__name__)`). If it is named differently, update the logging call accordingly.
3. If `RetailerEndpoint` is not the correct type for `current_retailer`, adjust the annotation accordingly or remove it to match your existing conventions.
</issue_to_address>
### Comment 3
<location path="tests/test_coordinator_ranking.py" line_range="151" />
<code_context>
+# ---------------------------------------------------------------------------
+
+
+class TestGetCompetitorRetailers:
+ def test_includes_user_current_retailer_first(self):
+ opts = {"cdr_plan": {"data": {"brand": "GloBird"}}}
</code_context>
<issue_to_address>
**suggestion (testing):** Add a test for non-string `brand` values in `cdr_plan.data.brand` to cover the type-guard logic
`get_competitor_retailers` only accepts `str` brands and intentionally ignores `None`/non-string values to avoid `.strip()`/matching errors. Current tests cover a missing `brand`, but not the case where `brand` exists and is malformed (e.g. `None`, `42`, `{}`), which is what the guards are for.
To cover that behavior, you could add a parametrized test like:
```python
@pytest.mark.parametrize("bad_brand", [None, 42, {}, []])
def test_non_string_brand_is_ignored(bad_brand):
opts = {"cdr_plan": {"data": {"brand": bad_brand}}}
agl = _retailer("AGL Energy", brand_id="200")
with patch(
"custom_components.pricehawk.cdr.ranking_job.get_registry",
AsyncMock(return_value=([agl], "live")),
):
result = asyncio.run(get_competitor_retailers(None, opts))
assert all(r.brand_name != bad_brand for r in result)
```
This ensures malformed `brand` values are ignored and do not break the function or enter the competitor list.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| async def handle_rank_alternatives(call: object) -> None: | ||
| # CR-fix: malformed service payload (e.g. ``top_k: "abc"`` from | ||
| # a typo in a YAML automation) would raise ValueError/TypeError | ||
| # and fail the call. Coerce defensively + fall back to default. | ||
| raw = call.data.get("top_k", 20) # type: ignore[attr-defined] | ||
| try: | ||
| top_k = int(raw) | ||
| except (TypeError, ValueError): | ||
| _LOGGER.warning( |
There was a problem hiding this comment.
suggestion: Avoid duplicating the default top_k value; reuse the shared constant to prevent drift.
20 is hardcoded in the default for call.data.get, in the warning text, and implicitly as the clamp bound. Prefer a single source of truth: either import DEFAULT_TOP_K from the ranking layer and use it in all three places, or define a module-level constant here and reference it everywhere. That avoids divergence if the default changes elsewhere.
Suggested implementation:
RANK_ALTERNATIVES_DEFAULT_TOP_K = 20
# Phase 3.1 commit 5 — manual ranking trigger. Lets users force-run
# the ranking pipeline from Developer Tools → Services without
# waiting for the next 00:30 schedule fire. Most useful right after
# switching plans (so the alternatives ranking reflects the new
# distributor / postcode immediately).
async def handle_rank_alternatives(call: object) -> None:
# CR-fix: malformed service payload (e.g. ``top_k: "abc"`` from
# a typo in a YAML automation) would raise ValueError/TypeError
# and fail the call. Coerce defensively + fall back to default.
raw = call.data.get("top_k", RANK_ALTERNATIVES_DEFAULT_TOP_K) # type: ignore[attr-defined]
try:
top_k = int(raw)
except (TypeError, ValueError):
_LOGGER.warning(
"rank_alternatives: invalid top_k=%r, using default %d",
raw,
RANK_ALTERNATIVES_DEFAULT_TOP_K,
)
top_k = RANK_ALTERNATIVES_DEFAULT_TOP_K
top_k = max(1, min(top_k, 100))
result = await coordinator.async_run_ranking_job(top_k=top_k)
_LOGGER.info(
"rank_alternatives service: ran successfully, %d result(s)",
len(result),
)If there is already a shared default (e.g. DEFAULT_TOP_K) in your ranking layer, you may prefer to:
- Import that constant at the top of this file.
- Replace
RANK_ALTERNATIVES_DEFAULT_TOP_Kwith the imported constant name.
This preserves a single source of truth across modules.
| async def get_competitor_retailers( | ||
| session: aiohttp.ClientSession, | ||
| options: dict[str, Any], | ||
| *, | ||
| competitor_fragments: tuple[str, ...] = DEFAULT_COMPETITOR_BRAND_FRAGMENTS, | ||
| ) -> list[RetailerEndpoint]: | ||
| """Build the retailer list scanned during the daily ranking job. | ||
|
|
||
| Composition (in priority order, dedup by ``brand_id``): | ||
| 1. User's CURRENT retailer (from ``cdr_plan.data.brand``). |
There was a problem hiding this comment.
suggestion: Handle the case where the current brand exists but does not resolve to a registry entry more explicitly.
If current_brand is set but find_by_brand returns no match, we silently ignore it and only use the hardcoded competitors. Consider either logging this condition (debug/info) or adding a more forgiving lookup (e.g., case/whitespace normalization) so it’s easier to detect when the registry and stored plan are out of sync.
Suggested implementation:
current_brand = plan_data.get("brand")
current_retailer: RetailerEndpoint | None = None
if current_brand:
# First, try the exact lookup used previously
current_retailer = registry.find_by_brand(current_brand)
if not current_retailer:
# Fallback: trim whitespace and do a case-insensitive comparison against
# known registry brands to recover from minor formatting drift.
normalized_brand = current_brand.strip().lower()
for retailer in registry:
try:
retailer_brand = retailer.brand
except AttributeError:
# Be defensive if registry entries are not uniform
continue
if retailer_brand and retailer_brand.strip().lower() == normalized_brand:
current_retailer = retailer
break
if not current_retailer:
# Explicitly log when the stored plan refers to a brand that we cannot
# resolve in the registry so configuration drift is discoverable.
_LOGGER.info(
"CDR ranking: current brand %r from plan did not match any registry entry",
current_brand,
)I assumed the existence of:
- A
registryobject that is iterable and whose items have a.brandattribute. - A module-level
_LOGGERlogger instance.
To integrate this cleanly, you should:
- Ensure
registryis iterable and that its items expose abrandattribute (or adjust the loop to match your actual type, e.g.,registry.retailersorregistry.values()). - Confirm
_LOGGERis defined in this module (typical Home Assistant pattern:import loggingand_LOGGER = logging.getLogger(__name__)). If it is named differently, update the logging call accordingly. - If
RetailerEndpointis not the correct type forcurrent_retailer, adjust the annotation accordingly or remove it to match your existing conventions.
| # --------------------------------------------------------------------------- | ||
|
|
||
|
|
||
| class TestGetCompetitorRetailers: |
There was a problem hiding this comment.
suggestion (testing): Add a test for non-string brand values in cdr_plan.data.brand to cover the type-guard logic
get_competitor_retailers only accepts str brands and intentionally ignores None/non-string values to avoid .strip()/matching errors. Current tests cover a missing brand, but not the case where brand exists and is malformed (e.g. None, 42, {}), which is what the guards are for.
To cover that behavior, you could add a parametrized test like:
@pytest.mark.parametrize("bad_brand", [None, 42, {}, []])
def test_non_string_brand_is_ignored(bad_brand):
opts = {"cdr_plan": {"data": {"brand": bad_brand}}}
agl = _retailer("AGL Energy", brand_id="200")
with patch(
"custom_components.pricehawk.cdr.ranking_job.get_registry",
AsyncMock(return_value=([agl], "live")),
):
result = asyncio.run(get_competitor_retailers(None, opts))
assert all(r.brand_name != bad_brand for r in result)This ensures malformed brand values are ignored and do not break the function or enter the competitor list.
Summary
Reopen of #69 (auto-closed when #68's base branch deleted on merge). Phase 3.1 commit 5 — HA service + lifecycle wiring for the daily ranking job from #68.
This PR contains the same commits as the closed #69 (no content drift; just a fresh PR against
devnow that #68 has merged).What's in this PR
async_setup_entrycallscoordinator.schedule_daily_ranking()+ immediate first-run taskasync_unload_entrycallscoordinator.cancel_ranking()pricehawk.rank_alternativesHA service withtop_kfield (1-100, default 20)int(top_k)coercion (try/except, falls back to default on bad input)Test plan
pytest -q— full suite greenruff check— cleanBreaking Changes
None. Additive lifecycle + new service.
Files Changed
custom_components/pricehawk/__init__.pycustom_components/pricehawk/services.yaml🤖 Generated with Claude Code
Summary by Sourcery
Add a daily multi-plan ranking job and expose it via a new Home Assistant service and coordinator orchestration.
New Features:
Enhancements:
Tests:
Summary
This PR implements Phase 3.1 of the ranking pipeline by wiring the daily multi-plan ranking job into the integration lifecycle and exposing a manual service for on-demand ranking.
What Changed
Lifecycle integration:
async_setup_entry()now schedules a daily ranking task at 00:30 local time and triggers an immediate first-run ranking job to populate alternatives on fresh installs;async_unload_entry()cancels the scheduled job and unregisters the ranking service on last-entry teardown.Ranking orchestration module: New
ranking_job.pymodule provides coordinator-facing helpers:get_user_geography()safely extracts postcode and distributor from nested CDR plan payloads, tolerating malformed shapes (non-dict values, missing fields).get_competitor_retailers()asynchronously resolves the user's current retailer (by brand) and hardcoded big-4 competitor brands with deduplication.run_ranking_job()orchestrates the full pipeline: resolves retailers, derives geography, callsrank_alternatives(), and returns the top-K ranked plans.Coordinator enhancements: Added ranking state management with cache TTL behavior (per-day rollover), concurrent-run serialization via
asyncio.Lock, and three new methods:schedule_daily_ranking()registers the 00:30 callback viaasync_track_time_change.cancel_ranking()unsubscribes from scheduled callbacks.async_run_ranking_job(top_k=20)runs the ranking pipeline under lock, updates cached alternatives only on non-empty results, and returns cached results if the run produces nothing.Manual service: New
rank_alternativesservice exposes ranking on demand with atop_kparameter (1–100, default 20); defensiveint()coercion with try/except fallback to default on bad input.Test coverage: Comprehensive pytest suite validates shape/type guards for malformed config, geography extraction, competitor retailer composition rules (dedup, ordering, fragment overrides), and ranking job orchestration.
Why
The changes enable both scheduled (daily 00:30) and manual (service-triggered) ranking to update alternatives after plan switches or on demand. Defensive parsing and cache TTL behavior prevent errors on malformed configs and avoid needless cache clears on same-day reruns. Serialization via lock ensures concurrent scheduled/manual runs don't race.
Breaking Changes
None. All changes are additive; existing unit tests continue to pass.
Files Changed
custom_components/pricehawk/__init__.pycustom_components/pricehawk/cdr/ranking_job.pycustom_components/pricehawk/coordinator.pycustom_components/pricehawk/services.yamltests/conftest.pytests/test_coordinator_ranking.py