Skip to content

feat(integration): Phase 3.1 commit 5 — rank_alternatives service + lifecycle#72

Merged
Artic0din merged 5 commits into
devfrom
phase-3-1-rank-service
May 17, 2026
Merged

feat(integration): Phase 3.1 commit 5 — rank_alternatives service + lifecycle#72
Artic0din merged 5 commits into
devfrom
phase-3-1-rank-service

Conversation

@Artic0din
Copy link
Copy Markdown
Owner

@Artic0din Artic0din commented May 17, 2026

Summary

Reopen of #69 (auto-closed when #68's base branch deleted on merge). Phase 3.1 commit 5 — HA service + lifecycle wiring for the daily ranking job from #68.

This PR contains the same commits as the closed #69 (no content drift; just a fresh PR against dev now that #68 has merged).

What's in this PR

  • async_setup_entry calls coordinator.schedule_daily_ranking() + immediate first-run task
  • async_unload_entry calls coordinator.cancel_ranking()
  • New pricehawk.rank_alternatives HA service with top_k field (1-100, default 20)
  • Defensive int(top_k) coercion (try/except, falls back to default on bad input)
  • Service unregistered alongside others on last-entry teardown

Test plan

  • pytest -q — full suite green
  • ruff check — clean

Breaking Changes

None. Additive lifecycle + new service.

Files Changed

  • custom_components/pricehawk/__init__.py
  • custom_components/pricehawk/services.yaml

🤖 Generated with Claude Code

Summary by Sourcery

Add a daily multi-plan ranking job and expose it via a new Home Assistant service and coordinator orchestration.

New Features:

  • Schedule a daily 00:30 local-time ranking job that computes top-K cheaper alternative plans and runs once immediately on setup.
  • Expose a new pricehawk.rank_alternatives Home Assistant service with an optional top_k parameter (1–100, default 20) to trigger ranking on demand.

Enhancements:

  • Introduce coordinator-side ranking orchestration with caching, locking, and safe reuse of previous results on failures.
  • Refine option and registry handling for CDR-based geography and retailer resolution to better support multi-retailer ranking.

Tests:

  • Add unit tests for ranking job orchestration, geography extraction, competitor retailer selection, and run_ranking_job behavior.
  • Extend Home Assistant test shims to cover aiohttp_client usage in the new ranking flow.

Summary

This PR implements Phase 3.1 of the ranking pipeline by wiring the daily multi-plan ranking job into the integration lifecycle and exposing a manual service for on-demand ranking.

What Changed

  • Lifecycle integration: async_setup_entry() now schedules a daily ranking task at 00:30 local time and triggers an immediate first-run ranking job to populate alternatives on fresh installs; async_unload_entry() cancels the scheduled job and unregisters the ranking service on last-entry teardown.

  • Ranking orchestration module: New ranking_job.py module provides coordinator-facing helpers:

    • get_user_geography() safely extracts postcode and distributor from nested CDR plan payloads, tolerating malformed shapes (non-dict values, missing fields).
    • get_competitor_retailers() asynchronously resolves the user's current retailer (by brand) and hardcoded big-4 competitor brands with deduplication.
    • run_ranking_job() orchestrates the full pipeline: resolves retailers, derives geography, calls rank_alternatives(), and returns the top-K ranked plans.
  • Coordinator enhancements: Added ranking state management with cache TTL behavior (per-day rollover), concurrent-run serialization via asyncio.Lock, and three new methods:

    • schedule_daily_ranking() registers the 00:30 callback via async_track_time_change.
    • cancel_ranking() unsubscribes from scheduled callbacks.
    • async_run_ranking_job(top_k=20) runs the ranking pipeline under lock, updates cached alternatives only on non-empty results, and returns cached results if the run produces nothing.
  • Manual service: New rank_alternatives service exposes ranking on demand with a top_k parameter (1–100, default 20); defensive int() coercion with try/except fallback to default on bad input.

  • Test coverage: Comprehensive pytest suite validates shape/type guards for malformed config, geography extraction, competitor retailer composition rules (dedup, ordering, fragment overrides), and ranking job orchestration.

Why

The changes enable both scheduled (daily 00:30) and manual (service-triggered) ranking to update alternatives after plan switches or on demand. Defensive parsing and cache TTL behavior prevent errors on malformed configs and avoid needless cache clears on same-day reruns. Serialization via lock ensures concurrent scheduled/manual runs don't race.

Breaking Changes

None. All changes are additive; existing unit tests continue to pass.

Files Changed

File Lines Added Lines Removed
custom_components/pricehawk/__init__.py 36 0
custom_components/pricehawk/cdr/ranking_job.py 169 0
custom_components/pricehawk/coordinator.py 115 2
custom_components/pricehawk/services.yaml 21 0
tests/conftest.py 2 0
tests/test_coordinator_ranking.py 300 0
Total 643 2

Review Change Stack

Artic0din and others added 5 commits May 17, 2026 08:22
Wires the multi-plan ranking pipeline (from PR #64) into the
coordinator on a daily 00:30 local schedule. Cheap-rank only for
now; deep-rank (consumption replay) joins in Phase 3.2 when the
universal HA-history backfill ships.

Architecture
------------

Pure logic in new ``custom_components/pricehawk/cdr/ranking_job.py``:

- ``get_user_geography(options)`` — extracts ``(state, postcode,
  distributor)`` from a config_entry's options. ``distributor``
  comes from the user's current cdr_plan's
  ``geography.distributors[0]`` (the plan the user already accepted,
  so its distributor is by definition theirs).

- ``get_competitor_retailers(session, options, *,
  competitor_fragments)`` — returns the retailer list scanned daily.
  Composition: user's CURRENT retailer (via ``cdr_plan.data.brand``)
  + hardcoded big-4 nationally-active competitors (AGL, Origin,
  EnergyAustralia, Red Energy). Dedup by brand_id so the user's
  retailer isn't double-scanned when it's also a big-4. Custom
  fragment tuple supported for tests + future user-configurable list.

- ``run_ranking_job(session, options, *, top_k, plan_cache,
  competitor_fragments)`` — top-level orchestrator. Returns the
  cheap-ranked top-K plans.

Coordinator-side wrappers
-------------------------

- New attributes on PriceHawkCoordinator:
  - ``_cheap_ranked_alternatives: list[dict]``
  - ``_ranking_last_run_at: datetime | None``
  - ``_ranking_plan_cache: dict[str, dict]``
  - ``_ranking_unsub: CALLBACK_TYPE | None``

- ``schedule_daily_ranking()`` registers ``async_track_time_change``
  for ``hour=0, minute=30`` local. Safe to call twice (second call
  replaces first).
- ``cancel_ranking()`` clears the handle.
- ``async_run_ranking_job(*, top_k)`` is a thin delegate around
  ``ranking_job.run_ranking_job``. Owns HA-side concerns: session
  acquisition, exception swallowing across the daily boundary,
  state persistence, cache reset. Empty-result runs keep prior
  ranking (transient failure → don't zero yesterday's data).

Why a separate module
---------------------

PriceHawkCoordinator inherits from
``DataUpdateCoordinator[dict[str, Any]]`` which gets mocked away by
``tests/conftest.py``. The class is therefore unreachable in unit
tests. Extracting the pure logic to ``ranking_job`` keeps it
unit-testable while the class methods stay as thin delegates.

Tests
-----

17 new tests in ``tests/test_coordinator_ranking.py``:

- Module constants (competitor fragments, run-time).
- ``get_user_geography``: 6 cases (happy path, missing fields, list
  ordering, empty options).
- ``get_competitor_retailers``: 5 cases (current retailer first,
  dedup, missing brand, unmatchable fragment, custom override).
- ``run_ranking_job``: 4 cases (empty retailers, geography
  forwarding, cache passthrough, top_k forwarding).

Also added ``homeassistant.helpers.aiohttp_client`` to ``conftest.py``
mock list — coordinator.py imports it; without the mock, importing
the coordinator silently fell back to a MagicMock class.

Full suite: 713/713 pass (was 696 — +17 new). Ruff clean.

Refs PHASE-3-ROADMAP.md §3.1. Next commit: HA service
``pricehawk.rank_alternatives`` in ``__init__.py``.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two findings from CodeRabbit and Sourcery on PR #68 (coordinator hook).

cache TTL (Sourcery + CR Major, same issue):
- coordinator.py — replaced unconditional ``cache.clear()`` at end of
  ``async_run_ranking_job`` with a date-rollover check at the start.
  Same-day reruns (scheduled at 00:30, manual via rank_alternatives
  service) now reuse cached plans as the docstring promised. New
  local day → cache clears before the run so overnight republished
  plans get fresh data.
- New ``_ranking_cache_date: date | None`` attribute tracks the
  cache's "as-of" day.

distributors type guard (CR Minor):
- ranking_job.py — ``get_user_geography`` now type-guards
  ``distributors`` with ``isinstance(..., list)`` before indexing.
  Malformed CDR payload could ship distributors as a string (e.g.
  ``"United Energy"``) and ``[0]`` would silently become ``"U"`` —
  skewing the ranking filter without raising.

Tests: 18/18 ranking tests pass (was 17 + 1 new for type guard).
Full suite 714/714.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three CodeRabbit findings on the round-1 fixes (f998d16):

1. ranking_job.py — extracted ``_safe_plan_data(options)`` helper
   that walks ``cdr_plan["data"]`` with isinstance gates at every
   level. Previously ``cdr_plan["data"]`` was assumed to be a dict;
   a malformed payload could ship it as a string (e.g.
   ``cdr_plan = {"data": "broken"}``) and trigger AttributeError on
   ``.get("geography")``. Same for ``brand`` — now ``isinstance(...,
   str)`` before ``.strip()``. ``geography`` non-dict also guarded
   in ``get_user_geography``. First-distributor-must-be-str check
   added to prevent dict / int from sneaking through as a
   distributor filter value.

2. coordinator.py — added ``self._ranking_lock = asyncio.Lock()`` to
   serialise concurrent runs. Scheduled 00:30 callback + manual
   service trigger could enter ``async_run_ranking_job``
   simultaneously, interleaving ``_ranking_plan_cache`` mutations
   and duplicating every expensive CDR detail fetch. The lock means
   the second caller blocks briefly then returns the
   freshly-populated cache results.

3. tests/test_coordinator_ranking.py — added 4 new tests for
   non-dict ``cdr_plan``, non-dict ``data``, non-dict ``geography``,
   and non-string first distributor. Covers the AttributeError
   regression paths CR flagged.

Tests: 22/22 ranking + 722/722 full suite green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…+ lifecycle

Wires the ranking job into the integration's lifecycle and exposes
a manual-trigger HA service. Completes Phase 3.1's user-visible
plumbing (sensor + dashboard exposure is commit 6).

Lifecycle wiring
----------------

- ``async_setup_entry`` now calls ``coordinator.schedule_daily_ranking()``
  to register the 00:30 local-time callback, plus ``hass.async_create_task``
  for an immediate first run so the alternatives data isn't empty until
  midnight on a fresh install.
- ``async_unload_entry`` calls ``coordinator.cancel_ranking()`` mirroring
  the existing ``cancel_persist()`` pattern.

HA service: pricehawk.rank_alternatives
---------------------------------------

- ``services.yaml`` declares the service with a single ``top_k`` int field
  (1-100, default 20).
- Handler clamps top_k to [1, 100] and delegates to
  ``coordinator.async_run_ranking_job(top_k=...)``.
- Most useful right after a plan switch: lets the user force-run the
  ranking pipeline so the alternatives sensor reflects the new
  distributor / postcode without waiting for the next 00:30 schedule.
- Unregistered alongside the other two services in
  ``async_unload_entry`` when the last config entry goes away.

No tests
--------

The service handler is a closure inside ``async_setup_entry`` (tightly
coupled to HA's service registration plumbing — can't be unit-tested
without HA itself). The underlying ``async_run_ranking_job`` is already
covered by ``tests/test_coordinator_ranking.py`` (17 tests via the
``ranking_job`` module). Live smoke test will validate the wiring once
the integration runs in HA.

Full suite: 713/713 still passing. Ruff clean. services.yaml validates.

Refs PHASE-3-ROADMAP.md §3.1. Next (optional commit 6): sensor for
ranked alternatives + dashboard exposure.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CR found that ``int(call.data.get("top_k", 20))`` raises
ValueError/TypeError on malformed service data (e.g. a YAML typo
sending ``top_k: "abc"``). The exception propagates back to the
service caller and fails the run instead of degrading gracefully.

Now wraps the coercion in try/except, logs a warning with the bad
value, and falls back to the default 20. Clamp [1, 100] still
applied after.

Same kind of defence as elsewhere in the codebase (``safe_int``,
``safe_decimal`` for incentive parsers).

Tests: 714/714 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@sourcery-ai
Copy link
Copy Markdown

sourcery-ai Bot commented May 17, 2026

Reviewer's Guide

Implements the daily multi-plan ranking lifecycle for the PriceHawk integration by adding a coordinator-driven scheduled ranking job, a manual Home Assistant service to trigger it with configurable top_k, robust option-parsing and concurrency control, and a new ranking_job orchestration module with accompanying tests.

File-Level Changes

Change Details Files
Add coordinator-managed daily ranking job with caching, locking, and error-tolerant orchestration.
  • Introduce ranking-related coordinator state for cached results, last-run timestamp, per-plan cache, cache date, unsubscribe handle, and an asyncio lock to serialize runs.
  • Define 00:30 local run-time constants and schedule/cancel helpers using async_track_time_change.
  • Implement async_run_ranking_job wrapper that resets cache on date rollover, calls cdr.ranking_job.run_ranking_job with a shared aiohttp session and options, swallows unexpected exceptions while keeping prior results, and conditionally updates persisted ranking and last-run timestamp.
custom_components/pricehawk/coordinator.py
Wire the coordinator ranking lifecycle into HA setup/unload and expose a rank_alternatives service with defensive top_k handling.
  • During async_setup_entry, schedule the daily ranking job, trigger an immediate first run via a background task, and continue existing persistence and panel setup.
  • Register rank_alternatives HA service that coerces and clamps the optional top_k payload (default 20, range 1–100), invokes the coordinator ranking job, and logs outcome.
  • Ensure ranking scheduling is cancelled on config entry unload and the rank_alternatives service is unregistered alongside other services when the last entry is removed.
custom_components/pricehawk/__init__.py
custom_components/pricehawk/services.yaml
Extract pure ranking orchestration logic into a new cdr.ranking_job module to keep coordinator-side logic testable without HA context.
  • Add DEFAULT_COMPETITOR_BRAND_FRAGMENTS and a get_competitor_retailers helper that composes the ordered retailer list (current retailer then big four), with safe brand handling and deduplication by brand_id.
  • Implement get_user_geography and _safe_plan_data to robustly derive (state, postcode, distributor) from config_entry options, tolerating malformed shapes without raising.
  • Provide run_ranking_job to wire together competitor resolution, geography extraction, and rank_alternatives invocation with top_k and optional plan_cache, logging and returning an empty list when no retailers resolve.
custom_components/pricehawk/cdr/ranking_job.py
Extend tests and HA mocks to cover ranking orchestration and registry/geography logic.
  • Update the test Home Assistant mocks to include homeassistant.helpers.aiohttp_client so the new ranking code can acquire client sessions without failing tests.
  • Add test_coordinator_ranking.py to validate competitor fragments, run-time constants, geography extraction robustness against malformed option payloads, competitor retailer composition and deduplication, and run_ranking_job integration with registry and rank_alternatives (including plan cache and top_k forwarding).
tests/conftest.py
tests/test_coordinator_ranking.py

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 17, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 32cc9eaf-bf13-421d-baa1-bd0e13f57461

📥 Commits

Reviewing files that changed from the base of the PR and between bbfa51e and d022187.

📒 Files selected for processing (6)
  • custom_components/pricehawk/__init__.py
  • custom_components/pricehawk/cdr/ranking_job.py
  • custom_components/pricehawk/coordinator.py
  • custom_components/pricehawk/services.yaml
  • tests/conftest.py
  • tests/test_coordinator_ranking.py

Walkthrough

Phase 3.1 adds a daily 00:30 local-time ranking job and on-demand service to fetch cheaper plan alternatives. A new ranking_job module orchestrates competitor retailer selection and ranking. The coordinator wraps this with scheduling, caching, and concurrency control. Integration lifecycle wires the scheduler on setup and cancels on teardown. A Home Assistant service exposes manual triggering with clamped top_k parameter.

Changes

Phase 3.1 Ranking Pipeline Implementation

Layer / File(s) Summary
Ranking Job Orchestration
custom_components/pricehawk/cdr/ranking_job.py
New module defines DEFAULT_COMPETITOR_BRAND_FRAGMENTS (big-4 identifiers), get_user_geography() safely extracts postcode/distributor from nested CDR payloads, get_competitor_retailers() fetches registry and composes user retailer + competitors with dedup by brand_id, and run_ranking_job() orchestrates the full pipeline: resolve retailers, derive geography, invoke rank_alternatives() and return top-K. Extensive malformed-payload guarding throughout (missing dicts, non-list distributors, type coercion).
Coordinator Scheduling and State Management
custom_components/pricehawk/coordinator.py
Adds Phase 3.1 imports (date, async_track_time_change, DEFAULT_TOP_K, run_ranking_job), scheduling constants (_RANKING_RUN_HOUR=0, _RANKING_RUN_MINUTE=30), and six coordinator state fields for cached alternatives, timestamps, per-day cache with date invalidation, and locking. schedule_daily_ranking() registers HA time-change callback at 00:30 local; cancel_ranking() unregisters. async_run_ranking_job(top_k) executes under lock with per-day cache reset before running, invokes the pure run_ranking_job(), swallows exceptions, and updates cached results only when non-empty (preserving prior results on failure).
Service Exposure and Lifecycle Integration
custom_components/pricehawk/__init__.py, custom_components/pricehawk/services.yaml
async_setup_entry calls coordinator.schedule_daily_ranking() + immediately spawns async_run_ranking_job() so alternatives populate on fresh install. Service handler rank_alternatives reads top_k, coerces to int (default 20), clamps to [1, 100], calls coordinator.async_run_ranking_job(top_k=...), logs result count. async_unload_entry calls coordinator.cancel_ranking() and unregisters service on last-entry teardown. Service schema defines top_k (number, 1–100, default 20, slider mode).
Ranking Pipeline Test Coverage
tests/conftest.py, tests/test_coordinator_ranking.py
Mocks homeassistant.helpers.aiohttp_client for test imports. test_coordinator_ranking.py with 30+ test cases across 4 classes: validates module constants (big-4 fragments, 00:30), get_user_geography() postcode/distributor extraction and malformed payload safety, get_competitor_retailers() ordering/dedup/fallback behavior, and run_ranking_job() orchestration (empty retailers, geography forwarding, cache/top_k passthrough).

Sequence Diagram

sequenceDiagram
  participant User as User/Setup
  participant Setup as async_setup_entry
  participant Coord as Coordinator
  participant Sched as HA Scheduler
  participant Lock as asyncio.Lock
  participant Job as run_ranking_job
  participant Service as Service Handler
  participant Rank as rank_alternatives<br/>(external)
  
  User ->> Setup: Setup integration
  Setup ->> Coord: schedule_daily_ranking()
  Coord ->> Sched: register 00:30 callback
  Setup ->> Coord: async_run_ranking_job()
  Coord ->> Lock: acquire
  Coord ->> Job: invoke
  Job ->> Rank: forward retailers, geography
  Rank -->> Job: alternatives
  Job -->> Coord: top-K list
  Coord ->> Coord: cache result
  Lock ->> Coord: release
  
  Note over Sched: Daily at 00:30
  Sched ->> Coord: async_run_ranking_job()
  Coord ->> Lock: acquire
  Coord ->> Job: invoke
  Job ->> Rank: forward retailers, geography
  Rank -->> Job: alternatives
  Job -->> Coord: top-K list
  Coord ->> Coord: cache result
  Lock ->> Coord: release
  
  User ->> Service: Call rank_alternatives<br/>(top_k=N)
  Service ->> Coord: async_run_ranking_job(top_k=N)
  Coord ->> Lock: acquire
  Coord ->> Job: invoke
  Job ->> Rank: forward retailers, geography
  Rank -->> Job: alternatives
  Job -->> Coord: top-K list
  Coord ->> Coord: cache result
  Lock ->> Coord: release
  Service -->> User: N result(s) logged
Loading

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Rationale: New orchestration module is straightforward (safe extraction helpers + function composition), but coordinator adds concurrent-execution semantics (locking, cache invalidation across date rollover, exception swallowing). Service handler is simple delegation. Test suite is large (30+ tests) but validates clear expected behaviors with mocks. Integration wiring is minimal. Moderate complexity comes from understanding the locking strategy and cache reset logic; pure logic is easy to follow.

Possibly Related PRs

  • Artic0din/ha-pricehawk#68: Introduces the PriceHawkCoordinator scheduling and state-management methods that this PR wires into the integration lifecycle and exposes via the service.
  • Artic0din/ha-pricehawk#64: Defines the rank_alternatives() API and plan-ranking logic that run_ranking_job() orchestrates and calls as its final step.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch phase-3-1-rank-service
  • 🛠️ scrub-secrets
  • 🛠️ no-hardcoded-rates
  • 🛠️ amber-api-limits
  • 🛠️ dashboard-protocol-safety

Comment @coderabbitai help to get the list of available commands and usage tips.

@Artic0din Artic0din merged commit 1a80159 into dev May 17, 2026
4 of 5 checks passed
@Artic0din Artic0din deleted the phase-3-1-rank-service branch May 17, 2026 02:04
Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 3 issues, and left some high level feedback:

  • In handle_rank_alternatives, you currently hardcode 20 in multiple places; consider reusing the shared DEFAULT_TOP_K constant so the service default stays in sync with the ranking logic.
  • The session parameter in get_competitor_retailers/run_ranking_job is annotated as aiohttp.ClientSession but your tests pass None; either relax the type to ClientSession | None or stop passing None so the type hints match actual usage.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `handle_rank_alternatives`, you currently hardcode `20` in multiple places; consider reusing the shared `DEFAULT_TOP_K` constant so the service default stays in sync with the ranking logic.
- The `session` parameter in `get_competitor_retailers`/`run_ranking_job` is annotated as `aiohttp.ClientSession` but your tests pass `None`; either relax the type to `ClientSession | None` or stop passing `None` so the type hints match actual usage.

## Individual Comments

### Comment 1
<location path="custom_components/pricehawk/__init__.py" line_range="193-201" />
<code_context>
+    # waiting for the next 00:30 schedule fire. Most useful right after
+    # switching plans (so the alternatives ranking reflects the new
+    # distributor / postcode immediately).
+    async def handle_rank_alternatives(call: object) -> None:
+        # CR-fix: malformed service payload (e.g. ``top_k: "abc"`` from
+        # a typo in a YAML automation) would raise ValueError/TypeError
+        # and fail the call. Coerce defensively + fall back to default.
+        raw = call.data.get("top_k", 20)  # type: ignore[attr-defined]
+        try:
+            top_k = int(raw)
+        except (TypeError, ValueError):
+            _LOGGER.warning(
+                "rank_alternatives: invalid top_k=%r, using default 20", raw
+            )
</code_context>
<issue_to_address>
**suggestion:** Avoid duplicating the default `top_k` value; reuse the shared constant to prevent drift.

`20` is hardcoded in the default for `call.data.get`, in the warning text, and implicitly as the clamp bound. Prefer a single source of truth: either import `DEFAULT_TOP_K` from the ranking layer and use it in all three places, or define a module-level constant here and reference it everywhere. That avoids divergence if the default changes elsewhere.

Suggested implementation:

```python
    RANK_ALTERNATIVES_DEFAULT_TOP_K = 20

    # Phase 3.1 commit 5 — manual ranking trigger. Lets users force-run
    # the ranking pipeline from Developer Tools → Services without
    # waiting for the next 00:30 schedule fire. Most useful right after
    # switching plans (so the alternatives ranking reflects the new
    # distributor / postcode immediately).
    async def handle_rank_alternatives(call: object) -> None:
        # CR-fix: malformed service payload (e.g. ``top_k: "abc"`` from
        # a typo in a YAML automation) would raise ValueError/TypeError
        # and fail the call. Coerce defensively + fall back to default.
        raw = call.data.get("top_k", RANK_ALTERNATIVES_DEFAULT_TOP_K)  # type: ignore[attr-defined]
        try:
            top_k = int(raw)
        except (TypeError, ValueError):
            _LOGGER.warning(
                "rank_alternatives: invalid top_k=%r, using default %d",
                raw,
                RANK_ALTERNATIVES_DEFAULT_TOP_K,
            )
            top_k = RANK_ALTERNATIVES_DEFAULT_TOP_K
        top_k = max(1, min(top_k, 100))
        result = await coordinator.async_run_ranking_job(top_k=top_k)
        _LOGGER.info(
            "rank_alternatives service: ran successfully, %d result(s)",
            len(result),
        )

```

If there is already a shared default (e.g. `DEFAULT_TOP_K`) in your ranking layer, you may prefer to:
1. Import that constant at the top of this file.
2. Replace `RANK_ALTERNATIVES_DEFAULT_TOP_K` with the imported constant name.
This preserves a single source of truth across modules.
</issue_to_address>

### Comment 2
<location path="custom_components/pricehawk/cdr/ranking_job.py" line_range="84-93" />
<code_context>
+async def get_competitor_retailers(
</code_context>
<issue_to_address>
**suggestion:** Handle the case where the current brand exists but does not resolve to a registry entry more explicitly.

If `current_brand` is set but `find_by_brand` returns no match, we silently ignore it and only use the hardcoded competitors. Consider either logging this condition (debug/info) or adding a more forgiving lookup (e.g., case/whitespace normalization) so it’s easier to detect when the registry and stored plan are out of sync.

Suggested implementation:

```python
    current_brand = plan_data.get("brand")

    current_retailer: RetailerEndpoint | None = None
    if current_brand:
        # First, try the exact lookup used previously
        current_retailer = registry.find_by_brand(current_brand)

        if not current_retailer:
            # Fallback: trim whitespace and do a case-insensitive comparison against
            # known registry brands to recover from minor formatting drift.
            normalized_brand = current_brand.strip().lower()
            for retailer in registry:
                try:
                    retailer_brand = retailer.brand
                except AttributeError:
                    # Be defensive if registry entries are not uniform
                    continue

                if retailer_brand and retailer_brand.strip().lower() == normalized_brand:
                    current_retailer = retailer
                    break

        if not current_retailer:
            # Explicitly log when the stored plan refers to a brand that we cannot
            # resolve in the registry so configuration drift is discoverable.
            _LOGGER.info(
                "CDR ranking: current brand %r from plan did not match any registry entry",
                current_brand,
            )

```

I assumed the existence of:
1. A `registry` object that is iterable and whose items have a `.brand` attribute.
2. A module-level `_LOGGER` logger instance.

To integrate this cleanly, you should:
1. Ensure `registry` is iterable and that its items expose a `brand` attribute (or adjust the loop to match your actual type, e.g., `registry.retailers` or `registry.values()`).
2. Confirm `_LOGGER` is defined in this module (typical Home Assistant pattern: `import logging` and `_LOGGER = logging.getLogger(__name__)`). If it is named differently, update the logging call accordingly.
3. If `RetailerEndpoint` is not the correct type for `current_retailer`, adjust the annotation accordingly or remove it to match your existing conventions.
</issue_to_address>

### Comment 3
<location path="tests/test_coordinator_ranking.py" line_range="151" />
<code_context>
+# ---------------------------------------------------------------------------
+
+
+class TestGetCompetitorRetailers:
+    def test_includes_user_current_retailer_first(self):
+        opts = {"cdr_plan": {"data": {"brand": "GloBird"}}}
</code_context>
<issue_to_address>
**suggestion (testing):** Add a test for non-string `brand` values in `cdr_plan.data.brand` to cover the type-guard logic

`get_competitor_retailers` only accepts `str` brands and intentionally ignores `None`/non-string values to avoid `.strip()`/matching errors. Current tests cover a missing `brand`, but not the case where `brand` exists and is malformed (e.g. `None`, `42`, `{}`), which is what the guards are for.

To cover that behavior, you could add a parametrized test like:

```python
@pytest.mark.parametrize("bad_brand", [None, 42, {}, []])
def test_non_string_brand_is_ignored(bad_brand):
    opts = {"cdr_plan": {"data": {"brand": bad_brand}}}
    agl = _retailer("AGL Energy", brand_id="200")
    with patch(
        "custom_components.pricehawk.cdr.ranking_job.get_registry",
        AsyncMock(return_value=([agl], "live")),
    ):
        result = asyncio.run(get_competitor_retailers(None, opts))
    assert all(r.brand_name != bad_brand for r in result)
```

This ensures malformed `brand` values are ignored and do not break the function or enter the competitor list.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +193 to +201
async def handle_rank_alternatives(call: object) -> None:
# CR-fix: malformed service payload (e.g. ``top_k: "abc"`` from
# a typo in a YAML automation) would raise ValueError/TypeError
# and fail the call. Coerce defensively + fall back to default.
raw = call.data.get("top_k", 20) # type: ignore[attr-defined]
try:
top_k = int(raw)
except (TypeError, ValueError):
_LOGGER.warning(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Avoid duplicating the default top_k value; reuse the shared constant to prevent drift.

20 is hardcoded in the default for call.data.get, in the warning text, and implicitly as the clamp bound. Prefer a single source of truth: either import DEFAULT_TOP_K from the ranking layer and use it in all three places, or define a module-level constant here and reference it everywhere. That avoids divergence if the default changes elsewhere.

Suggested implementation:

    RANK_ALTERNATIVES_DEFAULT_TOP_K = 20

    # Phase 3.1 commit 5 — manual ranking trigger. Lets users force-run
    # the ranking pipeline from Developer Tools → Services without
    # waiting for the next 00:30 schedule fire. Most useful right after
    # switching plans (so the alternatives ranking reflects the new
    # distributor / postcode immediately).
    async def handle_rank_alternatives(call: object) -> None:
        # CR-fix: malformed service payload (e.g. ``top_k: "abc"`` from
        # a typo in a YAML automation) would raise ValueError/TypeError
        # and fail the call. Coerce defensively + fall back to default.
        raw = call.data.get("top_k", RANK_ALTERNATIVES_DEFAULT_TOP_K)  # type: ignore[attr-defined]
        try:
            top_k = int(raw)
        except (TypeError, ValueError):
            _LOGGER.warning(
                "rank_alternatives: invalid top_k=%r, using default %d",
                raw,
                RANK_ALTERNATIVES_DEFAULT_TOP_K,
            )
            top_k = RANK_ALTERNATIVES_DEFAULT_TOP_K
        top_k = max(1, min(top_k, 100))
        result = await coordinator.async_run_ranking_job(top_k=top_k)
        _LOGGER.info(
            "rank_alternatives service: ran successfully, %d result(s)",
            len(result),
        )

If there is already a shared default (e.g. DEFAULT_TOP_K) in your ranking layer, you may prefer to:

  1. Import that constant at the top of this file.
  2. Replace RANK_ALTERNATIVES_DEFAULT_TOP_K with the imported constant name.
    This preserves a single source of truth across modules.

Comment on lines +84 to +93
async def get_competitor_retailers(
session: aiohttp.ClientSession,
options: dict[str, Any],
*,
competitor_fragments: tuple[str, ...] = DEFAULT_COMPETITOR_BRAND_FRAGMENTS,
) -> list[RetailerEndpoint]:
"""Build the retailer list scanned during the daily ranking job.

Composition (in priority order, dedup by ``brand_id``):
1. User's CURRENT retailer (from ``cdr_plan.data.brand``).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Handle the case where the current brand exists but does not resolve to a registry entry more explicitly.

If current_brand is set but find_by_brand returns no match, we silently ignore it and only use the hardcoded competitors. Consider either logging this condition (debug/info) or adding a more forgiving lookup (e.g., case/whitespace normalization) so it’s easier to detect when the registry and stored plan are out of sync.

Suggested implementation:

    current_brand = plan_data.get("brand")

    current_retailer: RetailerEndpoint | None = None
    if current_brand:
        # First, try the exact lookup used previously
        current_retailer = registry.find_by_brand(current_brand)

        if not current_retailer:
            # Fallback: trim whitespace and do a case-insensitive comparison against
            # known registry brands to recover from minor formatting drift.
            normalized_brand = current_brand.strip().lower()
            for retailer in registry:
                try:
                    retailer_brand = retailer.brand
                except AttributeError:
                    # Be defensive if registry entries are not uniform
                    continue

                if retailer_brand and retailer_brand.strip().lower() == normalized_brand:
                    current_retailer = retailer
                    break

        if not current_retailer:
            # Explicitly log when the stored plan refers to a brand that we cannot
            # resolve in the registry so configuration drift is discoverable.
            _LOGGER.info(
                "CDR ranking: current brand %r from plan did not match any registry entry",
                current_brand,
            )

I assumed the existence of:

  1. A registry object that is iterable and whose items have a .brand attribute.
  2. A module-level _LOGGER logger instance.

To integrate this cleanly, you should:

  1. Ensure registry is iterable and that its items expose a brand attribute (or adjust the loop to match your actual type, e.g., registry.retailers or registry.values()).
  2. Confirm _LOGGER is defined in this module (typical Home Assistant pattern: import logging and _LOGGER = logging.getLogger(__name__)). If it is named differently, update the logging call accordingly.
  3. If RetailerEndpoint is not the correct type for current_retailer, adjust the annotation accordingly or remove it to match your existing conventions.

# ---------------------------------------------------------------------------


class TestGetCompetitorRetailers:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Add a test for non-string brand values in cdr_plan.data.brand to cover the type-guard logic

get_competitor_retailers only accepts str brands and intentionally ignores None/non-string values to avoid .strip()/matching errors. Current tests cover a missing brand, but not the case where brand exists and is malformed (e.g. None, 42, {}), which is what the guards are for.

To cover that behavior, you could add a parametrized test like:

@pytest.mark.parametrize("bad_brand", [None, 42, {}, []])
def test_non_string_brand_is_ignored(bad_brand):
    opts = {"cdr_plan": {"data": {"brand": bad_brand}}}
    agl = _retailer("AGL Energy", brand_id="200")
    with patch(
        "custom_components.pricehawk.cdr.ranking_job.get_registry",
        AsyncMock(return_value=([agl], "live")),
    ):
        result = asyncio.run(get_competitor_retailers(None, opts))
    assert all(r.brand_name != bad_brand for r in result)

This ensures malformed brand values are ignored and do not break the function or enter the competitor list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant