Phase 3.1 prep — EME refdata2 registry + brand disambiguation#54
Conversation
Reviewer's GuideReplaces the legacy jxeeno-based CDR registry with Energy Made Easy (EME) refdata2 as the sole retailer source, adds a brand discriminator to registry and HTTP client so shared-base-URI retailers can be disambiguated, wires this through the config flow, and hardens live-registry fetching with an EME snapshot fallback and expanded tests. File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. 🗂️ Base branches to auto review (2)
Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- In
fetch_plan_detailyou unconditionally append?brand=..., which will break ifbase_urlis ever given with its own query string; consider usingurllib.parse.urlparse/urlunparseorurllib.parse.parse_qsto merge query parameters safely instead of manual concatenation. - The baked‑in registry tests assert
len(orgs) > 50/len(endpoints) > 50; tying tests to approximate current registry size can make them brittle over time, so it may be more robust to assert the presence of specific expected orgIds/brands or invariants (e.g. that shared‑base‑URI brands retain distinctcdr_brand) instead of a numeric threshold. - You now have two very similar HTTP mocking helpers in the tests (
_mock_sessionand_mock_session_capturing); consider consolidating them into a single, parameterized helper to reduce duplication and keep future test changes (e.g. adding headers or timeouts) in one place.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `fetch_plan_detail` you unconditionally append `?brand=...`, which will break if `base_url` is ever given with its own query string; consider using `urllib.parse.urlparse/urlunparse` or `urllib.parse.parse_qs` to merge query parameters safely instead of manual concatenation.
- The baked‑in registry tests assert `len(orgs) > 50`/`len(endpoints) > 50`; tying tests to approximate current registry size can make them brittle over time, so it may be more robust to assert the presence of specific expected orgIds/brands or invariants (e.g. that shared‑base‑URI brands retain distinct `cdr_brand`) instead of a numeric threshold.
- You now have two very similar HTTP mocking helpers in the tests (`_mock_session` and `_mock_session_capturing`); consider consolidating them into a single, parameterized helper to reduce duplication and keep future test changes (e.g. adding headers or timeouts) in one place.
## Individual Comments
### Comment 1
<location path="custom_components/pricehawk/cdr/registry.py" line_range="118-123" />
<code_context>
+ display = o.get("tradingName") or o.get("orgName")
+ if not (cdr_code and display):
continue
+ logo_path = o.get("logo")
+ logo_uri = (
+ f"{_EME_LOGO_PREFIX}{logo_path}"
+ if isinstance(logo_path, str) and logo_path.startswith("/")
+ else logo_path
+ )
out.append(
RetailerEndpoint(
- brand_id=str(bid),
- brand_name=str(brand),
- base_uri=str(base).rstrip("/"),
- logo_uri=e.get("logoUri"),
- abn=e.get("abn"),
- last_updated=e.get("lastUpdated"),
+ brand_id=str(org_id),
+ brand_name=str(display),
+ base_uri=_EME_BASE_URI_TEMPLATE.format(cdr_code=cdr_code),
+ logo_uri=logo_uri,
+ abn=str(o.get("abn")) if o.get("abn") else None,
+ last_updated=None, # EME envelope has no per-row mtime
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Normalize `logo_uri` to a string or `None` instead of passing through arbitrary types.
Currently `logo_uri` is a `https://energymadeeasy.gov.au/...` string only when `logo_path` is a string starting with `/`; otherwise we pass through whatever truthy value `logo_path` is. If upstream changes `logo` to a non-string (e.g. dict/list), we’ll store that in `RetailerEndpoint.logo_uri` despite its type being `str | None`. Consider normalising non-string values to `None` (or `str(...)`) so downstream consumers can safely rely on the declared type.
```suggestion
logo_path = o.get("logo")
if isinstance(logo_path, str) and logo_path:
logo_uri = (
f"{_EME_LOGO_PREFIX}{logo_path}"
if logo_path.startswith("/")
else logo_path
)
else:
logo_uri = None
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
CR-driven correctness + security fixes on PR #28 plus v1.5.0 release artefacts (manifest version, CHANGELOG section). Correctness - cdr_client.fetch_plan_list now dedups by planId — republish-boundary page-flips no longer double-count plans. - cdr_client.fetch_plan_list: 404 from the LIST endpoint raises CdrAPIError (bad URL / proxy regression), not CdrPlanNotFound which is reserved for stale planId on the detail endpoint. - coordinator.daily_rollover skips saving_month_aud accumulation when Amber is not configured (was computing fake savings against $0). - streaming.from_dict only restores _last_update when the stored state belongs to today — restoring yesterday's timestamp produced a synthetic delta on the first tick of a new day → over-counted energy/cost. - providers/cdr_plan guards float(dailySupplyCharge) against malformed CDR values — defaults to $0/day instead of crashing provider setup. - vpp_rebate.parse_rule uses safe_int for batteries_enrolled; garbage option values no-op the credit instead of aborting parser dispatch. Wizard - Removed dead "Skip CDR — enter rates manually" sentinel from the install-flow retailer + plan-select dropdowns. With manual entry deleted in Phase 3.0f the skip handlers re-entered the same step, creating an infinite loop. Options-flow keeps a Cancel sentinel (legitimate escape back to init menu). - Deleted async_step_cdr_override and CONF_CDR_OVERRIDE_JSON — the step was wired into strings/const but never routed from any other step, so it was unreachable dead code. Helpers _deep_merge_dict + _parse_override_json + their tests removed too. - strings.json + translations/en.json: removed "skip CDR" / "enter rates manually" copy from cdr_retailer, cdr_plan_select, cdr_error step descriptions and cdr_empty_unavailable error. Release scope - manifest.json: 1.4.0-beta.2 → 1.5.0-beta.1. - CHANGELOG: full [1.5.0-beta.1] section documenting CDR-native engine, 8 retailer parsers, opt-in fields, streaming evaluator, wizard rewrite, and all the carried-over fixes. Known limitations (tracked, not fixed in this commit) - tiered_fit PERIOD cap multiplies by distinct days in slots. Correct for full billing-period evals (the common case); under-credits partial windows, over-credits cross-period windows. Proper fix needs electricityContract.billingPeriod parsing — deferred. Documented inline in tiered_fit.py. Test plan - 633 tests pass (was 658 on PR #54 — PR #28 branch doesn't have the registry-EME changes yet). - ruff clean on all modified files. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@CHANGELOG.md`:
- Around line 7-35: The changelog entries currently only live under "##
[Unreleased]"; add a concrete released section header (e.g. "## [x.y.z] -
YYYY-MM-DD") and either move or duplicate the listed "Changed", "Added", and
"Removed" subsections under that new version header while preserving the Keep a
Changelog formatting and the exact bullet text (retailer registry switched to
EME refdata2, RetailerEndpoint.cdr_brand, fetch_plan_list/fetch_plan_detail
brand parameter, baked-in snapshot path, and removed cdr_endpoints.json) so the
PR contains a new versioned release section alongside Unreleased.
In `@custom_components/pricehawk/cdr/registry.py`:
- Around line 115-117: The display value taken from o.get("tradingName") or
o.get("orgName") should be trimmed of leading/trailing whitespace before use and
before persisting as brand_name; update the code path around the display
variable (the lines that compute display and where brand_name is assigned) to
call .strip() on the chosen string (and guard against None), and apply the same
trimming to the other occurrence mentioned (the block around lines that set
brand_name at 127-128) so upstream names with trailing spaces are normalized
before slugging/storage.
- Around line 111-114: The code assumes o.get("cdrCode")/o.get("cdrBrand") are
strings and calls .strip(), which can raise AttributeError or other exceptions
and currently only some exception types are converted to CdrUnavailable; update
get_registry() so access/parsing of cdr_code and cdr_brand (the lines assigning
cdr_code and cdr_brand) are defensive: wrap the retrieval and .strip() calls in
a small try/except that catches AttributeError (or more generally Exception) and
converts any parsing failure into raising CdrUnavailable (the same fallback used
elsewhere), ensuring malformed payloads cannot crash get_registry(); apply the
same defensive pattern to the similar parsing block around lines 165-171.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 02a1eed6-96f8-474c-b665-641f9e8e770d
📒 Files selected for processing (10)
CHANGELOG.mdcustom_components/pricehawk/cdr/cdr_client.pycustom_components/pricehawk/cdr/data/cdr_endpoints.jsoncustom_components/pricehawk/cdr/data/eme_refdata.jsoncustom_components/pricehawk/cdr/registry.pycustom_components/pricehawk/config_flow.pycustom_components/pricehawk/strings.jsoncustom_components/pricehawk/translations/en.jsontests/test_cdr_client.pytests/test_cdr_registry.py
💤 Files with no reviewable changes (1)
- custom_components/pricehawk/cdr/data/cdr_endpoints.json
📜 Review details
🧰 Additional context used
📓 Path-based instructions (4)
**/*.md
⚙️ CodeRabbit configuration file
**/*.md: Verify: no broken links, code examples match actual implementation, version numbers are current, no TODO left unfixed.
Files:
CHANGELOG.md
**/CHANGELOG.md
⚙️ CodeRabbit configuration file
**/CHANGELOG.md: Entries MUST follow Keep a Changelog format. New version section MUST be present for this PR's changes.
Files:
CHANGELOG.md
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Useasync/awaitfor all I/O operations
NEVER hardcode tokens, API keys, or credentials in any file — use HA config entry storage
State restore MUST validate storage version before loading
from_dict() methods MUST receive an explicit HA-timezone date — no date.today() fallback
Files:
tests/test_cdr_client.pycustom_components/pricehawk/cdr/cdr_client.pycustom_components/pricehawk/config_flow.pycustom_components/pricehawk/cdr/registry.pytests/test_cdr_registry.py
⚙️ CodeRabbit configuration file
**/*.py: Check for: type hints on all public functions, no bareexcept:, SQL injection risks, missing input sanitisation, secrets not in code, Flask Blueprint structure respected, APScheduler job error handling.
Files:
tests/test_cdr_client.pycustom_components/pricehawk/cdr/cdr_client.pycustom_components/pricehawk/config_flow.pycustom_components/pricehawk/cdr/registry.pytests/test_cdr_registry.py
**/test*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Tariff rate calculation changes require edge case tests (negative rates, midnight boundaries, empty windows)
Files:
tests/test_cdr_client.pytests/test_cdr_registry.py
🔇 Additional comments (5)
custom_components/pricehawk/strings.json (1)
228-228: LGTM!custom_components/pricehawk/translations/en.json (1)
228-228: LGTM!custom_components/pricehawk/config_flow.py (1)
1330-1332: LGTM!Also applies to: 1525-1527, 1548-1550, 2147-2149, 2171-2173
custom_components/pricehawk/cdr/cdr_client.py (1)
62-63: LGTM!Also applies to: 70-73, 78-86, 108-110, 117-119, 121-123
tests/test_cdr_client.py (1)
191-212: LGTM!Also applies to: 214-248
…ation What was broken - jxeeno community registry covered 78 retailers vs the AER's full 117. It also published 2 known wrong base URIs (ARCLINE → /arcline/ instead of /energy-locals/; iO Energy → /io-energy/ instead of /energy-locals/). Plans hosted on shared base URIs (Energy Locals hosts 7 brands; OVO hosts 3) had no way to disambiguate which brand's plans were being requested. What this fixes - Switches the live registry source to EME refdata2 (117 orgs). - Ships the EME snapshot baked in as the offline fallback. - Drops jxeeno entirely — two unreliable sources are not better than one good source with an offline cache. - Adds the cdrBrand discriminator on every RetailerEndpoint and threads it through cdr_client as an optional ?brand= query param. Shared-base-URI plans (ARCLINE, RAA, Cooperative, Indigo, Sonnen, iO, MYOB, OVO CTM, Sunswitch, etc.) are now correctly identified. - Hardens the parser: trailing-whitespace bug in upstream EME cdrBrand fields (Amber, Aurora, Brighte) is stripped. Malformed payloads raise CdrUnavailable so the wizard falls back to baked-in rather than crashing. Test plan - 658 tests pass (added 18 new for EME parsing, baked-in health, shared-base-URI disambiguation, malformed-body fallback, and brand= query-string composition). - Ruff clean on all changed files. Why - Required prep for Phase 3.1 multi-plan ranking. Without cdrBrand, the wizard can't distinguish two brands hosted on the same CDR base URI, so picking the wrong one returns the wrong plan list. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
00b7784 to
8d3d9b5
Compare
|
@coderabbitai re-review please — all 3 round-1 findings fixed in 8d3d9b5: CHANGELOG folded into [1.5.0-beta.1] versioned section, defensive _safe_str helper added for non-string payloads + AttributeError caught in fallback, display name whitespace stripped. 647 tests pass. |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 188308
Fix 1 — CHANGELOG versioned: Additions land inside the Fix 2 —
Fix 3 — Display name whitespace stripped: The rest of the PR (brand query-param threading through [approve] |
Previously `logo_uri` could be passed through as-is when EME shipped a non-string `logo` value (dict, int, list). RetailerEndpoint.logo_uri is typed `str | None` so downstream consumers expect that contract. Now: `isinstance(logo_path, str) and logo_path` gates the assignment; anything else (None, dict, empty string) becomes None. Test: tests/test_cdr_registry.py::test_logo_uri_normalised_to_str_or_none covers dict, empty string, None, absolute URL, relative path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
PriceHawk's retailer list comes from a public registry of who supports the Consumer Data Right (CDR) energy API. This PR switches the registry source from a community-maintained list (78 retailers) to the official Energy Made Easy refdata service (117 retailers) and fixes a bug where multiple retailers sharing one address couldn't be told apart.
What was broken
What this fixes
?brand=arcline(orcooperative,raaetc) and gets the correct plan list back.cdrBrandis now stripped; malformed payloads fall back to the bundled offline snapshot instead of crashing the wizard.Test plan
pytest— 658 passing (18 new for this PR)?brand=<x>appended when set on plan list + plan detail, omitted when Noneruff check— clean on all changed filesChanges
Registry rewrite (
cdr/registry.py)api.energymadeeasy.gov.au/refdata2?keys=organisations,thirdParties)RetailerEndpointaddscdr_brand: str | NonefieldcdrBrandCdrUnavailablesoget_registry()falls through to baked-inHTTP client (
cdr/cdr_client.py)fetch_plan_list(brand=...)appends?brand=<x>when setfetch_plan_detail(brand=...)appends?brand=<x>when setbrand=and behavior is unchangedConfig flow (
config_flow.py)retailer.cdr_brandUser-facing strings
strings.json,translations/en.json)Data
cdr/data/eme_refdata.json— 86KB EME snapshot (117 orgs, 72 brokers)cdr/data/cdr_endpoints.jsondeleted (legacy jxeeno snapshot)Why
This is the foundation for Phase 3.1 (multi-plan ranking). Without
cdr_brandwe'd recommend the wrong plan to anyone on a shared-base-URI brand. Without 117-retailer coverage, we'd silently exclude ~40 retailers from the comparison. Both are correctness bugs that have to be fixed before ranking math runs against arbitrary user-selected plans.Breaking Changes
None for users — the wizard UX is unchanged.
For code:
LIVE_REGISTRY_URL_JXEENO,_parse_jxeeno_entries,parse_entries_for_test,baked_in_jxeeno_path_for_testremoved. No internal consumers, but if any external code imported the legacy names they will break.cdr/data/cdr_endpoints.jsonremoved.Files Changed
CHANGELOG.md—[Unreleased]sectioncustom_components/pricehawk/cdr/cdr_client.py—brand=param on both fetcherscustom_components/pricehawk/cdr/registry.py— full rewrite, EME-onlycustom_components/pricehawk/cdr/data/eme_refdata.json— new 86KB snapshotcustom_components/pricehawk/cdr/data/cdr_endpoints.json— deletedcustom_components/pricehawk/config_flow.py— 4 call sites passretailer.cdr_brandcustom_components/pricehawk/strings.json— error messagecustom_components/pricehawk/translations/en.json— error messagetests/test_cdr_registry.py— rewritten for EME shape, +9 new teststests/test_cdr_client.py— +4 brand-param testsStacked on #28. Retarget to
devafter #28 merges.🤖 Generated with Claude Code
Summary by Sourcery
Switch the CDR retailer registry to the official Energy Made Easy (EME) refdata2 source and plumb brand discriminators through to plan fetches to correctly handle retailers sharing a base URI.
New Features:
Enhancements:
Documentation:
Chores:
Summary of Changes
Key Changes
api.energymadeeasy.gov.au/refdata2?keys=organisations,thirdPartiescdr_brand: str | Nonefield inRetailerEndpointto disambiguate retailers sharing the same base CDR URIfetch_plan_list()andfetch_plan_detail()to accept optionalbrandparameter, which appends?brand=<value>to CDR requests when providedcdr_brandto CDR API calls (four call sites total)cdr/data/eme_refdata.jsoncontaining 117 organizations as fallback when live fetch failscdrBrandfield; catches and logs parse failures asCdrUnavailableto ensure graceful fallback to offline snapshotstrings.jsonandtranslations/en.jsonto reference EME service instead of jxeeno/GitHubcdr/data/cdr_endpoints.jsonand jxeeno snapshot referencesBreaking Changes
None for end users. Some internal implementation details and legacy filenames have changed, which may affect external imports.
Files Changed