Skip to content

fix(codex): P1-1 named_comparator + P1-2 service singleton + P1-4 singleRate + P1-3 research#111

Merged
Artic0din merged 4 commits into
devfrom
fix/codex-p1-1-p1-2-p1-4
May 23, 2026
Merged

fix(codex): P1-1 named_comparator + P1-2 service singleton + P1-4 singleRate + P1-3 research#111
Artic0din merged 4 commits into
devfrom
fix/codex-p1-1-p1-2-p1-4

Conversation

@Artic0din
Copy link
Copy Markdown
Owner

@Artic0din Artic0din commented May 23, 2026

Summary

Four codex P1 findings from the 2026-05-23 full-repo review. Three code fixes + one research outcome (deferred pending UX call).

Code fixes (3 commits)

P1-1 — Named comparator OptionsFlow read legacy `hass.data`

`config_flow.py:2734` looked up the coordinator via `self.hass.data[DOMAIN][entry_id]` — a v2-era pattern. v3 stores it on `entry.runtime_data`. The legacy lookup always returned None on a v3 install → step aborted with `no_ranked_alternatives` even when the coordinator HAD a populated ranked-alternatives list. User couldn't pin a plan from the ranked list as a Named Comparator.

Fix: Read `self.config_entry.runtime_data.coordinator`.

P1-2 — Services registered per-entry with closure-captured `entry`

Three services (`analyze_csv`, `backfill_history`, `rank_alternatives`) were registered inside `async_setup_entry` with handlers that closed over `entry`. HA's registry stores one handler per (domain, name), so multi-entry installs routed every service call to whichever entry set up LAST. Silent — no error, just wrong data.

Fix: Lift handlers into module-level `_register_services_once` with `has_service` idempotency guard. Handlers resolve the target entry at call time via `hass.config_entries.async_entries(DOMAIN)` honoring an optional `entry_id` field. Multiple entries without `entry_id` → `ServiceValidationError`.

P1-4 — Cheap-rank ignored `singleRate` plans

`_extract_peak_rate_cents` only walked `tariffPeriod[].timeOfUseRates[]`. Real CDR PlanDetailV2 with `rateBlockUType: "singleRate"` carries rates in a different key. Flat plans scored `None` → silently excluded from the ranked list → bias against simpler plans.

Fix: Read BOTH `timeOfUseRates` and `singleRate` blocks, take max across them.

Research outcome (1 commit, docs-only)

P1-3 — Monotonic-sum claim doesn't apply to external statistics

Context7 query against HA developer docs found:

  • The "monotonic, ignore negative deltas" rule lives on sensor-derived stats with `state_class=TOTAL_INCREASING`.
  • External statistics (which we use) have no `state_class` field; the integration provides absolute `sum` values directly and HA stores them as given.
  • Energy Dashboard computes "today's cost" as `sum[end] - sum[start]`. A decreasing sum produces a negative today's-cost — correct for a net-export day.

Codex may have applied the sensor-stats rule to the external-stats path incorrectly. Our current single-net-cost stat is plausibly correct.

Code change deferred pending live-UAT eyeball test (do negative bars on Energy Dashboard look broken on a real export-heavy day?). `TODOS.md` updated with the research + decision criteria. Priority downgraded P1 → P2 (now blocked on UX call, not engineering).

Tests

```
1087 passing (was 1078 — +9 regression tests)
ruff: All checks passed
```

Finding New tests Where
P1-1 2 `test_codex_p0_p1_fixes.py` (`TestNamedComparatorRuntimeDataLookup`)
P1-2 4 `test_codex_p0_p1_fixes.py` (`TestServicesSingletonRegistration`) + fixture update
P1-4 3 `test_cdr_ranking.py`
P1-3 n/a docs only

Commits

```
8f10069 fix(ranking): include singleRate plans in cheap-rank extractor
d7b4512 fix(config_flow): named_comparator reads coordinator from runtime_data
e7cb3be fix(services): register once with call-time entry resolution
d4792b8 docs(todos): research P1-3 monotonic-sum claim, defer pending UX call
```

Test plan

  • pytest — 1087 passing
  • ruff — clean
  • Deploy to live HA — open Options → Named Comparator → confirm dropdown populated (P1-1)
  • Multi-entry: Developer Tools → Services → call `pricehawk.rank_alternatives` with explicit `entry_id` → confirm correct entry's coordinator runs (P1-2)
  • Wait for next ranking pass; confirm singleRate plans appear in `sensor.pricehawk_ranked_alternatives` (P1-4)
  • When solar season produces an export-heavy day, eyeball Energy Dashboard cost chart and decide on P1-3 (TODOS.md has decision criteria)

Codex queue status

All 9 codex full-repo review findings now triaged. Six closed (P0-1, P0-2, P1-1, P1-2, P1-4, P1-5, P1-6 + two P2 quality items), one deferred (P1-3 with research), two filed (P2 MagicMock-stubs + P2 pycache cleanup — quality work, not blocking).

Artic0din and others added 4 commits May 23, 2026 15:44
Codex P1-4 (2026-05-23): cheap_rank_score documented support for
SINGLE_RATE plans, but _extract_peak_rate_cents() only walked
tariffPeriod[].timeOfUseRates[]. Real CDR PlanDetailV2 bodies with
rateBlockUType="singleRate" carry their rates in a singleRate.rates[]
block — NOT in timeOfUseRates. Those plans scored None and got
silently excluded from the ranked-alternatives list, biasing the
output against flat/single-rate plans.

Fix: read BOTH blocks and take the max unitPrice across them. The
existing test fixture encodes single-rate plans inside timeOfUseRates
with type="SINGLE_RATE" for fixture compat — that shape is still
handled. Real CDR endpoints (which use the singleRate key) now work
too. The demand-charges path (TODO-5 in TODOS.md) remains
intentionally out of scope — that's a different cost model that
needs evaluator work, not just rate extraction.

3 regression tests in TestCheapRankScore:
- single-rate block extraction (real CDR shape)
- stepped singleRate picks max
- hybrid plan with both blocks picks max across them

1081 passing (was 1078), ruff clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex P1-1 (2026-05-23): the named-comparator OptionsFlow step read
the coordinator from ``self.hass.data[DOMAIN][entry_id]``, but v3
stores the coordinator on ``entry.runtime_data`` (PriceHawkData
dataclass) — see data.py and the integration's async_setup_entry.

The legacy hass.data lookup always returned None on a v3 install, so
``alternatives`` stayed empty and the step aborted with
``no_ranked_alternatives`` even when the coordinator HAD a populated
ranked-alternatives list. The user couldn't pin any plan from the
ranked list as a Named Comparator — the whole feature was a no-op.

Fix: read ``self.config_entry.runtime_data.coordinator`` instead.
The Silver-checklist "no legacy hass.data" test (test_runtime_data
test_no_legacy_hass_data_reads) already grep-banned this pattern
package-wide; codex caught that this specific call site slipped
through because the test fixture matched only the literal string
``hass.data[DOMAIN]`` (with the brackets), and the broken line used
``hass.data.get(DOMAIN, {}).get(...)`` instead.

2 regression tests in tests/test_codex_p0_p1_fixes.py:
- named_comparator reads from runtime_data
- named_comparator does not use legacy access patterns

1083 passing (was 1081), ruff clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex P1-2 (2026-05-23): the three services (analyze_csv,
backfill_history, rank_alternatives) were registered inside
``async_setup_entry`` with handlers that captured the current
``entry`` in a closure. HA's service registry stores one handler per
(domain, name), so each entry's setup silently overwrote the
previous registration. Multi-entry installs ended up routing every
service call to whichever entry was set up LAST — calls to a service
intended for entry-A could land on entry-B's coordinator.

Fix:
- Lift the three handlers + a shared ``_resolve_service_target_entry``
  helper into a module-level ``_register_services_once`` function.
- Guard registration with ``hass.services.has_service`` so the second
  entry's setup is a no-op for service registration.
- Handlers resolve the target entry at call time via
  ``hass.config_entries.async_entries(DOMAIN)``, honoring an optional
  ``entry_id`` field in the service call data. Multiple loaded entries
  without an ``entry_id`` raise ServiceValidationError so the caller
  has to be explicit (silent routing is the bug we're fixing).
- ``async_unload_entry`` already removes the singleton services on
  the LAST entry's unload (pre-existing pattern, lines 240-247) — pairs
  cleanly with this idempotent-register.

Tests:
- tests/test_runtime_data.py::test_service_handlers_resolve_fresh_coordinator
  updated: the hass mock now pre-populates ``config_entries.async_entries``
  with the test entry so the new resolution path can find it.
- tests/test_runtime_data.py::_make_hass: ``has_service`` default
  return_value=False so the idempotent guard doesn't skip registration
  in test runs.
- tests/test_codex_p0_p1_fixes.py: 4 new tests pinning the singleton
  helper + has_service guard + call-time resolver + ambiguous-call
  rejection.

1087 passing (was 1083), ruff clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex P1-3 (2026-05-23) flagged that `async_add_external_statistics`
with `has_sum=True` is "monotonic", and our cumulative cost sum can
decrease on export-heavy days. The fix codex proposed (split import +
export streams) is a schema change that would break existing users'
stat history, so verify before changing.

Context7 query against HA developer docs confirms:
- The monotonic rule lives on SENSOR-derived stats with
  `state_class=TOTAL_INCREASING`: "if the difference is negative,
  nothing is added to the sum."
- For `state_class=TOTAL`: sum updates include negative diffs.
- External statistics (via `async_add_external_statistics`) carry no
  `state_class` field on `StatisticMetaData`. The integration provides
  absolute `sum` values directly; HA stores them as given.
- Energy Dashboard's cost source computes today's cost as
  `sum[end_of_day] - sum[start_of_day]`. A decreasing sum produces a
  negative today's-cost — correct for a net-export day where the user
  earned more than they spent.

Codex may have been applying the sensor-stats monotonic rule to the
external-stats path incorrectly. Our current single-net-cost stat is
plausibly correct; the remaining concern is Energy Dashboard UX (do
users see broken-looking negative bars on export-heavy days?).

Updated TODO-CODEX-P1-3 to reflect:
- Research complete
- Code change deferred pending live-UAT eyeball test on a known
  export-heavy day (will happen naturally during solar season on the
  installed entry)
- Decision criteria written: if negative-bar UX is acceptable, close
  as won't-fix; otherwise, split streams in a separate migration PR
- Priority downgraded P1 → P2 (now blocked on UX call, not engineering)

No code change in this commit — pure docs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Artic0din Artic0din changed the title fix(codex): P1-1 named_comparator + P1-2 service singleton + P1-4 singleRate ranking fix(codex): P1-1 named_comparator + P1-2 service singleton + P1-4 singleRate + P1-3 research May 23, 2026
@Artic0din Artic0din merged commit 277447b into dev May 23, 2026
7 of 11 checks passed
@Artic0din Artic0din deleted the fix/codex-p1-1-p1-2-p1-4 branch May 23, 2026 05:49
Artic0din added a commit that referenced this pull request May 23, 2026
… Jinja (#145)

Two real bugs surfaced by a Copilot-CLI retro-review of the 22 merged PRs
that the prior @claude batch couldn't reach (OIDC workflow-validation
against stale main).

statistics.py — external_statistic_id now sanitizes via regex:
  CDR-derived provider_ids carry the plan id verbatim
  (e.g. ``agl_AGL-CDR-N0001`` for AGL via CDR), so `.lower()` alone
  left hyphens that the recorder's ``[a-z0-9_]+`` regex silently
  rejected. Every CDR user's dual-write would fail and the Energy
  Dashboard would never receive their cost data — same silent-failure
  class #107 fixed for uppercase ULIDs. Added
  ``_STATISTIC_ID_OBJECT_SAFE`` compiled regex that coerces ANY
  character outside ``[a-z0-9_]`` to underscore. New regression test
  ``test_cdr_plan_id_with_hyphens_is_sanitized`` + adjusted
  ``test_entry_id_sliced_to_8_chars`` for the post-sanitization shape.
  Surfaced by retro-review of PRs #93, #95.

blueprints — variables block replaces !input inside Jinja:
  ``daily_7pm_summary.yaml`` and ``wholesale_spike_alert.yaml`` had
  templates like ``{{ states(!input today_cost_sensor) }}``. ``!input``
  is a YAML tag (resolved at YAML parse time) and is invalid inside
  Jinja ``{{ }}`` — Jinja parses ``!`` as an invalid operator and the
  template never renders. Replaced with the standard HA pattern: a
  ``variables:`` block at action level that binds blueprint inputs as
  Jinja identifiers (``today_cost_entity: !input today_cost_sensor``
  etc.), then ``{{ states(today_cost_entity) }}`` in the message.
  Surfaced by retro-review of PRs #99, #100.

22 Copilot reviews ran (PRs #85, #87-#101, #104, #105, #108-#111).
Most findings were false positives — Copilot flagged ``ServiceValidationError``
and ``async_items`` as broken HA APIs (they're both fine), and several
findings duplicated bugs already fixed by codex P0/P1 work
(token-in-URL #109, DWT reset #109). Findings library + triage notes
archived in ``.planning/copilot-retro/``.

Full test suite: 1120 passing.
Artic0din added a commit that referenced this pull request May 24, 2026
)

Bumps manifest to 1.6.0-beta.2 — first HACS-beta tag carrying the full
Phase 7-11 work landed since 1.6.0-beta.1. Live UAT against the production
HA install on 2026-05-24 confirmed the prior fixes (NEMWeb regex #107,
statistic_id sanitization #114/#145, bootstrap block #107, codex P0/P1
work in #109/#110/#111) had not reached users because the last tagged
release was v1.4.0-beta.1 from 2026-05-02.

Also fixes a UAT-surfaced bug not previously caught:

explanation.py — DWT winners no longer produce empty bullets.
  ``build_explanation`` branched on literal provider IDs ("amber",
  "globird", "flow_power", "localvolts") but Dynamic Wholesale Tariff
  providers carry IDs like "dwt_aemo_direct" / "dwt_openelectricity",
  so every DWT win fell through to an empty bullet list. The Best
  Provider sensor's winner_explanation attribute showed bullets=[]
  with margin_aud=0 — no "why" surfaced to the user. Added a
  ``winner_id.startswith("dwt_")`` branch + ``_dwt_won_bullets``
  builder using the standard provider snapshot shape: wholesale spot
  rate (c/kWh derived from $/MWh), today's import volume + cost,
  daily supply charge, and a stale-price warning when the wholesale
  price is over 10 min old. 5 regression tests in
  ``TestDwtWinnerBullets``.

Full test suite: 1125 passing.

Tag + GitHub pre-release follow immediately after this merge.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant