Skip to content

fix(codex): P0-1 token-out-of-URL + P0-2 reset-all-providers + P1-6 cancel-on-unload#109

Merged
Artic0din merged 3 commits into
devfrom
fix/codex-p0-p1-bundle
May 23, 2026
Merged

fix(codex): P0-1 token-out-of-URL + P0-2 reset-all-providers + P1-6 cancel-on-unload#109
Artic0din merged 3 commits into
devfrom
fix/codex-p0-p1-bundle

Conversation

@Artic0din

@Artic0din Artic0din commented May 23, 2026

Copy link
Copy Markdown
Owner

Summary

Three codex full-repo review findings from 2026-05-23 — all P0/P1 fixes that don't require deeper architectural investigation. Each commit is a single focused finding.

Findings + fixes

P0-1 — `setup_panel_iframe` appended HA token to URL (security)

Was: `dashboard_url += f"&token={ha_token}"` — the iframe panel registered with a URL containing the user's HA long-lived access token.

Leak paths: browser history, Referer headers, screenshots, panel-config dumps, log files (the existing log-line redaction only covered one path).

Fix: Stop appending the token from the Python side. The iframe page (`dashboard.html`) already has a four-method auth fallback chain — methods 2/3/4 (parent frame `hassConnection`, parent `localStorage.hassTokens`, local `localStorage`) all work in HA's same-origin iframe context.

The new Lit `setup_panel_custom_v2` panel never used URL tokens — it's the long-term replacement. Iframe panel remains available during the migration.

P0-2 — Daily rollover skipped every provider except Amber (correctness)

Was: Coordinator's daily-rollover branch only ever called `self._amber.reset_daily()` (and only inside the monthly-reset branch lower down). DWT, CdrPlan, FlowPower, and LocalVolts providers carried `today_cost` across midnight indefinitely.

Impact: Energy Dashboard `sensor.pricehawk_today_cost` showed accumulated multi-day total instead of today only. External statistics dual-write corrupt because they read from the same accumulator.

Fix: Iterate `self._providers.values()` in the rollover branch, call `reset_daily()` on each, wrapped in try/except so one buggy provider can't break the day's reset.

P1-6 — Background tasks not cancelled on unload (lifecycle)

Was: PR #107 switched the initial ranking + backfill scheduler to `hass.async_create_background_task` but threw the task handles away.

Impact: Reload/unload left tasks running against an unloaded coordinator — visible as pytest "coroutine was never awaited" warnings AND as real data races when reload happens mid-ranking.

Fix: Capture both task handles; register `entry.async_on_unload(task.cancel)` per task.

Tests

```
1078 passing (was 1070 — +8 regression tests)
ruff: All checks passed
```

8 tests in `tests/test_codex_p0_p1_fixes.py` (one class per finding):

Class Pins
`TestDailyRolloverResetsAllProviders` (3 tests) P0-2
`TestBackgroundTaskCancellationOnUnload` (3 tests) P1-6 + PR #107
`TestIframePanelNoTokenInUrl` (2 tests) P0-1

Test plan

  • pytest — 1078 passing
  • ruff — clean
  • Deploy to live HA — open the legacy iframe panel, confirm dashboard loads + WebSocket auths via method 2/3/4 (P0-1 live proof)
  • `sensor.pricehawk_today_cost` resets to $0 after midnight (P0-2)
  • Reload integration via kebab → no orphaned tasks (P1-6)

Remaining open codex findings (separate PRs)

  • P1-1 options-flow named-comparator reads legacy `hass.data`
  • P1-2 services captured per-entry instead of resolved at call time
  • P1-3 external-stats cumulative `sum` adds net (can go negative) — needs HA-docs research first
  • P1-4 cheap-rank only reads `timeOfUseRates` (single-rate plans silently excluded)
  • P1-5 DWT `from_dict(today)` ignores the `today` arg

Artic0din and others added 3 commits May 23, 2026 10:34
…lover

Codex full-repo review (2026-05-23, P0-2) flagged that the daily
rollover persists yesterday's history but only resets the Amber
provider's accumulators (and only inside the monthly-reset branch
lower down). DWT, CdrPlan, FlowPower, and LocalVolts providers carry
``today_cost`` across days indefinitely — corrupting Energy Dashboard
cost sensors and external statistics.

Add a `for provider in self._providers.values(): provider.reset_daily()`
loop after history capture, wrapped in try/except so a single buggy
provider can't take out the whole day's reset. Logs the offending
provider's id on failure for triage.

3 source-level regression tests pin the contract.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tasks

Codex full-repo review (2026-05-23, P1-6) flagged that PR #107
switched the initial ranking job + backfill scheduler to
``hass.async_create_background_task`` to fix the bootstrap-blocking
bug, but didn't retain the task handles. On reload/unload the tasks
survive against an unloaded coordinator — visible as pytest
"coroutine was never awaited" warnings AND as real data races in
production when a reload happens mid-ranking.

Capture both task handles and register ``entry.async_on_unload(
task.cancel)`` for each. HA now cancels both cleanly on reload/unload.

3 regression tests pin the contract (in the same file as P0-2's tests):
- ranking task handle captured + cancel-on-unload registered
- backfill task handle captured + cancel-on-unload registered
- bare ``hass.async_create_task(...)`` never re-introduced for these
  call sites (regression guard against accidentally reverting PR #107
  while patching task lifecycle)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex full-repo review (2026-05-23, P0-1) flagged that
`setup_panel_iframe` appended the HA long-lived access token to the
iframe URL as `&token=<jwt>`. Tokens in URLs leak via:

- browser history
- HTTP Referer headers
- screenshots and screen-share
- panel-config dumps (HA backups, frontend storage)
- log files (despite the existing `&token=<REDACTED>` log redaction —
  the redaction only covered our log line, not the other paths)

The iframe page (`dashboard.html`) already has a four-method auth
fallback chain documented inside the file:

  1) URL ?token=               (the leaky path)
  2) parent.hassConnection      (panel-iframe parent frame)
  3) localStorage.hassTokens    (HA frontend auth cache)
  4) parent.localStorage.hassTokens

Methods 2/3/4 work in HA's same-origin iframe context — the iframe
loads from `/local/...` which shares an origin with the HA frontend.
Removing method 1 from the Python side leaves the page authenticating
cleanly via the existing fallback paths.

dashboard.html itself is unchanged — keeping method-1 in the JS as
dead code is harmless and protects users running stale dashboards
during the upgrade window. Future cleanup PR can strip it.

The new Lit `setup_panel_custom_v2` panel (Phase 10 PR-13) is the
long-term replacement — it never used URL tokens. The iframe panel
stays available during the migration window per its existing intent.

2 regression tests pin the contract:
- no `&token=` concatenation inside `setup_panel_iframe`
- no `entry.data["ha_token"]` access patterns inside the function

1078 passing (+2), ruff clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Artic0din Artic0din changed the title fix(codex): P0-2 daily-reset all providers + P1-6 cancel background tasks on unload fix(codex): P0-1 token-out-of-URL + P0-2 reset-all-providers + P1-6 cancel-on-unload May 23, 2026
@Artic0din Artic0din merged commit 72403b8 into dev May 23, 2026
7 of 11 checks passed
@Artic0din Artic0din deleted the fix/codex-p0-p1-bundle branch May 23, 2026 04:51
Artic0din added a commit that referenced this pull request May 23, 2026
Codex full-repo review (2026-05-23, P1-5): the DWT provider's
``from_dict(data, today)`` required ``today`` as a contract argument
but then immediately ``del today``'d it and restored daily counters
unconditionally. After a restart that crosses midnight, yesterday's
``import_kwh_today`` / ``import_cost_today_c`` / export equivalents
were resurrected as today's — corrupting Energy Dashboard cost
sensors and the external statistics push for the rest of the day.

This is closely related to the codex P0-2 fix (PR #109) but covers a
different code path. P0-2 added ``reset_daily()`` to the coordinator's
midnight rollover branch — that handles the "HA stays running"
midnight case. P1-5 covers the "HA restarts after midnight" case —
``async_restore_state`` runs and pulls the stored daily counters from
the JSON store; without a date check those counters land in today's
running totals.

Fix:
- ``to_dict()`` persists ``state_date`` = ``_last_tick.date().isoformat()``,
  the HA-tz date the daily counters apply to. ``None`` when no tick
  has run yet (fresh provider, treated as cross-midnight on restore).
- ``from_dict()`` parses ``state_date``, compares to the supplied
  ``today`` HA-tz date. If they match, restore counters (existing
  roundtrip contract). If not, zero counters and log INFO with the
  two dates so operators can see why the dashboard reset.
- Missing ``state_date`` (pre-fix snapshots) and malformed values are
  treated as cross-midnight — safe default = zero rather than risk
  carrying yesterday's totals. Malformed strings additionally emit a
  WARNING for triage.
- ``last_price`` and ``last_tick`` are still restored regardless —
  they are not daily accumulators and survive midnight cleanly,
  giving the new day a known rate to start with.

4 regression tests in test_dynamic_wholesale_tariff_provider.py:
- cross-midnight restart resets counters (the codex case)
- same-day restart preserves counters (existing contract)
- missing state_date treated as cross-midnight (back-compat)
- malformed state_date treated as cross-midnight

1082 passing (was 1078), ruff clean.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Artic0din added a commit that referenced this pull request May 23, 2026
… Jinja (#145)

Two real bugs surfaced by a Copilot-CLI retro-review of the 22 merged PRs
that the prior @claude batch couldn't reach (OIDC workflow-validation
against stale main).

statistics.py — external_statistic_id now sanitizes via regex:
  CDR-derived provider_ids carry the plan id verbatim
  (e.g. ``agl_AGL-CDR-N0001`` for AGL via CDR), so `.lower()` alone
  left hyphens that the recorder's ``[a-z0-9_]+`` regex silently
  rejected. Every CDR user's dual-write would fail and the Energy
  Dashboard would never receive their cost data — same silent-failure
  class #107 fixed for uppercase ULIDs. Added
  ``_STATISTIC_ID_OBJECT_SAFE`` compiled regex that coerces ANY
  character outside ``[a-z0-9_]`` to underscore. New regression test
  ``test_cdr_plan_id_with_hyphens_is_sanitized`` + adjusted
  ``test_entry_id_sliced_to_8_chars`` for the post-sanitization shape.
  Surfaced by retro-review of PRs #93, #95.

blueprints — variables block replaces !input inside Jinja:
  ``daily_7pm_summary.yaml`` and ``wholesale_spike_alert.yaml`` had
  templates like ``{{ states(!input today_cost_sensor) }}``. ``!input``
  is a YAML tag (resolved at YAML parse time) and is invalid inside
  Jinja ``{{ }}`` — Jinja parses ``!`` as an invalid operator and the
  template never renders. Replaced with the standard HA pattern: a
  ``variables:`` block at action level that binds blueprint inputs as
  Jinja identifiers (``today_cost_entity: !input today_cost_sensor``
  etc.), then ``{{ states(today_cost_entity) }}`` in the message.
  Surfaced by retro-review of PRs #99, #100.

22 Copilot reviews ran (PRs #85, #87-#101, #104, #105, #108-#111).
Most findings were false positives — Copilot flagged ``ServiceValidationError``
and ``async_items`` as broken HA APIs (they're both fine), and several
findings duplicated bugs already fixed by codex P0/P1 work
(token-in-URL #109, DWT reset #109). Findings library + triage notes
archived in ``.planning/copilot-retro/``.

Full test suite: 1120 passing.
Artic0din added a commit that referenced this pull request May 24, 2026
)

Bumps manifest to 1.6.0-beta.2 — first HACS-beta tag carrying the full
Phase 7-11 work landed since 1.6.0-beta.1. Live UAT against the production
HA install on 2026-05-24 confirmed the prior fixes (NEMWeb regex #107,
statistic_id sanitization #114/#145, bootstrap block #107, codex P0/P1
work in #109/#110/#111) had not reached users because the last tagged
release was v1.4.0-beta.1 from 2026-05-02.

Also fixes a UAT-surfaced bug not previously caught:

explanation.py — DWT winners no longer produce empty bullets.
  ``build_explanation`` branched on literal provider IDs ("amber",
  "globird", "flow_power", "localvolts") but Dynamic Wholesale Tariff
  providers carry IDs like "dwt_aemo_direct" / "dwt_openelectricity",
  so every DWT win fell through to an empty bullet list. The Best
  Provider sensor's winner_explanation attribute showed bullets=[]
  with margin_aud=0 — no "why" surfaced to the user. Added a
  ``winner_id.startswith("dwt_")`` branch + ``_dwt_won_bullets``
  builder using the standard provider snapshot shape: wholesale spot
  rate (c/kWh derived from $/MWh), today's import volume + cost,
  daily supply charge, and a stale-price warning when the wholesale
  price is over 10 min old. 5 regression tests in
  ``TestDwtWinnerBullets``.

Full test suite: 1125 passing.

Tag + GitHub pre-release follow immediately after this merge.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant