fix(nemweb): regex tolerates full server path prefix + bump 1.6.0-beta.3#147
Conversation
Critical hotpatch. The earlier _LEGACY fix (#107) silenced one cause of NEMWeb listing parse failures but not the actual one. Live UAT against the production NEMWeb directory on 2026-05-24 showed the real shape: <A HREF="/Reports/CURRENT/DispatchIS_Reports/PUBLIC_DISPATCHIS_202605221100_..._.zip"> The prior regex was case-insensitive (via re.IGNORECASE) but required PUBLIC_DISPATCHIS_ to sit immediately after the opening quote — so it matched zero files even though both the case and the _LEGACY-optional fixes were correct. AEMO-Direct DWT and Flow Power's AEMO poll were both silently broken in production since the PR #87 / #107 work. Fix: insert a non-greedy ``[^"]*?`` between ``href="`` and the filename capture group, keeping the captured filename clean so the existing lexical-sort logic in _pick_latest_dispatch_file works unchanged. Verified against the live directory: regex now finds 576 matches in the 2026-05-24 snapshot (was 0). Two regression tests pin the new path-prefix tolerance — one using the exact shape pulled from the live directory, one defensive with mixed case + absolute URLs. Manifest bumped to 1.6.0-beta.3 for HACS-beta re-deploy. Silver-checklist test version assertion updated to match. Full test suite: 1130 passing.
There was a problem hiding this comment.
Code Review
This pull request addresses a production issue where the NEMWeb directory listing failed to match files due to unexpected server path prefixes in the HREF attributes. The regex in aemo_api.py was updated to handle these prefixes, and the version was bumped to 1.6.0-beta.3 with accompanying regression tests. A review comment identifies a latent bug where the case-insensitive regex matching combined with case-sensitive lexical sorting could lead to incorrect file selection if filename casing varies on the server.
| # prefix) keeps working unchanged. | ||
| _FILE_RE = re.compile( | ||
| r'href="(PUBLIC_DISPATCHIS_\d{12}_\d+(?:_LEGACY)?\.zip)"', | ||
| r'href="[^"]*?(PUBLIC_DISPATCHIS_\d{12}_\d+(?:_LEGACY)?\.zip)"', |
There was a problem hiding this comment.
The regex uses re.IGNORECASE, which allows matching filenames with inconsistent casing (e.g., public_dispatchis_...). However, the sorting logic in _pick_latest_dispatch_file (line 134) performs a default lexical sort, which is case-sensitive. This introduces a latent bug: if the directory listing ever contains a mix of cases, an older file might be selected as the 'latest' simply because its filename is lowercase (since 'p' > 'P' in ASCII). While NEMWeb filenames are currently uppercase, the code should ideally use a case-insensitive sort (e.g., sorted(matches, key=str.lower)) to remain robust against server-side casing changes or proxy normalization.
…#153) * fix(stack): four gemini-review findings I missed across beta.4-beta.8 Ryan caught that I'd been merging today's PRs with --auto without reading gemini-code-assist reviews. These are the legitimate findings from those reviews, fixed in one batch. #148 (gemini): per-tick build_explanation duplicates the rollover branch. Beta.4 added a per-tick rebuild but left the daily-rollover rebuild in place. At midnight the rollover branch built the "yesterday's summary" snapshot, then the per-tick rebuild a few seconds later clobbered it with the post-reset all-zeros snapshot. Per-tick rebuild fires on the midnight tick too, so the rollover one was redundant AND actively harmful. Removed. #147 (gemini): _pick_latest_dispatch_file lex-sort is case-sensitive. _FILE_RE uses re.IGNORECASE — could capture mixed-case filenames. sorted(matches)[-1] then compared Unicode codepoints; uppercase sorts before lowercase. A hypothetical mixed-case listing would put an older uppercase file last after sort. NEMWeb today is all uppercase but the contract should match the regex's tolerance. key=str.upper normalises. #150 (gemini): rebuild_engine never re-reads _grid_power_entity. The options→data fallback added in beta.6 fires once at __init__ and never refreshes. Users editing the grid sensor via options-flow saw the integration silently keep pointing at the prior entity until HA restart. rebuild_engine is the options-update path — now re-resolves. #152 (gemini): reset_today service silently succeeds with no entries. Silver action-exceptions rule: service handlers should raise when they cannot perform the action. A user with all entries in failed-load state would see "Successfully executed pricehawk.reset_today" and observe no dashboard change. Now raises HomeAssistantError telling them to reload the integration. PR #146 (export-credit bullet) and PR #149 (loop optimization) are real findings too but lower priority — left as future work. Manifest bumped to 1.6.0-beta.9. Silver-checklist version assertion updated to match. Full test suite: 1140 passing. * fix(coordinator): drop duplicate _grid_power_entity overwrite in rebuild_engine Per gemini review of PR #153: the fallback I added at the TOP of rebuild_engine was being silently negated by a duplicate options-only assignment at the END of the same function. Removed the second assignment; kept a comment marker so this isn't accidentally reintroduced. Bug effect: rebuild_engine fired on every options-flow save would overwrite my fix with the empty-string default if entry.options didn't have CONF_GRID_POWER_SENSOR. Same regression as the original bug beta.6 was meant to fix. NOTE: gemini also flagged that rebuild_engine creates new provider instances on every options update, losing today's daily accumulators (import_kwh_today, import_cost_today_c, etc.). That's a real concern but a bigger refactor — left for a follow-up PR.
Critical hotpatch found via beta.2 UAT.
The earlier _LEGACY fix (#107) silenced the wrong cause. Real NEMWeb HTML:
Prior regex was case-insensitive but required
PUBLIC_DISPATCHIS_to sit immediately after the opening quote → matched zero files. AEMO-Direct DWT and Flow Power's AEMO poll have been silently broken since the PR #87 / #107 work.Fix: insert
[^"]*?betweenhref="and the filename group. Verified live: 576 matches vs 0.Test plan
TestPickLatestFilepytest— 1130 passingruff check— clean