docs(spec): three quick wins from spec-completeness audit (81/100)#108
Merged
Conversation
Spec audit (5-layer framework) scored the repo 81/100 — Safe tier for
agent delegation — with L2 (interface) + L5 (cultural) flagged yellow.
The three quick wins close the highest-ROI gaps without forcing a
codebase rewrite.
Win 1 — pyproject.toml (L2 + L5 fix)
Pin ruff + pyright + pytest config. Ruff ruleset matches what the
codebase already passes (E + F), with a documented future-strict
ladder (W, I, B, UP, S, BLE, RUF) and pre-counted violations per
rule so each can be adopted incrementally. Pyright basic-mode +
reportMissingImports=warning to tolerate HA libs being absent in
the sandbox while still catching real type errors.
Win 2 — docs/development.md naming + reference implementations (L5)
Append three sections:
- Naming conventions for entity_id, provider_id, statistic_id,
config-flow step IDs, service IDs, CONF_* constants. Pins the
live-UAT 2026-05-23 lowercase-statistic_id contract in writing.
- Reference implementations table — canonical file per pattern
(external-SDK price source, public-endpoint price source,
CDR-derived provider, composition wrapper, Energy-Dashboard
sensor, reauth dispatcher, external-stats push).
- Versioning policy — SemVer + HACS rules, beta tag handling,
when version bumps happen in a stack.
Win 3 — agent orientation + codex findings in TODOS.md (L1 + L4)
- CLAUDE.md gains an "Agent orientation" section pointing at the
six artefacts an agent must read before non-trivial work
(.paul/STATE.md, .paul/phases/<phase>/PLAN.md, DECISIONS.md,
TODOS.md, docs/architecture.md, docs/development.md).
- TODOS.md grows a "Codex full-repo review findings (2026-05-23)"
section enumerating the P0/P1/P2 items the codex pass surfaced
(dashboard ha_token in URL, DWT midnight rollover gap, legacy
hass.data in options flow, per-entry service capture, monotonic
sum violation, single-rate ranking gap, DWT from_dict ignores
today, background task cleanup, MagicMock test drift, pycache
secret-shaped strings). Each entry has file:line + fix + priority
so a future agent doesn't silently re-discover them.
E741 cleanup in tests/test_blueprints.py — three list comprehensions
used `l` as the loop variable; renamed to `line` so the codebase
clears bare `ruff check` against the new pyproject ruleset.
1070 passing, ruff clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 23, 2026
Artic0din
added a commit
that referenced
this pull request
May 23, 2026
… Jinja (#145) Two real bugs surfaced by a Copilot-CLI retro-review of the 22 merged PRs that the prior @claude batch couldn't reach (OIDC workflow-validation against stale main). statistics.py — external_statistic_id now sanitizes via regex: CDR-derived provider_ids carry the plan id verbatim (e.g. ``agl_AGL-CDR-N0001`` for AGL via CDR), so `.lower()` alone left hyphens that the recorder's ``[a-z0-9_]+`` regex silently rejected. Every CDR user's dual-write would fail and the Energy Dashboard would never receive their cost data — same silent-failure class #107 fixed for uppercase ULIDs. Added ``_STATISTIC_ID_OBJECT_SAFE`` compiled regex that coerces ANY character outside ``[a-z0-9_]`` to underscore. New regression test ``test_cdr_plan_id_with_hyphens_is_sanitized`` + adjusted ``test_entry_id_sliced_to_8_chars`` for the post-sanitization shape. Surfaced by retro-review of PRs #93, #95. blueprints — variables block replaces !input inside Jinja: ``daily_7pm_summary.yaml`` and ``wholesale_spike_alert.yaml`` had templates like ``{{ states(!input today_cost_sensor) }}``. ``!input`` is a YAML tag (resolved at YAML parse time) and is invalid inside Jinja ``{{ }}`` — Jinja parses ``!`` as an invalid operator and the template never renders. Replaced with the standard HA pattern: a ``variables:`` block at action level that binds blueprint inputs as Jinja identifiers (``today_cost_entity: !input today_cost_sensor`` etc.), then ``{{ states(today_cost_entity) }}`` in the message. Surfaced by retro-review of PRs #99, #100. 22 Copilot reviews ran (PRs #85, #87-#101, #104, #105, #108-#111). Most findings were false positives — Copilot flagged ``ServiceValidationError`` and ``async_items`` as broken HA APIs (they're both fine), and several findings duplicated bugs already fixed by codex P0/P1 work (token-in-URL #109, DWT reset #109). Findings library + triage notes archived in ``.planning/copilot-retro/``. Full test suite: 1120 passing.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Spec-completeness audit scored the repo 81/100 — Safe tier for agent delegation, with L2 (interface) and L5 (cultural) flagged yellow. This PR closes the highest-ROI gaps via three quick wins.
Wins
1. `pyproject.toml` — pin tooling discipline
Ruff + pyright + pytest config pinned at the rule level the codebase already passes (`E` + `F`). A documented future-strict ladder lists `W`/`I`/`B`/`UP`/`S`/`BLE`/`RUF` with pre-counted violation counts so each can be adopted incrementally without big-bang code churn.
Pyright in basic mode with `reportMissingImports=warning` — tolerates HA libs being absent in the sandbox while still catching real type errors.
`pytest.ini_options` declares `hypothesis` as a required plugin so bare `pytest` no longer fails collection (codex finding from 2026-05-23 review).
2. `docs/development.md` — naming + reference implementations
Three new sections:
3. `CLAUDE.md` agent orientation + `TODOS.md` codex findings
`CLAUDE.md` gains an "Agent orientation" section pointing at the six artefacts an agent must read before non-trivial work.
`TODOS.md` grows a "Codex full-repo review findings (2026-05-23)" section — the P0/P1/P2 items from `/tmp/codex-fullrepo-review.log` are now recorded with file:line + fix + priority so a future agent doesn't silently re-discover them as new bugs.
Bonus cleanup
3× E741 `l` → `line` renames in `tests/test_blueprints.py` so the codebase clears bare `ruff check` against the new pyproject ruleset.
Tests
```
1070 passing
ruff: All checks passed
```
Not in scope
This PR does NOT fix the codex P0/P1 findings themselves — those are filed in TODOS.md with priorities. Separate PRs per finding.
Test plan