diff --git a/docs/BACKLOG.md b/docs/BACKLOG.md index d75f4dd3..4e17a3c1 100644 --- a/docs/BACKLOG.md +++ b/docs/BACKLOG.md @@ -1530,6 +1530,45 @@ within each priority tier. the invitation for any 3rd member is sent. Effort: S (minutes after sign-off). Reviewer: Aminata (threat-model-critic). +- [ ] **Map-completeness audit — proactive "do our maps cover + the surfaces we touch?" cadence** (round 44 surface-map-drift + absorb) — Aaron 2026-04-22: *"missing map hygene on backlog?"* + after agent tripped on a surface (GitHub org spending-budget) + that was never in the map. FACTORY-HYGIENE row #50 covers the + *reactive* smell (wrong URL on a mapped surface); this row + covers the complementary *proactive* audit: enumerate every + surface the factory actually touches — from `gh api` calls in + `tools/**`, `.github/workflows/**`, `.claude/skills/**`, and + docs — cross-reference against each mapping doc + (`docs/HARNESS-SURFACES.md`, + `docs/research/github-surface-map-complete-2026-04-22.md`, + `docs/AGENT-GITHUB-SURFACES.md`, `docs/GITHUB-SETTINGS.md`), + and flag surfaces-used-but-unmapped. **Known gaps surfaced by + the triggering incident:** (1) GitHub org spending-budget UI + at `https://github.com/organizations/{org}/billing/budgets` + (added to map as `ui-only` row 2026-04-22); (2) Copilot + Business per-feature toggle state + (`public_code_suggestions`, `ide_chat`, `cli`, `platform_chat` + — values documented in + `memory/feedback_lfg_paid_copilot_teams_throttled_experiments_allowed.md` + but not yet mapped as a declarative surface); (3) the + `coding-agent` / `internet-search` / custom-instruction + enablement flags Aaron referenced ("turned all them on") — + each is a distinct UI toggle with no declarative home yet. + **Detection script:** `tools/hygiene/audit-map-completeness.sh` + greps `gh api` usage + URL patterns across the tree, + normalises to `{org-or-repo}/`, diffs against mapping + docs' enumerated endpoints, outputs surface-used-but-unmapped + list. **Cadence:** every 5-10 rounds (same cadence as + skill-tune-up, row #46, row #48) so the map isn't a write-once + artifact that rots. **Companion:** FACTORY-HYGIENE row for + map-completeness (a new row #51 once this lands). **Effort:** + S (detection script + first sweep) + S per gap closure. + **Reviewer:** Dejan (devops-engineer) for the detection + mechanics; Architect (Kenji) for map-extension decisions + (which gaps land where). Related: `memory/feedback_surface_map_consultation_before_guessing_urls.md`; + FACTORY-HYGIENE row #50. + - [ ] **Orthogonal-axes cadenced audit — make the factory's axis set an orthogonal basis (round 44 absorb)** — Aaron 2026-04-22: *"also we need to make sure all our axises are diff --git a/docs/FACTORY-HYGIENE.md b/docs/FACTORY-HYGIENE.md index 0b9122eb..c578cbd0 100644 --- a/docs/FACTORY-HYGIENE.md +++ b/docs/FACTORY-HYGIENE.md @@ -99,6 +99,7 @@ is never destructive; retiring one requires an ADR in | 59 | Memory-reference-existence CI check (every `](foo.md)` link target in `memory/MEMORY.md` MUST resolve to an actual file under `memory/`) | Every pull_request + push-to-main touching `memory/**` or the audit tool / workflow; workflow-dispatch manual run available | Automated (`.github/workflows/memory-reference-existence-lint.yml`); any contributor resolves on fail | factory | `tools/hygiene/audit-memory-references.sh --enforce` parses link targets of the form `](.md)` in the supplied file (default `memory/MEMORY.md`), resolves each against a base dir (default `memory/`), and fails (exit 2 under `--enforce`) on any broken reference. Supports `--file PATH` and `--base DIR` for custom use. **Why this row exists:** Amara 2026-04-23 4th-ferry absorb (PR #221 Determinize-stage action) — her commit samples show repeated cleanup passes for memory paths that didn't exist; this is the retrieval-drift class she named. First-run baseline (2026-04-24): in-repo `memory/MEMORY.md` 44 refs all resolve; per-user MEMORY.md 391 refs all resolve (PR #220 memory-index-integrity CI has kept the substrate clean). **Third leg of memory-index hygiene:** row #58 (same-commit-pairing) + AceHack PR #12 (no duplicates) + this row (refs resolve) = three complementary checks. **Classification (row #47):** **prevention-bearing** — blocks merge before broken refs land. Ships to project-under-construction: adopters inherit the tool + workflow + three-leg hygiene pattern. | CI job result; first-run baseline captured in PR body. Optional fire-history file if longer-than-90-day retention wanted. | `.github/workflows/memory-reference-existence-lint.yml` + `tools/hygiene/audit-memory-references.sh` + sibling rows #58 (PR #220) + AceHack PR #12 duplicate-lint + `docs/aurora/2026-04-23-amara-memory-drift-alignment-claude-to-memories-drift.md` | | 58 | Memory-index-integrity CI check (PR/push that adds or modifies `memory/*.md` MUST also update `memory/MEMORY.md` in the same range) | Every pull_request + push-to-main touching `memory/**`; workflow-dispatch manual run available | Automated (`.github/workflows/memory-index-integrity.yml`); human-maintainer or any contributor resolves on fail | factory | Scope triggers: top-level `memory/*.md` add-or-modify (excluding `memory/README.md` and `memory/MEMORY.md` itself, and excluding `memory/persona/**` which has its own lifecycle). Check: if any trigger-qualifying file changed in the PR/push range, `memory/MEMORY.md` MUST also be in that range. Fail message cites NSA-001 (canonical incident: new memory landed without MEMORY.md pointer → undiscoverable from fresh session). Safe-pattern compliant per row #43 (SHA-pinned actions, explicit minimum permissions, no user-authored context interpolation, concurrency group, pinned runs-on). **Why this row exists:** Amara 2026-04-23 decision-proxy + technical review courier report (absorbed as PR #219) — action item #1 in her "10 immediate fixes" list, highest-value by her own ranking. Directly addresses the NSA-001 measured failure mode. **Classification (row #47):** **prevention-bearing** — the check runs at PR author-time, blocks merge before the memory substrate can diverge from its index. Ships to project-under-construction: adopters inherit the workflow unchanged; the `memory/**.md` and `memory/MEMORY.md` conventions are factory-generic. | CI job result + annotated fail message in PR checks + `docs/hygiene-history/memory-index-integrity-fires.md` (per-fire schema per row #44 — optional; CI log is durable for 90 days so fire-history file exists only if the human maintainer wants longer retention) | `.github/workflows/memory-index-integrity.yml` (detection + fail message) + `docs/hygiene-history/nsa-test-history.md` (NSA-001 canonical incident) + `docs/aurora/2026-04-23-amara-decision-proxy-technical-review.md` (ferry with proposal) + FACTORY-HYGIENE row #25 (pointer-integrity audit — covers dangling-pointer from the other direction) | | 55 | Machine-specific content scrubber (cadenced audit of in-repo tracked files for user-home paths, Claude Code harness paths, Windows user-profile paths, hostname leaks) | Detect-only (landed 2026-04-23); cadenced detection once per round-close (same cadence as rows #50 / #51 / #52 meta-audits) + opportunistic on-touch when a tick migrates per-user content to in-repo. Enforcement (`--enforce` exit-2) deferred until baseline is green. | Dejan (devops-engineer) on cadenced detection + CI-enforcement sign-off when baseline is green; the migrating agent (self-administered) on on-touch — every in-repo-first migration runs the audit before committing. | factory | `tools/hygiene/audit-machine-specific-content.sh` scans all tracked files (`git ls-files`) for machine-specific patterns: `/Users//`, `/home//`, `C:\Users\`, `C:/Users/`. Excludes: `docs/ROUND-HISTORY.md`, `docs/hygiene-history/**`, `docs/DECISIONS/**`, and the audit script itself. `--list` prints offending files; `--enforce` flips exit 2 on any gap. **Why this row exists:** Aaron 2026-04-23 Otto-27 — *"we can have a machine specific scrubber/lint hygene task for anyting that makes it in by default. just run on a cadence."* Following the Option D in-repo-first policy shift (per-user memory migrations to in-repo became the default), machine-specific content leakage becomes a real risk — content comfortably per-user now crosses the factory's public repo boundary. Baseline at first fire (2026-04-23) was 9 gaps: `/Users/` patterns in several SKILL.md files, 2 PDFs (metadata scan), a scratch-recon doc, a parallel-worktree research doc; `C:\Users\` pattern in 1 SKILL.md + `memory/feedback_path_hygiene.md` (anti-example reference — legitimate). **Classification (row #47):** **prevention-bearing** — the audit runs at author-time (on-touch during in-repo-first migrations) and surfaces the gap before it lands. Ships to project-under-construction: adopters inherit the audit + pattern list + exclusion-list discipline. | Audit output on each fire; cadenced runs appended to `docs/hygiene-history/machine-specific-content-audit-history.md` (per-fire schema per row #44 — date, agent, gaps count, files touched, actions taken, next-fire-expected-date); BACKLOG row per gap at triage time if cleanup doesn't fit on-touch. | `tools/hygiene/audit-machine-specific-content.sh` (detection + pattern list + exclusion list) + cross-refs: `memory/feedback_path_hygiene.md` + `memory/CURRENT-aaron.md` + `memory/CURRENT-amara.md` (in-repo-first migration boundary surfaces this audit's need) | +| 60 | Surface-map-drift smell (wrong URL on a mapped surface fires a hygiene alarm) | Pre-call: every `gh api ` (or equivalent platform call) on a surface that has a mapping doc — grep the map first, use its path, otherwise record a map-gap. Post-call: every 410 / 301 / "endpoint moved" response on a mapped endpoint auto-proposes a map-update. Cadenced sweep every 5-10 rounds replays the full set of mapped endpoints against the current platform to catch silent drift (endpoint renamed without 410). | Any agent calling `gh api` (self-administered on pre-call / post-call); Dejan (devops-engineer) on the cadenced sweep; Kenji (Architect) on map-update PRs when drift lands. Bounded to surfaces with a mapping doc under `docs/research/*surface-map*.md` / `docs/AGENT-*-SURFACES.md` / `docs/HARNESS-SURFACES.md` / `docs/GITHUB-SETTINGS.md`. | factory | **Pre-call (prevention-bearing):** before invoking any `gh api` call against org / enterprise / Copilot / billing / settings surfaces, `grep -li "" ` and use the path the map lists. If the map lacks the path, **file a map-gap finding** in the same audit's output — agent may still call a best-guess endpoint if confident the surface exists, but must log the gap so the next round-close sweep extends the map. **Post-call (detection-bearing):** any `410 Gone` / `301 Moved Permanently` / `"endpoint moved"` response from a mapped endpoint triggers a map-update task (write the new path to the map; note old-path + redirect-doc + drift-date in a "Map drift log" section). **Cadenced (detection-bearing):** every 5-10 rounds, replay the full set of mapped endpoints against the current platform to catch silent renames (200 OK from a stale path that silently redirects, or 404 from an endpoint removed without deprecation). **Why this row exists:** Aaron 2026-04-22 after agent invented `/orgs/.../billing/budgets` (404) for LFG budget audit despite task #195 having already produced the complete map: *"i'm supprised you got the url wrong given you mapped it"* + *"that should be a smell when that happen to a surface you already have mapped"*. Same incident revealed a second drift class — `/orgs/{org}/settings/billing/actions` (map §A.17) returned 410 with `documentation_url: https://gh.io/billing-api-updates-org`, meaning GitHub moved the endpoint between 2026-04-22 (map author-time) and 2026-04-22 (this fire, hours later). Two orthogonal failure modes compound: (a) **not-consulting** an existing map (guess without grep), (b) **consulting-but-stale** map (correct path + platform drift). **UI-only surfaces** (e.g., GitHub org budget management at `https://github.com/organizations/{org}/billing/budgets`, no REST equivalent) are legitimate map entries — the map should mark them as `ui-only` so agents know "no API path exists" before trying. **Classification (row #47):** **prevention-bearing** — the pre-call grep discipline is the prevention layer; the post-call 410 handler is a complementary detection layer; the cadenced sweep is the insurance detection layer for silent renames. See `memory/feedback_surface_map_consultation_before_guessing_urls.md`. Ships to project-under-construction: adopters inherit the smell pattern + the pre-call grep obligation + the map-update-on-410 trigger. | Pre-call: grep output shown in the audit (map-hit / map-miss). Post-call: map-update PR when 410/301 lands, with "Map drift log" row recording old-path + redirect-doc + drift-date. Cadenced: sweep output logged to `docs/hygiene-history/surface-map-drift-history.md` (per-fire schema per row #44). ROUND-HISTORY row when a drift resolves. | `memory/feedback_surface_map_consultation_before_guessing_urls.md` (authoritative) + `docs/research/github-surface-map-complete-2026-04-22.md` (primary target for GitHub surfaces) + `docs/AGENT-GITHUB-SURFACES.md` (ten-surface playbook) + `docs/HARNESS-SURFACES.md` + `docs/GITHUB-SETTINGS.md` + this row's enforcement discipline (agent-self-administered pre-call, detection scripts TBD under `tools/hygiene/audit-surface-map-drift.sh`) | ## Ships to project-under-construction diff --git a/docs/research/github-surface-map-complete-2026-04-22.md b/docs/research/github-surface-map-complete-2026-04-22.md index 49a322e3..0d327bf1 100644 --- a/docs/research/github-surface-map-complete-2026-04-22.md +++ b/docs/research/github-surface-map-complete-2026-04-22.md @@ -314,13 +314,34 @@ not script. - `GET /orgs/{org}/audit-log` — audit-log entries (GHE/GHEC; on Team it's UI-only). -- `GET /orgs/{org}/settings/billing/actions` — Actions billing. -- `GET /orgs/{org}/settings/billing/packages` — Packages billing. +- `GET /orgs/{org}/settings/billing/actions` — **MOVED + 2026-04-22** (`410 Gone`; `documentation_url: + https://gh.io/billing-api-updates-org`). Old-path kept here + for drift-log purposes; successor endpoint TBD per the + migration doc. See "Map drift log" at the foot of this doc. +- `GET /orgs/{org}/settings/billing/packages` — Packages billing + (likely also affected by the 2026-04-22 billing-API migration; + **re-verify before use**). - `GET /orgs/{org}/settings/billing/shared-storage` — shared - storage billing. + storage billing (same caveat). - `GET /orgs/{org}/settings/network-configurations` — GHE Cloud private networking (not applicable here). +**UI-only companion surfaces** (no REST equivalent; `ui-only` +tag): + +- Org **spending-budget management** — + `https://github.com/organizations/{org}/billing/budgets` (web + UI only; no public REST endpoint to read or write budgets + programmatically). Budget-cap-change is still in the + *forbidden* class per + `memory/feedback_lfg_paid_copilot_teams_throttled_experiments_allowed.md`; + audit is **human-only via UI screenshot** until GitHub ships a + Budgets API. +- Org-level audit-log (on Team plan) — + `/organizations/{org}/settings/audit-log` — web UI only; + GraphQL `auditLog` returns a subset (noted above). + **Team-plan limit:** audit log is UI-only under `/organizations/{org}/settings/audit-log`; no REST on Team. Workaround: GraphQL `auditLog` query returns a subset. @@ -746,3 +767,29 @@ verification before they can land as rows: - GitHub REST API top-level: `https://docs.github.com/en/rest` — the source-of-truth for endpoint categories used to build the per-scope tables above. + +## Map drift log + +Any mapped endpoint that returns `410 Gone` / `301 Moved +Permanently` / `404 Not Found` due to platform drift (not scope +issues) lands here with: old-path, drift-date, observed +response, and new-path (or "pending — see migration doc"). This +log is the **post-call arm of FACTORY-HYGIENE row #50** +(surface-map-drift smell). Agents encountering a drift on any +listed endpoint MUST append a row. + +| Old path | Drift date | Response | New path | Notes | +|---|---|---|---|---| +| `GET /orgs/{org}/settings/billing/actions` | 2026-04-22 | `410 Gone`; `documentation_url: https://gh.io/billing-api-updates-org`; requires `admin:org` scope | pending — see migration doc | Discovered during LFG budget audit when `admin:org` scope was also absent from token. Token carried `gist, read:org, repo, workflow`; 410 fires regardless of scope per test with `read:org`. Successor endpoint per migration doc TBD; re-verify `/orgs/{org}/settings/billing/packages` and `/orgs/{org}/settings/billing/shared-storage` simultaneously since all three are in the same migration batch. | + +## UI-only surfaces (no REST equivalent at map-time) + +Some GitHub surfaces have no public REST endpoint and must be +audited by human-in-the-loop (screenshot / CSV export / +manual-read). These are legitimate map entries so agents don't +waste attempts on non-existent paths. Tag: `ui-only`. + +| Surface | UI path | Audit workaround | Notes | +|---|---|---|---| +| Org spending-budget management | `https://github.com/organizations/{org}/billing/budgets` | Human screenshot / manual-read; agent cannot read-or-write programmatically | **Forbidden class** per `memory/feedback_lfg_paid_copilot_teams_throttled_experiments_allowed.md` — agent cannot change budgets without Aaron renegotiation; audit is read-intent only. | +| Org audit-log (Team plan) | `/organizations/{org}/settings/audit-log` | GraphQL `auditLog` returns subset | Documented above in §A.16. |