AceHack · AceHack · Apr 28, 2026 · Apr 23, 2026 · Apr 28, 2026 · Apr 28, 2026
diff --git a/.github/workflows/memory-index-duplicate-lint.yml b/.github/workflows/memory-index-duplicate-lint.yml
@@ -0,0 +1,59 @@
+name: memory-index-duplicate-lint
+
+# Detects duplicate link targets in `memory/MEMORY.md` —
+# Amara 2026-04-23 decision-proxy + technical review action
+# item #2 (PR #219 absorb). An index with duplicate entries
+# is a discoverability defect: fresh sessions can't tell
+# which entry is authoritative; the newest-first ordering
+# invariant breaks when the same file appears twice.
+#
+# Companion to `.github/workflows/memory-index-integrity.yml`
+# (the same-commit-pairing check for memory/ changes +
+# MEMORY.md updates). That check ensures index edits happen;
+# this check ensures those edits don't create duplicates.
+#
+# Safe-pattern compliance (FACTORY-HYGIENE row #43):
+#   - SHA-pinned actions/checkout
+#   - Explicit minimum `permissions: contents: read`
+#   - No user-authored context referenced
+#   - Concurrency group + cancel-in-progress: false
+#   - runs-on: ubuntu-22.04 pinned
+#
+# See:
+#   - tools/hygiene/audit-memory-index-duplicates.sh (the tool)
+#   - docs/aurora/2026-04-23-amara-decision-proxy-technical-
+#     review.md (ferry with the proposal)
+
+on:
+  pull_request:
+    paths:
+      - "memory/MEMORY.md"
+      - "tools/hygiene/audit-memory-index-duplicates.sh"
+      - ".github/workflows/memory-index-duplicate-lint.yml"
+  push:
+    branches: [main]
+    paths:
+      - "memory/MEMORY.md"
+      - "tools/hygiene/audit-memory-index-duplicates.sh"
+      - ".github/workflows/memory-index-duplicate-lint.yml"
+  workflow_dispatch: {}
+
+permissions:
+  contents: read
+
+concurrency:
+  group: memory-index-duplicate-lint-${{ github.ref }}
+  cancel-in-progress: false
+
+jobs:
+  lint:
+    name: lint memory/MEMORY.md for duplicate link targets
+    runs-on: ubuntu-22.04
+    steps:
+      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
+
+      - name: run duplicate-link lint
+        shell: bash
+        run: |
+          set -euo pipefail
+          tools/hygiene/audit-memory-index-duplicates.sh --enforce
diff --git a/docs/FACTORY-HYGIENE.md b/docs/FACTORY-HYGIENE.md
@@ -103,6 +103,7 @@ is never destructive; retiring one requires an ADR in
 | 56 | MD032 plus-at-line-start preflight audit (detects prose-continuation `+` followed by space that markdownlint misparses as list items) | Detect-only (landed 2026-04-24); on-touch when author edits markdown; round-cadence sweep + `--enforce` flip when baseline is green. | Dejan (devops-engineer) on cadenced + enforce-transition; author of markdown change self-administered on-touch. | factory | `tools/hygiene/audit-md032-plus-linestart.sh` scans tracked `.md` files for CommonMark-style plus-then-space list-marker lines (regex `^ {0,3}\+` followed by a single space: up to 3 leading spaces allowed, then `+`, then space) where the previous line is non-blank AND is not itself a plus-then-space marker line (so contiguous plus-space lists are not flagged). Whitespace-normalisation on the predecessor-blank check strips all whitespace classes (spaces, tabs, CR) via `[[:space:]]`, so tab-only separator lines count as blank. Path iteration uses NUL-delimited `git ls-files -z` piped into a `while read -d ''` loop and the script runs `cd` to `git rev-parse --show-toplevel` first, so paths resolve from repo root regardless of working directory. Excludes `docs/ROUND-HISTORY.md`, `docs/hygiene-history/**`, `docs/DECISIONS/**`, and self. The `--list` flag prints offending `file:lineno`; `--enforce` flips exit 2 on gap. **Why this row exists:** Otto-session 2026-04-23 hit MD032 regressions three times (Otto-35 + Otto-38 + Otto-38-again). The pattern is author-friendly in intent (prose continuation using `+`) but markdownlint-hostile (parsed as list item). Author-time detection prevents the full CI round-trip. Baseline at first fire (2026-04-24, post review-drain revision on PR #204) was ~170 gaps at repo scope — the CommonMark-aware rewrite removed the earlier file-level-skip heuristic (which masked false negatives when a file used `+` as its bullet style but still contained a prose-continuation `+`) in favour of per-line contiguous-list detection. **Classification (row #47):** **prevention-bearing** — audit runs at author-time (on-touch) and surfaces gap before commit. Ships to project-under-construction: adopters inherit audit + pattern + exclusion discipline. | Audit output on each fire; cadenced runs appended to `docs/hygiene-history/md032-plus-linestart-audit-history.md` (per-fire schema per row #44); author-time gap lands as fix-at-source (opportunistic). | `tools/hygiene/audit-md032-plus-linestart.sh` + this row's self-reference |
 | 61 | Surface-map-drift smell (wrong URL on a mapped surface fires a hygiene alarm) | Pre-call: every `gh api <path>` (or equivalent platform call) on a surface that has a mapping doc — grep the map first, use its path, otherwise record a map-gap. Post-call: every 410 / 301 / "endpoint moved" response on a mapped endpoint auto-proposes a map-update. Cadenced sweep every 5-10 rounds replays the full set of mapped endpoints against the current platform to catch silent drift (endpoint renamed without 410). | Any agent calling `gh api` (self-administered on pre-call / post-call); Dejan (devops-engineer) on the cadenced sweep; Kenji (Architect) on map-update PRs when drift lands. Bounded to surfaces with a mapping doc under `docs/research/*surface-map*.md` / `docs/AGENT-*-SURFACES.md` / `docs/HARNESS-SURFACES.md` / `docs/GITHUB-SETTINGS.md`. | factory | **Pre-call (prevention-bearing):** before invoking any `gh api` call against org / enterprise / Copilot / billing / settings surfaces, `grep -li "<surface-keyword>" <mapping-docs>` and use the path the map lists. If the map lacks the path, **file a map-gap finding** in the same audit's output — agent may still call a best-guess endpoint if confident the surface exists, but must log the gap so the next round-close sweep extends the map. **Post-call (detection-bearing):** any `410 Gone` / `301 Moved Permanently` / `"endpoint moved"` response from a mapped endpoint triggers a map-update task (write the new path to the map; note old-path + redirect-doc + drift-date in a "Map drift log" section). **Cadenced (detection-bearing):** every 5-10 rounds, replay the full set of mapped endpoints against the current platform to catch silent renames (200 OK from a stale path that silently redirects, or 404 from an endpoint removed without deprecation). **Why this row exists:** Aaron 2026-04-22 after agent invented `/orgs/.../billing/budgets` (404) for LFG budget audit despite task #195 having already produced the complete map: *"i'm supprised you got the url wrong given you mapped it"* + *"that should be a smell when that happen to a surface you already have mapped"*. Same incident revealed a second drift class — `/orgs/{org}/settings/billing/actions` (map §A.17) returned 410 with `documentation_url: https://gh.io/billing-api-updates-org`, meaning GitHub moved the endpoint between 2026-04-22 (map author-time) and 2026-04-22 (this fire, hours later). Two orthogonal failure modes compound: (a) **not-consulting** an existing map (guess without grep), (b) **consulting-but-stale** map (correct path + platform drift). **UI-only surfaces** (e.g., GitHub org budget management at `https://github.com/organizations/{org}/billing/budgets`, no REST equivalent) are legitimate map entries — the map should mark them as `ui-only` so agents know "no API path exists" before trying. **Classification (row #47):** **prevention-bearing** — the pre-call grep discipline is the prevention layer; the post-call 410 handler is a complementary detection layer; the cadenced sweep is the insurance detection layer for silent renames. See `memory/feedback_surface_map_consultation_before_guessing_urls.md`. Ships to project-under-construction: adopters inherit the smell pattern + the pre-call grep obligation + the map-update-on-410 trigger. | Pre-call: grep output shown in the audit (map-hit / map-miss). Post-call: map-update PR when 410/301 lands, with "Map drift log" row recording old-path + redirect-doc + drift-date. Cadenced: sweep output logged to `docs/hygiene-history/surface-map-drift-history.md` (per-fire schema per row #44). ROUND-HISTORY row when a drift resolves. | `memory/feedback_surface_map_consultation_before_guessing_urls.md` (authoritative) + `docs/research/github-surface-map-complete-2026-04-22.md` (primary target for GitHub surfaces) + `docs/AGENT-GITHUB-SURFACES.md` (ten-surface playbook) + `docs/HARNESS-SURFACES.md` + `docs/GITHUB-SETTINGS.md` + this row's enforcement discipline (agent-self-administered pre-call, detection scripts TBD under `tools/hygiene/audit-surface-map-drift.sh`) |
 | 62 | Skill data/behaviour split audit (skills stay routine-only; catalogs / inventories / adapter tables / worked examples offload to `docs/**.md`; event logs to `docs/hygiene-history/**.md`) | Author-time (prevention-bearing, every new or touched `SKILL.md`) via the `skill-creator` workflow's authoring checklist + cadenced detection every 5-10 rounds (same cadence as row #5 skill-tune-up) over `.claude/skills/**/SKILL.md` for mix signatures (gotcha-list > 3 items, worked-example / case-study > 20 lines, adapter / compatibility table, inventory matrix, cross-platform neutrality matrix) + opportunistic on-touch at every `SKILL.md` edit. | `skill-creator` workflow on author-time (self-check against the checklist); Aarav (skill-tune-up) on cadenced detection; all agents (self-administered) on on-touch edits. Retrospective one-shot pass over the existing roster queued in BACKLOG P1. | both | **Principle:** a skill's SKILL.md is the **behaviour layer** (the routine / procedure / decision-flow the agent walks through at invocation time). Catalogs of gotchas, inventories of what-survives / what-breaks, adapter-neutrality tables, enumerated variants, and worked-example galleries are **data**, not behaviour — they belong in `docs/<CAPITALIZED-NAME>.md`. Event logs (append-only history of each fire) belong in `docs/hygiene-history/<name>-history.md` per FACTORY-HYGIENE row #44. **Why the split matters:** (a) a routine edits differently than a catalog — the routine changes rarely, catalogs accrete continuously; bundling them creates churn the skill-diff can't cleanly attribute. (b) An agent invoking a skill needs the routine cold-loaded into context; the catalog is consultation-on-demand. Bundling inflates every invocation's token cost with data the routine doesn't always need. (c) Data is queryable under `docs/` (grep-friendly, indexable, linkable from other surfaces); under `.claude/skills/` it is invocation-local and harder to cite. **Mix signatures (trigger the audit):** a SKILL.md with ≥ 2 of — (a) "Known gotchas" section > 3 items; (b) "Worked example" / "Case study" / "In practice" section > 20 lines; (c) adapter / compatibility / variants / neutrality table; (d) what-survives / what-breaks inventory table; (e) cross-platform matrix; (f) multi-row catalog of any sort inside the SKILL.md body. **Split target:** routine stays, data moves to `docs/<CAPITALIZED-NAME>.md`, events to `docs/hygiene-history/<name>-history.md`, and the SKILL.md body carries pointers to the new data surface under a "Data surface" section. **Triggering incident:** 2026-04-22 first-pass `github-repo-transfer` SKILL.md mixed routine + S1-S7 gotcha catalog + adapter table + worked example; Aaron caught it — *"you told me you wanted to split skills into data and behavior/routines, see i remember what you tell me too"* (invoking the agent's own prior principle from `memory/feedback_text_indexing_for_factory_qol_research_gated.md`: *"seperating thing by data and behiaver is a tried and true way and you mentied it for the skills earler, works in code too lol"*). Canonical worked example after split: `.claude/skills/github-repo-transfer/SKILL.md` + `docs/GITHUB-REPO-TRANSFER.md` + `docs/hygiene-history/repo-transfer-history.md`. **Classification (row #47):** **prevention-bearing** — the `skill-creator` authoring checklist asks the split question at author-time; cadenced detection is the backup layer for skills landed before this row existed. Ships to project-under-construction: adopters inherit the three-surface pattern (behaviour / data / fire-log) + the authoring checklist + the cadenced audit. | Audit output per cadenced fire listing every `SKILL.md` + its mix-signature score + a split-or-justify recommendation, logged to `docs/hygiene-history/skill-data-behaviour-split-history.md` (per-fire schema per row #44); ROUND-HISTORY row when a skill splits; BACKLOG row if the retrospective surfaces > 3 existing mixes; `skill-edit-justification-log.md` entry when a mix is deliberate (rare; requires a stated reason). | `memory/feedback_skills_split_data_behaviour_factory_rule.md` (authoritative — to be written this tick) + `memory/feedback_text_indexing_for_factory_qol_research_gated.md` (Aaron's original principle statement) + `.claude/skills/github-repo-transfer/SKILL.md` + `docs/GITHUB-REPO-TRANSFER.md` + `docs/hygiene-history/repo-transfer-history.md` (three-surface canonical worked example) + `.claude/skills/skill-creator/SKILL.md` (authoring workflow — carries the checklist) + `.claude/skills/skill-tune-up/SKILL.md` (detection runner — gains a mix-signature check on top of its existing drift / contradiction / staleness / user-pain / bloat / BP-drift / portability-drift criteria) |
+| 63 | Memory-index duplicate-link lint (`memory/MEMORY.md` flagged if the same `.md` target appears more than once in the newest-first index) | Every pull_request + push-to-main touching `memory/MEMORY.md` or the audit tool + workflow; workflow_dispatch manual run available. Detect-only (exit 2 on duplicate) per `--enforce` flag in CI invocation. | Automated via `.github/workflows/memory-index-duplicate-lint.yml`; human-maintainer or any contributor resolves on fail. | factory | `tools/hygiene/audit-memory-index-duplicates.sh` greps for link targets matching `](foo.md)` in the supplied file (default `memory/MEMORY.md`) and tallies by target; any count > 1 fails. Catches: exact duplicate entries + old-plus-new pointer after an edit that forgot to dedupe. Does NOT catch: substantially similar descriptions of different files (judgment-based). **Why this row exists:** Amara 2026-04-23 decision-proxy + technical review (PR #219 absorb) action item #2 — her observation that `memory/MEMORY.md` had duplicate entries in an older state ("Signal-in, signal-out" + "Deletions > insertions" appearing twice each). Per-user MEMORY.md currently has 1 duplicate (`project_learning_repo_khan_style_...` appears twice) confirming the class. In-repo MEMORY.md currently clean. **Classification (row #47):** **prevention-bearing** — CI blocks merge before the duplicate lands. Ships to project-under-construction: adopters inherit the workflow unchanged; the `memory/MEMORY.md` convention is factory-generic. Sibling to row #58 (memory-index-integrity — same-commit pairing of memory edit + MEMORY.md edit). | CI job result + annotated fail message in PR checks. Optional fire-history surface if long-term retention beyond 90-day CI log is desired. | `.github/workflows/memory-index-duplicate-lint.yml` (CI invocation) + `tools/hygiene/audit-memory-index-duplicates.sh` (the detection tool) + `docs/aurora/2026-04-23-amara-decision-proxy-technical-review.md` (Amara ferry with proposal) + row #58 sibling (memory-index-integrity) |
 
 ## Ships to project-under-construction