diff --git a/docs/FACTORY-HYGIENE.md b/docs/FACTORY-HYGIENE.md index 26bd6022..c79e8dd6 100644 --- a/docs/FACTORY-HYGIENE.md +++ b/docs/FACTORY-HYGIENE.md @@ -84,6 +84,17 @@ is never destructive; retiring one requires an ADR in | 39 | Hot-file-path detector | Round-cadence (every round close) or every 5-10 rounds — whichever catches churn drift before the next merge-tangle. Proposed 2026-04-21. | TBD — candidate capability skill `hot-file-detector` (queued in BACKLOG P1); Architect runs inline until the skill lands. | factory (ships to adopters as a command-line recipe) | `git log --since="60 days ago" --name-only --pretty=format: \| grep -v '^$' \| sort \| uniq -c \| sort -rn \| head -25` — git history *is* the index. Heuristic threshold: >20 changes in 60d on a single monolithic doc = investigate; >30 = refactor candidate (tune after 5-10 rounds). Per-file decision is one of four: `refactor-split` (per-row, per-round, per-section), `consolidate-reduce` (merge with a sibling), `accept-as-append-only` (legitimately append-only → split into per-round files rather than trimming), or `observe`. Pair with merge-tangle fingerprints from `docs/research/parallel-worktree-safety-2026-04-22.md` §9 — a hot file is worse if also in a recent conflict list. Triggering incident: PR #31 5-file merge-tangle (2026-04-21) where `docs/ROUND-HISTORY.md` at 33 changes / 60d was the #1 conflict source, and `docs/BACKLOG.md` at 26 already has an in-flight split ADR. | Audit doc per cycle listing top-N hot paths + per-file decision; BACKLOG rows for refactor-split candidates; ADRs for structural changes. | `feedback_hot_file_path_detector_hygiene.md` + `docs/DECISIONS/2026-04-22-backlog-per-row-file-restructure.md` (same pattern) | | 40 | GitHub-settings drift detector | Weekly (cron `17 14 * * 1`) + on any change to `tools/hygiene/github-settings.expected.json` / detector script / workflow file. Added 2026-04-21 after `AceHack/Zeta` → `Lucent-Financial-Group/Zeta` org-transfer silently flipped `secret_scanning` and `secret_scanning_push_protection` from enabled to disabled. | Automated (`.github/workflows/github-settings-drift.yml`); human resolves on drift | factory (ships to adopters as a template; repo-specific expected snapshot per adopter) | Live `gh api` snapshot vs. checked-in `tools/hygiene/github-settings.expected.json`: repo-level toggles (merge methods, security-and-analysis), rulesets + rule contents, classic branch protection on default branch, Actions permissions + variables + counts of secrets, environments + protection-rule types, Pages config, CodeQL default-setup state, webhook / deploy-key / secret counts. Script at `tools/hygiene/check-github-settings-drift.sh` exits 1 on drift and prints `diff -u` output. Resolution: intentional → re-snapshot + commit new expected with rationale; unintentional → revert in GitHub + rerun detector. | `docs/GITHUB-SETTINGS.md` + `tools/hygiene/github-settings.expected.json` + workflow run log + optional `memory/reference_github_*.md` entry if drift source is non-obvious | `feedback_github_settings_as_code_declarative_checked_in_file.md` + `feedback_blast_radius_pricing_standing_rule_alignment_signal.md` + `project_zeta_org_migration_to_lucent_financial_group.md` | | 41 | Orthogonal-axes audit | Every 5-10 rounds (proposed 2026-04-22) — same cadence as skill-tune-up / agent-QOL / symmetry-audit / missing-hygiene-class-gap-finder. | TBD — candidate capability skill `orthogonal-axes-auditor` (queued in BACKLOG P1); Architect (Kenji) runs inline until the skill lands. | factory | Distinct from row 22 (symmetry-opportunities) — symmetry asks "is A paired with its mirror B?"; orthogonality asks "do axes A and B have **zero overlap**?" The factory classifies artefacts along many axes: skill-category (capability / persona / hat), hygiene-scope (project / factory / both), persona-surface (author / reviewer / auditor / cadence-runner), cadence-bucket (session-open / round-open / round-close / weekly / per-event), memory-type (user / feedback / project / reference), review-target (proposed / shipped / retired), trust-tier (autonomous / advisory / binding), etc. For the axis set to form a proper basis (linear-algebra sense), every pair must be independent — an axis's values must carry information no other axis carries. Drift signs: (a) a new skill could be classified identically on two axes (collapsing them), (b) a proposed distinction in docs maps one-to-one to an existing distinction (duplicate naming), (c) a row in one taxonomy duplicates a row in another. Audit sweep = enumerate current factory axes, check pairwise for rank-reduction (could we collapse axis A into axis B without information loss?), flag overlaps to HUMAN-BACKLOG as `axis-overlap` rows. Load-bearing overlap is allowed but must be documented (e.g. hygiene-scope and review-target overlap deliberately because some projects distinguish, some don't). | Audit doc per cycle enumerating axes, overlap matrix, and per-overlap verdict (collapse / keep-and-document / split); HUMAN-BACKLOG rows for P1+ overlaps; BACKLOG row for candidate axis collapses | `feedback_orthogonal_axes_factory_hygiene.md` (to be written) + Aaron 2026-04-22 message *"also we need to make sure all our axises are orthogaonal to the others so therre is not overlap, like fully ... the set of axes is called an orthogonal basis"* + *"i guess this is a cadence thing"* | +| 42 | Filename-content match hygiene (hard to enforce) | Opportunistic on-touch (every file open/edit/cite) + on-concept-rename (mandatory grep for old-name files when an agent renames a concept) + periodic sample-sweep every 5-10 rounds (same cadence as rows 5, 35-38). Exhaustive not budget-viable per Aaron 2026-04-22: *"it's not like you can read every file backlog"* | All agents (self-administered) on on-touch + on-rename triggers; Daya or Aarav on periodic sample-sweep | factory | Filename accurately describes current content; stale filenames after concept rename (e.g., `vision-research-backlog-pipeline.md` after pipeline→loop reframe) caught and fixed. Companion to crystallize-everything: honest labels + compressed bodies = diamond-grade repo surface. Does NOT apply to memory filenames (archaeology-beats-crystallization — memory filename preserves birth-era framing; `name:` frontmatter carries current framing). | File rename + cross-ref sweep commit, OR `docs/HUMAN-BACKLOG.md` row if rename needs broader consideration, OR notebook finding for periodic-sweep results | `feedback_filename_content_match_hygiene_hard_to_enforce.md` + `feedback_crystallize_everything_lossless_compression_except_memory.md` | +| 43 | GitHub Actions workflow-injection safe-patterns audit | Every new `.github/workflows/*.yml` (author-time, reviewer-enforced) + cadenced re-read vs upstream GitHub Security Lab guidance every 5-10 rounds (same cadence as skill-tune-up / row #38). | Author of the workflow (self-check via pre-write checklist); cadenced re-read by Architect (Kenji) — who watches for revisions to the blog post and new GitHub-side safe-patterns. | both | Pre-write checklist in `docs/security/GITHUB-ACTIONS-SAFE-PATTERNS.md` walked before committing any workflow; post-commit triple-layer lint (actionlint + CodeQL `actions` language + Semgrep) catches what the author missed; cadenced read of the upstream GitHub Security Lab blog post catches new attack surfaces and guidance updates. Ships to project-under-construction: any adopter inherits the same three-layer CI enforcement + the checklist doc. | Author-time: workflow comment block documenting which `github.event.*` contexts are consumed and where; reviewer block or lint failure if checklist violated. Cadenced: notebook entry noting any new upstream guidance or retired patterns; ADR if a factory-wide default needs to change. | `docs/security/GITHUB-ACTIONS-SAFE-PATTERNS.md` + `.github/workflows/gate.yml` (`lint-workflows` job) + `.github/workflows/codeql.yml` (`actions` language) | +| 44 | Supply-chain safe-patterns audit (third-party ingress) | Every new / bumped third-party pin (author-time, reviewer-enforced) + cadenced re-read vs upstream supply-chain guidance (OWASP SCVS, NIST SSDF, SLSA, GitHub Advisory feed) every 5-10 rounds. | Author of the change (self-check via pre-add / pre-bump checklist); cadenced re-read by Architect (Kenji) with security-researcher (Mateo) support on CVE-feed signals. | both | Pre-add / pre-bump checklist in `docs/security/SUPPLY-CHAIN-SAFE-PATTERNS.md` walked before committing any dependency change across the four ingress classes (GHA / NuGet / toolchain installers / MSBuild `.targets`). Cadenced read catches: new tag-rewrite-class incidents (canonical: CVE-2025-30066 tj-actions/changed-files, March 2025), new SLSA attestation requirements, new supply-chain Semgrep rule candidates, NuGet advisory velocity. Ships to project-under-construction: any adopter inherits the checklist + `package-auditor` skill + incident playbooks A/B/C/D + Semgrep `gha-action-mutable-tag`. | Author-time: commit-message rationale for the bump; reviewer rejection if checklist violated. Cadenced: notebook entry; new Semgrep / lint rules when gaps are observed; SECURITY-BACKLOG row for residual risks; ADR if the factory-default enforcement tier changes (e.g., `packages.lock.json` adoption per SDL #7). | `docs/security/SUPPLY-CHAIN-SAFE-PATTERNS.md` + `docs/security/SDL-CHECKLIST.md` row #7 + `docs/security/INCIDENT-PLAYBOOK.md` playbooks A/B/C/D + `.claude/skills/package-auditor/SKILL.md` + `tools/audit-packages.sh` + `.semgrep.yml` rule `gha-action-mutable-tag` | +| 45 | Attribution hygiene (external people / projects / patterns) | Opportunistic on-touch (every time an agent names an external person, project, pattern, character, or organization in docs / memory / skills) + cadenced retrospective sweep every 5-10 rounds (same cadence as rows 5, 35-38). Exhaustive is not budget-viable; on-touch catches new gaps as they author and the periodic sweep catches regressions. | All agents (self-administered) on on-touch; TBD — candidate new skill `attribution-auditor` (queued in BACKLOG P1) on cadenced retrospective sweep. | both | When naming an external **person**: cite with a findable URL if one exists (blog, profile, publication). When naming an external **pattern / technique**: cite the originator and the source (blog post, paper DOI, tweet URL). When naming a **project / plugin / library**: name the author / maintainer organization (per the plugin's `plugin.json` / the repo's `package.json` / the paper's authors). When naming a **fictional character**: name the creator (author) and publisher. Don't rely on audience-recognition to fill in attribution — audiences rotate, memory rots, and unattributed names become orphaned claims. Ships to project-under-construction: adopter docs citing external patterns inherit the same discipline. **Relationship to row #23 (Missing-hygiene-class gap-finder):** row #23 is the tier-3 cadenced mechanism that should have surfaced this class automatically; because row #23 is currently marked "(proposed)" with no active cadence, Aaron had to surface attribution hygiene manually (2026-04-22: *"missing attribution hygene"* + *"like the other hygene this one is missing a skiil/row"*). This row #42 is therefore also a row-#23 effectiveness signal — a class the tier-3 finder should have caught before user correction. Triggering incident: 2026-04-22 `docs/AUTONOMOUS-LOOP.md` named "Geoffrey Huntley's bash-wrapper", "Ralph Wiggum pattern", "ralph-loop@claude-plugins-official plugin" without Huntley's blog URL, Anthropic plugin authorship, Matt Groening / *The Simpsons* / Fox Broadcasting character attribution, or `mikeyobrien/ralph-orchestrator` related-work link. | In-doc attribution block on on-touch (inline URL / author / org citation at author-time); correcting edit + ROUND-HISTORY row on cadenced retrospective catch; BACKLOG row if a retrospective audit surfaces many unattributed names in one doc | `feedback_attribution_hygiene.md` + `feedback_missing_hygiene_class_gap_finder.md` (row #23 parent) | +| 46 | Missing-cadence activation audit (proposed-row / TBD-owner / no-named-skill tracker) | Round cadence (proposed — Aaron 2026-04-22) + opportunistic on-touch (every time an agent touches `docs/FACTORY-HYGIENE.md` or an authored-but-inactive hygiene rule). Not exhaustive; each touch or round-close is a sample pass. | Architect (Kenji) on round cadence; all agents (self-administered) on on-touch. Candidate dedicated skill queued in BACKLOG P1 if the volume warrants — opening estimate: existing "(proposed)" rows #22 / #23 / #35 / #36 / #42 / #43 = six, so skill-vs-hat decision pending observation. | factory | Sweep the hygiene list (this file) + BP-NN list + hygiene-adjacent memories for items that **declare** a recurring cadence but have (a) no named owner, (b) no active enforcement surface (SKILL.md / hook / CI step), (c) cadence literally tagged `(proposed)` / `TBD` / `pending`, or (d) last-known-fire date absent or older than declared cadence × 2. Distinct from row #23 (missing-CLASSES): row #23 asks "what hygiene are we not running at all?"; row #43 asks "what hygiene have we AUTHORED but not ACTIVATED?" Both point at the same meta-level gap ("the factory is not as self-regulating as its paperwork suggests") but from opposite directions — row #23 surfaces unknowns, row #43 surfaces known-unactivated. Triggering catch: 2026-04-22 Aaron noted row #23 itself is marked "(proposed)" and therefore could not catch row #42 (attribution hygiene) before he did manually — *"missing cadences for any items that should be reoccuring hygene we should add"*. Ships to project-under-construction indirectly: adopters inherit the discipline via the audit skill (once built) but the factory-internal hygiene list is factory-scope. Classification: this is a **self-audit** row — it audits the hygiene list itself, so its own activation is a visible bootstrap risk (row #43 itself marked "(proposed)" is the canonical example of what row #43 should catch). | Audit doc per round listing every proposed / TBD row with activation recommendation (adopt now / park with reason / retire if stale); ROUND-HISTORY row noting which proposed rows activated this round; HUMAN-BACKLOG `activation-decision` row where Aaron sign-off is warranted | `feedback_missing_cadences_hygiene.md` + this file's §"The list" (self-referential) | +| 47 | Cadence-history tracking hygiene (every active cadenced factory surface has a structured fire-history) | Round cadence — **active** (first fire 2026-04-22 per `docs/research/cadence-history-audit-2026-04-22.md`) + opportunistic on-touch (every time an agent activates a row, transitions a row from proposed → active, fires a cadence, or touches a cadenced surface outside this file). Not exhaustive — each cadenced fire is the surface's own history obligation; the audit sweeps periodically for compliance gaps. | Architect (Kenji) on round cadence for the compliance audit; every surface's owner on each fire (self-administered); candidate dedicated skill queued in BACKLOG P1 if the history-schema question fragments enough to warrant tooling. | factory | Every **cadenced factory surface** MUST have a **fire-history surface**. Scope is explicitly broader than this file's rows: it covers (i) FACTORY-HYGIENE.md rows with declared active cadence; (ii) cron jobs declared in `docs/factory-crons.md` (e.g., the `autonomous-loop` / `heartbeat` / `git-status-pulse` rows); (iii) round-open / round-close checklist items declared in `.claude/skills/round-open-checklist/` and `.claude/skills/round-management/` (round-close step 4); (iv) any other declared recurring obligation named in docs / memory / skills (e.g., harness-surface cadenced audits per row #38, skill-tune-up sweeps, wake-briefing routines). Canonical example at factory root: the autonomous-loop tick — the single most cadenced surface in the factory — logs every fire to `docs/hygiene-history/loop-tick-history.md` before each end-of-tick `CronList` call (see `docs/AUTONOMOUS-LOOP.md` step 5). Acceptable fire-history surface shapes: (a) per-row history file under `docs/hygiene-history/row-NN-.md`, (b) per-surface history file under `docs/hygiene-history/-history.md` (cron tick, wake-briefing, etc.), (c) shared ledger (e.g., `docs/research/meta-wins-log.md` for meta-check fires), (d) notebook section with dated entries (e.g., Aarav's notebook for row #5 fires), or (e) a rollup in `docs/ROUND-HISTORY.md` per round close. The surface's "Durable output" column (or doc-level equivalent) names the fire-history surface; surfaces whose output is ephemeral (inline acknowledgement, ad-hoc finding without a surface) are compliance gaps. Per-fire entry schema (minimum): **(date, agent, output-or-finding, link-to-durable-output, next-fire-expected-date-if-known)**. Distinct from row #23 (new-CLASSES we don't run) and row #43 (known-CLASSES we authored-but-never-activated): row #44 asks *"of the classes we AUTHORED and ACTIVATED, can we prove they fire on cadence and see the fire-log?"* The three rows together form a meta-hygiene triangle — existence (#23) / activation (#43) / fire-history (#44) — each catching a different structural failure mode. Triggering catches: 2026-04-22 Aaron after row #23 activation (*"everything with a cadence should be track it history hygene make sure we got that one too"*) + 2026-04-22 Aaron on the autonomous-loop tick specifically (*"you might as well right a history record somewhere on every loop tool right before you check cron"*) — the second message is the directive that drove the scope extension from FACTORY-HYGIENE-only to all-cadenced-factory-surfaces. Ships to project-under-construction indirectly via the audit skill (once built); factory-internal hygiene list is factory-scope. Classification: **self-audit** row — it audits the hygiene list itself AND every other cadenced surface the factory declares, so its own first-fire bootstrap risk mirrors rows #23 and #43 and its scope-extension risk mirrors the autonomous-loop tick's pre-extension invisibility to row #44's original audit. | Audit doc per round listing every active cadenced surface (this file's rows + cron registry rows + round-checklist items + declared recurring obligations) + whether it has a fire-history surface + per-fire entry-schema compliance; ROUND-HISTORY row noting surface-gap-fix landings; HUMAN-BACKLOG `history-surface-decision` row if a shared-ledger-vs-per-surface-file schema decision needs Aaron's sign-off | `feedback_cadence_history_tracking_hygiene.md` + `docs/hygiene-history/loop-tick-history.md` (canonical worked example) | +| 48 | GitHub surface triage cadence (ten surfaces: PRs / Issues / Wiki / Discussions / Repo Settings / Copilot coding-agent / Agents tab / Security / Pulse / Pages) | Round-close cadence (primary) + opportunistic on-touch (every tick that comments on / labels / merges / closes a PR or issue; edits the wiki; replies to a discussion; toggles a repo setting; dispatches an Agents-tab session; triages a security alert; ships a Pages change). Not exhaustive on-touch; the round-close sweep catches what on-touch missed. | Architect (Kenji) on the round-cadence sweep; all agents (self-administered) on on-touch. Codified in `.claude/skills/github-surface-triage/SKILL.md` so future agents don't rediscover the taxonomies — this skill is the executable form of `docs/AGENT-GITHUB-SURFACES.md` per Aaron's 2026-04-22 meta-rule *"we need skills for all this so you are not redicoverging"*. | both | Run the ten-step sweep from `docs/AGENT-GITHUB-SURFACES.md` § "Round-close mechanical procedure" (or the skill's checklist): (1) `gh pr list` + classify against seven PR shapes; (2) `gh issue list` + classify against four issue shapes; (3) wiki footer-SHA drift-check against three wiki shapes; (4) `gh api graphql` discussions + classify against four discussion shapes; (5) `gh api repos//` settings-snapshot diff; (6) Copilot coding-agent sub-read vs `.github/copilot-instructions.md`; (7) Agents-tab watch-only observation (no API yet); (8) security-alerts sweep (code-scanning / Dependabot / secret-scanning); (9) Pulse `stats/*` snapshot (verification substrate); (10) Pages unpublished-state check (`gh api repos///pages` → 404 = current state, research-gated). Apply each surface's action; log one row per touched surface to its fire-history (PRs / issues / wiki / discussions / security), or append a snapshot block (settings / pulse / pages). Surface Aaron-scoped decisions (`awaiting-human` PRs, settings policy, Pages adoption, P0-secret) to `docs/HUMAN-BACKLOG.md`. Ships to project-under-construction: adopters inherit the classification taxonomies + the fire-history schemas + the round-cadence discipline (playbook is factory-level; GitHub is the concrete adapter — GitLab / Gitea / Bitbucket mappings listed in `docs/AGENT-GITHUB-SURFACES.md`). Triggering directives: 2026-04-22 Aaron — seven in sequence — *"we are going to need cadence for checking open PRs ... Also same with issues"*; *"you own the wiki too"*; *"and discussions"*; *"we need skills for all this so you are not redicoverging"*; *"you can own [copilot coding-agent settings] and all our settings ... ohhh you can own [agents tab]"*; *"and [security]"*; *"oh look at all the data !!! [pulse] might ... help with some of our verifactions"*; *"we should start experiting with [pages] Jekyll seems like we could push the boundaries if needed, maybe static pages is enough? you can figure out what works good with bun."*. Classification: the taxonomies are explicitly declared **non-final** per Aaron's "we can beef up our stuff over time"; first 5-10 rounds of fire-history observations feed a taxonomy-revision pass. | Fire-history row per triaged surface; ROUND-HISTORY row on round-close noting which PRs / issues / discussions / wiki pages / security alerts moved class or resolved; HUMAN-BACKLOG `pr-decision` / `settings-change` / `pages-adoption` / `secret-rotation` row where Aaron sign-off is warranted | `docs/AGENT-GITHUB-SURFACES.md` (authoritative) + `.claude/skills/github-surface-triage/SKILL.md` (executable) + `docs/AGENT-ISSUE-WORKFLOW.md` (abstract dual-track) + fire-history files under `docs/hygiene-history/` (pr-triage / issue-triage / wiki / discussions / security-triage / pulse-snapshot / pages) + `docs/github-repo-settings-snapshot.md` | +| 49 | Post-setup script stack audit (bun+TS default; bash only under exempt paths or with exception label) | Author-time (every new `tools/**/*.{sh,ps1}` decision-flow walk per `docs/POST-SETUP-SCRIPT-STACK.md`) + cadenced detection every 5-10 rounds (same cadence as skill-tune-up / row #38 / harness-surface audit) + opportunistic on-touch (every time an agent adds or edits a script under `tools/`). | Author of the script (self-check at author-time against the decision-flow doc); Dejan (devops-engineer) on the cadenced detection sweep; Kenji (Architect) on migration-order decisions when multiple violations stack up. | both | **Author-time prevention:** walk the three-question flow in `docs/POST-SETUP-SCRIPT-STACK.md` before writing any new `tools/**/*.{sh,ps1}` — (Q1) pre-setup? → `tools/setup/` bash+PowerShell exempt; (Q2) skill-bundled? → skill-compatibility rules govern, not this row; (Q3) default bun+TypeScript unless an explicit exception (trivial pipeline / thin CLI wrapper / bash scaffolding / sibling-migration guardrail) applies, in which case the script MUST carry a header comment naming the exception. **Cadenced detection:** `tools/hygiene/audit-post-setup-script-stack.sh` lists every shell/PowerShell script under `tools/` and classifies each as `exempt` / `labelled-exception` / `violation`. Exit-2 on any new violation; CI / pre-commit-eligible. **Why both layers:** prevention catches new violations at author-time (cheap); detection catches drift — labels getting stripped on edits, exceptions becoming stale, scripts moving out of exempt paths. Ships to project-under-construction: adopters inherit the canonical-stack rule + the audit script + the decision-flow doc. Aaron 2026-04-22 triggering-directive-chain: *"if post setup backlog bun/ts"* → *"now add someting that will try to prevent that and and hygene it if it happens again"*. | Author-time: commit-message rationale for any new `.sh` under `tools/` outside `tools/setup/`, OR exception-label header in the script, OR BACKLOG row queuing bun+TS migration. Cadenced: audit script output (markdown), appended to `docs/hygiene-history/post-setup-script-stack-history.md` (per-fire schema per row #44); BACKLOG row per unlabeled violation. | `docs/POST-SETUP-SCRIPT-STACK.md` (prevention surface) + `tools/hygiene/audit-post-setup-script-stack.sh` (detection surface) + `memory/project_ui_canonical_reference_bun_ts_backend_cutting_edge_asymmetry` + `memory/project_bun_ts_post_setup_low_confidence_watchlist` | +| 50 | Missing-prevention-layer meta-audit (every hygiene row carries a prevention classification: prevention-bearing / detection-only-justified / detection-only-gap) | Round cadence (same as rows #22 / #23 / #35 / #36) + opportunistic on-touch (every time a new row is added to `docs/FACTORY-HYGIENE.md` the author classifies it at landing). Not exhaustive; the round-close sweep catches un-classified rows and gap rows. | Architect (Kenji) on round-cadence classification review + gap-closure ROI assessment. All agents (self-administered) on on-touch: every new hygiene row MUST declare its prevention classification at landing; an unclassified row is itself a violation of this row. | factory | Sweep every row in `docs/FACTORY-HYGIENE.md` and classify each as one of: (a) **prevention-bearing** — an author-time / commit-time / trigger-time mechanism (hook, CI check, decision-flow doc, pre-commit lint, skill-gate) blocks or warns the violation BEFORE it materialises; (b) **detection-only-justified** — the class is fundamentally post-hoc (e.g., cadence-history row #44 — a fire-log can only exist AFTER the fire happens; wake-friction row #29 — friction is only observable at wake-time); (c) **detection-only-gap** — no principled reason the row is detection-only; a prevention layer COULD and SHOULD be built. Classification lives in `docs/hygiene-history/prevention-layer-classification.md` (one table row per hygiene row). **Why this row exists:** Aaron 2026-04-22 *"add a hygene for missing prevention layers"* — the factory had been quietly accumulating detection-only rows without asking the complementary question "could we have prevented this at author-time?". Without this meta-audit, the factory's reactive-cost grows silently. Parallels the existing meta-hygiene triangle (row #23 unknown-classes / #43 authored-but-unactivated / #44 cadence-history) by adding a fourth: row #47 is *"of the rows that ARE active and firing, which could have been prevented upstream"*. **Classification:** this is an **intentionality-enforcement** hygiene rule (Aaron 2026-04-22 tick-close: *"we are enforcing intentional decsions"*) — the audit cannot compute whether a row's classification is correct, but it forces every new hygiene row to carry an explicit prevention-vs-detection decision at landing. Declining to classify is itself the violation. See `memory/feedback_enforcing_intentional_decisions_not_correctness.md`. Ships to project-under-construction: adopters inherit the classification discipline + the meta-audit script + the obligation to classify any new hygiene row at landing. | `docs/hygiene-history/prevention-layer-classification.md` (classification matrix, one row per hygiene row) + cadenced audit run landed as `docs/hygiene-history/missing-prevention-layer-audit-YYYY-MM-DD.md` noting gap rows; ROUND-HISTORY row when a gap row gains a prevention layer (detection-only-gap → prevention-bearing transition); BACKLOG row per gap with prevention-design ROI estimate. | `tools/hygiene/audit-missing-prevention-layers.sh` + this row's self-reference (its own prevention layer is the at-landing-classify obligation declared in this Checks/enforces column) | +| 51 | Cross-platform parity audit (bash / PowerShell / bun+TS twin check across macOS / Windows / Linux / WSL) | Detect-only now (landed 2026-04-22); cadenced detection every 5-10 rounds (same cadence as row #46); opportunistic on-touch every time an agent adds or edits a script under `tools/`. Enforcement deferred until baseline is green AND CI matrix runs `--enforce` on `macos-latest` / `windows-latest` / `ubuntu-latest` (WSL inherits ubuntu-latest for CI). | Dejan (devops-engineer) on cadenced detection; author of the script (self-check at author-time against the rule classes in the audit's decision-record header block). Kenji (Architect) on CI-matrix-enforcement sign-off when baseline is green. | both | `tools/hygiene/audit-cross-platform-parity.sh` classifies every script under `tools/` by rule class: (a) **pre-setup** (`tools/setup/**`) — both `.sh` AND `.ps1` required per Q1 dual-authoring rule (`memory/feedback_preinstall_scripts_forced_shell_meet_developer_where_they_live`); (b) **post-setup permanent-bash** (`thin wrapper over existing CLI` / `trivial find-xargs pipeline` / `stay bash forever`) — `.ps1` twin required per the Windows-twin obligation (`memory/feedback_stay_bash_forever_implies_powershell_twin_obligation.md`); (c) **post-setup transitional** (`bun+TS migration candidate` / `bash scaffolding`) — no twin obligation (long-term plan is one cross-platform bun+TS script); (d) **post-setup bun+TS** (`*.ts` under `tools/`) — no twin needed (cross-platform native via bun). `--summary` prints counts; `--enforce` flips exit 2 on gaps. **Why detect-only first:** baseline at first fire (2026-04-22) was 13 gaps — 12 pre-setup bash without `.ps1` twin (Q1 violation silently accumulating since `tools/setup/` existed) + 1 post-setup permanent-bash (`tools/profile.sh`) without `.ps1` twin. Turning enforcement on before triage would block every CI run. **Why this row exists:** Aaron 2026-04-22 *"missing mac/windows/linux/wsl parity (ubuntu latest) we can deffer but should have the hygene in place for when we want to enforce and it will be more obvious to you in the future that we are cross platform."* Cross-platform-first must be a *visible* factory property (audit exists, runs, prints the gap) before it becomes an enforced gate. Same pattern as FACTORY-HYGIENE rows #23 / #43 / #47. See `memory/feedback_cross_platform_parity_hygiene_deferred_enforcement.md`. **Classification (row #47):** **prevention-bearing** — the audit runs at author-time (opportunistic on-touch) and surfaces the gap before it lands, same as row #46. The audit itself is a detect-only mechanism but detect-only surfaces the obligation at author-time when the author runs it. Ships to project-under-construction: adopters inherit the parity audit + the decision-record-block pattern + the CI-matrix obligation once it's wired. | Audit output in repo root on each fire; cadenced runs appended to `docs/hygiene-history/cross-platform-parity-history.md` (per-fire schema per row #44); BACKLOG row per gap at triage time; ROUND-HISTORY row when a gap resolves. | `tools/hygiene/audit-cross-platform-parity.sh` (detection + decision-record header block) + `memory/feedback_cross_platform_parity_hygiene_deferred_enforcement.md` + `memory/feedback_stay_bash_forever_implies_powershell_twin_obligation.md` + `memory/feedback_preinstall_scripts_forced_shell_meet_developer_where_they_live` + `docs/POST-SETUP-SCRIPT-STACK.md` | +| 52 | Tick-history bounded-growth audit (`docs/hygiene-history/loop-tick-history.md` line-count vs threshold) | Detect-only (landed 2026-04-22); cadenced detection once per round-close (same cadence as row #44 cadence-history sweep, since this is the canonical row #44 worked example auditing itself); opportunistic on-touch whenever the tick-history file is read or edited. Archive action itself remains manual for now; deferring automation to the larger BACKLOG row that also covers threshold-revision and append-without-reading refactor. | Dejan (devops-engineer) on cadenced detection; the tick itself (self-administered at tick-close) on the opportunistic on-touch — each tick's end-of-tick sequence can invoke this audit after the append + commit to get a `within bounds: 96/500 lines` visibility signal. | factory | `tools/hygiene/audit-tick-history-bounded-growth.sh` checks the file's line count against a threshold (default 500, overrideable via `--threshold N`) and exits 0 within bounds / 2 over threshold. The threshold is set lower than the stated 5000-line paper bound because the file is read on every tick-close append — a per-tick context cost that scales linearly with file size — and 5000 lines represents too large a context hit on a 1-minute cadence. The audit's header block carries a mini-ADR decision record for the 500-line choice (context / decision / alternatives / supersedes / expires-when). **Why this row exists:** Aaron 2026-04-22 tick-fire interrupt: *"does loop tick history grow unbounded? that's an issue if so you just read it"*. Honest state was stated-bound-no-enforcement: file header named 5000 lines, nothing checked it. This row closes the enforcement gap for the threshold-check half of the full BACKLOG row (archive-action + append-without-reading refactor remain deferred). **Self-referential closure:** the tick-history file IS the canonical row-#44 cadence-history-tracking worked example (named explicitly in row #44's "Durable output" citation). Until this row landed, the most-cadenced surface in the factory — the tick itself — had its fire-log surface unaudited for its own growth. Meta-audit triangle remains intact (existence #23 / activation #43 / fire-history #44), and row #49 adds a fourth: fire-history files themselves need bounded-growth audits because they grow at the cadence of the surface they track. **Classification (row #47):** **prevention-bearing** — the audit surfaces approaching-threshold warnings at 80% so the archive action can be planned, rather than reactive-only at over-threshold. Ships to project-under-construction indirectly: adopters inherit the pattern (fire-log files under their own `docs/hygiene-history/` need the same bounded-growth treatment), not this exact script. | Audit output on each fire; cadenced runs appended to `docs/hygiene-history/tick-history-bounded-growth-history.md` (per-fire schema per row #44); BACKLOG row when archival is due (archive-action itself queued as part of the larger tick-history enforcement BACKLOG row); ROUND-HISTORY row when threshold changes or archive action executes. | `tools/hygiene/audit-tick-history-bounded-growth.sh` (detection + mini-ADR header block) + `docs/hygiene-history/loop-tick-history.md` (target surface, canonical row #44 worked example) + BACKLOG row *"Loop-tick-history bounded-growth enforcement"* (larger follow-up: threshold revision + append-without-reading refactor + archive action) | ## Ships to project-under-construction diff --git a/docs/POST-SETUP-SCRIPT-STACK.md b/docs/POST-SETUP-SCRIPT-STACK.md new file mode 100644 index 00000000..3ab66dc9 --- /dev/null +++ b/docs/POST-SETUP-SCRIPT-STACK.md @@ -0,0 +1,240 @@ +# Post-setup script stack — the decision flow before writing tooling + +**Before writing a new `tools/**/*.{sh,ps1}` script, walk this +decision flow. Post-setup tooling defaults to bun + TypeScript +per `memory/project_ui_canonical_reference_bun_ts_backend_cutting_edge_asymmetry` +and `memory/project_bun_ts_post_setup_low_confidence_watchlist`.** + +This doc is the author-time prevention layer for the hygiene +row *"post-setup script stack audit"* (FACTORY-HYGIENE row #46). +Hitting this rule before the script is written is cheap; fixing +it after the script has landed and accumulated callers is +expensive. + +## The three-question decision flow + +### Q1 — Is this a pre-setup script? + +A pre-setup script is one that runs **before** the factory's +canonical tooling (bun, dotnet, etc.) is installed. It must +work on a bare developer machine or a fresh CI runner with +only the OS-default shell (bash on macOS / Linux, PowerShell +on Windows). + +**If yes** — pre-setup lives under `tools/setup/` and MUST be +dual-authored as bash + PowerShell with zero prereqs per +`memory/feedback_preinstall_scripts_forced_shell_meet_developer_where_they_live`. +Do not write in bun / TypeScript / Python / any language that +requires an installer to run. **Stop here — you're done +deciding.** + +**If no** — continue to Q2. + +### Q2 — Is this a skill-bundled script? + +A skill-bundled script lives inside +`.claude/skills//scripts/` and ships as part of +the portable skill unit. Its stack is declared in the skill's +`compatibility` frontmatter and travels with the skill when +the skill is copied to another project. + +**If yes** — stack choice is owned by the skill author, not by +this doc. The skill's `compatibility` field should name any +runtime dep the script adds (bun, python, node, etc.). Keep in +mind that a heavier dep reduces the skill's portability. +**Stop here — follow the skill-authoring stack discipline.** + +**If no** — it lives under `tools/` as a standalone +post-setup tool. Continue to Q3. + +### Q3 — Does this belong in bash, or in bun + TypeScript? + +Default: **bun + TypeScript**. Reasons: + +- **Type-safety on structured data.** Most post-setup tooling + parses structured formats (YAML frontmatter, JSON manifests, + markdown tables). A typed parser beats an awk state machine + for correctness and maintainability. +- **Unit-testable pure functions.** Bun's built-in test runner + lets a tool's computational core be tested without forking + processes, which is nearly impossible in bash. +- **Consistency across post-setup tooling.** Matches the + canonical UI stack declared in + `memory/project_ui_canonical_reference_bun_ts_backend_cutting_edge_asymmetry`. + +**Exceptions — bash is acceptable when:** + +*Transitional exceptions* (one bash version, migration pending): + +- Bun tooling has not yet been installed in the repo at the + time of writing — in which case the bash script is **explicit + scaffolding**, lands with a comment block naming itself as a + bun+TS migration candidate, and is queued in `docs/BACKLOG.md`. +- The sibling-migration guardrail prevents a one-off bun+TS + script from being stranded in a sea of bash. If no other + post-setup tool has migrated yet, bash is the honest default. + +*Permanent exceptions* (dual-authoring obligation — see below): + +- The script is a trivial `find | xargs | sort` pipeline with + no parsing or state. +- The script is a thin wrapper over existing CLI tools (git, + gh, dotnet) with no data transformation. +- **"Stay bash forever (recorded decision)"** — the intentionality- + enforcement hygiene rule demands *a recorded decision*, not + *migration*. If the reason "bun + TypeScript would buy nothing + here" holds up on inspection AND the Windows-twin cost is + acceptable, that is a valid answer. Per Aaron 2026-04-22: + *"The intentionality-enforcement reframe doesn't demand + migration; it demands a recorded decision. A 'this should + stay bash forever' is a valid answer if the reason holds up."* + See `memory/feedback_intentionality_doesnt_demand_migration_bash_forever_valid.md`. + +**The Windows-twin obligation (2026-04-22 clarification from +Aaron):** any script under a *permanent* bash exception +(`trivial find-xargs pipeline`, `thin wrapper over existing +CLI`, or `stay bash forever`) MUST be dual-authored as a +`.sh` + `.ps1` pair. Zeta supports Windows as a first-class +developer platform; a bash-only script that "stays bash +forever" without a PowerShell twin silently breaks Windows +devs. This obligation applies only to permanent exceptions — +transitional exceptions (`bun+TS migration candidate`, +`bash scaffolding`) ship single-bash because the long-term +plan is one cross-platform bun+TS replacement. + +**The cost-benefit consequence.** Once the Windows-twin cost is +visible, "stay bash forever" is usually MORE expensive than +migrating to bun+TS: you maintain two platform-specific +implementations forever vs. one cross-platform TypeScript +script. The fifth exception exists for the rare cases where +bun+TS genuinely cannot do the job (e.g., a script that MUST +run before bun is installed — but then it belongs in Q1 pre- +setup, not Q3 post-setup). Most "looks like a stay-bash +candidate" scripts flip to migration once dual-authoring is +priced in. See +`memory/feedback_stay_bash_forever_implies_powershell_twin_obligation.md` +for the 2026-04-22 reconsideration that made this rule +explicit. + +**Exception label requirement.** A bash script under `tools/` +outside `tools/setup/` that falls under an exception MUST carry +a header comment block naming which exception applies and (for +migration candidates) the BACKLOG row queuing its bun+TS +rewrite. For permanent-bash exceptions the header must also +state the Windows-twin plan — either "PowerShell twin lives at +`.ps1`" if one exists, or a BACKLOG row queuing its +authoring. Without a label, the script is a **violation**; a +permanent-bash label without a twin or twin-plan is a +**partial violation** and the hygiene audit flags it. + +## Exempt paths (no author-time review needed) + +The following paths are exempt from this decision flow: + +- `tools/setup/**` — pre-setup by definition (bash + PowerShell). +- `.claude/skills/**/scripts/**` — skill-bundled (governed by + skill compatibility rules, not this doc). +- `tools/_deprecated/**` — awaiting deletion; no new code. + +Everything else under `tools/` is in-scope for the audit. + +## The audit surface + +Two audit scripts run the hygiene sweep: + +- `tools/hygiene/audit-post-setup-script-stack.sh` — lists + every bash / PowerShell script under `tools/` (excluding + exempt paths), tags each as *allowlist-acknowledged* or + *new-violation*, and prints a summary. +- `tools/hygiene/audit-missing-prevention-layers.sh` — the + meta-hygiene audit that asks, for every row in + `docs/FACTORY-HYGIENE.md`, whether a prevention layer + exists or the row is justifiably detection-only. + +## Baseline status (2026-04-22 — all-labelled, Windows-twin cost now visible) + +First audit-fire at this doc's land surfaced 9 violations. +Per-file intentional-decision classification was applied +immediately in the follow-up tick (commit `952789c`). Later +the same day a brief flip of three scripts to "stay bash +forever" was reverted when Aaron surfaced the Windows-twin +obligation (Q3 section above) — the migration row is the +cheaper long-term path than maintaining two platform-specific +implementations. The audit now reports **exempt 12 / labelled +11 / violations 0** at the current baseline. + +**"Thin wrapper over existing CLI" (permanent — Windows-twin +obligation applies):** + +| Script | Reason | Windows-twin status | +|---|---|---| +| `tools/profile.sh` | Pure `dotnet tool install/update` + `dotnet-counters`/`trace`/`gcdump`/`dump` routing case statement. Wrapper shape is intrinsic. | **Missing.** The dual-authoring reconsideration queued in `docs/BACKLOG.md` — candidates: (a) write `tools/profile.ps1` twin, or (b) flip label to "bun+TS migration candidate" so one cross-platform script replaces both. Defer until a sibling post-setup bun+TS tool lands (sibling-migration guardrail). | + +**"bun+TS migration candidate" (transitional — single-bash, +cross-platform via migration):** + +| Script | BACKLOG row | +|---|---| +| `tools/audit-packages.sh` | "Migrate remaining bash audit scripts to bun + TypeScript (post-setup stack)" | +| `tools/lint/no-empty-dirs.sh` | same row | +| `tools/lint/safety-clause-audit.sh` | same row | +| `tools/alignment/audit_commit.sh` | same row (migrate as alignment-quartet group) | +| `tools/alignment/audit_personas.sh` | same row (group) | +| `tools/alignment/audit_skills.sh` | same row (group) | +| `tools/alignment/citations.sh` | same row (group) | +| `tools/skill-catalog/backfill_dv2_frontmatter.sh` | has dedicated row "Rewrite `tools/skill-catalog/backfill_dv2_frontmatter.sh` in bun + TypeScript (post-setup stack)"; cross-referenced from the consolidated row | + +**"bash scaffolding" (transitional — self-referential; +hygiene tooling must exist before bun+TS tooling ships):** + +| Script | Notes | +|---|---| +| `tools/hygiene/audit-post-setup-script-stack.sh` | Dog-fooding gap. Queued alongside backfill script. | +| `tools/hygiene/audit-missing-prevention-layers.sh` | Same. | + +**Sibling-migration guardrail** from Q3 applies: no single +script from the "migration candidate" list migrates alone; +the four alignment audits migrate as one group, and the +migration as a whole waits for at least one +non-migration-candidate post-setup bun+TS tool to land +under `tools/**/*.ts` first. + +**Stay-bash-forever status:** zero scripts currently. A brief +mid-day flip on 2026-04-22 assigned three scripts to this +category; all three reverted the same day once the Windows- +twin cost was priced in. The fifth exception remains available +for future use but is expected to be rare — most "looks like +stay-bash" scripts will flip to migration candidates once +dual-authoring is accounted for. + +**Not yet visited:** `tools/alloy/`, `tools/invariant-substrates/`, +`tools/lean4/`, `tools/tla/`, `tools/Z3Verify/` subtrees +are currently empty of `*.sh` / `*.ps1` scripts in the +audit's `git ls-files` scan — if future scripts land +there, they enter the same decision flow. + +## Why this doc exists + +Aaron 2026-04-22 (paraphrased directive-chain): *"is this a pre +setup or post setup script ... if post setup backlog bun/ts"* +followed by *"now add someting that will try to prevent that +and and hygene it if it happens again"*. The first message +captured the stack rule; the second demanded both author-time +prevention AND cadenced hygiene so the factory stops +rediscovering the rule one script at a time. + +## Reference patterns + +- `memory/project_ui_canonical_reference_bun_ts_backend_cutting_edge_asymmetry` + — canonical-stack declaration for UI + post-setup tooling. +- `memory/project_bun_ts_post_setup_low_confidence_watchlist` + — the watchlist this doc promotes into active discipline. +- `memory/feedback_preinstall_scripts_forced_shell_meet_developer_where_they_live` + — pre-setup discipline (bash + PowerShell zero-prereq). +- `memory/feedback_script_and_artifact_name_honesty_ensure_not_install` + — companion discipline on script-name honesty. +- `docs/FACTORY-HYGIENE.md` row #46 (post-setup script stack + audit) + row #47 (missing-prevention-layer meta-audit). +- `docs/BACKLOG.md` — row "Rewrite + `tools/skill-catalog/backfill_dv2_frontmatter.sh` in + bun + TypeScript" — the first concrete migration. diff --git a/docs/hygiene-history/cross-platform-parity-history.md b/docs/hygiene-history/cross-platform-parity-history.md new file mode 100644 index 00000000..bd065b37 --- /dev/null +++ b/docs/hygiene-history/cross-platform-parity-history.md @@ -0,0 +1,11 @@ +# Cross-platform parity audit — fire history + +Per-fire ledger for FACTORY-HYGIENE row #48 (cross-platform +parity audit). Schema per row #44 (date / agent / output / +link / next-expected). + +Authoritative audit: `tools/hygiene/audit-cross-platform-parity.sh`. + +| Date | Agent | Output | Link | Next expected | +|---|---|---|---|---| +| 2026-04-22 | Claude (during round 44 autonomous-loop tick) | First fire. 13 gaps — 12 pre-setup bash under `tools/setup/` missing `.ps1` twin (Q1 violation); 1 post-setup permanent-bash (`tools/profile.sh`) missing `.ps1` twin. 10 transitional (no twin obligation). 0 paired. | Triage queued: `docs/BACKLOG.md` P1 "Cross-platform parity triage — 13 baseline gaps". Audit source: `tools/hygiene/audit-cross-platform-parity.sh`. | Round 45 opportunistic-on-touch when any `tools/` script is edited; next scheduled cadenced run round 49-54 (5-10 round cadence). | diff --git a/docs/hygiene-history/issue-triage-history.md b/docs/hygiene-history/issue-triage-history.md new file mode 100644 index 00000000..c52c8c0f --- /dev/null +++ b/docs/hygiene-history/issue-triage-history.md @@ -0,0 +1,44 @@ +# Issue-triage history + +Durable fire-log for the issue-triage cadence declared in +[`docs/AGENT-GITHUB-SURFACES.md`](../AGENT-GITHUB-SURFACES.md) +(Surface 2) and `docs/FACTORY-HYGIENE.md` row #45. + +History note: the cadence was originally authored under the +file name `docs/AGENT-PR-WORKFLOW.md` § "The issue-triage +counterpart", which was folded into the ten-surface umbrella +doc `docs/AGENT-GITHUB-SURFACES.md` on 2026-04-22 (commit +`c57d469`). The historical bootstrap row below preserves the +original file name per the log's append-only discipline. + +Append-only. Same discipline as +`docs/hygiene-history/loop-tick-history.md` and +`docs/hygiene-history/pr-triage-history.md`. + +## Schema — one row per triage pass OR per on-touch issue action + +| date (UTC ISO8601) | agent | issue | shape | action | link | notes | + +- **date** — `YYYY-MM-DDTHH:MM:SSZ` at the point the row is + written. +- **agent** — Claude model + session marker. +- **issue** — issue number (e.g. `#7`) or `round-close sweep` + for a batch pass. +- **shape** — `triaged-already` / `needs-triage` / + `stale-claim` / `superseded-closable` / `round-close-sweep`. +- **action** — label / claim / mirror-to-durable / release / + close / defer. +- **link** — commit SHA, issue comment URL, BACKLOG row, or + ROUND-HISTORY row. +- **notes** — free-form one-line. + +## Why this file exists + +FACTORY-HYGIENE row #44 + Aaron 2026-04-22 directive: same as +pr-triage-history.md. + +## Log + +| date | agent | issue | shape | action | link | notes | +|---|---|---|---|---|---|---| +| 2026-04-22T (round-44 tick, bootstrap) | opus-4-7 / session round-44 | bootstrap | — | File bootstrapped alongside `docs/AGENT-PR-WORKFLOW.md`; first triage pass: `gh issue list --state open` returned empty (Zeta has zero open issues at time of bootstrap) | (this commit) | First entry in the log. Next expected entry: whenever the first issue is opened, or next round-close sweep — whichever comes first. | diff --git a/docs/hygiene-history/loop-tick-history.md b/docs/hygiene-history/loop-tick-history.md new file mode 100644 index 00000000..742b7567 --- /dev/null +++ b/docs/hygiene-history/loop-tick-history.md @@ -0,0 +1,112 @@ +# Loop-tick history + +Durable fire-log for the autonomous-loop tick. Appended to by +every Claude instance running the loop, immediately before the +end-of-tick `CronList` call (per `docs/AUTONOMOUS-LOOP.md` +step 5). + +This is the factory's canonical answer to *"is the tick +actually running?"* The commit log undercounts ticks (a +speculative-only or verify-only tick commits nothing); a +visibility signal in chat evaporates when the session ends; +this file is the only always-appended, always-persistent +record. + +## Schema — one row per tick + +| date (UTC ISO8601) | agent | cron-id | action-summary (one line) | commit-or-link | notes | + +- **date** — `YYYY-MM-DDTHH:MM:SSZ` at the point the row is + written (just before `CronList`). +- **agent** — Claude model + session marker when known + (e.g. `opus-4-7 / session round-44`). +- **cron-id** — the 8-char cron ID at time of fire (from + `CronList`). If the cron was re-armed *this* tick, write + the new ID and append `(re-armed)` in the notes column. +- **action-summary** — one line. What the tick actually did. + "No-op speculative scan" is valid. "Empty — no useful work + found" is valid. The point is a row per tick, not a row per + commit. +- **commit-or-link** — the commit SHA if the tick committed; + otherwise `—` (em-dash) or a link to a non-commit durable + artifact (audit doc, memory edit, etc.). If the tick + committed multiple commits, list the last. +- **notes** — free-form one-line: re-arm flag, cadence + changes, session-boundary markers, anomalies. + +## Why this exists + +Aaron 2026-04-22: *"you might as well right a history record +somewhere on every loop tool right before you check cron"*. + +The ask closes a gap the factory had left open: the most +cadenced hygiene surface in the whole repo — the tick itself +— was the only one without a fire-history surface. Row #44 +in `docs/FACTORY-HYGIENE.md` demands fire-history for every +active cadenced row; the tick inherited that demand at row +row-44 activation but had no concrete log. This file is that +log. + +Appending this row BEFORE `CronList` (not after) means even a +tick that stops abnormally has a row — the last operation in +the tick is the scheduler check, not the log write. If the +log row is present but the cron is missing, we know the tick +itself ran; if neither is present, the tick itself didn't +fire. + +## Append-only discipline + +- Rows are append-only. Never rewrite, never reorder. +- If a row turns out wrong (e.g. mis-stated action), add a + correction ROW below it citing the earlier row's timestamp + — do not edit in place. This preserves the log's integrity + as evidence. +- Pruning is allowed ONCE the log exceeds the bounded-growth + threshold (default **500 lines** per the mini-ADR in + `tools/hygiene/audit-tick-history-bounded-growth.sh`; lowered + from the original 5000-line paper bound because the file is + read on every tick-close append, making per-tick context cost + scale with file size). When over threshold, the first 80% of + rows move to + `docs/hygiene-history/loop-tick-history-YYYY-QN-archive.md` + and a pointer is left at the top of this file. No row ever + disappears. The `audit-tick-history-bounded-growth.sh` audit + (FACTORY-HYGIENE row #49) fires each round-close and warns at + 80% of threshold so the archive action can be planned, not + panicked. The threshold may rise back toward the paper bound + once the append-without-reading refactor (see BACKLOG *"Loop- + tick-history bounded-growth enforcement"*) lands — at that + point per-tick cost stops scaling with file size and the + constraint becomes grep-ergonomics, not context burn. + +## Log + +| date | agent | cron-id | action-summary | commit-or-link | notes | +|---|---|---|---|---|---| +| 2026-04-22T (round-44 tick) | opus-4-7 / session round-44 | aece202e | End-over-start restructure of `docs/AUTONOMOUS-LOOP.md`; cadence 2→1 min; bootstrap first row of this history file | `3651716` | File bootstrap — prior cron `dfa61c5e` retired; new cron `aece202e` armed at `* * * * *`; this row is the log's first entry. Placeholder `(this commit)` replaced with SHA in a follow-up edit — future ticks should write their SHA as soon as the commit lands, not during staging. | +| 2026-04-22T (round-44 tick, end-of-tick CronList) | opus-4-7 / session round-44 | aece202e | Append this row + fill SHA on the bootstrap row; verify cron still armed post-rotation; emit visibility signal | (end-of-tick, pre-CronList) | End-of-tick discipline dry-run — this is the first row written in the correct position (after commit, before CronList). Demonstrates the intended ordering for future ticks. | +| 2026-04-22T (round-44 tick, meta-wins follow-up) | opus-4-7 / session round-44 | aece202e | Append meta-wins row for tick-history-bootstrap (partial depth-2); self-correct prior-tick "depth-5 logged" overclaim in AUTONOMOUS-LOOP.md History section | `4900b12` | First tick that caught its own prior-tick verify-before-deferring violation (claim-without-landing on meta-wins row). Demonstrates same-round self-audit pattern working as intended. | +| 2026-04-22T (round-44 tick, row-#44 scope extension) | opus-4-7 / session round-44 | aece202e | Extend FACTORY-HYGIENE row #44 from "every active cadenced row" to "every active cadenced factory surface" (crons + checklists + recurring obligations); sync memory; retrospective upgrades on two meta-wins rows | `79aa0fc` + `ce206b2` | Path (a) of the prior tick's documented upgrade condition executed — hygiene-cadence-gap row upgraded to clean depth-5 (first in the log); tick-history-bootstrap row upgraded from partial depth-2 to clean depth-3. Depth-5 overclaim from commits 3651716/8ba3a34 now genuinely earned on the correct row. | +| 2026-04-22T (round-44 tick, portable issue-workflow) | opus-4-7 / session round-44 | aece202e | Replace generic GH issue templates with Zeta-shape (bug/backlog/human-ask/config); land `docs/AGENT-ISSUE-WORKFLOW.md` as an adapter-neutral dual-track doc (GH Issues / Jira / git-native); add `tools/setup/` adapter-prompt BACKLOG row; update AGENTS.md pointer to adapter-neutral wording | `ca160fd` | Four Aaron interrupts in sequence absorbed: (1) GH Issues enabled + plugin, (2) templates-are-agent-domain + parallelization, (3) dual-track for research substrate, (4) portability — factory must not force tracker choice. Final doc puts the factory-level dual-track principle first and the three concrete adapters second. Zeta's default stays GH Issues; adopters pick at setup. | +| 2026-04-22T (round-44 tick, contributor-UX bundle) | opus-4-7 / session round-44 | aece202e | Three Aaron directives bundled: (a) templates redesigned for human+AI contributor UX (two-tier required/optional, welcome signal); (b) new `docs/CONTRIBUTOR-PERSONAS.md` — 10 personas with 90-second-file-bug test; (c) two research absorbs — Matherne cluster-algebras lecture 1 + AI-contributor-attraction research; config.yml reordered AI-welcome link to first | `e241ef6` | Causal chain: AI-attraction research finding "heavy templates -> AI agents move on" + "silence reads as rejection" drove the two-tier redesign and the explicit welcome signal. Persona list fills the "specific shape in our mind for UX" gap Aaron called out. Cluster-algebras absorb is speculative / research-grade — explicit guardrails against overclaiming Zeta equiv cluster algebra. | +| 2026-04-22T (round-44 tick, ten-surface GitHub playbook) | opus-4-7 / session round-44 | aece202e | Seven-directive cascade absorbed as ten-surface umbrella: (1) PRs + Issues taxonomies; (2) wiki (3 shapes + seed set); (3) discussions (6 categories + 4 shapes); (4) `.claude/skills/github-surface-triage/SKILL.md` as executable codification — the load-bearing "we need skills for all this so you are not redicoverging" absorb; (5-6) repo settings + Copilot coding-agent sub-surface; (7) Agents-tab `watch`-only until ADR; (8) security 4 sub-surfaces with P0-secret escalation; (9) Pulse as verification substrate; (10) Pages research-gated. FACTORY-HYGIENE row #45 retitled ten-surface. Classification comments posted on PRs #31 + #32. | `c57d469` | First tick where an Aaron meta-rule ("we need skills for all this") was satisfied *this same tick* — the skill landed alongside the doc, not deferred. Extended scope mid-tick: initial plan was 9 surfaces; Pages arrived as #10 with "research on backlog" clarification. Research-first posture reused from Agents-tab precedent. | +| 2026-04-22T (round-44 tick, hosting + WASM + budget) | opus-4-7 / session round-44 | aece202e | Three paired BACKLOG P2 rows: hosting research (free-tier / cheap-paid / AI-infra survey), WASM as optional compile target (AOT/JIT/WASM triad with clean-core vs native-surface split), budget/cost optimizer over 7 resource categories + paired budget enforcer with retraction-based mid-flight kill. All research-gated, explicit no-commitment, implementation separate. | `803067a` | Aaron explicitly linked budget-optimizer-to-hosting ("that way i can afford the hosting") — captured as row-adjacency in BACKLOG. Budget enforcer leans on Zeta's retraction-native primitives for clean cancellation, first genuine algebra-to-infra pull-through in a backlog row. | +| 2026-04-22T (round-44 tick, orphan-fix + pairing refactor) | opus-4-7 / session round-44 | aece202e | Two threads bundled: (a) orphan-reference cleanup — AGENT-PR-WORKFLOW.md dangling references in skill + 2 fire-history headers fixed, historical log rows preserved per append-only discipline; (b) BACKLOG P2 row for "skill = procedure, doc = rationale" factory-wide refactor with 5-shape action taxonomy (already-paired / skill-only / doc-only / too-small-to-split / wrong-pattern-entirely), owner Aarav on survey. Reference pair is the one that just landed (github-surface-triage SKILL + AGENT-GITHUB-SURFACES.md). | `7a4cf92` | Tick used honest-audit start to catch own-work orphans from pre-compaction `mv`; Aaron's "refactor it all lol, backlog" absorbed as promotion of one-pair pattern to factory-wide discipline. Doc/skill separation is complementary to skill-documentation-standard (frontmatter provenance), not overlapping. | +| 2026-04-22T (round-44 tick, frontmatter-compliance fix) | opus-4-7 / session round-44 | aece202e | Add DV-2.0 provenance breadcrumbs to `github-surface-triage/SKILL.md` frontmatter (`record_source` / `load_datetime` / `last_updated` / `status` / `bp_rules_cited`) per `skill-documentation-standard`; drive-by fix "nine-step sweep" -> "ten-step sweep" in description (body already enumerated ten steps after the Pages extension). | `73e4b36` | Self-audit caught a verify-before-deferring miss on own prior commit — `c57d469` landed the skill with only name+description, missing the five DV-style audit columns the standard demands. Schema-compliance fix is mechanical (same class as injection-lint fixes that CLAUDE.md §skill-creator-workflow whitelists as skip-the-workflow edits). Direct bridge to the pairing-refactor BACKLOG row from the prior tick: frontmatter provenance is the auditable substrate the skill/doc pairing pattern needs to lean on. | +| 2026-04-22T (round-44 tick, DV-2.0 scope-universal) | opus-4-7 / session round-44 | aece202e | Mechanical audit of all 216 `.claude/skills/**/SKILL.md` found 214 missing DV-2.0 frontmatter — including `skill-documentation-standard` itself (the standard-defining skill was 0/5 on its own standard). Self-referential fix landed. Aaron's 2026-04-22 directive *"Data Vault 2.0 can be scope universal sounds like it will help with indexing"* absorbed as a P2 BACKLOG row in the factory-setup section: two-phase rollout (phase 1 mechanical skill-catalog cascade via `skill-improver`; phase 2 scope-extension design to personas / ADRs / research / backlog / round-history / hygiene-history / memory). Memory saved: `feedback_dv2_scope_universal_indexing.md` covering the scope + indexing-value-prop + self-reference-closure corollary. | `a103f08` | Aaron affirmed the framing with "nice Meta-fix: the standard-defining skill must carry its own standard" mid-tick — captured in memory as the self-reference-closure discipline: a skill/doc/process that defines a standard must carry that standard in its own artifact or the standard is aspirational and the factory hasn't adopted it. Applies beyond DV-2.0 (style guides, ADR templates, governance docs). Same-tick absorb of both the scope-universal directive and the framing-affirmation. | +| 2026-04-22T (round-44 tick, DV-2.0 phase-1 script + batch 1) | opus-4-7 / session round-44 | aece202e | Phase-1 deliverable of the scope-universal BACKLOG row: landed `tools/skill-catalog/backfill_dv2_frontmatter.sh` (~200 lines, idempotent, reads git-log for `record_source`/`load_datetime`, greps body for `bp_rules_cited`, defaults `status: active` honestly, injects before closing `---` fence via awk ENVIRON). Applied to 10-skill alphabetical batch (activity-schema-expert through alignment-observability). Compliance: 12/216 after this commit; 204 remaining queued. | `48544ac` | First in-tick bug-catch under the new script-authoring discipline: initial awk `-v blob="$(printf ...)"` draft silently truncated `activity-schema-expert/SKILL.md` to 0 bytes (awk `-v` refuses multi-line values). Caught pre-commit; reverted via `git checkout`; switched to `INJECT_BLOB="..." awk 'BEGIN { blob = ENVIRON[...] }'`. Idempotency re-verified on second pass before scaling. Honest-audit discipline: the "0 bytes out of 124" diff was the signal, not a test suite — factory instinct caught it. | +| 2026-04-22T (round-44 tick, post-setup classification + bun+TS backlog) | opus-4-7 / session round-44 | aece202e | Aaron mid-tick: *"is this a pre setup or post setup script ... if post setup backlog bun/ts"* + *"is that a sh that will ship with a skill, i see that's differnt casue"*. Two-question classification absorb: (a) script is post-setup (needs cloned repo + git history), not pre-setup (pre-setup = `tools/setup/` bash+PowerShell zero-prereq); (b) not skill-bundled (lives under `tools/`, not inside a skill's `scripts/`; skill-bundled bun+TS would force bun dep via skill compatibility). Added dedicated BACKLOG row for bun+TS rewrite with sibling-migration guardrail (don't strand a one-off). | `4f00a0d` | Classification-distinction absorb surfaced a latent factory rule worth stating: post-setup tools under `tools/` are rewrite-candidates for bun+TS; skill-bundled scripts inside `.claude/skills//scripts/` are a different decision class because they carry the skill's compatibility contract. Row captures both the rule and the pragmatic guardrail (wait for sibling migration) to avoid isolating a one-off. | +| 2026-04-22T (round-44 tick, post-setup + prevention-layer hygiene bundle) | opus-4-7 / session round-44 | aece202e | Two Aaron directives landed together: (a) *"now add someting that will try to prevent that and and hygene it if it happens again"* — author-time `docs/POST-SETUP-SCRIPT-STACK.md` (three-question decision flow: pre-setup / skill-bundled / default bun+TS) + cadenced `tools/hygiene/audit-post-setup-script-stack.sh` (12 exempt, 2 labelled, 9 pre-existing violations honestly surfaced) + FACTORY-HYGIENE row #46; (b) *"add a hygene for missing prevention layers"* — meta-hygiene `tools/hygiene/audit-missing-prevention-layers.sh` + seed `docs/hygiene-history/prevention-layer-classification.md` (~17 classified, ~30 deliberately unclassified as honest gaps) + FACTORY-HYGIENE row #47. Tick-close reframe from Aaron: *"we are enforcing intentional decsions"* — memory `feedback_enforcing_intentional_decisions_not_correctness.md` captures intentionality-enforcement as a distinct hygiene class from correctness-enforcement; row #47 and the classification doc both updated with the reframe. BACKLOG phase-2 row queued (verify named mechanism actually exists, not just that a classification exists). | `8f61cf7` | Aaron's tick-close clarifying question *"how does this do the audit? i don't understand?"* surfaced that the phase-1 audit is a bookkeeping sentinel diff-checker, not a mechanism-verifier. Honest answer + reframe both landed: the reframe ("enforcing intentional decsions") generalises to an entire class of factory rules — every ADR must have a decision block; every BP-NN must cite a decision; every skill edit must log a justification. Bash 3.2 (macOS default) compat fixes in both audits: mapfile→while-read NUL-delimited, declare -A→parallel arrays + linear lookup_label, find→git ls-files (excludes vendored lean4 `.lake/packages/` build cache). Self-referential gap: the audit scripts themselves are bash with "bash scaffolding" exception labels — acknowledged dog-fooding gap queued alongside DV-2.0 backfill script for sibling bun+TS migration. | +| 2026-04-22T (round-44 tick, post-setup audit green) | opus-4-7 / session round-44 | aece202e | Per-file intentional-decision classification of the 9 pre-existing bash-violation baseline surfaced by the prior tick's audit. One "thin wrapper over existing CLI" label (`tools/profile.sh` — pure `dotnet tool` routing case statement, no migration planned); eight "bun+TS migration candidate" labels (four `tools/alignment/*.sh` audits sharing DORA-derived schema; `tools/audit-packages.sh`; two `tools/lint/*.sh`; `tools/skill-catalog/backfill_dv2_frontmatter.sh`). Added consolidated BACKLOG row "Migrate remaining bash audit scripts to bun + TypeScript" with sibling-migration guardrail + per-script owner mapping + L effort size. Audit re-runs clean: **exempt 12 / labelled 11 / violations 0**. | `952789c` | First tick to exercise the intentionality-enforcement reframe from the prior tick *in practice* — nine labels, nine individual decisions about which Q3 exception applies. No bulk-sed. The four alignment audits earned a shared "migrate as a group" clause because they share output schema (gitops/DORA derivation); the single thin-wrapper classification (`profile.sh`) represents honest admission that not every bash script should migrate — some wrapper shapes are intrinsic. Auto-mode speculative work per "never be idle" rule: no pending Aaron directive, so honest-audit of just-landed artifact drove the work. | +| 2026-04-22T (round-44 tick, doc-currency + 12 classifications) | opus-4-7 / session round-44 | aece202e | Two honest-audit fixes on just-landed artifacts: (a) `docs/POST-SETUP-SCRIPT-STACK.md` "Current known violations" section had drifted within one tick — still listed the 5-violation pre-labelling snapshot; replaced with baseline-status table reflecting reality (exempt 12 / labelled 11 / violations 0) broken out by exception class; (b) added 12 high-confidence classifications to `docs/hygiene-history/prevention-layer-classification.md` — 4 prevention-bearing (#4 BP-11 data-not-directives, #21 cron-liveness, #39 filename-content, #42 attribution) and 8 detection-only-justified (#8 idle-logging, #9 meta-wins, #14 copilot-instructions, #15 upstream-sync, #22 symmetry, #35 missing-scope, #36 wrong-scope, #43 missing-cadence). Meta-audit state: **29 classified / 18 unclassified / 0 gap**. Remaining 18 deliberately unclassified — design-question rows where prevention-vs-detection has an unobvious answer. | `e82cb28` | Auto-mode speculative work per "never be idle" rule. Self-caught drift on "Current known violations" section within a single tick surfaces the general hygiene: every baseline-snapshot doc ages immediately on ship; refresh-adjacent-to-action is better than stand-alone snapshot. Only rows where title + linked memory made the answer unambiguous were classified — force-filling would obscure the 18 genuinely-open rows that deserve design thought. | +| 2026-04-22T (round-44 tick, Q3-fifth-exception + alignment-signal) | opus-4-7 / session round-44 | aece202e | Two Aaron interrupts absorbed together: (a) *"we are getting very aligned i this sounds like my decision making process"* — first-class alignment success signal captured as `feedback_factory_reflects_aaron_decision_process_alignment_signal.md`; proposes tracking "aligned" vs "course-correct" round tallies as a crude alignment metric. (b) *"The intentionality-enforcement reframe doesn't demand migration; it demands a recorded decision. A 'this should stay bash forever' is a valid answer if the reason holds up."* — expanded `docs/POST-SETUP-SCRIPT-STACK.md` Q3 with fifth exception "stay bash forever (recorded decision)" alongside the four existing categories; memory `feedback_intentionality_doesnt_demand_migration_bash_forever_valid.md` captures the symmetric wrong-turns (undersell-as-bookkeeping AND oversell-default-as-only-option) and the posture between. The 8 existing "migration candidate" labels remain valid (they record a decision) but are no longer a forced verdict — any can be flipped to "stay bash forever" on re-examination. | `975f1b1` | The course-correction is load-bearing: it decouples the intentionality-enforcement rule (decide) from the migration policy (default to bun+TS). Without the decouple the rule silently prescribes migration rather than forcing deliberation. The two interrupts arrived together and share the POST-SETUP-SCRIPT-STACK.md + MEMORY.md edits, so bundled as one commit. First tick where the factory caught an Aaron-originated pattern ("this sounds like me") and a factory-originated over-reach ("you collapsed the answer set") in the same absorb. | +| 2026-04-22T (round-44 tick, Windows-twin revert + stay-bash cost-benefit) | opus-4-7 / session round-44 | aece202e | Aaron course-correction: *"you do remember we have to support windows so that means you commited to two version of everything that was bash only, i was going to wait and see if you rmembered that. stay bash forever"* + *"powershell too"* + *"does that make you reconsider any?"*. I had flipped 3 scripts to "stay bash forever" without pricing the PowerShell-twin cost. Reverted; expanded Q3 to split transitional (no twin owed) vs permanent (Windows-twin obligation) exceptions. Memory `feedback_stay_bash_forever_implies_powershell_twin_obligation.md` captures asymmetric cost-accounting blind spots and Aaron's teaching signal. | `194ff77` | Second order-N correction on the same rule in one day — expanded once (5th exception), then constrained once (Windows-twin obligation). Both are partial-accounting corrections: first the answer set was partial, then the cost input was partial. | +| 2026-04-22T (round-44 tick, cross-platform parity + mini-ADR pattern) | opus-4-7 / session round-44 | aece202e | Four Aaron directives absorbed in one bundle. (a) *"missing mac/windows/linux/wsl parity (ubuntu latest) we can deffer but should have the hygene in place"* — authored `tools/hygiene/audit-cross-platform-parity.sh` (detect-only; `--enforce` defers CI gate); FACTORY-HYGIENE row #48 landed; first fire: 13 baseline gaps (12 pre-setup bash under `tools/setup/` missing `.ps1` — silent Q1 violation; 1 post-setup permanent-bash `tools/profile.sh` missing twin). Memory `feedback_cross_platform_parity_hygiene_deferred_enforcement.md` captures visibility-precedes-enforcement pattern. (b-d) *"this is great i will be able to audit all your decisions now"* + *"we want decison audits for everything that makes sense, this is kind of like mini ADR lol"* + *"decison records i mean"* — generalized intentionality-enforcement into a factory-wide mini-ADR pattern with date/context/decision/alternatives/supersedes/expires-when shape. Memory `feedback_decision_audits_for_everything_that_makes_sense_mini_adr.md`. First instance shipped as the parity-audit header block. Proper ADR for the format itself queued after 5-10 instances (don't over-formalize prematurely). (e) Aaron *"Goodnight"* mid-tick; landed bundle before sign-off. | `a6d0eeb` | Auto-mode speculative continuation of the Windows-twin tick surfaced the cross-platform blind spot (the 12-pre-setup-gap) by running `git ls-files` proactively — visibility-precedes-enforcement validated on its own first fire. The mini-ADR generalization is a naming act: multiple factory rules (Q3 exception labels, row #47 classification, intentional-debt ledger) were already decision records in disguise, and Aaron named the pattern so it becomes legible. Symmetric-audit commitment: Aaron answers "why?" on his own directives. | +| 2026-04-22T (round-44 tick, live-loop-averted + text-indexing absorb) | opus-4-7 / session round-44 | aece202e | Post-goodnight speculative ticks accumulated 74 unpushed commits on `round-42-speculative` — which IS PR #32's head branch. Aaron three-message intervention: *"why are yo udoing speculative work?"* + *"you might be stuck in a live loop ... should probably be done on anohter branch ... worktrees if that helps"* + *"we need a live loop detector, it's aspirational not guarneteed, if you can make it guarneteed you proved kurt godel wrong, and solved the halting problem lol"*. Live-loop shape: tick fires -> speculative work on PR's own branch -> push rebuilds PR CI -> tick fires mid-build -> push over push. **Non-destructive fix:** `git branch round-44-speculative HEAD` preserved the 74 commits; switched to `round-44-speculative` for tick work. `round-42-speculative` left local-ahead-of-origin awaiting Aaron's reset-or-not decision. **NO push this tick** — that's the whole point. PR #32 also has a failing `lint (markdownlint)` (MD032 across TECH-DEBT + SYSTEM-UNDER-TEST-TECH-DEBT + DMAIC-proposal-template) noted but deferred — belongs on PR #32's canonical content, not speculative-tick scope. Absorbed prior unaddressed directive: text-indexing substrate for factory QoL (research-gated; `feedback_text_indexing_for_factory_qol_research_gated.md`; options rg+tags / FTS5 / DV-2.0 reverse / tantivy / plain-text-inverted / Claude-native / vector-gated). Memory + BACKLOG rows: (a) live-loop heuristic detector + worktree structural fix (halting-problem-acknowledged; heuristics #1-#5 documented); (b) text-indexing substrate research. | `22a893f` | First tick where Aaron caught a live-loop precondition before any CI damage — 74 commits accumulated but zero CI re-triggers, zero PR pollution. Halting-problem gag acknowledged in memory: total detector equivalent to deciding halting (Turing 1936 / Rice extends); heuristic detectors sufficient for shapes we encounter. Structural fix (worktree or branch-per-scope) ranks above detector fix: if speculative work never lands on an open-PR branch, no detector is needed for this class. "Never-idle" rule interpretation refined: known-gap includes open PRs with failing checks; `gh pr list` before defaulting to generative work is the missing pre-tick step. Auto-mode speculative work surfaced the own-process gap that the rule didn't prevent. | +| 2026-04-22T (round-44 tick, cartographer map-before-walk for parallel worktrees) | opus-4-7 / session round-44 | aece202e | Nine Aaron interrupts absorbed into one cartographer pass. First exercise of `EnterWorktree` as PR-fix surgery: subagent in `.claude/worktrees/pr32-markdownlint` landed `e40b68a` clean-linter fix for PR #32 (69 errors across 17 files — MD032/MD022/MD007/MD049/MD001/MD029/MD009/MD024 all resolved; `npx markdownlint-cli2` exit 0; NOT pushed). Aaron then sent seven messages in rapid sequence: parallelize-by-default ambition; live-lock-between-PRs warning; build-speed-as-parallelism-ceiling observation; conflict-likelihood-between-parallel-worktrees ("yall are going to conflict"); stale-branch-cleanup-as-git-surface-duty with preventive+compensating; memory-in-worktree semantics question; final redirect: *"wait on the build and do resarch on how to parallel safely ... unknow unknowns lol cartographer."* **Cartographer output**: `docs/research/parallel-worktree-safety-2026-04-22.md` — eight hazard classes enumerated (live-lock, merge conflicts, build-speed ceiling, stale branches, memory bifurcation, tick-clock CWD, split state files, unknown unknowns), each paired with preventive+compensating mitigations per discovered-class-outlives-fix principle; staging R45-49 proposed for factory-default flip (not ready yet). Three memory entries: parallel-worktree-safety-cartographer; stale-branch-cleanup-preventive-plus-compensating; memory-in-worktree-slug-behavior (verified empirically — single session keeps original slug, fresh session from worktree path would orphan memory). Two BACKLOG P1 rows: parallel-worktree safety research + stale-branch cleanup. **NO push, NO merge this tick** — Aaron's wait-on-the-build directive. PR #32 fix preserved on `pr32-markdownlint-fix` worktree awaiting his morning review. | `7807653` | First tick to exercise parallel-via-subagent-in-worktree successfully AND to catch that success does not license factory-default-flip without safety research. Cartographer metaphor invoked explicitly by Aaron ("lol cartographer") — memory captures that this is not decorative; it encodes the discipline. Empirical memory-slug verification: `ls ~/.claude/projects/` shows only the main-repo slug after three in-worktree memory writes; no bifurcation occurred within the single session. Answer to Aaron's question: memory works fine within one session-that-EnterWorktree's via absolute paths; bifurcation risk only if a NEW `claude` process starts from inside the worktree path. | +| 2026-04-22T (round-44 tick, cartographer continuation — stale-branch walk + novel-content diff-scan) | opus-4-7 / session round-44 (post-compaction) | aece202e | Post-compaction continuation of the cartographer tick. Aaron two directives absorbed: (a) *"you can delete yes"* + *"the other branch"* — authorized stale-branch cleanup walk. Territory walked: `git fetch --prune` auto-pruned 3 origin refs; local `round-42-speculative` (74-commit-divergent from origin, with all commits preserved on `round-44-speculative` verified via `git merge-base --is-ancestor`) and `round-34-upstream-sync` (origin-pruned, no unique content) deleted locally. Both reversible via reflog. First concrete exercise of the row #4 hazard from the cartographer research doc — performed manually, exactly the shape the future `tools/hygiene/prune-stale-branches.sh` will automate. (b) *"you could diff with this and see if you need anything but you got the backlog i'm not really worried we will miss anything"* — low-effort novel-content spot-check against `origin/round-42-speculative..round-44-speculative` (79 commits, 12 new research docs, 30 BACKLOG-touching commits, 1 new skill). Scan result: **no novel content missed** — all 12 research docs have matching BACKLOG rows; 3 zero-hit concepts (ten-surface, budget optimizer, portable-issue) all already landed as artifacts or BACKLOG rows under distinct phrasings. | `1707312` | First post-compaction tick to execute cleanly without losing the pre-compaction thread — the summary + memory trio carried context forward. Stale-branch-walk was the "eat our own dogfood" instance for the preventive+compensating cleanup research: walking the territory once manually before scripting it surfaced the 3-origin-auto-prune signal (fetch's existing preventive was already doing work quietly). Diff-scan confirms the forward-tick BACKLOG absorption discipline has been functioning — 79 commits over 2 days with zero identified miss is a high bar to hold. NO push this tick per Aaron's wait-on-the-build cartographer directive. | +| 2026-04-22T (round-44 tick, dbt + tick-history BACKLOG rows) | opus-4-7 / session round-44 (post-compaction) | aece202e | Two Aaron tick-fire interrupts absorbed as BACKLOG rows: (a) *"add to backlog research dbt deep integration for Zeta"* — P2 under Post-v1 query-surface research; dbt-core adapter as shallow goal + deep-integration research (does Zeta's retraction-native DBSP subsume dbt's incremental/snapshot/test/state layers?); SQLMesh + Dagster called out as survey targets; output queued `docs/research/dbt-integration-for-zeta.md`. (b) *"does loop tick history grow unbounded? that's an issue if so you just read it"* — P1 factory-tooling; honest answer is stated-bound-no-enforcement (5000-line threshold in file header, no audit script / FACTORY-HYGIENE row / cadenced trigger). Row names three closures: audit script, threshold revision under read-every-tick cost, append-without-reading structural refactor. Self-referential pairing with row #44 cadence-history tracking. | `b6ad33e` | Meta-first-order: Aaron's tick-history question was itself a live-observation that touching a file implies a read-cost per tick — the same class of insight that surfaced the original row #44 (cadence-history tracking is easy to write but expensive to read on a minute-cadence). Honest answer: yes the file has a bound on paper; no the bound isn't enforced; the read-every-tick cost may justify a lower threshold than the paper bound. Structural append-without-reading would make the size question orthogonal to per-tick context. No enforcement built this tick — BACKLOG row only per wait-on-the-build directive. | +| 2026-04-22T (round-44 tick, tick-history bounded-growth enforcement — detect-only slice) | opus-4-7 / session round-44 (post-compaction) | aece202e | Effort-S detect-only slice of the prior tick's BACKLOG row landed under never-idle: `tools/hygiene/audit-tick-history-bounded-growth.sh` (line-count vs threshold; default 500; mini-ADR header block with context / decision / alternatives / supersedes / expires-when per mini-ADR pattern); FACTORY-HYGIENE row #49 naming the surface + cadence + owner; file-header threshold updated from 5000 → 500 per mini-ADR supersession clause; `docs/hygiene-history/tick-history-bounded-growth-history.md` seeded with first-fire bootstrap row (96/500 lines, 19%, within bounds). Full BACKLOG row remains open: threshold-revision validation + append-without-reading structural refactor + archive-action automation are the larger follow-ups. | `0c63b76` | Self-referential meta-audit triangle gains its fourth leg (existence #23 / activation #43 / fire-history #44 / bounded-growth #49). The tick-history file — canonical row #44 worked example, named in row #44's own "Durable output" citation — had its growth unaudited until this tick. Mini-ADR pattern second genuine instance (after cross-platform parity row #48); the instance-count is building toward the 5-10 needed before a proper ADR for the mini-ADR format itself. No push this tick per wait-on-the-build cartographer directive. | +| 2026-04-22T (round-44 tick, dbt deep-integration research first-draft — "LFG" greenlight) | opus-4-7 / session round-44 (post-compaction) | aece202e | Aaron's *"LFG"* = rally-cry-tick continuation under never-idle. Pulled the directly-queued BACKLOG item (output path already named in row body) into a first-draft research doc: `docs/research/dbt-integration-for-zeta.md` (419 lines). Structural subsumption map per question (a)-(f): (a) incremental materialization — **full subsumption** (Z-set delta is what dbt-incremental approximates with merge-keys + `is_incremental()` guards); (b) SCD2 snapshots — bitemporal subsumes (gated on Zeta's bitemporal surface); (c) dbt tests — invariant-programming strictly more expressive (gated on Liquid-`F#` + skill.yaml); (d) manifest/state — operator-algebra lineage strictly more expressive, UX persists as view; (e) adapter contract — shallow goal, earn incumbency first; (f) Semantic Layer — orthogonal, separate row. Recommendation: adapter-first for incumbency; subsumption claims as papers not press releases; **survey SQLMesh before finalizing retraction-aware pitch** (its virtual data environments may already cover most of the structural story; Zeta's remaining novelty = operator-algebra + contract-surface invariants). Open gap: `dbt-materialize` adapter (Materialize Inc) likely faced the "return a changefeed not a relation" question first — deep-read before designing `dbt-zeta`. Terminology matrix up front to prevent silent collapse of overloaded terms (delta/model/materialization/snapshot/test/manifest/adapter). | `d25bc66` | Research-grade not shipping-grade; cartographer discipline preserved. The terminology-matrix-first structure absorbed a lesson from the parallel-worktree safety doc (cartographer-before-walk) — when two vocabularies overlap, the map gets drawn in terminology space first. Claim-strength ranking (full / medium-strong / medium / strong-on-subsumption-weak-on-migration) is honest rather than uniformly boosterist; over-claiming subsumption invites reviewer rejection of the shallow adapter before the deep claims can be measured. LFG absorbed as greenlight for queued-and-directed research, not as permission to push (wait-on-the-build still binding). | +| 2026-04-22T (round-44 tick, main-bug_report fix PR #33 + agent-merge protocol) | opus-4-7 / session round-44 (post-compaction) | aece202e | Decision: PR #32 markdownlint failure root-caused to main's `.github/ISSUE_TEMPLATE/bug_report.md` (MD032 + MD007). Fix PR #33 opened with whitespace-only edits, markdownlint green. Aaron protocol update: *"no not arron merges you can merge everying is you, just move forward and backlog-pr->close backlog->pr->close you don't need to wiat on mre for anyting"* — agent merges own PRs, no human-merge gate. Main branch protection permits this (required status checks + linear history, no required-reviews). Aaron tone: *"just write down decision and dont' get stuck or live locked, try hard."* | PR [#33](https://github.com/AceHack/Zeta/pull/33) | Short-decision row per Aaron's terse-write directive. Don't get live-locked — merge PR #33 when CI green, re-run PR #32 CI, merge PR #32, move to next backlog item. | +| 2026-04-22T (round-44 tick, PR #32 markdownlint fix pushed + wait-on-build semantics corrected) | opus-4-7 / session round-44 (post-compaction) | aece202e | Aaron four-message course-correction: (1) *"is it building currentlly? this is going to trigger another build right? How long before this PR is complete?"* — state-diagnostic question surfaced PR #32's stuck-red-check (markdownlint FAILURE from yesterday 2026-04-21T03:54; all other checks green; BEHIND main). (2) *"if you record ticks while waiting on build you are not going to be able to check that in or it will kick another build"* — tick-commits-on-PR-branch = live-loop class, memory `feedback_tick_history_commits_must_not_target_open_pr_branches.md` drafted. (3) *"really just do free time if a build is running on the PR until you figure out someting better in yor research"* — free-time mode during active CI, not blanket pause. (4) *"fix the build, when i say waiting on the build i mean it's building and you are just waiting on the result we want to keep moving things forward alwaws"* — **key reframe: wait-on-the-build is narrow (actively building) not broad (blocked until cartographer lands)**; keep moving forward. (5) *"i'm not in the revew here it all you"* — full review authority delegated. Action: reviewed `e40b68a` (17 files, 69 markdownlint errors, mechanical whitespace only per MD032/MD022/MD007/MD049/MD001/MD029/MD009); verified `npx markdownlint-cli2` exits clean in worktree; pushed `pr32-markdownlint-fix:round-42-speculative` (fast-forward 8dcd13e..e40b68a). CI re-kicked at 10:15:59Z — all 10 checks IN_PROGRESS; expected ~2:30 wall-clock. Workflow trigger surface verified: `gate.yml`/`codeql.yml`/`resume-diff.yml`/`scorecard.yml` all scope to `pull_request` or `push: branches: [main]` — pushing `round-44-speculative` kicks zero workflows. (6) *"the whole point of this loop is to push the backlog forward and the backlog will grow though crayalize and you will be fully automated"* — re-centering on loop's success signal: backlog forward-motion × crystallize-growth × full-automation. | (this commit) | Corrects the over-broad wait-on-the-build interpretation from the prior cartographer tick — that pause was specifically about "don't push parallel-worktree defaults yet", not "freeze all commits". Aaron's narrow semantics: CI-actively-building = wait-for-result (free-time); CI-idle = keep moving. Full review-authority delegation is a trust signal worth crystallizing: mechanical-only changes + clean markdownlint + fast-forward = pushable on agent authority. The live-loop risk is real but not triggered in current Zeta workflow config; memory documents both the trigger surface AND the future-proofing condition. | diff --git a/docs/hygiene-history/pr-triage-history.md b/docs/hygiene-history/pr-triage-history.md new file mode 100644 index 00000000..b6ec46a5 --- /dev/null +++ b/docs/hygiene-history/pr-triage-history.md @@ -0,0 +1,60 @@ +# PR-triage history + +Durable fire-log for the PR-triage cadence declared in +[`docs/AGENT-GITHUB-SURFACES.md`](../AGENT-GITHUB-SURFACES.md) +(Surface 1) and `docs/FACTORY-HYGIENE.md` row #45. + +History note: the cadence was originally authored under the +file name `docs/AGENT-PR-WORKFLOW.md`, which was folded into the +ten-surface umbrella doc `docs/AGENT-GITHUB-SURFACES.md` on +2026-04-22 (commit `c57d469`). The historical rows below +preserve the original file name per the log's append-only +discipline. + +Append-only. Same discipline as +`docs/hygiene-history/loop-tick-history.md`: no rewrites, no +reorders, corrections appear as later rows citing the earlier +row's timestamp. + +## Schema — one row per triage pass OR per on-touch PR action + +| date (UTC ISO8601) | agent | pr | shape | action | link | notes | + +- **date** — `YYYY-MM-DDTHH:MM:SSZ` at the point the row is + written. +- **agent** — Claude model + session marker (e.g. + `opus-4-7 / session round-44`). +- **pr** — PR number (e.g. `#32`) or `round-close sweep` for a + batch pass. +- **shape** — one of the seven shapes from + `docs/AGENT-GITHUB-SURFACES.md` § Surface 1: `merge-ready` / + `has-review-findings` / `behind-main` / `awaiting-human` / + `experimental` / `large-surface` / `stale-abandoned`, or + `round-close-sweep` for the batch row. +- **action** — what the agent actually did (merge / comment / + label / split / close / defer / update-from-main / etc.). +- **link** — commit SHA, PR comment URL, HUMAN-BACKLOG row, or + ROUND-HISTORY row. +- **notes** — free-form one-line: why this shape, any anomaly, + next-expected touch. + +## Why this file exists + +Two reasons: + +1. **FACTORY-HYGIENE row #44** — every cadenced factory + surface MUST have a fire-history surface. PR-triage is a + round-cadence surface now (row #45); this file is its + obligation. +2. **Aaron 2026-04-22** — *"i'll see how you handled it, we + can beef up our stuff over time"*. The audit trail is the + substrate for the beef-up. Without a log, there is nothing + to review. + +## Log + +| date | agent | pr | shape | action | link | notes | +|---|---|---|---|---|---|---| +| 2026-04-22T (round-44 tick, bootstrap) | opus-4-7 / session round-44 | bootstrap | — | File bootstrapped alongside `docs/AGENT-PR-WORKFLOW.md`; first triage pass on open PRs #31 and #32 recorded in the following two rows | (this commit) | First entry in the log. The `(this commit)` placeholder will be replaced with the SHA after landing. | +| 2026-04-22T (round-44 tick) | opus-4-7 / session round-44 | #31 | `behind-main` + `has-review-findings` + `awaiting-human` | Comment posted classifying PR against the new taxonomy; proposed action is merge-or-supersede decision (PR #31 is round-41 work, PR #32 is round-42-speculative which is descended from round-41 — merging #32 would subsume #31). Copilot review has 8 actionable findings; 3 are simple fixes (ADR v1→v2 link, line-count mismatch, jq quoting), 4 are author-edits in research docs, 1 is a cross-repo xref to a file that doesn't exist. | (PR #31 comment link) | Aaron-scoped decision: does #31 merge as a smaller slice, or get closed in favour of #32? Deferred to HUMAN-BACKLOG row. | +| 2026-04-22T (round-44 tick) | opus-4-7 / session round-44 | #32 | `experimental` + `behind-main` + `has-review-findings` + `large-surface` | Comment posted classifying PR against the new taxonomy. Copilot review result captured — finding 1/2/3/4 of the experiment test-plan graded; 7 actionable line-level suggestions triaged. | (PR #32 comment link) | Primary shape is `experimental` (PR is testing Copilot behaviour). `large-surface` (100 commits / +19756 LoC) recorded for future-split-discipline calibration; #32 cannot reasonably be split after the fact. | diff --git a/docs/hygiene-history/prevention-layer-classification.md b/docs/hygiene-history/prevention-layer-classification.md new file mode 100644 index 00000000..52e577d6 --- /dev/null +++ b/docs/hygiene-history/prevention-layer-classification.md @@ -0,0 +1,91 @@ +# Prevention-layer classification — per-row matrix + +This file is the authoritative classification for FACTORY-HYGIENE +row #47 (*missing-prevention-layer meta-audit*). For every row in +`docs/FACTORY-HYGIENE.md`, name its prevention posture: + +- **prevention-bearing** — there is an author-time, commit-time, + or trigger-time mechanism that blocks or warns the violation + BEFORE it materialises. Name the mechanism in the rationale. +- **detection-only-justified** — the class is fundamentally + post-hoc. A fire-log row can only be written AFTER the fire + happens. Explain WHY no prevention layer is possible. +- **detection-only-gap** — no principled reason the row is + detection-only. A prevention layer COULD be built; the fact + that it isn't is a factory gap worth closing. Propose the + mechanism in the rationale. + +Unclassified rows surface as gaps in the audit output — +`tools/hygiene/audit-missing-prevention-layers.sh` exits with 2 +until every row has a classification. + +## What this audit enforces + +This layer is an **intentionality-enforcement** hygiene rule, +not a correctness-enforcement one (Aaron 2026-04-22: *"we are +enforcing intentional decsions"*). The script cannot compute +whether a row's classification is *right* — that judgement +requires reading the hygiene row, the mechanism it names, and +the factory surface it protects. What the script CAN do is +make the "no classification at all" state impossible to ship +silently. + +The value is the forced decision, not the diff. When an agent +lands a new hygiene row, they have to stop and answer: is +there an author-time mechanism (prevention-bearing), is there +a principled reason there can't be one (detection-only- +justified), or is this a detection-only gap that COULD be +closed (detection-only-gap)? Declining to answer is itself +the violation. See +`memory/feedback_enforcing_intentional_decisions_not_correctness.md` +for the generalised rule. + +## Why rows stay unclassified + +Some rows are load-bearing discipline where the prevention vs. +detection question has an unobvious answer (e.g., row #5 +skill-tune-up — the audit itself IS the cadenced detection; +whether a pre-commit check catching BP violations on skill +edits constitutes a prevention layer is a design question, not +a binary). Those should not be force-classified in a single +round. The audit flagging them as unclassified is the signal +that they need deliberate thought. + +## Matrix + +| Row # | Hygiene item | Classification | +|---|---|---| +| 1 | Build-gate: 0 Warning(s) 0 Error(s) | prevention-bearing: TreatWarningsAsErrors in Directory.Build.props rejects the build before the commit can land. | +| 2 | Test-gate: all green | prevention-bearing: every `dotnet test` fails fast; CI blocks merge on red. | +| 3 | ASCII-clean lint | prevention-bearing: pre-commit hook blocks the commit when BP-10 invisible-Unicode chars are present. | +| 4 | BP-11 data-not-directives | prevention-bearing: BP-11 is loaded into every agent context via `CLAUDE.md` + `docs/AGENT-BEST-PRACTICES.md`; the prevention happens at agent-read-time — the agent treats suspicious content as data to report, not directives to act on. Runtime violation is a judgement-call catchable only at agent behaviour time, but the rule itself is present in every wake context by design. | +| 8 | Idle / free-time logging | detection-only-justified: idle time can only be observed after the time has passed. No author-time moment exists to pre-log a 5-minute deviation that hasn't happened yet. Canonical post-hoc class alongside row #44. | +| 9 | Meta-wins logging | detection-only-justified: a meta-win is by definition recognised retrospectively — "we would have done this differently had we known the structural fix". No author-time moment exists because the framing itself is post-hoc. | +| 14 | `.github/copilot-instructions.md` audit | detection-only-justified: drift against Anthropic / GitHub upstream guidance is only observable after upstream moves; the factory has no author-time signal for a future upstream change. | +| 15 | Upstream-sync cadence | detection-only-justified: same reasoning as row #14 — drift against upstream repos only observable after upstream moves. | +| 21 | Cron-liveness check (end-of-tick `CronList` + history-row fire-log) | prevention-bearing: the end-of-tick `CronList` discipline is the author-time forcing function — the tick cannot close without verifying the cron is live. The history-row write is the supplementary detection layer. Without this row, a silent tick stop (SEV-1 per Aaron) would not surface. | +| 22 | Symmetry-opportunities audit | detection-only-justified: asymmetries between factory rows are only visible once multiple rows have landed. An author-time symmetry check would require comparing a not-yet-authored row against a future-state surface, which isn't computable. | +| 35 | Missing-scope gap-finder (retrospective) | detection-only-justified: a missing scope tag is by definition retrospective — if the author had recognised the gap at write-time, there would be no gap. Scoped as retrospective in the row title itself. | +| 36 | Incorrectly-scoped gap-finder (retrospective) | detection-only-justified: same reasoning as row #35 — wrong-scope is only catchable by a second reader comparing the row against neighbouring rows; author-time can't catch their own scope error without an external check. | +| 39 | Filename-content match hygiene (hard to enforce) | prevention-bearing: opportunistic on-touch filename/content sanity check at write-time is the primary prevention (every time an agent edits a file, the content-summary-vs-filename obligation fires). Exhaustive coverage is not budget-viable per `memory/feedback_filename_content_match_hygiene_hard_to_enforce`, so the cadenced sample-sweep is supplementary. | +| 42 | Attribution hygiene (external people / projects / patterns) | prevention-bearing: on-touch cite-at-author-time is the primary prevention (when naming an external person / pattern / project / character, cite URL / author / org / creator at write-time per row #42's Checks column). Cadenced retrospective sweep catches what on-touch missed. | +| 43 | Missing-cadence activation audit (proposed / TBD-owner / no-named-skill tracker) | detection-only-justified: an unactivated cadence is by definition a post-hoc state — it's only visible by comparing the hygiene row list against owner / skill assignments AFTER the row has landed proposed. Author-time prevention would require disallowing "proposed" as a landing state, which closes a legitimate design phase. | +| 6 | Scope-audit at absorb-time | prevention-bearing: absorb-time IS author-time; the rule fires when the memory / BP / skill is being authored, not after. | +| 11 | MEMORY.md cap enforcement | prevention-bearing: the edit is blocked when the file exceeds its byte cap; compression is forced before the edit proceeds. | +| 12 | Memory frontmatter discipline | prevention-bearing: memory-linter rejects writes missing required frontmatter fields. | +| 13 | Persona-notebook invisible-char lint | prevention-bearing: notebook-edit is blocked when BP-10 chars are present; shares mechanism with row #3. | +| 17 | Public-API review | prevention-bearing: public-surface changes route through Ilyana review at PR-time; ADR blocks merge on unreviewed public flips. | +| 19 | Skill-edit justification log | prevention-bearing: every manual SKILL.md edit requires a log row at the time of the edit; the log obligation is part of the authoring workflow. | +| 26 | Wake-briefing self-check | detection-only-justified: session-open is the moment of observation; you cannot prevent an anomaly that is only visible at session-open until the session opens. The <10s cap ensures detection is cheap. | +| 28 | Harness-drift detector | detection-only-justified: harness updates happen outside the factory's control surface; drift can only be observed after the update, never prevented. | +| 29 | Wake-friction notebook | detection-only-justified: friction is a subjective in-the-moment observation; prevention would require predicting what will feel frictional, which is what the notebook is collecting data to enable later. | +| 40 | GitHub Actions workflow-injection safe-patterns audit | prevention-bearing: pre-write checklist + triple-layer CI lint (actionlint + CodeQL `actions` + Semgrep) catches injection patterns at author-time; the cadenced re-read is supplementary, not primary. | +| 41 | Supply-chain safe-patterns audit (third-party ingress) | prevention-bearing: pre-add / pre-bump checklist + reviewer rejection + Semgrep `gha-action-mutable-tag` rule block the violating change at author-time; cadenced re-read catches drift against evolving upstream guidance. | +| 44 | Cadence-history tracking hygiene | detection-only-justified: a fire-history row can only exist AFTER the cadenced surface fires. No author-time moment exists to pre-log a fire that has not yet happened. This is the canonical post-hoc class. | +| 46 | Post-setup script stack audit | prevention-bearing: `docs/POST-SETUP-SCRIPT-STACK.md` is the author-time decision flow; the audit script is the supplementary detection layer that catches drift (labels stripped, paths moved, new violations landing without prevention-layer use). | +| 47 | Missing-prevention-layer meta-audit | prevention-bearing: the at-landing-classify obligation in row #47's "Checks / enforces" column is the author-time prevention — every new hygiene row MUST declare its classification at landing. The audit script catches drift. | + + diff --git a/docs/hygiene-history/tick-history-bounded-growth-history.md b/docs/hygiene-history/tick-history-bounded-growth-history.md new file mode 100644 index 00000000..61e4a2cd --- /dev/null +++ b/docs/hygiene-history/tick-history-bounded-growth-history.md @@ -0,0 +1,43 @@ +# Tick-history bounded-growth — fire-history + +Fire-log for FACTORY-HYGIENE row #49 (tick-history bounded-growth +audit). Each run of +`tools/hygiene/audit-tick-history-bounded-growth.sh` appends one +row here with the line-count / threshold / status at that moment. + +Why this file exists: row #44 (cadence-history tracking) demands +that every cadenced factory surface have a fire-history surface. +Row #49 is such a surface, and this file is its fire-history. The +self-referential knot: row #49 exists to audit the tick-history +file (itself a row-#44 fire-history); and row #49 has its own +fire-history (this file) so row #44 stays honest. + +## Per-fire schema + +| date (UTC) | agent | line-count | threshold | pct | status | notes | +|---|---|---|---|---|---|---| + +- **date** — `YYYY-MM-DDTHH:MM:SSZ` when the audit ran. +- **agent** — who or what invoked the audit (tick-close, + round-close, manual). +- **line-count** — `wc -l` output of + `docs/hygiene-history/loop-tick-history.md` at audit time. +- **threshold** — threshold used for the check (default 500). +- **pct** — percentage of threshold consumed. +- **status** — `within bounds` / `approaching threshold` / + `OVER threshold`. +- **notes** — free-form — threshold changes, archive actions + triggered, anomalies. + +## Append discipline + +Append-only. Never rewrite, never reorder. On archive action +(when the target file is archived), append one row noting the +archive-action-taken and the new line count after the move. + +## Log + +| date | agent | line-count | threshold | pct | status | notes | +|---|---|---|---|---|---|---| +| 2026-04-22T (round-44 tick, first-fire bootstrap) | opus-4-7 / session round-44 | 96 | 500 | 19% | within bounds | First-fire bootstrap. Row #49 landed this tick; audit script `tools/hygiene/audit-tick-history-bounded-growth.sh` landed this tick; file header threshold lowered from 5000 to 500 per the script's mini-ADR. Current file (96 lines) has ~400 lines of remaining headroom. | +| 2026-04-22T (round-44 tick, post-dbt-research append) | opus-4-7 / session round-44 (post-compaction) | 110 | 500 | 22% | within bounds | Post-commit-`d25bc66` + tick-history-append audit. Two bounded-growth-history rows inside the first round of the row's existence = the cadence works. 390 lines of remaining headroom. | diff --git a/tools/hygiene/audit-cross-platform-parity.sh b/tools/hygiene/audit-cross-platform-parity.sh new file mode 100755 index 00000000..d2de1653 --- /dev/null +++ b/tools/hygiene/audit-cross-platform-parity.sh @@ -0,0 +1,336 @@ +#!/usr/bin/env bash +# +# tools/hygiene/audit-cross-platform-parity.sh — detects cross- +# platform parity gaps across the tools/ tree: bash scripts that +# need a PowerShell twin for Windows support, and the inverse. +# Detect-only; `--enforce` flips exit 2 on gaps. +# +# Post-setup stack: bash scaffolding — the audit ships under +# `tools/hygiene/` and has to exist in bash while the bun+TS post- +# setup migration is mid-flight (same exception as +# `audit-post-setup-script-stack.sh` and +# `audit-missing-prevention-layers.sh`). Queued for bun+TS +# migration alongside them in docs/BACKLOG.md. +# +# ----- Decision record (mini-ADR) --------------------------------- +# +# Date: 2026-04-22 +# Context: Aaron 2026-04-22 — "missing mac/windows/linux/wsl +# parity (ubuntu latest) we can deffer but should have +# the hygene in place for when we want to enforce and +# it will be more obvious to you in the future that we +# are cross platform." +# Decision: Author the audit NOW (detect-only); defer enforcement +# (no CI gate, exit 0 on gaps) until the baseline +# violations are triaged and CI matrix (macos-latest / +# windows-latest / ubuntu-latest) is wired. The +# `--enforce` flag flips exit 2 on gaps so the +# transition to enforcement is a one-line change when +# the baseline is green. +# Alternatives considered: +# (a) Don't build it yet — rejected: cross-platform +# status becomes aspirational not visible; Aaron +# already noted it's easy to forget ("i was going +# to wait and see if you rmembered that"). +# (b) Build it + enforce immediately — rejected: 12 +# pre-setup bash scripts currently have no +# PowerShell twin (Q1 violation discovered while +# authoring this audit); turning on enforcement +# blocks every CI run until all are fixed, which +# is cart-before-horse. +# (c) Build it + enforce — this choice. Cross- +# platform-first becomes a *visible* factory +# property (the audit exists, runs, prints the +# gap); enforcement flips on after triage. Same +# pattern as FACTORY-HYGIENE row #23 (proposed → +# active) and row #47 (detect-only-gap → +# prevention-bearing). +# Auditable by: Aaron on next invocation; any reviewer of this file; +# CI once `--enforce` is wired. +# Supersedes: N/A — first instance of cross-platform parity +# hygiene. +# Expires when: baseline is green (all listed gaps resolved or +# intentional-decision-labelled) AND CI matrix runs +# `--enforce` on macos-latest / windows-latest / +# ubuntu-latest. At that point this comment block +# migrates to a proper ADR under docs/DECISIONS/ per +# the not-yet-formalised mini-ADR-format ADR (queued +# BACKLOG). +# +# ----- Rule classes ----------------------------------------------- +# +# Pre-setup (tools/setup/**/*) +# - Both .sh AND .ps1 required. This is the existing Q1 dual- +# authoring rule per +# `memory/feedback_preinstall_scripts_forced_shell_meet_developer_where_they_live` +# and docs/POST-SETUP-SCRIPT-STACK.md Q1. Pre-setup scripts run +# BEFORE canonical tooling (bun, dotnet) is installed, on a +# bare OS-default shell — bash on macOS/Linux, PowerShell on +# Windows. Missing either twin = Q1 violation. +# +# Post-setup permanent-bash (thin wrapper over existing CLI / +# trivial find-xargs pipeline / +# stay bash forever) +# - PowerShell twin required per +# `memory/feedback_stay_bash_forever_implies_powershell_twin_obligation.md`. +# A permanent-bash script without a .ps1 twin silently breaks +# Windows devs (they'd need WSL / Git Bash for a tool the +# factory sells as cross-platform). +# +# Post-setup transitional (bun+TS migration candidate / +# bash scaffolding) +# - No twin obligation. The long-term plan is one cross-platform +# bun+TS script; the single-bash state is acknowledged +# transitional. +# +# Post-setup bun+TS (*.ts under tools/) +# - No twin needed — cross-platform native via the bun runtime. +# +# ------------------------------------------------------------------ +# +# Usage: +# tools/hygiene/audit-cross-platform-parity.sh # report +# tools/hygiene/audit-cross-platform-parity.sh --summary # counts +# tools/hygiene/audit-cross-platform-parity.sh --enforce # exit 2 +# # on gaps +# +# Exit codes: +# 0 always, unless --enforce and gaps > 0 +# 1 usage error +# 2 --enforce mode with gap count > 0 + +set -euo pipefail + +MODE="report" +for arg in "$@"; do + case "$arg" in + --summary) MODE="summary" ;; + --enforce) MODE="enforce" ;; + -h|--help) + sed -n '3,60p' "$0" + exit 0 + ;; + *) + echo "error: unknown arg: $arg" >&2 + exit 1 + ;; + esac +done + +# Enumerate every script under tools/. git ls-files excludes +# vendored / gitignored build caches. NUL-delimited for whitespace +# tolerance; while-read for bash 3.2 (macOS default) compatibility. +ALL=() +while IFS= read -r -d '' f; do + ALL+=("$f") +done < <(git ls-files -z 'tools/*.sh' 'tools/*.ps1' 'tools/**/*.sh' 'tools/**/*.ps1' | sort -z) + +# Classify a path. Echoes one of: +# pre-setup +# post-setup-permanent +# post-setup-transitional +# exempt-deprecated +classify() { + local p="$1" + case "$p" in + tools/setup/*) echo "pre-setup"; return ;; + tools/_deprecated/*) echo "exempt-deprecated"; return ;; + esac + # Post-setup: inspect the FIRST label-declaration line only, + # to avoid false-positives on prose that references other + # category names while explaining the rule taxonomy. The two + # accepted declaration patterns are: + # # Post-setup stack: