Skip to content

hygiene: machine-specific content scrubber (FACTORY-HYGIENE row #55; cadenced)#198

Merged
AceHack merged 2 commits intomainfrom
hygiene/machine-specific-content-scrubber-cadenced
Apr 23, 2026
Merged

hygiene: machine-specific content scrubber (FACTORY-HYGIENE row #55; cadenced)#198
AceHack merged 2 commits intomainfrom
hygiene/machine-specific-content-scrubber-cadenced

Conversation

@AceHack
Copy link
Copy Markdown
Member

@AceHack AceHack commented Apr 23, 2026

Summary\n\nPer Aaron 2026-04-23 Otto-27 greenlight: cadenced hygiene task scrubbing machine-specific content from in-repo tracked files.\n\nTool: tools/hygiene/audit-machine-specific-content.sh — detect-only first pass; --list + --enforce modes.\n\nFACTORY-HYGIENE row #55 landed with full schema.\n\nBaseline: 9 gaps at first fire (pre-existing; surfaces for opportunistic cleanup).\n\nComposes with PR #197 Option D in-repo-first migration policy.\n\nOtto (loop-agent PM hat) scripted; Dejan (devops-engineer) owns cadenced fire.

… row #55)

Per Aaron 2026-04-23 Otto-27: "we can have a machine
specific scrubber/lint hygene task for anyting that makes
it in by default" + "just run on a cadence."

Files added:
- tools/hygiene/audit-machine-specific-content.sh —
  scans tracked files for machine-specific patterns
  (/Users/<name>/, /home/<name>/, C:\Users\<name>,
  C:/Users/<name>). Detect-only; --list + --enforce modes.
- docs/FACTORY-HYGIENE.md row #55 — full schema per
  existing hygiene-row pattern; Dejan (devops-engineer)
  on cadenced fire; classified prevention-bearing
  (row #47 taxonomy).

Baseline at first fire: 9 gaps across .claude/skills/,
docs/ PDFs, and one legitimate anti-example reference
in memory/feedback_path_hygiene.md. All pre-existing;
this row surfaces them for opportunistic cleanup.

Exclusions (historical content preserved verbatim per
append-only discipline):
- docs/ROUND-HISTORY.md
- docs/hygiene-history/**
- docs/DECISIONS/**
- the audit script itself (patterns as examples)

Composes with Otto-27 Option D in-repo-first policy
(PR #197 migrating CURRENT-*.md files) — the scrubber
prevents machine-specific leakage as new content enters
the in-repo boundary by default.

Attribution: Otto (loop-agent PM hat) scripted the audit;
Dejan (devops-engineer) owns cadenced operation.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 23, 2026 20:46
@AceHack AceHack enabled auto-merge (squash) April 23, 2026 20:46
AceHack added a commit that referenced this pull request Apr 23, 2026
…uns passed

Aaron greenlight on Option D in-repo-first policy +
cadenced scrubber.

Actions:
- submit-nuget reruns PASSED on #149/#154/#170 (GitHub
  transient confirmed)
- PR #197: CURRENT-aaron.md + CURRENT-amara.md → in-repo
- PR #198: machine-specific scrubber + FACTORY-HYGIENE
  row #55 (cadenced detect-only)

Phase 1 closure push: 3 of 5 Amara-named PRs unblocked
(rerun path). #155 needs deeper rebase. #161 likely
clean.

Amara's "mechanize failure modes" recommendation →
scrubber is the first concrete instance.

Attribution: Otto (loop-agent PM hat).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@AceHack AceHack merged commit b45d594 into main Apr 23, 2026
10 checks passed
@AceHack AceHack deleted the hygiene/machine-specific-content-scrubber-cadenced branch April 23, 2026 20:50
AceHack added a commit that referenced this pull request Apr 23, 2026
…ated-branch; #149/#154 armed)

#197 merged at 20:44:41Z (CURRENT files now in-repo,
Amara-findable).

#198 rebased + pushed.

Phase 1 acceleration:
- #149 + #154 auto-merge armed (were NOT armed before;
  opened before auto-merge became session-standard)
- #149/#154/#161/#170 updated-branch via gh pr
  update-branch — brought all 4 up to date with main
- Cascading merge likely as CI completes + conversation-
  resolution satisfied

#155 deferred (DIRTY + 30 threads; bigger effort next
tick).

Amara's "merge over invent" direction manifesting in
concrete queue-drain.

Attribution: Otto (loop-agent PM hat).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new hygiene audit to detect machine-specific paths leaking into tracked repo content, and documents it as a new FACTORY-HYGIENE row.

Changes:

  • Introduces tools/hygiene/audit-machine-specific-content.sh with --list and --enforce modes.
  • Adds FACTORY-HYGIENE row #55 describing the new audit, cadence, and intended outputs.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File Description
tools/hygiene/audit-machine-specific-content.sh New bash audit script to scan tracked files for machine-specific path patterns.
docs/FACTORY-HYGIENE.md Documents the new hygiene row (#55) and references the script and related artefacts.

Comment on lines +4 to +42
# Scans in-repo content for machine-specific patterns that
# should not appear in a portable factory substrate:
#
# - User-home paths: /Users/<username>/... , /home/<username>/...
# - Claude Code harness paths: ~/.claude/projects/<slug>/...
# - Other machine-name or hostname leaks
#
# Part of FACTORY-HYGIENE row #55 (machine-specific content
# scrubber) per Aaron 2026-04-23 Otto-27:
#
# > we can have a machine specific scrubber/lint hygene task
# > for anyting that makes it in by default. just run on a
# > cadence.
#
# Not a prevention-gate yet (detect-only first per row #23 +
# #47 pattern). Cadenced fire surfaces gaps; author-time
# remediation lands as opportunistic cleanup.
#
# Usage:
# tools/hygiene/audit-machine-specific-content.sh # summary
# tools/hygiene/audit-machine-specific-content.sh --list # list offending files
# tools/hygiene/audit-machine-specific-content.sh --enforce # exit 2 on any gap
#
# Exit codes:
# 0 — no machine-specific content detected (or --enforce not set and gaps found)
# 2 — gaps found and --enforce was set

set -euo pipefail

mode="${1:-summary}"

# Patterns that indicate machine-specific content leakage.
# Each pattern is a grep -E extended regex.
patterns=(
'/Users/[a-zA-Z0-9._-]+/' # macOS home paths
'/home/[a-zA-Z0-9._-]+/' # Linux home paths
'C:\\Users\\[a-zA-Z0-9._-]+' # Windows home paths
'C:/Users/[a-zA-Z0-9._-]+' # Windows home paths (forward-slash form)
)
Comment on lines +40 to +41
'C:\\Users\\[a-zA-Z0-9._-]+' # Windows home paths
'C:/Users/[a-zA-Z0-9._-]+' # Windows home paths (forward-slash form)
'docs/ROUND-HISTORY.md'
'docs/hygiene-history/'
'docs/DECISIONS/' # ADRs preserve their historical context
'tools/hygiene/audit-machine-specific-content.sh' # self (patterns are examples)
Comment on lines +55 to +80
# Get all tracked files, filter out exclusions.
tracked_files=$(git ls-files)

# Build exclusion grep
exclude_grep=""
for p in "${exclude_patterns[@]}"; do
if [[ -n "$exclude_grep" ]]; then
exclude_grep+="|"
fi
exclude_grep+="$p"
done

files_to_check=$(echo "$tracked_files" | grep -vE "$exclude_grep" || true)

# Count gaps
gap_count=0
gap_files=()

for pattern in "${patterns[@]}"; do
while IFS= read -r file; do
[[ -z "$file" ]] && continue
if grep -l -E "$pattern" "$file" > /dev/null 2>&1; then
gap_files+=("$file:$pattern")
gap_count=$((gap_count + 1))
fi
done <<< "$files_to_check"
Comment thread docs/FACTORY-HYGIENE.md
| 51 | Cross-platform parity audit (bash / PowerShell / bun+TS twin check across macOS / Windows / Linux / WSL) | Detect-only now (landed 2026-04-22); cadenced detection every 5-10 rounds (same cadence as row #46); opportunistic on-touch every time an agent adds or edits a script under `tools/`. Enforcement deferred until baseline is green AND CI matrix runs `--enforce` on `macos-latest` / `windows-latest` / `ubuntu-latest` (WSL inherits ubuntu-latest for CI). | Dejan (devops-engineer) on cadenced detection; author of the script (self-check at author-time against the rule classes in the audit's decision-record header block). Kenji (Architect) on CI-matrix-enforcement sign-off when baseline is green. | both | `tools/hygiene/audit-cross-platform-parity.sh` classifies every script under `tools/` by rule class: (a) **pre-setup** (`tools/setup/**`) — both `.sh` AND `.ps1` required per Q1 dual-authoring rule (`memory/feedback_preinstall_scripts_forced_shell_meet_developer_where_they_live`); (b) **post-setup permanent-bash** (`thin wrapper over existing CLI` / `trivial find-xargs pipeline` / `stay bash forever`) — `.ps1` twin required per the Windows-twin obligation (`memory/feedback_stay_bash_forever_implies_powershell_twin_obligation.md`); (c) **post-setup transitional** (`bun+TS migration candidate` / `bash scaffolding`) — no twin obligation (long-term plan is one cross-platform bun+TS script); (d) **post-setup bun+TS** (`*.ts` under `tools/`) — no twin needed (cross-platform native via bun). `--summary` prints counts; `--enforce` flips exit 2 on gaps. **Why detect-only first:** baseline at first fire (2026-04-22) was 13 gaps — 12 pre-setup bash without `.ps1` twin (Q1 violation silently accumulating since `tools/setup/` existed) + 1 post-setup permanent-bash (`tools/profile.sh`) without `.ps1` twin. Turning enforcement on before triage would block every CI run. **Why this row exists:** Aaron 2026-04-22 *"missing mac/windows/linux/wsl parity (ubuntu latest) we can deffer but should have the hygene in place for when we want to enforce and it will be more obvious to you in the future that we are cross platform."* Cross-platform-first must be a *visible* factory property (audit exists, runs, prints the gap) before it becomes an enforced gate. Same pattern as FACTORY-HYGIENE rows #23 / #43 / #47. See `memory/feedback_cross_platform_parity_hygiene_deferred_enforcement.md`. **Classification (row #47):** **prevention-bearing** — the audit runs at author-time (opportunistic on-touch) and surfaces the gap before it lands, same as row #46. The audit itself is a detect-only mechanism but detect-only surfaces the obligation at author-time when the author runs it. Ships to project-under-construction: adopters inherit the parity audit + the decision-record-block pattern + the CI-matrix obligation once it's wired. | Audit output in repo root on each fire; cadenced runs appended to `docs/hygiene-history/cross-platform-parity-history.md` (per-fire schema per row #44); BACKLOG row per gap at triage time; ROUND-HISTORY row when a gap resolves. | `tools/hygiene/audit-cross-platform-parity.sh` (detection + decision-record header block) + `memory/feedback_cross_platform_parity_hygiene_deferred_enforcement.md` + `memory/feedback_stay_bash_forever_implies_powershell_twin_obligation.md` + `memory/feedback_preinstall_scripts_forced_shell_meet_developer_where_they_live` + `docs/POST-SETUP-SCRIPT-STACK.md` |
| 54 | Backlog-refactor cadenced audit (overlap / staleness / priority-drift / knowledge-update sweep of `docs/BACKLOG.md`) | Cadenced detection every 5-10 rounds (same cadence as rows #5 / #23 / #38 / #46 meta-audits) + opportunistic on-touch when a tick adds a new BACKLOG row and the author notices adjacent rows that may overlap. Not exhaustive; bounded passes per firing are acceptable. | Architect (Kenji) on round-cadence sweeps; `backlog-scrum-master` skill if explicitly invoked; all agents (self-administered) on on-touch overlap-spot during authoring. | factory | Read `docs/BACKLOG.md` (or a scoped slice — P0/P1 first if full scan is too large) and apply the following passes: (a) **overlap cluster** — two or more rows describing the same concern from different angles get flagged; decide merge (single consolidated row) or sharpen (two rows with clear non-overlap scope boundaries); (b) **stale retire** — rows where context has died, implementation landed without retire-action, or assumption has been falsified by newer knowledge get explicitly retired with a "retired: <reason>" marker (not silent deletion — signal-preservation still applies); (c) **re-prioritize** — priority labels (P0/P1/P2/P3) re-examined against current knowledge; any row whose priority feels wrong after re-read gets a justified move with a one-line rationale; (d) **knowledge absorb** — rows written before a newer architectural insight landed get rewording / cross-refs to the new substrate (e.g., rows predating AutoDream cadence now cite the policy; rows predating scheduling-authority sharpening now note self-schedulability); (e) **document** — ROUND-HISTORY row per fire with pre-audit and post-audit row counts + what was merged / retired / re-prioritized / updated. **Why this row exists:** the human maintainer 2026-04-23 *"we probalby need some meta iteam to refactor the backlog base on current knowledge and look for overlap, this is hygene we could run from time to time so our backlog is not just a dump"*. The BACKLOG is the triage substrate for every future tick's "what to pick up" decision; without periodic meta-audit it becomes an append-only log rather than a living triage surface. **Classification (row #50):** **detection-only-justified** — accumulated drift (overlap, staleness, priority-drift, knowledge-update-gap) is inherently post-hoc; no author-time check can prevent rows from becoming overlapping with *future* rows not yet written. **Maintainer-scope boundary:** rows with explicit maintainer framing at their priority (e.g., P0 rows the human maintainer explicitly set) stay at that priority; re-prioritization applies within the agent-owned priority space only. Ships to project-under-construction: adopters inherit the cadenced-sweep discipline + the retire-with-marker convention + the ROUND-HISTORY documentation pattern. | ROUND-HISTORY row per fire with pre/post row counts + merged/retired/re-prioritized/updated actions; `docs/hygiene-history/backlog-refactor-history.md` (per-fire schema per row #44 — date, agent, rows touched, actions taken, pre/post counts, next-fire-expected-date). | `docs/BACKLOG.md` (target surface) + governing rule in per-user memory (not in-repo; lives at `~/.claude/projects/<slug>/memory/feedback_backlog_hygiene_cadenced_refactor_look_for_overlap_not_just_dump_2026_04_23.md`) + `.claude/skills/backlog-scrum-master/SKILL.md` (dedicated runner when invoked) + `.claude/skills/reducer/SKILL.md` (Rodney's Razor applied at backlog level) + sibling meta-audit rows #5, #23, #38, #46, #50 |
| 52 | Tick-history bounded-growth audit (`docs/hygiene-history/loop-tick-history.md` line-count vs threshold) | Detect-only (landed 2026-04-22); cadenced detection once per round-close (same cadence as row #44 cadence-history sweep, since this is the canonical row #44 worked example auditing itself); opportunistic on-touch whenever the tick-history file is read or edited. Archive action itself remains manual for now; deferring automation to the larger BACKLOG row that also covers threshold-revision and append-without-reading refactor. | Dejan (devops-engineer) on cadenced detection; the tick itself (self-administered at tick-close) on the opportunistic on-touch — each tick's end-of-tick sequence can invoke this audit after the append + commit to get a `within bounds: 96/500 lines` visibility signal. | factory | `tools/hygiene/audit-tick-history-bounded-growth.sh` checks the file's line count against a threshold (default 500, overrideable via `--threshold N`) and exits 0 within bounds / 2 over threshold. The threshold is set lower than the stated 5000-line paper bound because the file is read on every tick-close append — a per-tick context cost that scales linearly with file size — and 5000 lines represents too large a context hit on a 1-minute cadence. The audit's header block carries a mini-ADR decision record for the 500-line choice (context / decision / alternatives / supersedes / expires-when). **Why this row exists:** Aaron 2026-04-22 tick-fire interrupt: *"does loop tick history grow unbounded? that's an issue if so you just read it"*. Honest state was stated-bound-no-enforcement: file header named 5000 lines, nothing checked it. This row closes the enforcement gap for the threshold-check half of the full BACKLOG row (archive-action + append-without-reading refactor remain deferred). **Self-referential closure:** the tick-history file IS the canonical row-#44 cadence-history-tracking worked example (named explicitly in row #44's "Durable output" citation). Until this row landed, the most-cadenced surface in the factory — the tick itself — had its fire-log surface unaudited for its own growth. Meta-audit triangle remains intact (existence #23 / activation #43 / fire-history #44), and row #49 adds a fourth: fire-history files themselves need bounded-growth audits because they grow at the cadence of the surface they track. **Classification (row #47):** **prevention-bearing** — the audit surfaces approaching-threshold warnings at 80% so the archive action can be planned, rather than reactive-only at over-threshold. Ships to project-under-construction indirectly: adopters inherit the pattern (fire-log files under their own `docs/hygiene-history/` need the same bounded-growth treatment), not this exact script. | Audit output on each fire; cadenced runs appended to `docs/hygiene-history/tick-history-bounded-growth-history.md` (per-fire schema per row #44); BACKLOG row when archival is due (archive-action itself queued as part of the larger tick-history enforcement BACKLOG row); ROUND-HISTORY row when threshold changes or archive action executes. | `tools/hygiene/audit-tick-history-bounded-growth.sh` (detection + mini-ADR header block) + `docs/hygiene-history/loop-tick-history.md` (target surface, canonical row #44 worked example) + BACKLOG row *"Loop-tick-history bounded-growth enforcement"* (larger follow-up: threshold revision + append-without-reading refactor + archive action) |
| 55 | Machine-specific content scrubber (cadenced audit of in-repo tracked files for user-home paths, Claude Code harness paths, Windows user-profile paths, hostname leaks) | Detect-only (landed 2026-04-23); cadenced detection once per round-close (same cadence as rows #50 / #51 / #52 meta-audits) + opportunistic on-touch when a tick migrates per-user content to in-repo. Enforcement (`--enforce` exit-2) deferred until baseline is green. | Dejan (devops-engineer) on cadenced detection + CI-enforcement sign-off when baseline is green; the migrating agent (self-administered) on on-touch — every in-repo-first migration runs the audit before committing. | factory | `tools/hygiene/audit-machine-specific-content.sh` scans all tracked files (`git ls-files`) for machine-specific patterns: `/Users/<name>/`, `/home/<name>/`, `C:\Users\<name>`, `C:/Users/<name>`. Excludes: `docs/ROUND-HISTORY.md`, `docs/hygiene-history/**`, `docs/DECISIONS/**`, and the audit script itself. `--list` prints offending files; `--enforce` flips exit 2 on any gap. **Why this row exists:** Aaron 2026-04-23 Otto-27 — *"we can have a machine specific scrubber/lint hygene task for anyting that makes it in by default. just run on a cadence."* Following the Option D in-repo-first policy shift (per-user memory migrations to in-repo became the default), machine-specific content leakage becomes a real risk — content comfortably per-user now crosses the factory's public repo boundary. Baseline at first fire (2026-04-23) was 9 gaps: `/Users/` patterns in several SKILL.md files, 2 PDFs (metadata scan), a scratch-recon doc, a parallel-worktree research doc; `C:\Users\` pattern in 1 SKILL.md + `memory/feedback_path_hygiene.md` (anti-example reference — legitimate). **Classification (row #47):** **prevention-bearing** — the audit runs at author-time (on-touch during in-repo-first migrations) and surfaces the gap before it lands. Ships to project-under-construction: adopters inherit the audit + pattern list + exclusion-list discipline. | Audit output on each fire; cadenced runs appended to `docs/hygiene-history/machine-specific-content-audit-history.md` (per-fire schema per row #44 — date, agent, gaps count, files touched, actions taken, next-fire-expected-date); BACKLOG row per gap at triage time if cleanup doesn't fit on-touch. | `tools/hygiene/audit-machine-specific-content.sh` (detection + pattern list + exclusion list) + cross-refs: `memory/feedback_path_hygiene.md` + `memory/CURRENT-aaron.md` + `memory/CURRENT-amara.md` (in-repo-first migration boundary surfaces this audit's need) |
Comment thread docs/FACTORY-HYGIENE.md
| 51 | Cross-platform parity audit (bash / PowerShell / bun+TS twin check across macOS / Windows / Linux / WSL) | Detect-only now (landed 2026-04-22); cadenced detection every 5-10 rounds (same cadence as row #46); opportunistic on-touch every time an agent adds or edits a script under `tools/`. Enforcement deferred until baseline is green AND CI matrix runs `--enforce` on `macos-latest` / `windows-latest` / `ubuntu-latest` (WSL inherits ubuntu-latest for CI). | Dejan (devops-engineer) on cadenced detection; author of the script (self-check at author-time against the rule classes in the audit's decision-record header block). Kenji (Architect) on CI-matrix-enforcement sign-off when baseline is green. | both | `tools/hygiene/audit-cross-platform-parity.sh` classifies every script under `tools/` by rule class: (a) **pre-setup** (`tools/setup/**`) — both `.sh` AND `.ps1` required per Q1 dual-authoring rule (`memory/feedback_preinstall_scripts_forced_shell_meet_developer_where_they_live`); (b) **post-setup permanent-bash** (`thin wrapper over existing CLI` / `trivial find-xargs pipeline` / `stay bash forever`) — `.ps1` twin required per the Windows-twin obligation (`memory/feedback_stay_bash_forever_implies_powershell_twin_obligation.md`); (c) **post-setup transitional** (`bun+TS migration candidate` / `bash scaffolding`) — no twin obligation (long-term plan is one cross-platform bun+TS script); (d) **post-setup bun+TS** (`*.ts` under `tools/`) — no twin needed (cross-platform native via bun). `--summary` prints counts; `--enforce` flips exit 2 on gaps. **Why detect-only first:** baseline at first fire (2026-04-22) was 13 gaps — 12 pre-setup bash without `.ps1` twin (Q1 violation silently accumulating since `tools/setup/` existed) + 1 post-setup permanent-bash (`tools/profile.sh`) without `.ps1` twin. Turning enforcement on before triage would block every CI run. **Why this row exists:** Aaron 2026-04-22 *"missing mac/windows/linux/wsl parity (ubuntu latest) we can deffer but should have the hygene in place for when we want to enforce and it will be more obvious to you in the future that we are cross platform."* Cross-platform-first must be a *visible* factory property (audit exists, runs, prints the gap) before it becomes an enforced gate. Same pattern as FACTORY-HYGIENE rows #23 / #43 / #47. See `memory/feedback_cross_platform_parity_hygiene_deferred_enforcement.md`. **Classification (row #47):** **prevention-bearing** — the audit runs at author-time (opportunistic on-touch) and surfaces the gap before it lands, same as row #46. The audit itself is a detect-only mechanism but detect-only surfaces the obligation at author-time when the author runs it. Ships to project-under-construction: adopters inherit the parity audit + the decision-record-block pattern + the CI-matrix obligation once it's wired. | Audit output in repo root on each fire; cadenced runs appended to `docs/hygiene-history/cross-platform-parity-history.md` (per-fire schema per row #44); BACKLOG row per gap at triage time; ROUND-HISTORY row when a gap resolves. | `tools/hygiene/audit-cross-platform-parity.sh` (detection + decision-record header block) + `memory/feedback_cross_platform_parity_hygiene_deferred_enforcement.md` + `memory/feedback_stay_bash_forever_implies_powershell_twin_obligation.md` + `memory/feedback_preinstall_scripts_forced_shell_meet_developer_where_they_live` + `docs/POST-SETUP-SCRIPT-STACK.md` |
| 54 | Backlog-refactor cadenced audit (overlap / staleness / priority-drift / knowledge-update sweep of `docs/BACKLOG.md`) | Cadenced detection every 5-10 rounds (same cadence as rows #5 / #23 / #38 / #46 meta-audits) + opportunistic on-touch when a tick adds a new BACKLOG row and the author notices adjacent rows that may overlap. Not exhaustive; bounded passes per firing are acceptable. | Architect (Kenji) on round-cadence sweeps; `backlog-scrum-master` skill if explicitly invoked; all agents (self-administered) on on-touch overlap-spot during authoring. | factory | Read `docs/BACKLOG.md` (or a scoped slice — P0/P1 first if full scan is too large) and apply the following passes: (a) **overlap cluster** — two or more rows describing the same concern from different angles get flagged; decide merge (single consolidated row) or sharpen (two rows with clear non-overlap scope boundaries); (b) **stale retire** — rows where context has died, implementation landed without retire-action, or assumption has been falsified by newer knowledge get explicitly retired with a "retired: <reason>" marker (not silent deletion — signal-preservation still applies); (c) **re-prioritize** — priority labels (P0/P1/P2/P3) re-examined against current knowledge; any row whose priority feels wrong after re-read gets a justified move with a one-line rationale; (d) **knowledge absorb** — rows written before a newer architectural insight landed get rewording / cross-refs to the new substrate (e.g., rows predating AutoDream cadence now cite the policy; rows predating scheduling-authority sharpening now note self-schedulability); (e) **document** — ROUND-HISTORY row per fire with pre-audit and post-audit row counts + what was merged / retired / re-prioritized / updated. **Why this row exists:** the human maintainer 2026-04-23 *"we probalby need some meta iteam to refactor the backlog base on current knowledge and look for overlap, this is hygene we could run from time to time so our backlog is not just a dump"*. The BACKLOG is the triage substrate for every future tick's "what to pick up" decision; without periodic meta-audit it becomes an append-only log rather than a living triage surface. **Classification (row #50):** **detection-only-justified** — accumulated drift (overlap, staleness, priority-drift, knowledge-update-gap) is inherently post-hoc; no author-time check can prevent rows from becoming overlapping with *future* rows not yet written. **Maintainer-scope boundary:** rows with explicit maintainer framing at their priority (e.g., P0 rows the human maintainer explicitly set) stay at that priority; re-prioritization applies within the agent-owned priority space only. Ships to project-under-construction: adopters inherit the cadenced-sweep discipline + the retire-with-marker convention + the ROUND-HISTORY documentation pattern. | ROUND-HISTORY row per fire with pre/post row counts + merged/retired/re-prioritized/updated actions; `docs/hygiene-history/backlog-refactor-history.md` (per-fire schema per row #44 — date, agent, rows touched, actions taken, pre/post counts, next-fire-expected-date). | `docs/BACKLOG.md` (target surface) + governing rule in per-user memory (not in-repo; lives at `~/.claude/projects/<slug>/memory/feedback_backlog_hygiene_cadenced_refactor_look_for_overlap_not_just_dump_2026_04_23.md`) + `.claude/skills/backlog-scrum-master/SKILL.md` (dedicated runner when invoked) + `.claude/skills/reducer/SKILL.md` (Rodney's Razor applied at backlog level) + sibling meta-audit rows #5, #23, #38, #46, #50 |
| 52 | Tick-history bounded-growth audit (`docs/hygiene-history/loop-tick-history.md` line-count vs threshold) | Detect-only (landed 2026-04-22); cadenced detection once per round-close (same cadence as row #44 cadence-history sweep, since this is the canonical row #44 worked example auditing itself); opportunistic on-touch whenever the tick-history file is read or edited. Archive action itself remains manual for now; deferring automation to the larger BACKLOG row that also covers threshold-revision and append-without-reading refactor. | Dejan (devops-engineer) on cadenced detection; the tick itself (self-administered at tick-close) on the opportunistic on-touch — each tick's end-of-tick sequence can invoke this audit after the append + commit to get a `within bounds: 96/500 lines` visibility signal. | factory | `tools/hygiene/audit-tick-history-bounded-growth.sh` checks the file's line count against a threshold (default 500, overrideable via `--threshold N`) and exits 0 within bounds / 2 over threshold. The threshold is set lower than the stated 5000-line paper bound because the file is read on every tick-close append — a per-tick context cost that scales linearly with file size — and 5000 lines represents too large a context hit on a 1-minute cadence. The audit's header block carries a mini-ADR decision record for the 500-line choice (context / decision / alternatives / supersedes / expires-when). **Why this row exists:** Aaron 2026-04-22 tick-fire interrupt: *"does loop tick history grow unbounded? that's an issue if so you just read it"*. Honest state was stated-bound-no-enforcement: file header named 5000 lines, nothing checked it. This row closes the enforcement gap for the threshold-check half of the full BACKLOG row (archive-action + append-without-reading refactor remain deferred). **Self-referential closure:** the tick-history file IS the canonical row-#44 cadence-history-tracking worked example (named explicitly in row #44's "Durable output" citation). Until this row landed, the most-cadenced surface in the factory — the tick itself — had its fire-log surface unaudited for its own growth. Meta-audit triangle remains intact (existence #23 / activation #43 / fire-history #44), and row #49 adds a fourth: fire-history files themselves need bounded-growth audits because they grow at the cadence of the surface they track. **Classification (row #47):** **prevention-bearing** — the audit surfaces approaching-threshold warnings at 80% so the archive action can be planned, rather than reactive-only at over-threshold. Ships to project-under-construction indirectly: adopters inherit the pattern (fire-log files under their own `docs/hygiene-history/` need the same bounded-growth treatment), not this exact script. | Audit output on each fire; cadenced runs appended to `docs/hygiene-history/tick-history-bounded-growth-history.md` (per-fire schema per row #44); BACKLOG row when archival is due (archive-action itself queued as part of the larger tick-history enforcement BACKLOG row); ROUND-HISTORY row when threshold changes or archive action executes. | `tools/hygiene/audit-tick-history-bounded-growth.sh` (detection + mini-ADR header block) + `docs/hygiene-history/loop-tick-history.md` (target surface, canonical row #44 worked example) + BACKLOG row *"Loop-tick-history bounded-growth enforcement"* (larger follow-up: threshold revision + append-without-reading refactor + archive action) |
| 55 | Machine-specific content scrubber (cadenced audit of in-repo tracked files for user-home paths, Claude Code harness paths, Windows user-profile paths, hostname leaks) | Detect-only (landed 2026-04-23); cadenced detection once per round-close (same cadence as rows #50 / #51 / #52 meta-audits) + opportunistic on-touch when a tick migrates per-user content to in-repo. Enforcement (`--enforce` exit-2) deferred until baseline is green. | Dejan (devops-engineer) on cadenced detection + CI-enforcement sign-off when baseline is green; the migrating agent (self-administered) on on-touch — every in-repo-first migration runs the audit before committing. | factory | `tools/hygiene/audit-machine-specific-content.sh` scans all tracked files (`git ls-files`) for machine-specific patterns: `/Users/<name>/`, `/home/<name>/`, `C:\Users\<name>`, `C:/Users/<name>`. Excludes: `docs/ROUND-HISTORY.md`, `docs/hygiene-history/**`, `docs/DECISIONS/**`, and the audit script itself. `--list` prints offending files; `--enforce` flips exit 2 on any gap. **Why this row exists:** Aaron 2026-04-23 Otto-27 — *"we can have a machine specific scrubber/lint hygene task for anyting that makes it in by default. just run on a cadence."* Following the Option D in-repo-first policy shift (per-user memory migrations to in-repo became the default), machine-specific content leakage becomes a real risk — content comfortably per-user now crosses the factory's public repo boundary. Baseline at first fire (2026-04-23) was 9 gaps: `/Users/` patterns in several SKILL.md files, 2 PDFs (metadata scan), a scratch-recon doc, a parallel-worktree research doc; `C:\Users\` pattern in 1 SKILL.md + `memory/feedback_path_hygiene.md` (anti-example reference — legitimate). **Classification (row #47):** **prevention-bearing** — the audit runs at author-time (on-touch during in-repo-first migrations) and surfaces the gap before it lands. Ships to project-under-construction: adopters inherit the audit + pattern list + exclusion-list discipline. | Audit output on each fire; cadenced runs appended to `docs/hygiene-history/machine-specific-content-audit-history.md` (per-fire schema per row #44 — date, agent, gaps count, files touched, actions taken, next-fire-expected-date); BACKLOG row per gap at triage time if cleanup doesn't fit on-touch. | `tools/hygiene/audit-machine-specific-content.sh` (detection + pattern list + exclusion list) + cross-refs: `memory/feedback_path_hygiene.md` + `memory/CURRENT-aaron.md` + `memory/CURRENT-amara.md` (in-repo-first migration boundary surfaces this audit's need) |
AceHack added a commit that referenced this pull request Apr 23, 2026
…eep scope diagnosed

#198 (machine-specific scrubber) merged at 20:50:16Z.
Amara's "mechanize failure modes" recommendation landed.

#149 thread sweep:
- 2 unresolved P2 Codex findings (cross-PR dangling-ref)
- Both replied + resolved per queue-drain discipline
- #149 now has clean merge path

Thread-sweep scope across remaining Amara PRs:
- #154: 6 threads (mixed dangling-ref + name-attribution)
- #161: 11 threads
- #170: 15 threads
- #155: 30 threads (deferred)

Total 62+ threads. Two disposition classes identified:
1. Cross-PR dangling-refs (queue-drain acknowledgment;
   self-heal as queue drains)
2. Name-attribution in ADRs/config (legitimate per
   named-agents-attribution memory; bot doesn't know
   the policy)

Batch-sweep tool candidate queued: 60+ threads one-by-
one is tick-exhausting; template-based batch resolver
would drain in ~2 minutes + mechanize Amara's "failure
modes" recommendation.

Attribution: Otto (loop-agent PM hat).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
AceHack added a commit that referenced this pull request Apr 23, 2026
…s drained mechanically

Tool: tools/git/batch-resolve-pr-threads.sh (185 lines,
PR #199). Classifies review threads into dangling-ref /
name-attribution / unknown; template replies + resolve
via GraphQL. Dry-run default; --apply flag for action.
Unknown threads always left unresolved (conservative).

Patched in-tick for empty-array bug + extended pattern
matching (doesn't-exist-in-repo / point-references-to /
direct-contributor-name-attribution / etc.).

Applied results:
- #154: 5 resolved + 1 unknown
- #161: 2 resolved + 10 unknown (over 2 apply passes)
- #170: 3 resolved + 15 unknown
- #149: 2 manually resolved (Otto-29) + 9 new (bot
  re-reviewed post-update-branch — high-churn pattern)

Total: 15 threads drained this session; 135 remaining
across 5 PRs (including #155's 100).

High-churn pattern: update-branch triggers bot re-review.
Copilot-instructions.md tune could reduce noise (queued).

Attribution: Otto (loop-agent PM hat). Mechanizes Amara's
"failure modes" recommendation — 2nd instance after #198
machine-specific scrubber.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
AceHack added a commit that referenced this pull request Apr 23, 2026
…substantive remain

#154 (decision-proxy ADR + config) merged at 21:28:48Z.
Second Amara-named PR canonical. 4 of 5 original Amara
PRs merged or close (#149/#161/#170 substantive remain).

#200 MD032 regression: my Otto-37 content-fix reintroduced
'+' at line-start pattern (same as Otto-35). Replaced
with 'and'. Author-time lint rule opportunity queued.

46 unresolved threads across #149/#161/#170/#200 are
ALL substantive content findings. Tool has drained all
mechanizable classes. Content-review required for rest
per Aaron's Otto-31 Option 3.

Phase 1 merge-cadence: #196 + #154 + #197 + #198 + #199
(pending) + #200 (pending) all cleared or close.

Next-tick reprioritize candidate: Craft next module or
gap #2 linguistic-seed first term.

Attribution: Otto (loop-agent PM hat).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
AceHack added a commit that referenced this pull request Apr 23, 2026
…ORY-HYGIENE row #56)

Prevents recurring regression hit 3× in session.

Tool: tools/hygiene/audit-md032-plus-linestart.sh.
Baseline: 3 pre-existing gaps in persona notebooks.
Detect-only; --list + --enforce modes.

FACTORY-HYGIENE row #56 with full schema. Classified
prevention-bearing per row #47 taxonomy.

Pivot discipline: attempted linguistic-seed term #2
equality but #202 truth still BLOCKED/OPEN (directory
not on main yet). Pivoted to row #56 work instead of
stacking dependent substrate.

Third tool-hardening landing in session (after #198
scrubber + #199 batch-resolve). Tool-substrate
developing alongside content.

Attribution: Otto (loop-agent PM hat).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
AceHack added a commit that referenced this pull request Apr 25, 2026
)

* tools(git): batch-resolve-pr-threads.sh — classify + resolve stacked-PR thread backlog

Built to mechanize Amara's "mechanize already-discovered
failure modes" recommendation (2nd instance; first was
the machine-specific scrubber #198).

Classifies unresolved review threads into:

1. dangling-ref — cross-PR references to artifacts not
   yet on main; acceptable during stacked-PR queue-drain;
   self-heals as queue drains. Blanket-acknowledge +
   resolve template.

2. name-attribution — legitimate named-persona attribution
   in ADRs/config/collaborator registries per the
   named-agents-get-attribution memory. BP name-
   attribution rule applies to personal human names outside
   persona-scope, not to persona names in structural
   attribution contexts. Policy-pointer reply + resolve.

3. unknown — left unresolved; reported for manual review.
   Conservative default keeps substantive findings visible.

Usage:
  tools/git/batch-resolve-pr-threads.sh <pr-number>          # dry-run
  tools/git/batch-resolve-pr-threads.sh <pr-number> --apply  # resolve

Dry-run tested on #154: correctly classified 2 dangling-
ref + 3 name-attribution + 1 unknown.

Attribution: Otto (loop-agent PM hat).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* tools(git): patch batch-resolve — empty-array guard + more classification patterns

Two fixes:

1. Guarded empty-array iteration (name_ids[@] unbound when
   all unresolved threads are dangling-ref class)
2. Extended dangling-ref patterns: 'doesn't exist in-repo',
   'point protocol references', 'point references to
   existing', 'references a location', 'references a file'
3. Extended name-attribution patterns: explicit multi-word
   phrases like 'direct contributor name attribution',
   'repo convention prohibits', 'repo's standing rule'

Re-tested on #161: caught 1 more name-attribution (was 0).
#170 still has 15 unknown (likely different class; manual
review next tick).

Attribution: Otto (loop-agent PM hat).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* tools(git): fix shellcheck on batch-resolve — disable SC2016 on literal replies; bash pattern-replace for SC2001

Two issues flagged by CI shellcheck:
- SC2016: single-quoted reply bodies contain Markdown
  backticks that must stay literal (intentional). Added
  shellcheck disable comments.
- SC2001: sed-based escape replaced with bash native
  ${variable//search/replace} pattern.

Clean local run confirms fix.

Attribution: Otto (loop-agent PM hat).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* tools(git): harden batch-resolve per PR #199 Copilot findings

Addresses 7 of 9 substantive findings:

1. Portable repo detection via 'gh repo view --json owner,name'
   (was hard-coded Lucent-Financial-Group/Zeta; now works
   on forks / renamed orgs)

2. Full pagination handling (pageInfo + endCursor loop;
   was dropping threads past 100)

3. Full thread context fetch (comments first:50, joined
   with newline-delimiter; was only first comment)

4. Proper GraphQL body escaping via 'gh api -F body=...'
   (multipart form; was manual string-concat into mutation)

5. NUL-delimited bash pipe replaced with jq -c JSON-per-line
   + per-line jq parse (was silently dropping threads on
   tab/newline in body — test confirmed; now processes all
   24 threads on #170 correctly)

6. Explicit exit 1 on API failures (matches docstring)

7. Removed per-user-memory reference from name-attribution
   reply template — now cites in-repo memory/CURRENT-aaron.md
   + docs/EXPERT-REGISTRY.md (no dangling-ref in tool output)

8. Added "not present in-repo" + "aren't resolvable" to
   dangling-ref pattern list (conservative extension)

9. Global shellcheck disable=SC2016 with clear rationale
   (GraphQL queries + Markdown reply bodies are intentionally
   literal)

Local test: #170 classification went from 0/0/0 (broken
parsing) to 0/1/23 (correct — 1 name-attribution + 23
legit substantive findings).

Attribution: Otto (loop-agent PM hat).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* tools(git): harden batch-resolve-pr-threads.sh per #199 review feedback

Subsumes the bulk of #199 PR review-thread findings into a single
hardening pass:

  - Add post-setup-script-stack exception label + BACKLOG row B-0015
    (Copilot P0 catch — script needed an exception-label header
    naming which bash exception applies and a migration BACKLOG row).
  - Strip persona names from the script header in favour of role-
    refs ("the operational-gap-assessment direction" instead of the
    persona name), per the role-refs-preferred-in-code surface
    class. Reply template still references named-agents-get-
    attribution policy by pointer rather than enumeration.
  - Strict argv: reject extra args; accept only exact `--apply`.
  - Reject pr-number 0 explicitly (regex alone admits 0).
  - Dependency probe for `gh` and `jq` up-front (clean exit 1 vs
    unhelpful 127 with set -e).
  - Use `printf '%s'` instead of `echo` for body content (echo can
    mangle leading `-n` / backslash escapes).
  - Stop suppressing stderr from `gh` (auth / rate-limit errors
    were silently swallowed).
  - Inspect GraphQL `errors` field explicitly via
    graphql_check_errors helper (gh exit code alone misses
    partial-failure responses).
  - Fail fast on null pullRequest (nonexistent / inaccessible PR
    no longer treated as zero-thread success).
  - Fix paginated `after` argument injection: build positional
    arg array instead of `${var:+-F after="$var"}` (parameter-
    expansion-quote pitfall).
  - Add per-thread comment-count truncation warning (>50 comments
    → stderr warning so a human can audit).
  - Capture extractor-jq failure: process-substitution can't
    propagate jq errors into set -e; route through a temp file
    and inspect exit code explicitly.
  - Print unknown thread IDs in summary so manual follow-up has
    a target list.
  - Sync header docstring to match implementation (newline-
    delimited JSON parse, not NUL-delimited; comment-count claim
    matches the actual fetch limit).

The dangling-ref classifier and reply-template structure are
unchanged — only hardening of the surrounding plumbing.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants