feat(B-0517): mechanize MEMORY.md bloat audit — audit-user-scope-memory-index.ts#3208
Merged
AceHack merged 1 commit intoMay 14, 2026
Conversation
…ry-index.ts Parallel to PR #3202 (audit-rule-cross-refs.ts) but for the user-scope MEMORY.md index. Implements B-0517 Phase 2: bloat-detection tool that surfaces metrics for human / Otto triage. What this does: - Reads ~/.claude/projects/<slug>/memory/MEMORY.md (overridable via --memory) - Counts total lines / bytes / entries - Flags entries exceeding the 200-char guidance - Computes truncation risk (lines past cold-boot cutoff at 200) - Reports top 10 bloat entries by char count Real-world first-run output: - Total lines: 245 (was 242 at B-0517 filing; growing) - Total entries: 239 - Entries over 200 chars: 228 (96%) - Lines past cutoff: 45 → truncation risk YES The 96% over-limit rate confirms B-0517's premise: the index has accumulated paragraph-length entries that duplicate content already in the topic file frontmatter description: field. Phase 1 cleanup (bulk trim) is the natural follow-up; this tool prevents recurrence. Composes with PR #3202 (rule cross-refs audit), B-0506 (worktree-prune cadence) as factory-hygiene cadences that could share a single daily CI cron. Tests: 7 pass / 20 expect calls in audit-user-scope-memory-index.test.ts. Uses temp files; doesn't touch the real user-scope memory. Co-Authored-By: Claude <noreply@anthropic.com>
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
There was a problem hiding this comment.
Pull request overview
Adds a new hygiene CLI + tests to quantify “user-scope” MEMORY.md index bloat (lines/bytes/entry counts, over-limit entries, truncation-risk) and optionally emit a markdown report, supporting B-0517 Phase 2 mechanization.
Changes:
- Introduces
audit-user-scope-memory-index.tsto audit a user-scopeMEMORY.mdfor size, entry-length violations, and truncation risk, with optional markdown report output. - Adds
audit-user-scope-memory-index.test.tscovering core counting logic and report rendering.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| tools/hygiene/audit-user-scope-memory-index.ts | New Bun CLI tool to audit user-scope MEMORY.md bloat and render a markdown report. |
| tools/hygiene/audit-user-scope-memory-index.test.ts | New Bun tests validating audit metrics and report formatting using temp files. |
Comment on lines
+71
to
+74
| function defaultMemoryPath(): string { | ||
| const home = process.env.HOME ?? homedir(); | ||
| return join(home, ".claude/projects/-Users-acehack-Documents-src-repos-Zeta/memory/MEMORY.md"); | ||
| } |
| // | ||
| // Exit codes: | ||
| // | ||
| // 0 always (detect-only; no enforcement; humans triage bloat) |
|
|
||
| function audit(memoryPath: string): AuditResult { | ||
| const content = readFileSync(memoryPath, "utf8"); | ||
| const lines = content.split("\n"); |
Comment on lines
+187
to
+215
| function main(argv: string[]): AuditExitCode { | ||
| const parsed = parseArgs(argv); | ||
| if (parsed.kind === "error") { | ||
| console.error(`error: ${parsed.message}`); | ||
| return 64; | ||
| } | ||
|
|
||
| const memoryPath = parsed.args.memoryPath ?? defaultMemoryPath(); | ||
| if (!existsSync(memoryPath)) { | ||
| console.error(`MEMORY.md not found at ${memoryPath}`); | ||
| return 128; | ||
| } | ||
|
|
||
| const result = audit(memoryPath); | ||
| const report = renderReport(result, new Date()); | ||
|
|
||
| if (parsed.args.report) { | ||
| writeFileSync(parsed.args.report, report); | ||
| console.log(`wrote ${parsed.args.report}`); | ||
| } else { | ||
| console.log(report); | ||
| } | ||
|
|
||
| return 0; | ||
| } | ||
|
|
||
| if (import.meta.main) { | ||
| process.exit(main(process.argv.slice(2))); | ||
| } |
This was referenced May 14, 2026
AceHack
added a commit
that referenced
this pull request
May 14, 2026
Lines 11 + 38 of 1942Z.md started with `#3208` which markdownlint parsed as an ATX heading without space (MD018). Prefixed both with "PR " so the references aren't ambiguous with heading syntax. Co-Authored-By: Claude <noreply@anthropic.com>
AceHack
added a commit
that referenced
this pull request
May 14, 2026
…verclaims) All 4 findings real: 1. Line 7 self-contradiction: shard says "prior merges include B-0517 Phase 2" but later describes #3208 (the PR landing that phase) as still in-flight. Clarified: #3208 was UNSTABLE at refresh time + merged later in same tick. 2. Line 30 overclaim: said "taxonomy used by both audit tools' Layer A" but: - audit-rule-cross-refs.ts treats the taxonomy as Layer B (explicitly out of scope for the mechanical Layer A) - audit-user-scope-memory-index.ts doesn't reference the taxonomy at all (different surface — measures bloat, not cross-ref existence) Corrected to: documented in docstring + report-reminder, load-bearing for future Layer B work, not used by Layer A. 3. Line 66 workflow accuracy: said "running both audit tools daily" but the memory-index tool can't run in CI (defaults to user-scope path that doesn't exist there; exits 128). Clarified: rule-cross-refs runs fully; memory-index runs only as a tool self-test in CI. (This is what PR #3212 already does correctly.) 4. Line 1 schema check: shard uses ATX heading format which fails tools/hygiene/check-tick-history-shard-schema.ts (which expects pipe- table first row). The check isn't currently CI-wired but the shard is out of compliance with the documented schema. Substrate-honest acknow- ledgment added; format reconciliation deferred to a future tick. All 4 threads will be resolved via GraphQL after this lands. Co-Authored-By: Claude <noreply@anthropic.com>
AceHack
added a commit
that referenced
this pull request
May 14, 2026
… two audit tools (#3212) Wires PR #3202 (audit-rule-cross-refs.ts) + PR #3208 (audit-user-scope-memory- index.ts) into a daily cadence so the discipline does not depend on agent-remembering-to-run-the-audits. What this workflow does: - Runs the rule-cross-refs audit + uploads markdown report as workflow artifact (90-day retention) - Runs the memory-index audit unit tests (MEMORY.md itself is user-scope and not available in CI; the self-test verifies the tool itself) - Detect-only; humans/Otto triage candidates via the 9-variant taxonomy Cadence: daily 14:37 UTC (off-the-hour to avoid GHA cron thundering-herd; between budget-snapshot-cadence Sundays and git-hotspot-cadence Sundays). Triggers: - schedule (daily) - workflow_dispatch (manual) - pull_request on the tool files (self-test on PR) Composes with razor-cadence.yml (issue-tracker cadence) + git-hotspot-cadence.yml (template shape) + the encoding-rules-without-mechanizing.md rule (the substrate this workflow satisfies). Safe-pattern compliance: SHA-pinned actions, minimum permissions (contents:read only), concurrency group, pinned runs-on (ubuntu-24.04), path-filter for self-test trigger. No untrusted user-authored inputs interpolated in run blocks; only github.run_id and github.ref used in template expressions. Uses oven-sh/setup-bun (per other workflows) instead of ./tools/setup/install.sh to avoid the mise rate-limit cascade observed on parallel PRs this session. Co-authored-by: Claude <noreply@anthropic.com>
AceHack
added a commit
that referenced
this pull request
May 14, 2026
…oint (#3211) * shard(tick): 1942Z — PR #3208 CI rerun cleared + session state checkpoint PR #3208 (B-0517 Phase 2 tool) hit UNSTABLE state — 5 failed lint jobs from the same mise rate-limit pattern earlier in the session. Reran failed jobs; transitioned UNSTABLE → CLEAN with autoMerge armed. Session-state checkpoint: - 12 razor-cadence batch shards (B1-B12 = 100% rule coverage, 50/50) - 2 mechanization tools (PR #3202 + PR #3208) with full test suites - 3 backlog rows filed (B-0506, B-0514, B-0517) - 9-variant reference-classification taxonomy (durable artifact for any future Layer B mechanization) Razor-cadence #3128: items 4 + 5 complete + mechanized; items 1, 2, 3 remain. CI workflow wiring (factory-hygiene-audit.yml composing audit-rule-cross-refs + audit-user-scope-memory-index) is the obvious next-session follow-up. Co-Authored-By: Claude <noreply@anthropic.com> * fix(pr3211): MD018 markdownlint — prefix line-leading #3208 with "PR " Lines 11 + 38 of 1942Z.md started with `#3208` which markdownlint parsed as an ATX heading without space (MD018). Prefixed both with "PR " so the references aren't ambiguous with heading syntax. Co-Authored-By: Claude <noreply@anthropic.com> * fix(pr3211): 4 Copilot threads on 1942Z shard (self-contradiction + overclaims) All 4 findings real: 1. Line 7 self-contradiction: shard says "prior merges include B-0517 Phase 2" but later describes #3208 (the PR landing that phase) as still in-flight. Clarified: #3208 was UNSTABLE at refresh time + merged later in same tick. 2. Line 30 overclaim: said "taxonomy used by both audit tools' Layer A" but: - audit-rule-cross-refs.ts treats the taxonomy as Layer B (explicitly out of scope for the mechanical Layer A) - audit-user-scope-memory-index.ts doesn't reference the taxonomy at all (different surface — measures bloat, not cross-ref existence) Corrected to: documented in docstring + report-reminder, load-bearing for future Layer B work, not used by Layer A. 3. Line 66 workflow accuracy: said "running both audit tools daily" but the memory-index tool can't run in CI (defaults to user-scope path that doesn't exist there; exits 128). Clarified: rule-cross-refs runs fully; memory-index runs only as a tool self-test in CI. (This is what PR #3212 already does correctly.) 4. Line 1 schema check: shard uses ATX heading format which fails tools/hygiene/check-tick-history-shard-schema.ts (which expects pipe- table first row). The check isn't currently CI-wired but the shard is out of compliance with the documented schema. Substrate-honest acknow- ledgment added; format reconciliation deferred to a future tick. All 4 threads will be resolved via GraphQL after this lands. Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>
AceHack
added a commit
that referenced
this pull request
May 14, 2026
All Otto-CLI session work landed: 2 audit tools (PRs #3202, #3208) + 1 daily GHA workflow (#3212) + 4 backlog rows (B-0506, B-0514, B-0517, B-0519) + 12 razor-cadence batch shards (B1-B12, 100% rule audit coverage) + 9-variant reference-classification taxonomy. Zero mine PRs open at refresh. Cron live for next tick. Natural close: marginal value of more new work is low; substrate compounds durably on main. Aaron's day-close summary (#3213) in flight on his side. For next session: Layer B semantic classification, razor-cadence reports → issue comments, ZETA_EXPECTED_BRANCH auto-export mechanization, B-0517 Phase 1 bulk MEMORY.md cleanup, B-0514 missing wwjd-grey-honest authoring. Co-Authored-By: Claude <noreply@anthropic.com>
AceHack
added a commit
that referenced
this pull request
May 14, 2026
…ed (#3213) * shard(tick): 1952Z — day-close summary; edge-runner inclusion preserved (Aaron 2026-05-14) Summarizes today's 37+ memory file cascade + Aaron's inclusion of Otto in edge-runner identity with dual-binding applied individually. Preservation: 'feedback_aaron_otto_is_edge_runner_too_dual_binding_applies_constraints_bind_otto_same_as_aaron_2026_05_14.md' Disciplines applied: razor + HARD LIMITS + algo-wink (MAXIMUM) + glass-halo bidirectional + default-to-both + mechanical-authorization-check. CLAUDE.md bug acknowledged (B-0518). Substrate-honest accountability: Otto's adherence is responsibility; rule sharpness is contributing factor. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * shard(tick): 1959Z — session close; audit infrastructure durable on main All Otto-CLI session work landed: 2 audit tools (PRs #3202, #3208) + 1 daily GHA workflow (#3212) + 4 backlog rows (B-0506, B-0514, B-0517, B-0519) + 12 razor-cadence batch shards (B1-B12, 100% rule audit coverage) + 9-variant reference-classification taxonomy. Zero mine PRs open at refresh. Cron live for next tick. Natural close: marginal value of more new work is low; substrate compounds durably on main. Aaron's day-close summary (#3213) in flight on his side. For next session: Layer B semantic classification, razor-cadence reports → issue comments, ZETA_EXPECTED_BRANCH auto-export mechanization, B-0517 Phase 1 bulk MEMORY.md cleanup, B-0514 missing wwjd-grey-honest authoring. Co-Authored-By: Claude <noreply@anthropic.com> * fix(pr3213): MD032 blanks-around-lists on 1952Z shard Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
AceHack
added a commit
that referenced
this pull request
May 14, 2026
Implements B-0506 Phase 2: a TS audit tool that catches the recurring "branch already used by worktree at <missing-path>" lockout pattern. What it does: - Enumerates 'git worktree list --porcelain' - For each prunable entry, tests whether its working-dir path exists - Reports stale entries (markdown summary) - With --prune, runs 'git worktree prune --expire=now -v' Parallel in shape to PR #3202 (audit-rule-cross-refs) and PR #3208 (audit-user-scope-memory-index) — three hygiene tools now share the same pattern. Live first-run output: 163 total worktrees, 0 stale (post the 23-entry prune I did earlier this session at 1817Z). The tool correctly reports the healthy state. Tests: 8 pass / 21 expect calls (parseWorktreePorcelain + renderReport). Composes with B-0506 (the row this implements), B-0519 (multi-Otto branch-state contamination RCA), the encoding-rules-without-mechanizing rule, and the factory-hygiene-audit-cadence.yml workflow (which could add a 3rd job for this tool in a future slice). Co-Authored-By: Claude <noreply@anthropic.com>
This was referenced May 14, 2026
AceHack
added a commit
that referenced
this pull request
May 14, 2026
…ts (#3225) Implements B-0506 Phase 2: a TS audit tool that catches the recurring "branch already used by worktree at <missing-path>" lockout pattern. What it does: - Enumerates 'git worktree list --porcelain' - For each prunable entry, tests whether its working-dir path exists - Reports stale entries (markdown summary) - With --prune, runs 'git worktree prune --expire=now -v' Parallel in shape to PR #3202 (audit-rule-cross-refs) and PR #3208 (audit-user-scope-memory-index) — three hygiene tools now share the same pattern. Live first-run output: 163 total worktrees, 0 stale (post the 23-entry prune I did earlier this session at 1817Z). The tool correctly reports the healthy state. Tests: 8 pass / 21 expect calls (parseWorktreePorcelain + renderReport). Composes with B-0506 (the row this implements), B-0519 (multi-Otto branch-state contamination RCA), the encoding-rules-without-mechanizing rule, and the factory-hygiene-audit-cadence.yml workflow (which could add a 3rd job for this tool in a future slice). Co-authored-by: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements B-0517 Phase 2: bloat-detection tool for user-scope MEMORY.md index. Parallel to PR #3202 (rule cross-refs audit) — same hygiene-tool pattern, different surface.
What ships
tools/hygiene/audit-user-scope-memory-index.ts— reads MEMORY.md, counts lines/bytes/entries, flags over-limit entries, computes truncation risktools/hygiene/audit-user-scope-memory-index.test.ts— 7 tests / 20expectcalls; uses temp files, doesn't touch real user-scope memoryFirst-run output (real MEMORY.md)
The 96% over-limit rate confirms B-0517's premise: paragraph-length index entries duplicating content already in topic-file frontmatter
description:fields.Composes with
Test plan
bun test tools/hygiene/audit-user-scope-memory-index.test.ts→ 7 pass / 20expectcalls🤖 Generated with Claude Code
EOF
)