diff --git a/docs/backlog/P3/B-0517-memory-md-index-bloat-cleanup-cadence-2026-05-14.md b/docs/backlog/P3/B-0517-memory-md-index-bloat-cleanup-cadence-2026-05-14.md new file mode 100644 index 0000000000..0108d8f27d --- /dev/null +++ b/docs/backlog/P3/B-0517-memory-md-index-bloat-cleanup-cadence-2026-05-14.md @@ -0,0 +1,82 @@ +--- +id: B-0517 +priority: P3 +status: open +title: "MEMORY.md index bloat cleanup + entry-length enforcement cadence" +tier: substrate-hygiene +effort: M +created: 2026-05-14 +last_updated: 2026-05-14 +depends_on: [] +composes_with: [B-0006] +tags: [memory, MEMORY.md, fast-path, index-hygiene, razor-cadence, user-scope] +type: chore +--- + +# MEMORY.md index bloat cleanup + entry-length enforcement cadence + +## Origin + +Otto-CLI 2026-05-14T19:27Z razor-cadence item 5 (MEMORY.md index audit) investigation found: + +- **`~/.claude/projects/-Users-acehack-Documents-src-repos-Zeta/memory/MEMORY.md` has grown to 242 lines / 66KB / 237 entries** (the user-scope MEMORY.md, NOT the repo-scope one). +- Cold-boot loads the first 200 lines / 25KB only; **~37 lines (15%) past the cutoff are unreachable at fast-path** for new sessions. +- **Average entry size: 275 chars** — well over the 200-char "one line" guidance in the auto-memory system prompt. +- **10+ entries exceed 500 chars** — each is a paragraph rather than an index line; the bloat is concentrated in recent additions. + +## Examples of over-long entries + +``` +620 chars: [Reverse-engineer Gates physical ECCs for minimalist memory storage...] +613 chars: [Devil + god play within universal logic...] +608 chars: [Dialectical viewpoint = Aaron's natural operation...] +582 chars: [NEVER A CAGE — binding alignment force of attention...] +575 chars: [Constraints as self-binding against acknowledged temptation...] +``` + +## Why this matters + +Per the auto-memory system: "MEMORY.md is always loaded into your conversation context — lines after 200 will be truncated, so keep the index concise." The current state silently drops the most recent ~37 entries from cold-boot visibility. The bloat compounds: each oversized entry's "tail" prose duplicates content that already lives in the topic file's frontmatter `description:` field and body. + +## Proposed work + +Two phases: + +### Phase 1 — bulk cleanup (one-time) + +For each entry exceeding 200 chars: + +1. Read the underlying topic file +2. Verify the topic file's body + frontmatter `description:` contain the full detail (no information loss from trimming the index) +3. Rewrite the MEMORY.md entry to: `- [Short Title](filename.md) — one-line hook (~50-100 chars).` + +Roughly 100-130 entries need trimming. Manageable in 3-5 ticks. + +### Phase 2 — mechanized enforcement (cadence) + +Add `tools/hygiene/audit-user-scope-memory-index.ts` that: + +- Reads the user-scope MEMORY.md +- Reports total line count, byte count, entry count +- Flags entries over 200 chars +- Computes truncation risk (lines past 200) +- Exits 0 always (detect-only); future ticks consume the report + +This parallels `tools/hygiene/audit-rule-cross-refs.ts` (PR #3202) — same shape, different surface. Could compose with B-0506 (worktree-prune cadence) as one of several factory-hygiene CI crons. + +## Composes with + +- B-0006 (memory-md hub compression — prior cleanup work on MEMORY.md) +- `tools/hygiene/audit-rule-cross-refs.ts` (PR #3202 — parallel mechanization) +- Razor-cadence #3128 daily fire (item 5 is the MEMORY.md index audit) +- `encoding-rules-without-mechanizing.md` rule + +## Substrate-honest framing + +Cleanup is invasive (touching ~130 entries in user-scope memory) but each individual edit is mechanical: replace the long description with the topic file's frontmatter-derived hook. The risk is semantic (does the trim preserve enough triage signal in cold-boot context?). Mitigation: phase 1 is reviewable per-entry; entries can be expanded back if the trim lost a load-bearing signal. + +User-scope memory is not git-tracked, so this work doesn't go through PR in the usual sense. The Phase 2 mechanization tool IS git-tracked and follows the standard factory shape. + +## Origin tick + +`docs/hygiene-history/ticks/2026/05/14/1927Z.md` — this tick's shard documents the empirical observation. diff --git a/docs/hygiene-history/ticks/2026/05/14/1927Z.md b/docs/hygiene-history/ticks/2026/05/14/1927Z.md new file mode 100644 index 0000000000..e12b88aa92 --- /dev/null +++ b/docs/hygiene-history/ticks/2026/05/14/1927Z.md @@ -0,0 +1,72 @@ +# Tick 2026-05-14T19:27Z — razor-cadence item 5 investigated (MEMORY.md index bloat); B-0517 filed + +## Refresh (step 1) + +- Rate limit: 4762/5000 (healthy). +- 14 open PRs. Mine: #3202 (mechanization tool), #3203 (1925Z shard) — both wait-ci, autoMerge armed. + +## Holding discipline (step 2) + +Razor-cadence item 4 done + mechanized. Pick next priority per never-be-idle ladder. Razor-cadence #3128 has 4 remaining items; item 5 (MEMORY.md index audit) is similar in shape to item 4 — partly mechanizable, tractable per-tick. + +## Speculative work (step 3) — razor-cadence item 5 investigation + +### Findings + +User-scope `MEMORY.md` (`~/.claude/projects/-Users-acehack-Documents-src-repos-Zeta/memory/MEMORY.md`) state: + +| Metric | Value | +|--------|-------| +| Lines | 242 | +| Bytes | 66,227 (~66KB) | +| Entries | 237 | +| Average entry size | 275 chars | +| Entries > 200 chars | ~100-130 | +| Cold-boot truncation cutoff | line 200 / 25KB | +| Lines unreachable at fast-path | ~37 (15%) | + +### Root cause + +Per the auto-memory system prompt: "Keep index entries to one line under ~200 chars; move detail into topic files." Recent additions have been adding paragraph-length entries with detail that's ALSO in the topic file's frontmatter `description:` field — duplicated content blocking the most recent ~37 entries from cold-boot fast-path. + +Top 5 over-long entries (>500 chars each): + +- 620 chars: "Reverse-engineer Gates physical ECCs for minimalist memory storage..." +- 613 chars: "Devil + god play within universal logic..." +- 608 chars: "Dialectical viewpoint = Aaron's natural operation..." +- 582 chars: "NEVER A CAGE — binding alignment force of attention..." +- 575 chars: "Constraints as self-binding against acknowledged temptation..." + +## Verify (step 4) + +Verified topic file (`feedback_aaron_reverse_engineer_gates_*.md`) has the full detail in its frontmatter `description:` and body — trimming the MEMORY.md index entry does NOT lose information. + +Filed **B-0517** at `docs/backlog/P3/B-0517-memory-md-index-bloat-cleanup-cadence-2026-05-14.md` documenting: + +- The empirical state +- Phase 1 (one-time bulk cleanup of ~130 entries) +- Phase 2 (mechanization via `tools/hygiene/audit-user-scope-memory-index.ts`) +- Composition with B-0006 + PR #3202 + razor-cadence #3128 + +## Shard (step 5) + +This file. + +## CronList (step 6) + +Sentinel `f970cb2d` armed. + +## Visibility (step 7) + +- **Razor-cadence item 5 finding**: MEMORY.md fast-path is silently dropping ~15% of entries due to index bloat +- **B-0517 filed** with two-phase plan: bulk cleanup + audit-tool mechanization +- **Mine in-flight**: #3202 (mechanization tool, wait-ci armed); #3203 (prior shard, wait-ci armed) +- **Razor-cadence #3128 status**: item 4 complete + mechanized; item 5 captured as B-0517; items 1+2+3 remain for future ticks + +## Notes for future-Otto + +**Razor-cadence items as backlog rows**: items 1+2+3+5 each merit a backlog row to capture their scope + acceptance criteria. Item 4 was an ad-hoc 12-tick effort that produced the 9-variant taxonomy (load-bearing artifact); items 1+2 (operational form + dialectical-unfalsifiability per rule) need deep prose-reading per rule and likely take 30+ ticks if done manually. Mechanization of items 1+2 is much harder than item 4 (requires LLM-like semantic understanding, not pattern matching). + +**MEMORY.md bloat is recent**: the topic files have well-structured frontmatter; the bloat is in the index where paragraph-length entries duplicate the frontmatter content. Cleanup is mechanical; the audit tool (Phase 2 of B-0517) would prevent recurrence. + +**Compounding mechanization**: PR #3202 (audit-rule-cross-refs.ts) + B-0506 (worktree-prune) + B-0517 (memory-index audit) are all factory-hygiene cadences that could share a single CI workflow. Composing them is an obvious future leverage point.