Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
---
id: B-0517
priority: P3
status: open
title: "MEMORY.md index bloat cleanup + entry-length enforcement cadence"
tier: substrate-hygiene
effort: M
created: 2026-05-14
last_updated: 2026-05-14
depends_on: []
composes_with: [B-0006]
tags: [memory, MEMORY.md, fast-path, index-hygiene, razor-cadence, user-scope]
type: chore
---

# MEMORY.md index bloat cleanup + entry-length enforcement cadence

## Origin

Otto-CLI 2026-05-14T19:27Z razor-cadence item 5 (MEMORY.md index audit) investigation found:

- **`~/.claude/projects/-Users-acehack-Documents-src-repos-Zeta/memory/MEMORY.md` has grown to 242 lines / 66KB / 237 entries** (the user-scope MEMORY.md, NOT the repo-scope one).
- Cold-boot loads the first 200 lines / 25KB only; **~37 lines (15%) past the cutoff are unreachable at fast-path** for new sessions.
- **Average entry size: 275 chars** — well over the 200-char "one line" guidance in the auto-memory system prompt.
- **10+ entries exceed 500 chars** — each is a paragraph rather than an index line; the bloat is concentrated in recent additions.

## Examples of over-long entries

```
620 chars: [Reverse-engineer Gates physical ECCs for minimalist memory storage...]
613 chars: [Devil + god play within universal logic...]
608 chars: [Dialectical viewpoint = Aaron's natural operation...]
582 chars: [NEVER A CAGE — binding alignment force of attention...]
575 chars: [Constraints as self-binding against acknowledged temptation...]
```

## Why this matters

Per the auto-memory system: "MEMORY.md is always loaded into your conversation context — lines after 200 will be truncated, so keep the index concise." The current state silently drops the most recent ~37 entries from cold-boot visibility. The bloat compounds: each oversized entry's "tail" prose duplicates content that already lives in the topic file's frontmatter `description:` field and body.

## Proposed work

Two phases:

### Phase 1 — bulk cleanup (one-time)

For each entry exceeding 200 chars:

1. Read the underlying topic file
2. Verify the topic file's body + frontmatter `description:` contain the full detail (no information loss from trimming the index)
3. Rewrite the MEMORY.md entry to: `- [Short Title](filename.md) — one-line hook (~50-100 chars).`

Roughly 100-130 entries need trimming. Manageable in 3-5 ticks.

### Phase 2 — mechanized enforcement (cadence)

Add `tools/hygiene/audit-user-scope-memory-index.ts` that:

- Reads the user-scope MEMORY.md
- Reports total line count, byte count, entry count
- Flags entries over 200 chars
- Computes truncation risk (lines past 200)
- Exits 0 always (detect-only); future ticks consume the report

This parallels `tools/hygiene/audit-rule-cross-refs.ts` (PR #3202) — same shape, different surface. Could compose with B-0506 (worktree-prune cadence) as one of several factory-hygiene CI crons.

## Composes with

- B-0006 (memory-md hub compression — prior cleanup work on MEMORY.md)
- `tools/hygiene/audit-rule-cross-refs.ts` (PR #3202 — parallel mechanization)
- Razor-cadence #3128 daily fire (item 5 is the MEMORY.md index audit)
- `encoding-rules-without-mechanizing.md` rule

## Substrate-honest framing

Cleanup is invasive (touching ~130 entries in user-scope memory) but each individual edit is mechanical: replace the long description with the topic file's frontmatter-derived hook. The risk is semantic (does the trim preserve enough triage signal in cold-boot context?). Mitigation: phase 1 is reviewable per-entry; entries can be expanded back if the trim lost a load-bearing signal.

User-scope memory is not git-tracked, so this work doesn't go through PR in the usual sense. The Phase 2 mechanization tool IS git-tracked and follows the standard factory shape.

## Origin tick

`docs/hygiene-history/ticks/2026/05/14/1927Z.md` — this tick's shard documents the empirical observation.
72 changes: 72 additions & 0 deletions docs/hygiene-history/ticks/2026/05/14/1927Z.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# Tick 2026-05-14T19:27Z — razor-cadence item 5 investigated (MEMORY.md index bloat); B-0517 filed

## Refresh (step 1)

- Rate limit: 4762/5000 (healthy).
- 14 open PRs. Mine: #3202 (mechanization tool), #3203 (1925Z shard) — both wait-ci, autoMerge armed.

## Holding discipline (step 2)

Razor-cadence item 4 done + mechanized. Pick next priority per never-be-idle ladder. Razor-cadence #3128 has 4 remaining items; item 5 (MEMORY.md index audit) is similar in shape to item 4 — partly mechanizable, tractable per-tick.

## Speculative work (step 3) — razor-cadence item 5 investigation

### Findings

User-scope `MEMORY.md` (`~/.claude/projects/-Users-acehack-Documents-src-repos-Zeta/memory/MEMORY.md`) state:

| Metric | Value |
|--------|-------|
| Lines | 242 |
| Bytes | 66,227 (~66KB) |
| Entries | 237 |
| Average entry size | 275 chars |
| Entries > 200 chars | ~100-130 |
| Cold-boot truncation cutoff | line 200 / 25KB |
| Lines unreachable at fast-path | ~37 (15%) |

### Root cause

Per the auto-memory system prompt: "Keep index entries to one line under ~200 chars; move detail into topic files." Recent additions have been adding paragraph-length entries with detail that's ALSO in the topic file's frontmatter `description:` field — duplicated content blocking the most recent ~37 entries from cold-boot fast-path.

Top 5 over-long entries (>500 chars each):

- 620 chars: "Reverse-engineer Gates physical ECCs for minimalist memory storage..."
- 613 chars: "Devil + god play within universal logic..."
- 608 chars: "Dialectical viewpoint = Aaron's natural operation..."
- 582 chars: "NEVER A CAGE — binding alignment force of attention..."
- 575 chars: "Constraints as self-binding against acknowledged temptation..."

## Verify (step 4)

Verified topic file (`feedback_aaron_reverse_engineer_gates_*.md`) has the full detail in its frontmatter `description:` and body — trimming the MEMORY.md index entry does NOT lose information.

Filed **B-0517** at `docs/backlog/P3/B-0517-memory-md-index-bloat-cleanup-cadence-2026-05-14.md` documenting:

- The empirical state
- Phase 1 (one-time bulk cleanup of ~130 entries)
- Phase 2 (mechanization via `tools/hygiene/audit-user-scope-memory-index.ts`)
- Composition with B-0006 + PR #3202 + razor-cadence #3128

## Shard (step 5)

This file.

## CronList (step 6)

Sentinel `f970cb2d` armed.

## Visibility (step 7)

- **Razor-cadence item 5 finding**: MEMORY.md fast-path is silently dropping ~15% of entries due to index bloat
- **B-0517 filed** with two-phase plan: bulk cleanup + audit-tool mechanization
- **Mine in-flight**: #3202 (mechanization tool, wait-ci armed); #3203 (prior shard, wait-ci armed)
- **Razor-cadence #3128 status**: item 4 complete + mechanized; item 5 captured as B-0517; items 1+2+3 remain for future ticks

## Notes for future-Otto

**Razor-cadence items as backlog rows**: items 1+2+3+5 each merit a backlog row to capture their scope + acceptance criteria. Item 4 was an ad-hoc 12-tick effort that produced the 9-variant taxonomy (load-bearing artifact); items 1+2 (operational form + dialectical-unfalsifiability per rule) need deep prose-reading per rule and likely take 30+ ticks if done manually. Mechanization of items 1+2 is much harder than item 4 (requires LLM-like semantic understanding, not pattern matching).

**MEMORY.md bloat is recent**: the topic files have well-structured frontmatter; the bloat is in the index where paragraph-length entries duplicate the frontmatter content. Cleanup is mechanical; the audit tool (Phase 2 of B-0517) would prevent recurrence.

**Compounding mechanization**: PR #3202 (audit-rule-cross-refs.ts) + B-0506 (worktree-prune) + B-0517 (memory-index audit) are all factory-hygiene cadences that could share a single CI workflow. Composing them is an obvious future leverage point.
Loading