From 47ec0005b2d1ac54b8bc35579e982f4851bdf228 Mon Sep 17 00:00:00 2001
From: Aaron Stainback <aaron_bond@yahoo.com>
Date: Thu, 23 Apr 2026 20:48:39 -0400
Subject: [PATCH 1/3] =?UTF-8?q?research:=20memory=20reconciliation=20algor?=
 =?UTF-8?q?ithm=20=E2=80=94=20v0=20design=20(Amara=20Determinize=20L-effor?=
 =?UTF-8?q?t=20item)?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Amara's 4th ferry (PR #221 absorb) centerpiece proposal: replace
hand-maintained CURRENT-*.md distillations with generated views
over typed memory facts. Her sketch was ~40 lines of Python;
this is the design that downstream implementation follows.

~380 lines covering:

- MemoryFact record schema (id / subject / predicate / object /
  source_kind / source_path / source_anchor / timestamp_utc /
  supersedes / priority / status / confidence / tags)
- 6 schema invariants (at-most-one-active-per-canonical-key +
  monotone-timestamps-on-chain + retraction-leaves-trail + ...)
- Canonical-key normalization rules (7 apply; 3 deliberately
  NOT applied to preserve distinctions)
- Reconciliation pseudocode (group by canonical key, detect
  conflicts, follow supersession chains)
- Conflict output format → CONTRIBUTOR-CONFLICTS.md rows
- Rendering rules for CURRENT-<maintainer>.md + MEMORY.md
- 5-phase incremental migration (schema adoption → generator
  prototype → mechanical backfill → cutover → LLM extraction)
- CI integration hooks composing with rows #58, #59, #12
- Worked examples (MF-2026-04-23-001 "Aaron endorses
  deterministic reconciliation"; MF-2026-04-23-004 "Aaron
  grants full GitHub access")
- 5 open questions for Phase 1 PR design decisions

Composes with:

- Otto-73 retractability-by-design foundation — MemoryFact
  status (active / superseded / retracted) is the retraction-
  native primitive at the memory substrate
- PR #222 decision-proxy-evidence — consulted_memory_ids
  can now reference MemoryFact.id directly
- PR #225 memory-reference-existence CI (row #59) — generated
  output preserves the invariant by construction
- Zeta's ZSet algebra — MemoryFact records ARE Z-set entries
  at the memory layer; same primitive, different surface

Addresses MEMORY.md cap-drift (Otto-70 snapshot-tool
surfaced 58842 bytes vs. 24976-byte cap): a generated
index can be bounded by construction (top-N most-recent,
archive the rest).

Not implementation. Research doc only. Downstream arc:
schema adoption (S) → generator prototype off-CI (S-M) →
mechanical backfill (M) → cutover with retractability (M) →
LLM-assisted extraction (L research).

Amara Determinize-stage: 3/5 (with this PR).
  ✓ Live-state-before-policy (PR #224)
  ✓ Memory reference-existence lint (PR #225)
  ✓ Memory reconciliation algorithm design (this PR)
  Remaining:
  - Generated CURRENT-*.md views (L; this doc's Phase 2)
  - Memory duplicate-title lint enforcement (partial via
    AceHack PR #12; graduates via batch-sync)

Per Aaron Otto-73 retractability foundation: the design
itself embodies the thesis — supersession + status +
retraction make the memory layer's reconciliation
deterministic, same primitive as Zeta's data layer.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 ...onciliation-algorithm-design-2026-04-24.md | 476 ++++++++++++++++++
 1 file changed, 476 insertions(+)
 create mode 100644 docs/research/memory-reconciliation-algorithm-design-2026-04-24.md
diff --git a/docs/research/memory-reconciliation-algorithm-design-2026-04-24.md b/docs/research/memory-reconciliation-algorithm-design-2026-04-24.md
new file mode 100644
index 00000000..83c53ab0
--- /dev/null
+++ b/docs/research/memory-reconciliation-algorithm-design-2026-04-24.md
@@ -0,0 +1,476 @@
+# Memory reconciliation algorithm — design v0
+
+**Date:** 2026-04-24
+**Status:** research proposal; v0 design ready for review + incremental implementation
+**Stage:** Amara Determinize (L-effort item per PR #221 absorb)
+**Companion:** Otto-73 retractability-by-design foundation memory
+**Implementation arc:** this doc is design-only; implementation lands as separate PRs (schema adoption → migration tooling → generation tool → CI integration) across multiple rounds
+
+---
+
+## Why this exists
+
+Amara's 4th courier ferry (PR #221 absorb) proposed replacing
+hand-maintained prose-based `CURRENT-aaron.md` / `CURRENT-amara.md`
+distillations with **generated views over typed memory facts**.
+
+Her sketch was a ~40-line Python prototype. This doc is the
+design that downstream implementation follows: schema
+semantics, normalization rules, conflict detection, rendering,
+migration path from the existing prose corpus.
+
+The design also addresses the MEMORY.md cap-drift surfaced by
+Otto-70's snapshot-pinning tool (58842 bytes vs. 24976-byte
+cap per FACTORY-HYGIENE row #11). A generated index can be
+bounded by construction (emit top-N most-relevant, archive
+the rest).
+
+Composes with "deterministic reconciliation" naming (Otto-67
+endorsement): this IS the concrete reconciliation mechanism
+for the memory layer. Also composes with Zeta's retraction-
+native algebra — `MemoryFact` records with explicit
+supersession + retraction status mirror Z-set algebraic
+semantics at the memory substrate.
+
+---
+
+## Scope
+
+### In scope
+
+- Typed `MemoryFact` record schema (fields + invariants)
+- Canonical-key normalization rules (what makes two facts
+  "about the same thing")
+- Priority / supersession / status semantics
+- Conflict detection + surfacing
+- Generated rendering rules for `CURRENT-<maintainer>.md`
+  and `MEMORY.md` index
+- Migration path from existing prose memories
+- CI integration hooks
+
+### Out of scope (future work)
+
+- Actual implementation language + tool (Python, F#, shell —
+  later decision; design is language-agnostic)
+- Full backfill of the 391 existing per-user memories +
+  44 in-repo memories into typed records
+- LLM-based fact extraction (if needed for prose-to-fact
+  migration — separate research arc)
+- Multi-maintainer consensus protocols (today: one
+  human maintainer + AI maintainers. Cross-human
+  consensus can be added when roster grows)
+
+### Guardrail principles
+
+- **Don't rewrite prior prose memories.** They're source-
+  of-truth for the facts they encode; typed records
+  extract facts FROM them, don't replace them.
+- **Retractions leave trails.** Supersession is explicit +
+  dated; no silent rewrite. Honors Otto-73 retractability-
+  by-design discipline.
+- **Generated views are DERIVED, not authoritative.**
+  `CURRENT-*.md` and `MEMORY.md` become generated; the
+  typed fact corpus is the source of truth.
+- **Migration is incremental.** Land the schema first;
+  backfill mechanically where possible; retain prose for
+  facts too rich to compress.
+
+---
+
+## Schema — `MemoryFact` record
+
+### Fields
+
+| Field | Type | Required | Semantics |
+|---|---|---|---|
+| `id` | string | yes | Globally unique fact ID (e.g., `MF-2026-04-23-001`) |
+| `subject` | string | yes | Who the fact is about: `aaron` / `amara` / `otto` / `kenji` / ... / `any` (factory-generic) |
+| `predicate` | string | yes | Normalized verb: `prefers` / `delegates` / `forbids` / `endorses` / `retracted` / `supersedes` / ... |
+| `object` | string | yes | Normalized claim text |
+| `source_kind` | enum | yes | `memory` / `current` / `decision` / `backlog` / `conflict` / `verbatim-quote` |
+| `source_path` | string | yes | File path the fact was extracted from |
+| `source_anchor` | string | optional | Line number, section header, or hash for citation |
+| `timestamp_utc` | ISO8601 | yes | When the fact was authored (not when extracted) |
+| `supersedes` | string | optional | ID of fact this one supersedes (one-to-one) |
+| `priority` | int | yes | Explicit override > current view > memory > archive (4 > 3 > 2 > 1) |
+| `status` | enum | yes | `active` / `retracted` / `superseded` |
+| `confidence` | enum | optional | `verbatim` / `paraphrase` / `inference` — how tight the extraction is |
+| `tags` | list[string] | optional | Cross-cutting tags: `principle`, `authorization`, `register`, `ops`, `naming`, etc. |
+
+### Invariants
+
+1. `(subject, predicate, canonical_key(object))` is the
+   canonical key. Multiple facts with the same canonical
+   key form a version chain.
+2. At most one fact per canonical key has `status: active`
+   at any given time. Others are `superseded` or `retracted`.
+3. `supersedes` is a single-step back-pointer. Chain
+   traversal: follow `supersedes` until null.
+4. `timestamp_utc` is monotone along a supersession chain
+   (newer supersedes older).
+5. `retracted` status implies `supersedes` is set to the
+   previously-active fact (retraction creates a new
+   record, not an in-place edit).
+6. `priority` breaks ties only among simultaneously-
+   active facts (shouldn't happen under invariant 2 but
+   provides a deterministic fallback).
+
+### Canonical-key normalization
+
+`canonical_key(object)` collapses minor variations so
+facts-about-the-same-thing chain cleanly.
+
+Rules (applied in order):
+
+1. Lowercase all characters
+2. Replace whitespace sequences with single space
+3. Strip leading/trailing whitespace
+4. Remove markdown emphasis markers (`**`, `*`, `_`, backticks)
+5. Normalize smart quotes (`"` / `"` / `'` / `'`) to plain
+   ASCII (`"` / `'`)
+6. Collapse repeated punctuation (`!!!` → `!`)
+7. Strip trailing punctuation (`.`, `!`, `?`, `;`, `,`)
+
+Rules NOT applied (preserve these distinctions):
+
+- Word order — "Aaron prefers X" ≠ "X is Aaron's preference"
+  (different canonical keys; handle via separate fact
+  extraction, not normalization)
+- Synonyms — "like" vs. "prefer" (lexically distinct;
+  collapsing requires LLM-assisted normalization,
+  out of scope for v0)
+- Tense — "Aaron prefers X" vs. "Aaron preferred X"
+  (different tense = different time; preserve)
+
+### Example records
+
+```yaml
+- id: MF-2026-04-23-001
+  subject: aaron
+  predicate: endorses
+  object: deterministic reconciliation as canonical phrasing for operational closure
+  source_kind: memory
+  source_path: memory/feedback_deterministic_reconciliation_endorsed_naming_for_closure_gap_not_philosophy_gap_2026_04_23.md
+  timestamp_utc: 2026-04-23T20:45:00Z
+  supersedes: null
+  priority: 3
+  status: active
+  confidence: verbatim
+  tags: [naming, principle, vocabulary]
+
+- id: MF-2026-04-23-004
+  subject: aaron
+  predicate: grants
+  object: full GitHub access for AceHack + LFG, only restriction is don't increase spending without asking
+  source_kind: memory
+  source_path: memory/feedback_aaron_full_github_access_authorization_all_acehack_lfg_only_restriction_no_spending_increase_2026_04_23.md
+  timestamp_utc: 2026-04-23T21:30:00Z
+  supersedes: MF-2026-04-23-002   # superseding the prior Otto-23 partial grant
+  priority: 3
+  status: active
+  confidence: verbatim
+  tags: [authorization, standing, github]
+```
+
+---
+
+## Reconciliation algorithm
+
+Pseudocode (language-agnostic):
+
+```
+function reconcile(facts):
+  # Group by canonical key
+  by_key = {}
+  for f in facts:
+    k = (f.subject, f.predicate, canonical_key(f.object))
+    by_key[k].append(f)
+
+  # Per-key: pick the winner, detect conflicts
+  accepted = {}
+  conflicts = []
+  for key, group in by_key.items():
+    active = [f for f in group if f.status == "active"]
+    if len(active) == 0:
+      continue  # all retracted/superseded
+    if len(active) > 1:
+      # multiple active with same key = invariant-2 violation
+      winner = max(active, key=lambda f: (f.priority, f.timestamp_utc))
+      conflicts.append(ConflictRow(key, active, winner=winner))
+      accepted[key] = winner
+    else:
+      accepted[key] = active[0]
+
+  # Check version-chain consistency
+  for key, f in accepted.items():
+    chain = follow_supersession(f, by_key[key])
+    if chain_broken(chain):
+      conflicts.append(ConflictRow(key, chain, reason="broken chain"))
+
+  return accepted, conflicts
+```
+
+### Conflict outputs
+
+Each conflict becomes a row in `docs/CONTRIBUTOR-CONFLICTS.md`
+(the file Amara's 4th ferry noted is empty but should be used).
+Row format:
+
+```markdown
+### CONF-<YYYY-MM-DD>-<NNN>: <subject> / <predicate>
+- **Canonical key:** `<subject>::<predicate>::<normalized-object>`
+- **Conflicting facts:** [MF-..., MF-...]
+- **Winner (priority tiebreak):** MF-...
+- **Reason:** invariant-2 violation | broken chain | explicit disagreement
+- **Resolution:** pending | explicit-preference-recorded | escalated
+- **Resolution evidence:** <DP-NNN.yaml ref if proxy-reviewed>
+```
+
+Conflicts block the `CURRENT-*.md` generation if unresolved
+— this is the "explicit-not-silent" discipline Amara
+emphasized. A CI run that discovers unresolved conflicts
+fails the generation job.
+
+---
+
+## Rendering rules
+
+### `CURRENT-<maintainer>.md` generation
+
+Filter accepted facts by subject (`<maintainer>` or `any`),
+sort by `(priority DESC, timestamp DESC)`, group by
+`predicate`, render as markdown:
+
+```markdown
+# CURRENT-<maintainer>.md — generated
+
+**Last generated:** <ISO8601 UTC>
+**Source corpus:** <N facts from memory/ + <M> facts from docs/>
+**Conflicts pending:** <K>
+
+---
+
+## <predicate>
+
+- **<object>** — source: [<memory>](<source_path>), <timestamp>
+- ...
+```
+
+Header states generation-time + source-corpus-size +
+pending-conflict-count. The generator may refuse to emit
+if `conflicts_pending > 0` and `--allow-conflicts` is not
+set.
+
+### `MEMORY.md` index generation
+
+Accept facts where `source_kind == "memory"`; emit
+newest-first list of `(source_path, first-sentence-of-object, tags)`
+tuples. Cap at configurable size (default: 250 entries or 30KB,
+whichever smaller — matches the FACTORY-HYGIENE row #11 cap with
+headroom).
+
+Older entries move to dated archive files
+`memory/MEMORY-ARCHIVE-YYYY-MM.md`. Ordering + link integrity
+preserved across the archive boundary.
+
+---
+
+## Migration path from existing prose corpus
+
+### Phase 1 — Schema adoption + worked example (S)
+
+- Land this research doc (current PR)
+- Create `memory/facts/` directory seeded with 5-10
+  manually-authored `MemoryFact` records as worked
+  examples (e.g., the "Aaron endorses deterministic
+  reconciliation" record shown above)
+- Keep existing prose memories unchanged
+
+### Phase 2 — Generator prototype, off-CI (S-M)
+
+- Implement `tools/memory/reconcile.py` (or equivalent)
+  reading `memory/facts/*.yaml` + emitting
+  `memory/CURRENT-<maintainer>.md.generated` +
+  `memory/MEMORY.md.generated` (parallel output, not
+  replacing existing files yet)
+- Land the tool + a research doc comparing generated
+  output against current hand-maintained files
+- Do NOT overwrite existing files in this phase
+
+### Phase 3 — Mechanical backfill (M)
+
+- For each existing prose memory, extract 1-5
+  `MemoryFact` records mechanically (parse frontmatter
+  `description` + `verbatim` quotes)
+- Human-maintainer spot-check of backfill quality
+- Cross-link: typed records cite their source prose
+  memory via `source_path`
+
+### Phase 4 — Cutover with retractability (M)
+
+- Move existing hand-maintained `CURRENT-*.md` to
+  archive (`CURRENT-aaron-archive-2026-04.md`);
+  retractability preserves the old versions
+- Cutover the root `CURRENT-aaron.md` / `CURRENT-amara.md`
+  to generated output
+- Same for `MEMORY.md`
+- CI integration: fail if generated output drifts from
+  expected; conflict rows block generation
+
+### Phase 5 — Richer LLM-assisted extraction (L, research)
+
+- Use an LLM pass to extract additional facts from
+  prose that the mechanical parser missed
+- Careful review discipline — not auto-merge; human
+  + peer review for each LLM extraction pass
+- Establishes a richer fact-count; may surface additional
+  conflicts
+
+---
+
+## CI integration hooks
+
+### Existing surfaces this composes with
+
+- FACTORY-HYGIENE row #58 (memory-index-integrity CI) —
+  same-commit pairing of memory changes + MEMORY.md
+  updates. Generated MEMORY.md preserves this invariant
+  by construction; CI stays green.
+- FACTORY-HYGIENE row #59 (memory-reference-existence) —
+  link targets must resolve. Generated output can be
+  validated by the same tool; CI stays green.
+- AceHack PR #12 (memory-index-duplicates) — no duplicate
+  link targets. Generated output deduplicates by
+  construction; CI stays green.
+- PR #222 decision-proxy-evidence — `consulted_memory_ids`
+  can now reference `MemoryFact.id` directly for
+  tighter audit.
+
+### New CI hook for this work
+
+- `memory-reconcile-generation.yml` — on PR touching
+  `memory/facts/*.yaml` or the generator, re-run
+  generation; fail if generated output ≠ committed
+  output (similar to OpenAPI-spec-diff style check).
+
+### Ordering of hooks
+
+1. memory-index-integrity (row #58) — same-commit
+2. memory-reference-existence (row #59) — refs resolve
+3. memory-index-duplicates (AceHack #12) — no dups
+4. memory-reconcile-generation (new) — generated output
+   matches committed
+5. memory-reconcile-conflict-check (new) — no unresolved
+   conflicts
+
+Steps 4 + 5 are future work; 1-3 already cover the
+prose-layer invariants.
+
+---
+
+## Relationship to existing substrate
+
+### With Otto-73 retractability-by-design
+
+The `MemoryFact.status` field (active / superseded /
+retracted) is exactly the retraction-native primitive at
+the memory substrate. Each record is a signed delta;
+supersession chains encode history; the reconciliation
+algorithm is a deterministic fold over the deltas.
+Zeta's ZSet algebra applied to memory.
+
+### With Amara's 4 ferries
+
+Amara's 4th ferry explicitly proposed this algorithm;
+earlier ferries established the drift classes it
+addresses:
+
+- Otto-24 (PR #196) operational gap — memory-index lag
+  (NSA-001) now captured as canonical-key conflict
+  in the fact corpus
+- Otto-54 (PR #211) ZSet semantics — the algebraic
+  framework (Z-sets + retraction) that this memory
+  schema inherits
+- Otto-59 (PR #219) decision-proxy technical review —
+  `consulted_memory_ids` field needs stable memory IDs;
+  MemoryFact.id provides them
+- Otto-67 (PR #221) memory drift alignment — this is
+  the concrete algorithm her report proposed
+
+### With Zeta's core algebra
+
+`MemoryFact` records ARE Z-set entries at the memory
+layer:
+
+- `(subject, predicate, canonical_key(object))` = the Z-set
+  key
+- Priority + status + timestamp = the "weight" dimension
+  (non-integer; resembles signed-delta semantics)
+- Reconciliation = the `distinct` operator at the
+  memory level, clamping to at-most-one-active per key
+- Conflict detection = invariant violation surfacing
+  (the same discipline Zeta's algebra-owner enforces
+  for the code layer)
+
+This is not coincidence. Aaron's Otto-73 thesis:
+retractability is design at every layer of the factory.
+This doc operationalizes it at the memory layer.
+
+---
+
+## What this design is NOT
+
+- **Not a commitment to one implementation language.**
+  Python, F#, shell — later decision. Design is
+  language-agnostic.
+- **Not a requirement to migrate all 391 existing
+  per-user memories at once.** Incremental backfill,
+  prose retained as source-of-truth.
+- **Not authorization to overwrite existing
+  CURRENT-*.md files.** Cutover is Phase 4; earlier
+  phases generate `.generated` companions.
+- **Not a commitment to LLM-assisted extraction.**
+  Phase 5 is research-grade; manual + mechanical
+  parsing covers the main backfill.
+- **Not a replacement for decision-proxy-evidence
+  records.** Evidence records capture per-decision
+  context; MemoryFacts capture long-lived claims.
+  Different surfaces; they compose via ID references.
+- **Not a retraction of prose memory discipline.**
+  Prose stays; it's the source material from which
+  typed records extract. The factory's thought-layer
+  continues in prose.
+
+---
+
+## Open questions for follow-up rounds
+
+1. **Language choice** — Python (Amara's prototype),
+   F# (consistent with Zeta), shell (matches existing
+   tools/hygiene/ pattern)?
+2. **Facts directory location** — `memory/facts/` under
+   the existing memory tree, or separate surface?
+3. **Conflict-row automation boundary** — CI-generated
+   rows, or human-required fields for resolution?
+4. **Archive boundary policy** — date-based (>90 days),
+   count-based (keep 250 most-recent), relevance-scored
+   (keep most-cited), or hybrid?
+5. **Extraction granularity for mechanical backfill** —
+   one fact per memory frontmatter, or mine the body
+   for multi-fact patterns?
+
+These are Phase 1 PR design decisions, not blockers for
+the research-doc approval.
+
+---
+
+## Attribution
+
+Amara (external AI maintainer) proposed the algorithm
+Otto-67 (PR #221 ferry). Otto (loop-agent PM hat,
+Otto-74) authored this design doc. Aaron's Otto-73
+retractability-by-design insight grounds the schema's
+supersession semantics. Kenji (Architect) queued for
+synthesis on Phase 1 scope. Downstream implementation
+follows this design across multiple PRs on the Amara
+Determinize + Govern + Assure roadmap.

From 6f895f1042da09e4ff9513312076d6800244dcb6 Mon Sep 17 00:00:00 2001
From: Aaron Stainback <aaron_bond@yahoo.com>
Date: Sat, 25 Apr 2026 01:08:41 -0400
Subject: [PATCH 2/3] =?UTF-8?q?drain(#226=20P0+P1=C3=972+P2=C3=973=20Codex?=
 =?UTF-8?q?):=20retraction=20semantics=20+=20cap=20consistency=20+=20smart?=
 =?UTF-8?q?-quote=20+=20pseudocode=20init=20+=20present-with-schema?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Six substantive Codex findings on memory-reconciliation algorithm doc:

P0 (line 202) — retraction semantics inconsistency:
reconcile() filtered by status == 'active' which masked the
intent. Added explicit retraction-semantics docstring:
- Facts transition via explicit FactRetracted / FactSuperseded
  events; never deleted, only marked.
- reconcile() ignores retracted/superseded for liveness but
  STILL considers them when checking version-chain integrity.
- Updated chain check to operate over ALL facts in the group
  (including retracted/superseded), not just active ones —
  chain integrity needs the full history.

P1 (line 187) — stable fact identity vs grouping key:
Distinguished fact ID (stable identity, unique) from
(subject, predicate, canonical_key) grouping tuple (which
multiple facts can share under invariant-2's collision
case). Comment makes the distinction explicit.

P1 (line 270) — MEMORY.md cap inconsistency:
Default 30KB exceeded FACTORY-HYGIENE row #11 cap (24,976
bytes). Updated to 24,000 bytes — strictly under the hard
cap with ~1KB headroom for header/annotation overhead.

P2 (line 130) — smart-quote example ambiguous:
Both sides showed plain ASCII ('"' / "'"). Replaced with
explicit Unicode codepoint references (U+201C/D for
double, U+2018/9 for single) so the rule is unambiguous
in plain-ASCII source.

P2 (line 186) — pseudocode by_key[k] used before init:
Switched to defaultdict(list); added a comment noting the
equivalence to 'if k not in by_key: by_key[k] = []' for
non-Python implementers.

P2 (line 216) — CONTRIBUTOR-CONFLICTS.md 'empty' wording:
File is present and contains a schema; just unpopulated.
Updated text to 'present-with-schema-but-unpopulated; this
design starts populating it via the generator'.
---
 ...onciliation-algorithm-design-2026-04-24.md | 45 ++++++++++++++-----
 1 file changed, 33 insertions(+), 12 deletions(-)

diff --git a/docs/research/memory-reconciliation-algorithm-design-2026-04-24.md b/docs/research/memory-reconciliation-algorithm-design-2026-04-24.md
index 83c53ab0..01120a83 100644
--- a/docs/research/memory-reconciliation-algorithm-design-2026-04-24.md
+++ b/docs/research/memory-reconciliation-algorithm-design-2026-04-24.md
@@ -126,8 +126,9 @@ Rules (applied in order):
 2. Replace whitespace sequences with single space
 3. Strip leading/trailing whitespace
 4. Remove markdown emphasis markers (`**`, `*`, `_`, backticks)
-5. Normalize smart quotes (`"` / `"` / `'` / `'`) to plain
-   ASCII (`"` / `'`)
+5. Normalize smart/curly quotes (left-double U+201C, right-
+   double U+201D, left-single U+2018, right-single U+2019)
+   to plain ASCII straight quotes (`"` and `'`)
 6. Collapse repeated punctuation (`!!!` → `!`)
 7. Strip trailing punctuation (`.`, `!`, `?`, `;`, `,`)
 
@@ -180,19 +181,34 @@ Pseudocode (language-agnostic):
 
 ```
 function reconcile(facts):
-  # Group by canonical key
-  by_key = {}
+  # Group by canonical key. Use defaultdict(list) so the
+  # first append() initialises the bucket; equivalent to
+  # `if k not in by_key: by_key[k] = []` then append.
+  by_key = defaultdict(list)
   for f in facts:
+    # Stable fact identity is (id) — fact-IDs are unique.
+    # The (subject, predicate, canonical_key(object)) tuple
+    # is the *grouping* key (multiple distinct facts may
+    # share it under invariant #2's collision case below);
+    # do NOT confuse the two.
     k = (f.subject, f.predicate, canonical_key(f.object))
     by_key[k].append(f)
 
-  # Per-key: pick the winner, detect conflicts
+  # Per-key: pick the winner, detect conflicts.
   accepted = {}
   conflicts = []
   for key, group in by_key.items():
+    # Retraction semantics: a fact is "live" if its
+    # latest version (by supersession chain + timestamp)
+    # has status == "active". Status transitions to
+    # "retracted" or "superseded" via explicit
+    # FactRetracted / FactSuperseded events; we never
+    # delete records, only mark them. The reconcile()
+    # filter below ignores retracted/superseded forms but
+    # still considers them when checking chain integrity.
     active = [f for f in group if f.status == "active"]
     if len(active) == 0:
-      continue  # all retracted/superseded
+      continue  # all retracted/superseded — key not live
     if len(active) > 1:
       # multiple active with same key = invariant-2 violation
       winner = max(active, key=lambda f: (f.priority, f.timestamp_utc))
@@ -201,7 +217,9 @@ function reconcile(facts):
     else:
       accepted[key] = active[0]
 
-  # Check version-chain consistency
+  # Check version-chain consistency over ALL facts in the
+  # group (including retracted/superseded), not just active
+  # ones — chain integrity needs the full history.
   for key, f in accepted.items():
     chain = follow_supersession(f, by_key[key])
     if chain_broken(chain):
@@ -213,7 +231,9 @@ function reconcile(facts):
 ### Conflict outputs
 
 Each conflict becomes a row in `docs/CONTRIBUTOR-CONFLICTS.md`
-(the file Amara's 4th ferry noted is empty but should be used).
+(the file Amara's 4th ferry noted is present-with-schema-but-
+unpopulated; this design starts populating it via the
+generator).
 Row format:
 
 ```markdown
@@ -264,10 +284,11 @@ set.
 ### `MEMORY.md` index generation
 
 Accept facts where `source_kind == "memory"`; emit
-newest-first list of `(source_path, first-sentence-of-object, tags)`
-tuples. Cap at configurable size (default: 250 entries or 30KB,
-whichever smaller — matches the FACTORY-HYGIENE row #11 cap with
-headroom).
+newest-first list of `(source_path, first-sentence-of-object,
+tags)` tuples. Cap at configurable size (default: 250 entries
+or 24,000 bytes — strictly under the FACTORY-HYGIENE row #11
+24,976-byte hard cap, with ~1KB headroom for any header /
+index annotations the generator writes around the entry list).
 
 Older entries move to dated archive files
 `memory/MEMORY-ARCHIVE-YYYY-MM.md`. Ordering + link integrity

From 32721c01b8c69be7dda056275bed5485b4c4bce0 Mon Sep 17 00:00:00 2001
From: Aaron Stainback <aaron_bond@yahoo.com>
Date: Sat, 25 Apr 2026 01:18:41 -0400
Subject: [PATCH 3/3] drain(#226 P1+P2 Codex): chain-head liveness +
 chain-integrity for retired groups
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

P1 (line 210) — chain-HEAD liveness, not 'any active in group':
The reconcile filter marked a key live whenever any record
in the group had status==active. That's wrong — a key with
active(t=1) → retracted(t=2) has an earlier active record
but the HEAD of the supersession chain is retracted, so the
key is not live. Fix: `follow_supersession_to_head(group)`
walks supersedes-pointers to find the most-recent record;
liveness keyed on its status == active.

P2 (line 224) — chain integrity for fully retired groups:
The chain-integrity check looped over `accepted.items()`,
which only included keys with at least one active record.
Retired groups (all members retracted/superseded) could
have broken chains and we'd silently miss them. Fix: loop
over `by_key.items()` (all groups, including fully retired
ones). Chain integrity is independent of liveness.
---
 ...onciliation-algorithm-design-2026-04-24.md | 52 +++++++++++--------
 1 file changed, 30 insertions(+), 22 deletions(-)

diff --git a/docs/research/memory-reconciliation-algorithm-design-2026-04-24.md b/docs/research/memory-reconciliation-algorithm-design-2026-04-24.md
index 01120a83..092bd8f5 100644
--- a/docs/research/memory-reconciliation-algorithm-design-2026-04-24.md
+++ b/docs/research/memory-reconciliation-algorithm-design-2026-04-24.md
@@ -198,30 +198,38 @@ function reconcile(facts):
   accepted = {}
   conflicts = []
   for key, group in by_key.items():
-    # Retraction semantics: a fact is "live" if its
-    # latest version (by supersession chain + timestamp)
-    # has status == "active". Status transitions to
+    # Retraction semantics: a key is "live" if the HEAD
+    # of its supersession chain has status == "active".
+    # The chain head — not "any active record in the
+    # group" — determines liveness, because a key with
+    # active(t=1) → retracted(t=2) is NOT live (head is
+    # retracted) even though an earlier active record
+    # exists in the group. Status transitions to
     # "retracted" or "superseded" via explicit
     # FactRetracted / FactSuperseded events; we never
-    # delete records, only mark them. The reconcile()
-    # filter below ignores retracted/superseded forms but
-    # still considers them when checking chain integrity.
-    active = [f for f in group if f.status == "active"]
-    if len(active) == 0:
-      continue  # all retracted/superseded — key not live
-    if len(active) > 1:
-      # multiple active with same key = invariant-2 violation
-      winner = max(active, key=lambda f: (f.priority, f.timestamp_utc))
-      conflicts.append(ConflictRow(key, active, winner=winner))
-      accepted[key] = winner
-    else:
-      accepted[key] = active[0]
-
-  # Check version-chain consistency over ALL facts in the
-  # group (including retracted/superseded), not just active
-  # ones — chain integrity needs the full history.
-  for key, f in accepted.items():
-    chain = follow_supersession(f, by_key[key])
+    # delete records, only mark them.
+    chain_head = follow_supersession_to_head(group)
+    if chain_head is not None and chain_head.status == "active":
+      # Multiple active records that all map to the same
+      # canonical key (invariant-2 violation) surface as a
+      # ConflictRow; chain head is the winner.
+      siblings_active = [f for f in group
+                         if f.status == "active"
+                         and f.id != chain_head.id]
+      if siblings_active:
+        conflicts.append(ConflictRow(
+          key, [chain_head, *siblings_active], winner=chain_head))
+      accepted[key] = chain_head
+    # else: key is fully retired (chain head retracted or
+    # superseded with no successor). Don't mark live;
+    # chain integrity is still validated below.
+
+  # Check version-chain consistency over ALL grouped keys
+  # — including those whose chain head is retracted or
+  # superseded — not just `accepted`. Chain integrity is
+  # a property of the history, independent of liveness.
+  for key, group in by_key.items():
+    chain = follow_supersession_full(group)
     if chain_broken(chain):
       conflicts.append(ConflictRow(key, chain, reason="broken chain"))