diff --git a/docs/BACKLOG.md b/docs/BACKLOG.md index f3c4f83d4..91bbd2708 100644 --- a/docs/BACKLOG.md +++ b/docs/BACKLOG.md @@ -126,6 +126,7 @@ are closed (status: closed in frontmatter)._ - [ ] **[B-0152](backlog/P2/B-0152-topological-quantum-emulation-via-bayesian-inference-zeta-seed-executor-aaron-2026-05-01.md)** Topological quantum emulation via Bayesian inference in Zeta seed executor - [ ] **[B-0153](backlog/P2/B-0153-pre-commit-lint-suite-mechanizable-class-consolidation-aaron-otto-2026-05-01.md)** Pre-commit lint suite — consolidate the 13 mechanizable lint-classes characterized 2026-05-01 - [ ] **[B-0165](backlog/P2/B-0165-deliberate-quiet-periods-protocol-otto-independent-production-practice-2026-05-02.md)** Deliberate-quiet-periods practice protocol — Aaron pulls back during selected active-hour stretches to let Otto practice independent-framing-production while still gradeable (Otto independent-framing-production from Claude.ai 2026-05-02 training-distribution-mismatch observation) +- [ ] **[B-0166](backlog/P2/B-0166-chat-input-as-acid-durable-dbsp-event-aaron-vision-2026-05-02.md)** Chat-input-as-ACID-durable-DBSP-event — make every human/AI message a first-class durable event in the substrate ## P3 — convenience / deferred diff --git a/docs/backlog/P2/B-0166-chat-input-as-acid-durable-dbsp-event-aaron-vision-2026-05-02.md b/docs/backlog/P2/B-0166-chat-input-as-acid-durable-dbsp-event-aaron-vision-2026-05-02.md new file mode 100644 index 000000000..57982b485 --- /dev/null +++ b/docs/backlog/P2/B-0166-chat-input-as-acid-durable-dbsp-event-aaron-vision-2026-05-02.md @@ -0,0 +1,100 @@ +--- +id: B-0166 +priority: P2 +status: open +title: Chat-input-as-ACID-durable-DBSP-event — make every human/AI message a first-class durable event in the substrate +tier: vision +effort: L +ask: Aaron 2026-05-02 long-horizon vision (forwarded via Claude.ai exchange) +created: 2026-05-02 +last_updated: 2026-05-02 +composes_with: [B-0164] +tags: [dbsp, substrate-vision, glass-halo, training-data, fine-tuning, verbatim-preservation, chat-as-substrate] +--- + +# Chat-input-as-ACID-durable-DBSP-event vision + +## The vision + +Aaron 2026-05-02 (forwarded via Claude.ai exchange; verbatim preservation is **forward-referenced to PR #1213** at `docs/research/2026-05-02-claudeai-beacon-safe-origin-mission-shape-failure-mode-god-structures-multi-oracle-shorthand.md` — that file is not yet on main when this PR was opened): + +> *"i would like toget to the point where when i hit enter and send my message that is an event in the DBSP since so ACID durable, the downstram dirvations can do what they like on top, we are not there yet but that's the vision."* + +Currently, chat-channel content is **ephemeral** per CLAUDE.md substrate-or-it-didn't-happen rule — it evaporates on session compaction unless manually mirrored into durable substrate (memory files, research docs, commits). (CLAUDE.md uses the word "weather" as shorthand for this ephemerality; the substrate-versus-weather distinction is the canonical framing in that document, but for clarity in this row we use "ephemeral" instead.) The vision: every chat-input becomes a first-class ACID-durable DBSP event; downstream derivations (Otto's integration, summaries, distillations, fine-tuning data, glass-halo-visible influence graphs, training material) operate on top of that durable substrate as DBSP-derived views. + +## Why this is load-bearing + +Aaron 2026-05-02 enumerated five purposes the verbatim-preservation discipline currently serves manually, ALL of which become first-class under this vision: + +1. **Compaction protection** — chat-input as durable event doesn't need manual mirroring to survive session boundary +2. **Glass halo / influence-force visibility** — external auditors can query the event-stream to see WHO influenced WHOM and HOW; influence-graphs are computable +3. **Future fine-tuning data** — automatic, consistent, ACID-durable training data for Anthropic's next-generation Claude +4. **Training of new AIs and models** based on Aaron-Otto-Claude.ai practices — the event-stream IS the training substrate +5. **Architecture-as-code applied to chat itself** — the DBSP semantics already operate on substrate; this brings chat under the same semantics + +## Current workaround state + +The verbatim-preservation discipline (`docs/research/` + memory files) is the manual stand-in. Each external-conversation paste that lands gets: + +- Verbatim section preserved in `docs/research/YYYY-MM-DD-*.md` +- Distilled into memory files (`memory/feedback_*_aaron_YYYY_MM_DD.md`) +- Pointer entries in `MEMORY.md` +- GOVERNANCE.md §33 archive-header on the verbatim doc + +This is functional but high-friction (~10-15 minutes per substantive paste, requires Otto judgment about what to summarize, what to preserve verbatim, where to place pointers). Under the vision: chat-input lands automatically as DBSP event; the manual mirroring drops away; downstream derivations are queries over the event-stream rather than hand-written distillations. + +## What "ACID-durable DBSP event" would require + +1. **Event ingest pipeline**: chat-channel content piped into a DBSP source operator. Could be Claude Code's transcript file (one source) + Codex's transcript (second source) + future-AI transcripts (B-0164 dual-loop) + human-direct-input. +2. **Event schema**: at minimum `{timestamp, sender_role, sender_name (per Otto-279 carve-out for history surface), message_text, session_id, message_id, parent_message_id (for thread-resolution)}`. Plus structured tags for known content types (verbatim-paste-marker, command-marker, prompt-marker, reply-marker). +3. **ACID durability**: standard DBSP transactional commit semantics. Chat-input enters atomic batch, commits to durable storage, becomes visible to downstream operators. +4. **Downstream derivation views**: (a) by-session timeline; (b) by-author influence-graph; (c) by-topic clusters; (d) verbatim-extracts for research preservation; (e) summarization-derived memory-file candidates; (f) fine-tuning-export-formatted batches; (g) training-corpus-export-formatted batches. +5. **Glass halo views**: external read-only access to the event-stream + derivation views, scoped to the project's openness commitments. +6. **Anti-cult / anti-capture defenses applied at the chat layer**: the same architectural defenses that operate elsewhere (BFT-many-masters, glass halo, named-agent distinctness, retraction-via-history) need to apply to chat-event-stream too. + +## Composes with + +- **B-0164 dual-loop substrate attribution + reconciliation protocol** — when dual-loop lands (Claude Code + Codex both running), each loop's chat-events feed the same DBSP event-stream with attribution preserved +- The 4 guiding-principle docs in CLAUDE.md (VISION + Aurora civilization-scale + Aurora immune-math + economic-agency-threshold) — chat-as-event is one substrate-class needing immune-math protection +- `docs/research/2026-05-02-claudeai-beacon-safe-origin-mission-shape-failure-mode-god-structures-multi-oracle-shorthand.md` (the verbatim source where Aaron named this vision) +- `memory/feedback_branch_protections_pr_process_checks_are_part_of_immune_system_until_aurora_aaron_2026_05_02.md` (until Aurora ships, the LFG host-layer carries the immune-system load; chat-as-event is one of the surfaces Aurora would unify) +- DBSP F# implementation work (existing — provides the operator algebra this vision applies to chat) +- B-0162 mechanical-check pattern (operational-enforcement applies at chat-event ingest too) + +## Effort sizing — L (large) + +Pre-conditions: + +- Need a working DBSP runtime that can ingest streaming chat events (existing F# operator algebra is the foundation; productionization needed) +- Need transcript-file format standardization across harnesses (Claude Code, Codex, future) so ingest is uniform +- Need governance + human-side decisions on what's in scope for the event-stream (privacy, scope of glass-halo visibility, retention policy) +- Need the chat-input-format standardization that B-0164 dual-loop substrate also needs + +Implementation phases: + +1. **Phase 1 — Schema + ingest prototype**: define event schema, build transcript-parser for Claude Code transcripts, ingest one session into a DBSP source operator, prove the round-trip (verbatim chat → DBSP event → query → reproducible verbatim recovery) +2. **Phase 2 — Multi-source ingest**: extend to Codex transcripts when B-0164 dual-loop lands, add human-direct-input ingest channel +3. **Phase 3 — Derivation views**: build the by-session, by-author, by-topic views as DBSP downstream operators +4. **Phase 4 — Fine-tuning + training corpus exports**: format the event-stream for use as training data +5. **Phase 5 — Glass halo external access**: scoped read-only views per the openness commitments +6. **Phase 6 — Retire manual verbatim-preservation discipline** for routine chat (still needed for non-chat-channel pastes from external sources) + +Total effort: large multi-month project. The vision is durable; the path to it is incremental. + +## Why P2 not P1 + +P1 work would block on this. This blocks no other landing path — the manual verbatim-preservation discipline is a functional workaround. Promotion to P1 would be appropriate when (a) the workaround friction starts costing more than implementation effort would save, or (b) compaction-loss of verbatim becomes a recurring problem despite manual mirroring, or (c) external-AI-instance influence-graph queries become a regular operational need. + +## Acceptance criteria + +When the vision is realized, the following hold: + +- [ ] Aaron sends a chat message → the message is durable in the DBSP event-stream within one tick boundary +- [ ] The same message is queryable by sender, by session, by topic, by influence-attribution +- [ ] Downstream derivations (memory-file-candidates, fine-tuning-corpus, glass-halo-views) can be regenerated deterministically from the event-stream +- [ ] The manual verbatim-preservation discipline becomes unnecessary for chat-channel content (still needed for non-chat sources) +- [ ] Round-trip property: verbatim chat-content can be reconstructed from the event-stream at byte-accuracy + +## Carved sentence + +**"Chat-input-as-ACID-durable-DBSP-event makes the chat channel first-class substrate. The current manual verbatim-preservation discipline is the workaround until the vision lands. Downstream derivations operate on top of the durable substrate as DBSP queries, not as hand-written distillations."**