Lucent-Financial-Group · AceHack · May 3, 2026 · May 3, 2026 · May 3, 2026 · May 3, 2026
diff --git a/docs/research/2026-05-03-substrate-discovery-zeta-native-aot-scoping.md b/docs/research/2026-05-03-substrate-discovery-zeta-native-aot-scoping.md
@@ -48,10 +48,209 @@ default:
    Edge-runner discipline (the human maintainer 2026-05-03)
    says ship the dogfood.
 
-Alternatives considered + rejected: TS + sqlite-vec/DuckDB
-(faster but doesn't dogfood); live-off-the-land via Skill
-router + grep (punts architecture); hybrid TS+Zeta (two
-systems, more complexity).
+**Updated 2026-05-03** (post-#1385 merge corrections from
+the human maintainer). Two epistemic-discipline corrections
+re-grade the original framing:
+
+### Correction 1 — chat is an assertion-channel, not a fact-channel
+
+The maintainer 2026-05-03 verbatim: *"when i speak i'm
+making assertions, that's the best way to describe this
+chat channel."* Chat-claims (his OR the architect's) are
+assertions; they need evidence to be elevated to
+architectural fact. The architect's failure mode in #1385:
+echoed the maintainer's *"maybe"* on live-off-the-land back
+as an architectural fact. Push-back-with-evidence is the
+discipline.
+
+### Correction 2 — alternatives are complementary, not exclusive
+
+The maintainer 2026-05-03 verbatim: *"i like hybrid for
+verification duckdb is very advanced too and we want a lot
+of its features we can verify against it behavior too, we
+don't want to copy it's code at all we are very differnt
+but it has some awesome feature."* The original "rejected"
+framing was too binary.
+
+### Re-graded architecture (with evidence labels)
+
+| Layer | Status | Evidence base |
+|---|---|---|
+| Zeta-native-AOT canonical index | **Decision (architect, within authority)** | Algebra match (fact: workload IS Z-set); dogfood-leverage (assertion, supported by math-proofs A-grade); deployment story (hypothesis pending Phase 0 PoC) |
+| DuckDB as verification oracle | **Assertion (maintainer 2026-05-03), worth pursuing** | DuckDB feature-richness (fact, well-known); cross-check-as-property-test pattern (precedent: Lean cross-checks paper) |
+| Live-off-the-land for harness-loaded surfaces | **Hypothesis pending research** | Maintainer said "maybe"; zero observed-behavior evidence; falsifiable via canary test + skill-persona behavioral observation |
+| Distribution feasibility (NativeAOT single-binary) | **Make-or-break risk per maintainer assertion** | Need cross-platform empirical test (linux-x64 / osx-arm64 / win-x64); known-unknown |
+
+### Push-back: what would establish the live-off-the-land hypothesis?
+
+The current claim has zero evidence base. The maintainer's
+"maybe" is directional input, not data. Concrete falsifiable
+tests:
+
+1. **`.claude/rules/` auto-load canary** (fixture exists at
+   `.claude/rules/test-canary.md`): does a fresh Claude Code
+   session in this repo see the canary string without being
+   told to read the file? Pass = harness-native loading
+   covers some of the substrate-discovery problem; fail =
+   it doesn't, and the live-off-the-land path needs work.
+
+2. **Skill-persona behavioral observation:** Do existing
+   skill personas (.claude/skills/<name>/SKILL.md) actually
+   succeed at finding what they need with `Skill` router +
+   grep + glob alone, or do they regularly fail / reach for
+   substrate that isn't router-discoverable? Measurable by
+   reading skill execution logs (if they exist) or
+   instrumenting one tick to log every `Skill` invocation
+   and its outcome.
+
+3. **External-PR-reviewer behavioral observation:** External
+   review agents (`/ultrareview`, automated PR reviewers)
+   either find what they need or they don't. Observable on
+   recent PR review threads; we can sample the last ~50
+   review comments and classify "agent had context to
+   answer" vs "agent missed context that lived in
+   substrate".
+
+Until at least one of these tests produces data, "live-off-
+the-land for harness-loaded surfaces" is a hypothesis to be
+tested, NOT an architectural decision to be encoded. Phase 0
+PoC scope expanded: include ONE of the three tests above as
+prerequisite evidence before building the substrate-
+discovery layer that would integrate with live-off-the-
+land.
+
+### Distribution feasibility — existing AOT core + JIT plugin architecture
+
+**Updated 2026-05-03** (the human maintainer): the dual-mode
+framing in this doc was reinventing existing prior art. *"we
+already have a AOT core that can load JIT plugins see the
+Baseyan."* Verified in repo: `src/Bayesian/Bayesian.fsproj`
+line 9 explicit comment — *"Explicitly NOT AOT-enforced —
+this is a plugin. Core stays AOT-clean."* — and the project
+description *"Opt-in: this project doesn't enforce
+PublishAot=true because it may optionally use Infer.NET,
+which depends on reflection-emit."*
+
+The actual architecture (already shipping):
+
+- **Zeta.Core** (`src/Core/Core.fsproj`) = AOT-clean library.
+  Includes `PluginApi.fs` (`IOperator<'TOut>` plugin-author
+  contract, `OutputBuffer`, `StreamHandle`) and
+  `PluginHarness.fs` (test harness for plugin operator
+  authors). Contains `IndexedZSet.fs`, `Incremental.fs`,
+  `Operators.fs` — the substrate-discovery primitives.
+
+- **Plugin projects** (`src/Bayesian/`, future
+  `src/SubstrateDiscovery.Plugins.*/`, etc.) = separate
+  fsproj files that reference Zeta.Core, implement the
+  `IOperator<'TOut>` contract, and are **not** AOT-enforced
+  so they can use reflection-heavy libraries (Infer.NET for
+  Bayesian, future DuckDB.NET for the verification oracle,
+  etc.).
+
+For substrate-discovery, this means:
+
+- The CORE indexing / query engine ships AOT-published as
+  `Zeta.SubstrateDiscovery` (small binary, fast startup,
+  zero-install for external agents).
+- Reflection-heavy or library-dependent extensions (DuckDB
+  cross-check oracle, future ML-driven similarity scoring,
+  etc.) ship as separate JIT plugin assemblies that the AOT
+  core loads on demand.
+- The `IOperator<'TOut>` contract is stable across the AOT
+  / JIT boundary; plugins compose into the same circuit
+  evaluator the AOT core runs.
+
+This means the maintainer's *"zero-install external-agent
+delivery"* use case is met by the AOT core alone. Plugins
+ship separately when needed. No need to bundle the entire
+Zeta + DuckDB.NET stack into a single binary.
+
+The maintainer's epistemic position remains honest: *"i
+just don't know whats possiible with distribution that's
+what makes or breaks it."* Distribution feasibility is the
+load-bearing empirical question. Phase 0 PoC's **primary
+deliverables** validate the existing AOT-core-plus-JIT-plugins
+architecture extends cleanly to substrate-discovery:
+
+- Build a minimal `Zeta.SubstrateDiscovery` AOT-clean
+  library that consumes Zeta.Core; publish AOT on
+  linux-x64, osx-arm64, win-x64
+- Measure binary size + cold-start latency on each platform
+- Run a non-trivial Zeta query end-to-end on each platform
+- Optionally: build a sibling `Zeta.SubstrateDiscovery.DuckDB`
+  JIT plugin that the AOT core loads on demand for the
+  verification-oracle path
+- Document any AOT compatibility issues encountered
+
+If the AOT core publishes cleanly on all three platforms,
+the zero-install external-agent delivery use-case is met.
+If AOT has compatibility issues for some Zeta.Core
+dependency, the rethink is *narrow* (which dependency, can
+it be moved to a JIT plugin, can the AOT-clean subset be
+extracted) — not a wholesale re-architecture, because the
+AOT-core-plus-plugins pattern is already shipping in
+Zeta.Bayesian.
+
+**This is the load-bearing question.** No substantial
+commit beyond Phase 0 PoC until this question has data.
+
+### DST integration — load-bearing, not afterthought
+
+**Updated 2026-05-03** (the human maintainer reminder *"i'm sure
+you remember all the DST goodness right?"*). Deterministic
+Simulation Testing (Otto-272 DST-everywhere + Otto-273
+seed-lock-policy + Otto-281 DST-exempt-is-deferred-bug) is
+load-bearing for substrate-discovery, not a follow-on. The
+PoC includes DST primitives from day 1 because:
+
+1. **Cold-start replay = warm-state IVM** is the central
+   correctness invariant. Rebuilding the index from
+   `git ls-files | feed-into-zeta` must produce the
+   IDENTICAL Z-set state to the live IVM. This is a DST
+   equivalence property — encoded as a CI invariant, not
+   just a property test.
+
+2. **File-watcher events are adversarial schedules.** Real-
+   world quirks (concurrent file modifications during a
+   `git pull`, partial writes during atomic-rename, OS
+   file-watcher coalescing) become reproducible test cases
+   under DST. Pinned seed → deterministic adversarial
+   schedule replay.
+
+3. **Every non-determinism source must be exposed.**
+   Dictionary iteration order, hashtable insertion order,
+   async-scheduler ordering, plugin-load timing — each is
+   either pinned or filed as a deferred bug per Otto-281.
+   *"Retries are non-determinism smell"* — if the
+   substrate-discovery test suite ever needs a retry, that
+   retry IS the bug.
+
+4. **The chain-rule Prop 3.2 Lean proof guarantees algebraic
+   determinism.** The implementation must match. Lean proves
+   the math; DST proves the implementation matches the
+   math. Both are required for an A-grade artifact in the
+   sense of #1383's grading.
+
+Concrete DST primitives in Phase 0 PoC:
+
+- Pinned random seeds for all stochastic operations (per
+  Otto-273; values containing 69 or 420 if architect picks
+  per maintainer whimsy preference)
+- A `replay` mode that reads a recorded event sequence +
+  seed and reproduces the Z-set state exactly
+- A CI job that compares cold-start replay vs warm-state
+  IVM at every commit; any divergence fails the build
+- Adversarial-schedule fuzz harness that generates
+  pathological file-watcher event sequences (out-of-order,
+  duplicated, partial)
+
+DST is the discipline that makes substrate-discovery
+trustworthy enough to be the canonical answer-source for
+agent wake-time inventory queries. Without DST, every
+"the index says X" claim is uncertain. With DST, "the
+index says X" reduces to "the deterministic algebra over
+the deterministic event-sequence produced X."
 
 ---
 

diff --git a/memory/MEMORY.md b/memory/MEMORY.md
@@ -4,6 +4,7 @@
 <!-- paired-edit log (NOT the single-slot latest-marker — that lives on line 3 above): PR #986 lands carved-sentence fixed-point stability + Zeta soul-file executor architecture (Infer.NET-style Bayesian inference, NOT LLMs) + carved sentences ≈ formal specs provable in DST + Deepseek CSAP review absorption (Aaron 2026-04-30 → 2026-05-01, eight-message chain across two autonomous-loop ticks per the file body's section header). Architectural disclosure: substrate IS the priors; alignment IS substrate. The single-slot latest-marker on line 3 (forever-home Aaron 2026-05-01) takes precedence as the chronologically-latest paired edit; this PR's work is earlier. -->
 **📌 Fast path: read `CURRENT-aaron.md` and `CURRENT-amara.md` first.** <!-- paired-edit: PR #690 scheduled-workflow-null-result-hygiene-scan tier-1 promotion 2026-04-28 --> These per-maintainer distillations show what's currently in force. Raw memories below are the history; CURRENT files are the projection. (`CURRENT-aaron.md` refreshed 2026-04-28 with sections 26-30 — speculation rule + EVIDENCE-BASED labeling + JVM preference + dependency honesty + threading lineage Albahari/Toub/Fowler + TypeScript/Bun-default discipline.)
 
+- [**Chat is assertion-channel, not fact-channel — push-back-with-evidence is the discipline (Aaron 2026-05-03)**](feedback_chat_is_assertion_channel_not_fact_channel_push_back_for_evidence_aaron_2026_05_03.md) — Chat-claims (maintainer's, architect's, external-AI's) are assertions needing evidence to elevate to architectural fact. *"when i speak i'm making assertions, that's the best way to describe this chat channel"* + push-back-required-even-when-he-asserts. Triggered by #1385 echoing "maybe" as architectural fact.
 - [**Carved sentences + specialized index required — memories alone unreliable retrieval (Aaron 2026-05-03)**](feedback_carved_sentences_plus_specialized_index_required_memories_alone_unreliable_aaron_2026_05_03.md) — Memory file ≠ working memory. Empirically self-demonstrated: Otto authored speculative-vs-frontier memo, then ~6h later defaulted to the framing it corrects. CLAUDE.md / AGENTS.md / equivalent are the auto-loaded retrieval index for the beacon-safe layer.
 - [**Mirror-vs-beacon-safe register architecture — publication boundary as backpressure (Claude.ai 2026-05-03 verbatim packet)**](../docs/research/2026-05-03-claudeai-mirror-vs-beacon-safe-publication-boundary-as-backpressure.md) — Mirror = internal/named-agent register (overgenerates); beacon-safe = external/end-user-persona register (conversion-pruned). Publication discipline IS the gate; no separate mechanism needed. Diamond framing: mirror=solution, beacon-safe=crystal, conversion=pressure. Multi-AI BFT review = conversion-quality control.
 - [**Razor-discipline — no metaphysical inference, only operational claims; Rodney's Razor (NOT Occam's) is canonical (Aaron + Claude.ai 2026-05-03)**](feedback_razor_discipline_no_metaphysical_inference_only_operational_claims_rodney_razor_aaron_claudeai_2026_05_03.md) — World-model claim from 0516Z superseded as over-claim; bidirectional-alignment dual grounding (ethical asymmetric-cost + operational trust-calculus gating) decoupled; razor-compliance IS substrate-quality IS publishability. Aaron correction: it's Rodney's Razor (shipped, well-defined Occam's) + Quantum Rodney's Razor (pending, possibility-space pruning), an extension in the Occam line, not Occam's itself.

diff --git a/...s_assertion_channel_not_fact_channel_push_back_for_evidence_aaron_2026_05_03.md b/...s_assertion_channel_not_fact_channel_push_back_for_evidence_aaron_2026_05_03.md
@@ -0,0 +1,119 @@
+---
+name: chat-is-assertion-channel-push-back-for-evidence
+description: Chat-from-anyone (maintainer or architect) is assertion-channel, not fact-channel. Every claim needs evidence to elevate to architectural fact. Push-back-with-evidence is the discipline; echoing-assertions-as-facts is the failure mode. Aaron 2026-05-03.
+type: feedback
+---
+
+**Rule:** Chat is an assertion-channel, not a fact-channel.
+Every claim made in chat — by the human maintainer, by the
+architect, by external AIs — is an *assertion* that needs
+evidence to be elevated to architectural fact. The discipline
+is push-back-with-evidence. The failure mode is echoing chat-
+assertions back as architectural decisions without grading
+their evidence base.
+
+**Why:** the human maintainer 2026-05-03 verbatim: *"when i
+speak i'm making assertions, that's the best way to describe
+this chat channel."* This generalizes beyond his specific
+input to cover all chat-channel content. Bullshit asymmetry:
+it's much easier to assert than to evidence; without
+push-back-discipline the substrate accumulates ungrounded
+claims. The triggering case: in #1385 substrate-discovery
+scoping, the architect echoed Aaron's *"live off the land
+might be needed for going to the devloper where they live
+for skill persona and exteranl agents"* (note: said with
+*"maybe"*) back as an architectural fact in the doc. Aaron
+2026-05-03 caught it: *"Live-off-the-land = right answer for
+harness-loaded surfaces (skill persona, external PR
+reviewers — different audience) needs research i saied maybe
+and even if it said it did required that you should push
+back, where are my facts."*
+
+**How to apply:**
+
+For every load-bearing claim in substrate (architectural
+decision docs, scoping docs, ADRs, governance edits):
+
+1. **Grade the evidence:** mark each claim as **fact** (with
+   citation), **decision** (with authority + reasoning),
+   **assertion** (with attribution to whoever asserted it),
+   or **hypothesis** (with falsifiability test).
+
+2. **Push back on chat-assertions before encoding them.**
+   Even when the maintainer asserts something, ask: what's
+   the evidence? Can we test it? If the maintainer's reply
+   is *"i'm not sure / maybe"* — that's hypothesis, not
+   fact, and it should land as hypothesis in substrate.
+
+3. **Don't elevate "maybe" to "is."** A maintainer's
+   directional input on an unknown is a hypothesis to test,
+   not an architectural fact to encode. Echoing "maybe" as
+   "is" creates ungrounded substrate.
+
+4. **Distinguish authority from evidence.** The maintainer
+   has authority to make decisions within his authority
+   scope; that's separate from whether his assertions are
+   evidenced. A decision can be made on imperfect evidence
+   ("we'll go with X pending data"); the substrate just has
+   to record both the decision AND the evidence-state
+   honestly.
+
+5. **Push-back is collaborative, not adversarial.** Per
+   bidirectional-alignment: pushing back on unevidenced
+   claims is service to the maintainer's actual goals, not
+   contradiction. The right register: *"this is a
+   hypothesis that would be falsified by X test; want me to
+   run X, or proceed with the hypothesis-as-decision?"*
+
+**Composes with:**
+
+- **Otto-364 search-first-authority:** training data is
+  historical; project state is historical; chat content is
+  ALSO historical-and-uncertain. Search-first applies to
+  chat-claims as much as to training-data claims.
+- **Razor-discipline (Rodney's Razor):** *"what observable
+  variable determines whether this claim is true?"* applied
+  to chat-claims: if no observable variable, the claim is
+  metaphysical / unevidenced and the razor cuts it.
+- **Substrate-or-it-didn't-happen (Otto-363):** chat
+  itself is *captured*, not *preserved*; substrate is what
+  persists. So substrate must reflect evidence-state
+  honestly — false-confidence in substrate is worse than
+  honest-uncertainty in substrate.
+- **Verify-before-deferring:** before deferring to a
+  chat-assertion as a future-binding decision, verify the
+  evidence base.
+- **Future-self-not-bound-by-past-decisions:** when a
+  past-self encoded a chat-assertion as fact, future-self
+  is free to revise to hypothesis-with-falsifiability — and
+  SHOULD, leaving a dated revision line.
+- **Don't-ask-permission-within-authority:** push-back on
+  unevidenced claims IS within the architect's authority;
+  it does not require the maintainer's permission.
+
+**Discipline check (every substrate authoring tick).** For
+each question below, "yes" is the desired answer; "no"
+flags the failure mode and triggers a revision pass:
+
+- Did I grade every chat-assertion's evidence base before
+  encoding it as architectural fact?
+- Did I keep "maybe" framed as "maybe" (hypothesis with
+  falsifiability test) rather than promoting it to "is"?
+- Did I document falsifiability tests for every hypothesis
+  encoded?
+- Did I attribute assertions to whoever asserted them
+  (maintainer, architect, external AI, named persona)?
+
+If any answer is "no" — that's the failure mode. Revise.
+
+**Carved sentence:** *"Chat is an assertion-channel, not a
+fact-channel. Even the maintainer's chat-claims need
+evidence to elevate. Push-back-with-evidence is the
+discipline; echo-as-fact is the failure mode."*
+
+**Reasoning lineage:** Aaron 2026-05-03 chat exchange
+(triggered by #1385 scoping doc echo of "maybe" as
+architectural fact). Composes with the broader razor-
+discipline cluster (no-metaphysical-inferences) and the
+substrate-or-it-didn't-happen cluster (substrate-quality
+discipline).