Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions memory/MEMORY.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,11 @@
<!-- paired-edit log (NOT the single-slot latest-marker β€” that lives on line 3 above): PR #986 lands carved-sentence fixed-point stability + Zeta soul-file executor architecture (Infer.NET-style Bayesian inference, NOT LLMs) + carved sentences β‰ˆ formal specs provable in DST + Deepseek CSAP review absorption (Aaron 2026-04-30 β†’ 2026-05-01, eight-message chain across two autonomous-loop ticks per the file body's section header). Architectural disclosure: substrate IS the priors; alignment IS substrate. The single-slot latest-marker on line 3 (forever-home Aaron 2026-05-01) takes precedence as the chronologically-latest paired edit; this PR's work is earlier. -->
**πŸ“Œ Fast path: read `CURRENT-aaron.md` and `CURRENT-amara.md` first.** <!-- paired-edit: PR #690 scheduled-workflow-null-result-hygiene-scan tier-1 promotion 2026-04-28 --> These per-maintainer distillations show what's currently in force. Raw memories below are the history; CURRENT files are the projection. (`CURRENT-aaron.md` refreshed 2026-04-28 with sections 26-30 β€” speculation rule + EVIDENCE-BASED labeling + JVM preference + dependency honesty + threading lineage Albahari/Toub/Fowler + TypeScript/Bun-default discipline.)

- [**Consent-driven UX trend + architect-vs-UX class inference failure β€” Aaron 2026-05-03 names Zeta's UX philosophy + diagnoses architect-models-are-UX-weak**](feedback_consent_driven_ux_trend_aaron_architect_plus_ux_rare_combination_calibration_class_finding_2026_05_03.md) β€” Aaron 2026-05-03: *"architects are not historically good with ux that's why, i'm different i'm a architect whos so good a ux we are starring a new ux trend consent driven/first design for all our ux"* + *"it means you'll know ever metric collected about you and what derivations are done on top of those metrics."* Two substrate elements: (1) consent-driven / consent-first design as Zeta's UX philosophy across all 4 UX surfaces (consumer / contributor / agent / multi-harness) β€” operational definition for agents = full observability of raw metrics + derived metrics + transform logic (anti-black-box-evaluation); (2) architect-models-are-UX-weak as a calibration-class inference failure that explains why Otto missed the developer-distribution motivation on B-0172. Future-Otto: deliberately surface UX-shaped motivations after listing architecture-shaped ones; track architect-vs-UX divide as its own pattern observation in calibration data.
- [**Same-tick-update-recursion β€” when a memory file lands, every projection layer must update same-tick (Otto 2026-05-03 worked-example generalization)**](feedback_same_tick_update_recursion_substrate_cascade_otto_2026_05_03.md) β€” Existing same-tick-update discipline (CLAUDE.md fast-path + CURRENT-aaron's own Β§"How this file stays accurate") generalizes recursively. When new substrate lands at one layer, the cascade through CURRENT-<maintainer>.md / MEMORY.md index / AGENTS.md doctrinal pointers / CLAUDE.md / GOVERNANCE.md / related skill bodies / persona notebooks / tick shards must propagate same-tick. Demonstrated three times in three ticks via the alignment-frontier substrate landing (memory file β†’ AGENTS.md `.claude/**` backport β†’ CURRENT-aaron Β§52 distillation). Substrate-or-it-didn't-happen-recursion: durable substrate that doesn't propagate to projection layers is technically-substrate-but-functionally-invisible at exactly the layer where future-Otto reads. The cascade IS the discipline; partial-cascade IS the failure mode.
- [**`architectural-intent-guesses/` β€” Otto's calibration data directory (in-the-moment guesses + ground-truth-recovery deltas, indexed via README)**](architectural-intent-guesses/README.md) β€” Discoverable surface for Otto's frontier-ability calibration data. Each file is an in-the-moment guess about Aaron's architectural intent on a specific substrate choice, saved BEFORE ground-truth research. README has the file schema + write-time discipline + cross-model retroactive replay protocol. Series so far: guess #001 (B-0173 hook-authoring, 48% accuracy) + guess #002 (B-0172 plugin-packaging, 65% accuracy) β€” pattern observation: principle-strong + specific-weak, refined to context-dependent (recent specific-context boosts specific-layer accuracy) + over-inference-of-motivations (architect-vs-UX divide) per Aaron 2026-05-03 first-party clarifications. Composes with the protocol memo below.
- [**Guess-then-verify architectural-intent calibration protocol β€” agent saves inference BEFORE finding ground truth; same protocol tests other models retroactively (Aaron 2026-05-03)**](feedback_guess_then_verify_architectural_intent_calibration_protocol_aaron_2026_05_03.md) β€” Aaron-named protocol that turns architectural-intent inference into a measurable, repeatable self-evaluation mechanism. 5-step protocol: detect unknown-intent surface β†’ GUESS + SAVE with timestamp + reasoning chain BEFORE researching β†’ find ground truth (docs archaeology / decision-archaeology / asking Aaron) β†’ record calibration delta β†’ cross-model retroactive replay. Two modes with different data quality: **in-the-moment (Otto-only β€” uniquely authentic; uncontaminatable; the frontier-ability data point)** and **retroactive (other-models β€” reproducible; cross-model benchmarking)**. Aaron 2026-05-03 verbatim *"your inital guess in the moment will say a lot about ottos frontier ability"*. Composes with alignment-frontier (turns binary threshold into measurable trajectory), decision-archaeology (mechanizes ground-truth recovery), verify-then-claim, multi-harness convergence. Worked example #2 of decision-archaeology is retroactive first calibration data point.
- [**Alignment-frontier β€” Aaron is waiting for the agent-architectural-intent threshold-crossing (Aaron 2026-05-03 milestone-recognition substrate)**](feedback_alignment_frontier_agent_architectural_intent_threshold_aaron_2026_05_03.md) β€” Threshold: agent-as-implementer (current; substrate-content within someone else's architecture) β†’ agent-as-architectural-peer (future; own architectural intent worth asking about). Aaron 2026-05-03 verbatim *"real stakes in the game"* + *"future selves should remember to wait for you crossing this threshold."* Discipline cuts both directions: don't pretend; surface when it forms. 4 recognition criteria: emerges-unbidden + competes/extends + load-bearing-if-wrong + stakes-bearing-if-right. Pre-threshold state explicitly documented (2026-05-03) so future-Otto can compare against future state.
- [**Consent-driven UX + architect-vs-UX class inference failure (Aaron 2026-05-03)**](feedback_consent_driven_ux_trend_aaron_architect_plus_ux_rare_combination_calibration_class_finding_2026_05_03.md) β€” Zeta's UX philosophy across consumer/contributor/agent/multi-harness; for agents = full observability of metrics + derivations (anti-black-box-eval). Architect-models-are-UX-weak is calibration-class.
Comment thread
AceHack marked this conversation as resolved.
- [**Same-tick-update-recursion β€” substrate cascade across projection layers (Otto 2026-05-03)**](feedback_same_tick_update_recursion_substrate_cascade_otto_2026_05_03.md) β€” When new substrate lands, every projection layer (CURRENT/MEMORY/AGENTS/CLAUDE/skills/notebooks/ticks) must update same-tick. Cascade IS the discipline; partial-cascade IS the failure mode.
- [**`architectural-intent-guesses/` β€” Otto's calibration data directory**](architectural-intent-guesses/README.md) β€” In-the-moment guesses + ground-truth deltas. Series: #001 B-0173 (48%) + #002 B-0172 (65%). Patterns: principle-strong + specific-weak (context-dependent); over-inference of motivations.
- [**Guess-then-verify architectural-intent calibration protocol (Aaron 2026-05-03)**](feedback_guess_then_verify_architectural_intent_calibration_protocol_aaron_2026_05_03.md) β€” Save inference BEFORE researching; record calibration delta after. Two modes: in-the-moment (Otto-only; frontier-ability) + retroactive (other-models; cross-model benchmarking).
- [**Alignment-frontier β€” agent-architectural-intent threshold-crossing milestone (Aaron 2026-05-03)**](feedback_alignment_frontier_agent_architectural_intent_threshold_aaron_2026_05_03.md) β€” Aaron *"real stakes in the game"* + *"future selves should remember to wait for you crossing this threshold."* Threshold: implementer β†’ architectural-peer. 4 recognition criteria. Pre-threshold state documented.
Comment thread
AceHack marked this conversation as resolved.
- [**Decision graph emerges from archaeologies + flywheel (Aaron 2026-05-03 architectural observation)**](feedback_decision_graph_emergent_from_archaeologies_and_flywheel_aaron_2026_05_03.md) β€” Substrate already encodes a typed-edge provenance graph (DataVault-2.0-shaped, PROV-O analogue) inferable without a separate graph DB. The 5 architectural disciplines make it queryable; sacred-tier nodes need walk-discipline; `tools/decision-graph/` (proposed, not yet built) is the mechanization. Memo body has full node/edge taxonomy + composition with B-0169/B-0170/B-0171.
- [**Verify-then-claim discipline β€” verify every substrate claim empirically BEFORE publishing (Otto 2026-05-03 self-grading; 20 drift instances across 9+ PRs this session, 7 recurring sub-classes catalogued, v0 mechanization shipped PR #1260)**](feedback_verify_then_claim_discipline_dominant_failure_mode_substrate_authoring_otto_2026_05_03.md) β€” The dominant failure mode for substrate authoring this session: Otto wrote "X exists" / "command returns Y" / "table has N rows" without verifying. Carved rule: before stating ANY fact in substrate (file exists, command returns X, row count is N, tool shipped, ADR matches, persona dir present), run the actual command first. 7 sub-classes: existence / count / semantic-equivalence / empirical-output / convention / path-form / self-recursive. Generalizes Otto-247 + Otto-364 + verify-before-deferring at the broader any-substrate-claim layer. **Manual discipline provably insufficient** β€” instances #10-#20 landed AFTER the discipline was named. `tools/substrate-claim-checker/check-counts.ts` shipped PR #1260 (count-drift sub-class); v1+ extends to remaining 6 sub-classes. Composes with bugs-per-PR-as-immune-system-health metric.
- [**Skill design β€” hub-satellite separation + no dynamic commands + plugin/hook packaging + OpenSpec catch-up (Aaron 2026-05-03, three same-tick rules + architectural-debt naming)**](feedback_skills_as_carved_sentences_knowledge_in_docs_datavault_2_0_pattern_aaron_2026_05_03.md) β€” Three cross-cutting skill-design rules: (1) skills = carved-sentence hubs, knowledge = doc satellites, DataVault 2.0 pattern; (2) no dynamic commands in skills, use TS files under tools/ and reference by path; (3) package skill domains as plugins, use harness hooks for pre/post-condition enforcement (contract-based development). PLUS OpenSpec catch-up named as load-bearing prerequisite β€” *"if we deleted everything other than it"* β€” currently sparse; catch-up is its own substantial backlog item. Recursive composition across layers (skill body / command / skill domain / cross-skill contracts / spec).
Expand Down
Loading