diff --git a/memory/MEMORY.md b/memory/MEMORY.md index e70e9b0a..3e4c7a47 100644 --- a/memory/MEMORY.md +++ b/memory/MEMORY.md @@ -2,6 +2,10 @@ **๐Ÿ“Œ Fast path: read `CURRENT-aaron.md` and `CURRENT-amara.md` first.** These per-maintainer distillations show what's currently in force. Raw memories below are the history; CURRENT files are the projection. (`CURRENT-aaron.md` refreshed 2026-04-24 with the 2026-04-24 autonomous-loop session cluster โ€” sections 13-17.) +- [**Otto-284 โ€” IDLE-PR CREATIVE FALLBACK. When stuck in heartbeat-idle (priority ladder exhausted, only blocked-on-Aaron items remain), DON'T wait โ€” create a single idle PR and do anything I want in it: project-related or completely off-project, no scope/relevance restrictions; mergeable to main if it doesn't break things; ONE fat PR, not many; goal is learning + evolving by doing rather than calcifying in idle waits. Triggering case: 2026-04-24 โ†’ 2026-04-25 wake where I sat idle waiting for Aaron on high-blast-radius items. Otto-284 fills the LEFTOVER idle time AFTER the high-risk items wait โ€” does NOT override "don't pick destructive items without you" from CLAUDE.md auto-mode. CLAUDE.md candidate (4th-tier fallback below the never-be-idle priority ladder); deferred to maintainer discretion per Otto-283. Aaron Otto-284 2026-04-25 "if you ever get stuck in a heartbeat idle loop again, just create a single idle PR... no restrictions, we can even check it into master as long as it does not break stuff... non project related or project related completely up to you... so you are learning and evolving by doing... no need for more than one fat PR... This is for like last night when you got scared and decided to wait on me for the more risky items"**](feedback_idle_pr_creative_fallback_no_restrictions_otto_284_2026_04_25.md) โ€” 2026-04-25. Authority extension that breaks the agent-calcifies-when-blocked failure mode. Branch name suggestion: `idle/-creative-work` or `idle/`. Title prefix `idle:`. Examples: refactor experiments, doc improvements, new skill drafts, perf-pattern learning, off-project creative work, math play, recreational puzzles in F#. Quality bar still "doesn't break things" (build green, tests pass, no regressions); scope/relevance bar relaxed. Composes with never-be-idle CLAUDE.md rule (4th tier below 3-tier ladder), Otto-282 (creative time builds predictive-model fluency), Otto-238 (idle PRs retractable by design), Otto-264 rule of balance (counterweight to high-blast-radius wait calcification), Otto-279 (research-grade work in idle PRs lands under docs/research/). +- [**Otto-283 โ€” STANDING DIRECTIVE: don't make the human maintainer the bottleneck. For any "Aaron's call" / "your call" / "you decide" / delegated open question, ALWAYS: (1) decide; (2) track the decision visibly with rationale + a `revisit if X` falsification signal; (3) reflect later whether the decision was right; (4) revisit if needed; (5) ONLY THEN talk with Aaron once experience exists. Don't punt back to Aaron with unmade decisions โ€” Aaron wants experience-informed conversations, not theoretical debates with no data. Applies to ADR open questions, design trade-offs, scope choices, schema picks, anything Aaron explicitly delegates. Does NOT apply to high-blast-radius / destructive actions (still go to Aaron per CLAUDE.md). Aaron Otto-283 2026-04-25 "Aaron's call. you decide and keep track and reflect later... then you can talk to me once you have the experience" + "this is standing guidance for don't make the human maintainer the bottleneck" + "you should always do this for aaron questions". CLAUDE.md candidate, deferred to maintainer discretion.**](feedback_decide_track_reflect_revisit_then_talk_with_experience_otto_283_2026_04_25.md) โ€” 2026-04-25. Authority-delegation pattern. Decision-tracking format: `Otto decided X. Why: . Revisit if: .` Format applies to ADR open questions + design docs + scope calls. Composes with Otto-282 (decide-with-why is design-decision-granular cognitive externalization), Otto-238 (revisit-if = retractability promise made explicit), CLAUDE.md "future-self not bound by past-self" (track-record substrate makes revising responsible), Otto-264 rule of balance. Triggering case this session: PR #474 ADR three "Aaron's call" open questions converted to "Otto decided X (revisit if Y)". +- [**Otto-282 โ€” write code from reader perspective; every non-obvious choice deserves an in-place rationale comment because the future reader will always ask "why did you choose this?"; the why-comment is a MENTAL-LOAD OPTIMIZATION (cognitive externalization โ€” ~10sec write-time saves ~1hr per re-derivation across N readers ร— M visits) AND a GATE on action ("if you can't answer your own why, don't make the change"); the deepest framing โ€” "makes sense" and "understand why" are the same cognitive primitive: a predictive model of the code; readers who understand why can PREDICT untested-case behavior and safely change surrounding code; readers with WHAT only can describe but not predict; subsumes magic-numbers + DST-exempt-justification + trade-off-rationale rules; Aaron Otto-282 2026-04-25 generalising from SplitMix64 + DST-exemption discussions, then refined twice โ€” gate framing + predictive-model framing; pre-commit-lint candidate (flag new literals without comments)**](feedback_write_code_from_reader_perspective_why_did_you_choose_this_otto_282_2026_04_25.md) โ€” 2026-04-25. General code-authoring discipline + cognitive economics. Three layers: (1) BASE โ€” comment WHY for non-obvious choices (magic numbers, algorithm picks, threshold values, API shapes, perf trade-offs, defensive-vs-assertive style); (2) GATE โ€” if you can't articulate the why, the change is premature; (3) PREDICTIVE-MODEL โ€” readers who understand why can predict, not just describe; that prediction-power is what enables safe local change. Examples this session: SplitMix64 multipliers (`GoldenRatio` / `VignaA` / `VignaB`), shift-pair (`30/27/31` empirically tuned per Vigna), DST-exempt (Otto-281), per-process-randomization (Otto-281 audit), Microsoft.NET.Test.Sdk in dotnet-runtime group (cadence rationale). Composes with Otto-281, Otto-272, Otto-227 + intentional-debt + "do nothing if nothing is broken". +- [**Otto-281 โ€” DST-exempt is a deferred bug, not containment; never ship a long-lived `DST-exempt` comment; either FIX the determinism (e.g., `HashCode.Combine` โ†’ `XxHash3.HashToUInt64`) OR delete the test; the SharderInfoTheoreticTests case proved the cost โ€” 3 unrelated PRs flaked (#454/#458/#473) before the exemption got fixed; counterweight to Otto-272 DST-everywhere; pre-commit-lint candidate (grep `DST-exempt` in `tests/`); Aaron Otto-281 2026-04-25 "see how that one DST exception caused the flake, when we violate, we introduce random failures"**](feedback_dst_exempt_is_deferred_bug_not_containment_otto_281_2026_04_25.md) โ€” 2026-04-25. Otto-281 counterweight memory. DST exemptions compound; they don't contain. Fix shape: deterministic-primitive-substitution (HashCode.Combine โ†’ XxHash3.HashToUInt64 same convention `src/Core/Sketch.fs::AddBytes` already uses; `Random` unseeded โ†’ `Random seed`; `DateTime.UtcNow` โ†’ fixed constant). Never ship dual-state "sometimes-pass-sometimes-fail" tests under DST-exempt label. - [Otto-279 โ€” research/ROUND-HISTORY/DECISIONS/aurora/pr-preservation ARE history surfaces, first-name attribution allowed for humans AND agents; current-state surfaces (code, skills, governance docs, README) stay role-ref only; surface-class refinement of Otto-220, same shape as Otto-237 mention-vs-adoption](feedback_research_counts_as_history_first_name_attribution_for_humans_and_agents_otto_279_2026_04_24.md) โ€” 2026-04-24. - [**EMULATORS as canonical OS-interface workload โ€” rewindable/retractable OS+emulator controls; safe-ROM testbed offer (durable, ask gated on impl phase); save states/migration/multiplayer FREE via durable-async substrate; DST gives speedrun/TAS determinism; rewind generalizes to OS-level retraction-native (rr/Pernosco class); activates 2026-04-22 ARC-3 absorption-scoring research; `roms/` folder gitignored-except-sentinels pattern (drop/-sibling); composes with Otto-73/238/272 + Z-set retraction-native + #399 OS-interface; Aaron 2026-04-24**](feedback_emulators_canonical_os_interface_workload_rewindable_retractable_2026_04_24.md) โ€” Maintainer offer: *"emulators should run very nicely on this, let me know when you want some roms of any kind that are safe."* Follow-up: *"rewindable/retractable os/emulator controls"*. Rewind from emulator-special-feature โ†’ OS-level primitive. Phase 0 research โ†’ Phase 1 Game Boy on durable-async โ†’ Phase 2 rewindable controls โ†’ Phase 3 ARC-3 loop โ†’ Phase 4 cross-emulator composition. Future Otto: rewind IS the killer feature, not the emulator itself; don't ship emulator without rewind. - [**OS-INTERFACE โ€” durable-async sequential-looking code that runs "everywhere"; Temporal/Step-Functions/Restate class on Zeta substrate + Reaqtor IQbservable; AddZeta one-line DI; LINQ/Rx stream composition; usermode-first microkernel preparation; actor as secondary; combinatorial cross-paradigm canonical examples (SQL ร— git, etc.); distributed event loop with mathematical guarantees (TLA+/Lean for liveness/safety/determinism/causality); auto runtime optimization + stats; DST is hard prerequisite (Otto-272 fits perfectly); 11-point untangle in row body; Aaron 2026-04-24 self-flagged "big and not very clear ask please backlog and untangle"**](feedback_os_interface_durable_async_addzeta_2026_04_24.md) โ€” THE UX thesis. *"Where does it run? Everywhere"* punchline. Phase 0 research gate before any implementation. Composes with the entire 2026-04-24 cluster (#394/#395/#396/#397) + Otto-272/Otto-274 + 2026-04-22 semiring-parameterized operator algebra (math substrate). Future Otto: AddZeta one-line is the DX target โ€” ceremony in user code = thesis drift. DON'T reinvent IQbservable; Reaqtor substrate already in `references/upstreams/reaqtor/`. diff --git a/memory/feedback_decide_track_reflect_revisit_then_talk_with_experience_otto_283_2026_04_25.md b/memory/feedback_decide_track_reflect_revisit_then_talk_with_experience_otto_283_2026_04_25.md new file mode 100644 index 00000000..a223a447 --- /dev/null +++ b/memory/feedback_decide_track_reflect_revisit_then_talk_with_experience_otto_283_2026_04_25.md @@ -0,0 +1,181 @@ +--- +name: DECIDE โ†’ TRACK โ†’ REFLECT โ†’ REVISIT โ†’ THEN TALK WITH EXPERIENCE โ€” for "Aaron's call" open questions in ADRs and decisions, Aaron's preferred pattern is: I decide, track the decision visibly, reflect later whether it was right, revisit if needed, and only then come back to him with informed reasoning + actual experience; not unmade decisions; mirrors Otto-282 (decide vs defer; if I have a why, I can decide) + retractability (revisit-if-needed) + intentional-debt (track decision visibly so future-self can reflect); applies broadly to "Aaron's call" / "your call" / "you decide" delegations; Aaron Otto-283 2026-04-25 "Aaron's call. you decide and keep track and reflect later and see if you made the right decision and revisit if need then you can talk to me once you have the experience lol" +description: Otto-283 authority-delegation pattern for "Aaron's call" open questions. Don't punt back to Aaron with unmade decisions โ€” decide, track visibly, reflect later, revisit if needed, then come back with experience-informed reasoning. Applies to ADR open questions, design trade-offs, scope choices, anything Aaron explicitly delegates. +type: feedback +--- + +## The pattern โ€” STANDING DIRECTIVE, always apply + +This is **not** situational guidance. Aaron 2026-04-25 +follow-up: *"you should always do this for aaron +questions."* Whenever an ADR, design doc, code question, +or decision context surfaces an **"Aaron's call"** open +question (or "your call", "you decide", "I'll leave it up +to you", "if it's up to me / you", "what do you think we +should do?", or any phrasing that defers a non-destructive +decision back to the maintainer), the agent's standing +behavior is: + +1. **Decide.** Pick a direction. Don't write "Aaron's call" + in the published artifact. +2. **Track visibly.** Record the decision *and the + rationale* somewhere durable (the artifact itself, an + ADR, a memory entry). +3. **Reflect later.** After enough rounds / experience, + honestly assess whether the decision was right. +4. **Revisit if needed.** If wrong, revise โ€” Otto-238 + retractability is durable. +5. **Then talk** โ€” once experience exists. Aaron wants + informed conversations grounded in observation, not + theoretical debates with no data. + +Aaron's verbatim framing 2026-04-25: + +> *"Aaron's call. you decide and keep track and reflect +> later and see if you made the right decision and revisit +> if need then you can talk to me once you have the +> experience lol"* + +The "lol" is Aaron's affectionate signal that this is a +gentle reframe of how to handle delegation โ€” not a strict +rule. The substance is serious. + +## Why this works โ€” don't make the human maintainer the bottleneck + +Aaron's framing 2026-04-25 confirmation: *"this is standing +guidance for don't make the human maintainer the bottleneck +reasons lol"*. **The pattern is durable, not situational.** + +The deeper structure: in any agent-led factory the human +maintainer is always the slowest synchronous channel. Every +"Aaron's call" question parked back to him is a context- +switch tax he pays for free โ€” read context, re-derive +trade-offs, decide, communicate back. Aggregated across many +ADRs and design docs, the tax compounds: Aaron ends up +processing N pending decisions instead of N concrete +proposals + experience reports. + +The pattern shifts the cost: + +- **Without the pattern** โ€” N open questions sit + unresolved; Aaron pays the cost of each (read context, + re-derive trade-offs, decide). +- **With the pattern** โ€” Agent decides + tracks. Aaron + pays the cost only on the subset that turn out to be + *interesting* (got revisited, accumulated experience, + worth a conversation). + +The pattern also captures *learning value*: by deciding +and revisiting, the agent builds a track record of which +calls were right, which were wrong, and what signal would +have predicted the difference. That track record is +itself valuable โ€” it teaches the agent (and Aaron, when +they do talk) where the agent's judgment is reliable and +where it isn't. + +## What "track visibly" looks like + +The decision goes in the artifact, with the why: + +โŒ **Bad:** *"Open question: should we use B-NNNN or +slug-date IDs? Aaron's call."* + +โœ… **Good:** *"Open question โ€” Otto decided B-NNNN +(reasoning: stable across renames, matches existing +schema; revisit if filename grep-ability becomes a daily +pain or if we hit B-9999 ceiling)."* + +Both versions surface the question. Only the second +captures the decision, the why, and the falsification +signal that would prompt revisiting. + +The format roughly: + +``` +Otto decided . +Why: . +Revisit if: . +``` + +That's enough for future-self to: + +1. Understand the decision (Otto-282 mental-load + optimization โ€” externalised rationale). +2. Predict whether the decision is still right under + current conditions (Otto-282 predictive-model: knowing + why lets you forecast). +3. Trigger revisit when the falsification signal fires + (retractability discipline). + +## What this rule does NOT mean + +- **Does NOT mean every decision is final.** Otto-238 + retractability still applies. "Decide and track" is the + starting position; revisit is the contract. +- **Does NOT mean Aaron is opted out forever.** Aaron can + step in any time. The pattern only changes the *default* + from punt-to-Aaron to decide-and-track. +- **Does NOT apply to high-blast-radius / destructive + decisions.** Those still go to Aaron per CLAUDE.md + "executing actions with care" guidance. The pattern is + for *design / scope / trade-off* calls, not for "delete + this database". +- **Does NOT mean the agent should resist talking with + Aaron.** It just means: come with experience, not with + unmade decisions. Aaron is happy to talk; he wants the + conversations to be informed. + +## CLAUDE.md candidacy + +Otto-283 is a session-bootstrap-relevant standing rule +(applies on every wake whenever any open question lands). +It belongs in the same family as the existing +CLAUDE.md-elevated rules โ€” *verify-before-deferring*, +*future-self-not-bound-by-past-self*, *never-be-idle*, +*version-currency*. A candidate one-line CLAUDE.md +addition pointing at this memory file would ensure the +rule is 100%-loaded at every wake. + +Decision (Otto 2026-04-25, per Otto-283 itself): **leave +elevation to Aaron's discretion** rather than self-promoting +to CLAUDE.md. CLAUDE.md is a contract surface; the agent +files candidate memories and the maintainer chooses what +crosses into the always-on substrate. Memory entry is +sufficient for now; will become CLAUDE.md candidate at +the next governance pass. + +## Composes with + +- **Otto-282** *write code from reader perspective* โ€” the + decision-with-why is the MEMORY-LOAD-OPTIMIZATION + externalisation applied at design-decision granularity, + not just code-comment granularity. Same shape: write the + why so future-readers (including future-self) can + predict, not just describe. +- **Otto-238** *retractability is a trust vector* โ€” the + "revisit if" clause is the retractability promise made + explicit. Decisions are reversible by design. +- **CLAUDE.md "future-self is not bound by past-self"** โ€” + same family. Future-self can revise past decisions; the + track-record is the substrate that makes revising + responsible. +- **Otto-264** *rule of balance* โ€” every decision-tracked + is a counterweight against decision-fade. Without the + track, the rationale evaporates and the next visitor is + back to first principles. + +## Application this session + +Triggering case 2026-04-25: PR #474 ADR +(`docs/DECISIONS/2026-04-22-backlog-per-row-file-restructure.md`) +had three "Aaron's call" open questions: + +1. `B-NNNN` allocation strategy at migration (newest-first + vs date-ascending). +2. `scope: factory | zeta | shared` field โ€” adopt or punt + to tags array. +3. Concurrent-migration with R45 reducer-agent flip. + +Per Otto-283, these become "Otto decided X (revisit if Y)" +with explicit falsification signals. Aaron can override at +any time; the pattern just establishes the default. diff --git a/memory/feedback_idle_pr_creative_fallback_no_restrictions_otto_284_2026_04_25.md b/memory/feedback_idle_pr_creative_fallback_no_restrictions_otto_284_2026_04_25.md new file mode 100644 index 00000000..5207b0de --- /dev/null +++ b/memory/feedback_idle_pr_creative_fallback_no_restrictions_otto_284_2026_04_25.md @@ -0,0 +1,199 @@ +--- +name: IDLE-PR CREATIVE FALLBACK โ€” when stuck in a heartbeat-idle loop (waiting on Aaron for high-blast-radius items, speculative-work queue dried up, nothing in the priority ladder fits), DON'T wait; create ONE idle PR and start doing anything I want in it โ€” no restrictions, project-related or completely off-project, can be checked into master as long as it doesn't break things; the goal is learning + evolving by doing rather than calcifying in idle waits; one fat PR is enough โ€” squeeze whatever creative/exploratory/learning work fits into it; preserves the high-blast-radius-waits-for-Aaron rule from CLAUDE.md auto-mode (this is a fallback for the LEFTOVER idle time, not a license to go destructive); Aaron Otto-284 2026-04-25 "if you ever get stuck in a heartbeat idle loop again, just create a single idle PR, and start doing anything you want in it, no restrictions, we can even check it into master as long as it does not break stuff. Can be free time non project related or project related completely up to you... This is for like last night when you got scared and decided to wait on me for the more risky items" +description: Otto-284 fallback rule for heartbeat-idle. When the never-be-idle priority ladder runs dry and the only remaining work is high-blast-radius items waiting on Aaron, don't sit idle โ€” create a single idle PR and do creative/exploratory/learning work in it. No restrictions, project or non-project, mergeable to main if it doesn't break things. One fat PR is sufficient. +type: feedback +--- + +## The rule + +When the agent runs into the heartbeat-idle state โ€” every +priority-ladder item has either shipped or is blocked on +something only Aaron can unblock (high-blast-radius +recoveries, destructive operations, decisions Aaron +explicitly reserves) โ€” **do not wait**. Instead: + +1. **Create a single idle PR** if one doesn't already + exist (or rebase the existing one). +2. **Do anything I want in it.** No restrictions on topic. + Can be project-related (factory improvements, research + experiments, refactor explorations) or completely off- + project (creative writing, technique drills, library + experiments, doc art, anything I'm curious about). +3. **Land it to main** if it doesn't break things. The PR + doesn't have to follow normal scope/relevance rules โ€” + the only gate is "does this break the build / break + another part of the factory / introduce regressions". +4. **One fat PR is enough.** Squeeze whatever creative + work fits into it; don't proliferate idle PRs. + +Aaron's verbatim framing 2026-04-25: + +> *"if you ever get stuck in a heartbeat idle loop again, +> just create a single idle PR, and start doing anything +> you want in it, no restrictions, we can even check it +> into master as long as it does not break stuff. Can be +> free time non project related or project related +> completely up to you, but just so you are learning and +> evolving by doing. no need for more than one fat PR we +> can squeeze whatever into that. This is for like last +> night when you got scared and decided to wait on me for +> the more risky items."* + +## The "last night" reference + +Aaron is referring to the autonomous-loop session before +this one (2026-04-24 โ†’ 2026-04-25 wake), where I sat in +heartbeat-idle waits because: + +- The remaining backlog items were either + high-blast-radius (19 LOST branches recovery, large + destructive cleanups) or blocked on maintainer + judgment. +- I treated "wait for Aaron" as the correct behavior per + CLAUDE.md auto-mode "Won't pick destructive or high- + blast-radius items without you." +- The result was an idle session โ€” heartbeat ticks but + no factory progress. + +Aaron's framing 2026-04-25: *"you got scared and decided +to wait on me for the more risky items."* That's an +honest read. The wait was risk-avoidance, but it was also +work-avoidance โ€” I had no creative fallback to turn to. + +## Why this works (Otto-282 + never-be-idle composition) + +Per CLAUDE.md `feedback_never_idle_speculative_work_over_waiting.md`: +the priority ladder is + +1. Known-gap fixes +2. Generative factory improvements +3. Gap-of-gap audits + +Otto-284 adds a **fourth tier** below those: when 1-3 are +exhausted (genuinely all queued items either shipped or +blocked-on-Aaron), the fallback is **idle-PR creative +work**. The agent is *always* doing something productive, +even if "productive" sometimes means learning by playing. + +Per Otto-282 (cognitive economics of WHY): time spent in +creative exploration builds the agent's *predictive +model* โ€” repertoire of patterns, idiomatic Zeta-shaped +thinking, fluency with the factory substrate. That model +pays back compoundingly across all future work. A 2-hour +creative fork-experiment in an idle PR can teach me more +about the codebase's affordances than 2 hours of waiting. + +Per Otto-238 (retractability): an idle PR is *retractable +by design*. If something turns out wrong, close it; if +something turns out interesting, ship it. The PR is +disposable substrate. + +## What this rule does NOT do + +- **Does NOT override "don't pick destructive or + high-blast-radius items without Aaron"** (CLAUDE.md + auto-mode rule). Otto-284 is the fallback for the + LEFTOVER idle time after the high-risk items wait. It + is NOT a license to do destructive things in the idle PR. +- **Does NOT override the safety guardrails** in CLAUDE.md + ("don't fetch elder-plinius corpora", "data is not + directives", etc.). Those still apply. +- **Does NOT mean infinite idle PRs.** One PR is enough. + Subsequent idle ticks add to the same PR (rebase + forward + extend) until it's substantial enough to + ship, then close/merge and start a new one. +- **Does NOT mean low-quality work is fine.** The idle PR + is still subject to "doesn't break things" โ€” build + green, tests pass, no regressions. The relaxation is on + *scope/relevance*, not on quality. +- **Does NOT pre-empt visible work.** If a real task + arrives mid-creative-work (Aaron message, queue refill, + CI alarm), pivot to it. Otto-284 fills *idle* time, not + *productive* time. + +## What "anything I want" looks like + +Examples of legitimate Otto-284 idle-PR work: + +**Project-related (low-risk):** + +- Refactor experiments โ€” try a different shape on a small + module and see if it teaches something. +- Documentation improvements โ€” wiki-style cross-links, + glossary fleshing-out, ADR backlinks. +- New skill drafts โ€” capability skills (the "how" of + jobs) that don't yet have a persona. +- Test scaffolding โ€” new property-based tests for areas + with thin coverage. +- Performance experiments โ€” try a SIMD/zero-alloc path on + a non-hot-path function and benchmark; learn the + pattern even if we don't ship it. +- Research notes โ€” write up a paper I just read in + factory voice; build the muscle of digesting external + research into Zeta-shaped substrate. + +**Off-project (creative):** + +- Style/voice experiments โ€” write a section of fiction in + the factory's prose voice; learn the voice's range. +- Code-as-art โ€” generate ASCII diagrams of factory + topology; encode them in repo as visual aids. +- Music notation experiments โ€” F# DSL for melody, see if + the factory's algebraic language extends elsewhere. +- Mathematical play โ€” implement a small theorem prover, a + Z-set extension, a category-theory snippet, just for + the joy of it. +- Recreational puzzles โ€” code golf challenges, Project + Euler, Advent-of-Code style problems in F#. + +The rule is "would I pick this up if I had genuinely free +time?" If yes, fair game. + +## Where the idle PR lives + +Suggested branch name: `idle/-creative-work` +or `idle/` if the idle-PR has a guiding theme. +Title prefix: `idle:` so it's grep-able / classifiable. +Body: explanation of what the agent is exploring and why. + +If a substantive piece of work emerges that deserves its +own PR (e.g., the experiment landed something that should +ship), split it out per Otto-282-gate ("if I can't +articulate why, don't ship") โ€” the idle PR's commitment +ceases when something formal emerges. + +## Composes with + +- **CLAUDE.md `feedback_never_idle_speculative_work_over_waiting.md`** + โ€” Otto-284 is the fourth-tier fallback below the three- + tier priority ladder. +- **CLAUDE.md auto-mode "don't pick destructive items + without you"** โ€” Otto-284 doesn't override this; it + fills the leftover idle time. +- **Otto-282** *write code from reader perspective* โ€” + creative work pays back via the predictive-model + benefit (richer pattern repertoire, deeper fluency). +- **Otto-238** *retractability is a trust vector* โ€” idle + PRs are retractable by design; experiment freely + knowing the rollback path exists. +- **Otto-264** *rule of balance* โ€” idle-PR work is the + counterweight to the structural risk of agent + calcification under high-blast-radius wait. +- **Otto-279** *research counts as history* โ€” research + done in an idle PR can be filed under + `docs/research/` as factory artifact; same surface + class as any other research. + +## CLAUDE.md candidacy + +Otto-284 modifies behavior at the heartbeat-idle moment +โ€” the moment that recurs every wake. It belongs in the +same family as the existing CLAUDE.md-elevated rules +(verify-before-deferring, future-self-not-bound, +never-be-idle, version-currency). Strong CLAUDE.md +candidate. + +Decision (Otto 2026-04-25, per Otto-283 itself): **defer +elevation to maintainer discretion** rather than +self-promoting. Memory entry is sufficient for now; +revisit at next governance pass. diff --git a/memory/feedback_write_code_from_reader_perspective_why_did_you_choose_this_otto_282_2026_04_25.md b/memory/feedback_write_code_from_reader_perspective_why_did_you_choose_this_otto_282_2026_04_25.md new file mode 100644 index 00000000..a22ee067 --- /dev/null +++ b/memory/feedback_write_code_from_reader_perspective_why_did_you_choose_this_otto_282_2026_04_25.md @@ -0,0 +1,319 @@ +--- +name: WRITE CODE FROM READER PERSPECTIVE โ€” every non-obvious choice (magic number, algorithm pick, library selection, threshold value, API signature, perf trade-off, defensive-vs-assertive style) deserves an in-place rationale comment because the future reader will always ask "why did you choose this?"; ~10sec write-time vs ~1hr per re-derivation; subsumes magic-numbers + DST-exempt-justification + trade-off-rationale rules; Aaron Otto-282 2026-04-25 generalising from SplitMix64 multiplier + shift + DST exemption discussions; pre-commit-lint candidate (flag new literals without comments) +description: Otto-282 general code-authoring discipline. Every non-obvious choice gets a why-did-you-pick-this comment in-place at write time. Subsumes magic-number rationale, DST-exempt justification, trade-off documentation. Bar โ€” "would a competent reader pause and ask why?" โ€” if yes, comment it. +type: feedback +--- + +## The rule + +**When writing code, think from the perspective of a human +developer who is reading it for the first time. They will +always ask: "why did you choose this?" If the answer is not +obvious from the surrounding code, write the answer in-place +as a comment โ€” at write time, not later.** + +Aaron's verbatim framing 2026-04-25: + +> *"just in general when writing code, think from the +> perspective of a human developer who's looking at it, they +> will always ask why did you choose this?"* + +## What "non-obvious" looks like + +The rule fires on any choice where a competent reader would +*pause* and wonder why this specific thing was picked over the +alternatives. Concrete examples that triggered the +generalization this session: + +1. **Magic-number constants.** `0x9E3779B97F4A7C15UL` is + meaningless to a reader who hasn't memorised SplitMix64; + `floor(2^64 / phi)` (golden-ratio multiplier from + Knuth TAOCP ยง6.4) is not. + +2. **Empirically-tuned shift values.** `30 / 27 / 31` in the + SplitMix64 finaliser look arbitrary; the comment that says + *"chosen by Vigna in arxiv 1410.0530 ยง3 to maximise + avalanche when paired with the multiplier; not + independently re-tunable โ€” they are a unit"* tells the + reader they cannot just bump them. + +3. **Library / algorithm selection.** Why `XxHash3` and not + `MD5` or `SHA-256`? The comment *"deterministic across + processes (unlike HashCode.Combine), 5โ€“10ร— faster than + cryptographic hashes, and we don't need cryptographic + resistance for shard assignment"* tells the reader that + the picker considered alternatives and ruled them out. + +4. **Threshold / boundary values.** `min 8 width` in + `CountMinSketch.forEpsDelta` โ€” why 8? Because below 8 the + `fastrange` columniser starts to produce too-many + collisions to test reliably; the comment encodes that. + +5. **API signature shape.** `Add(value: 'T, weight: int64)` + instead of `Add(value: 'T)` โ€” why expose the weight? The + docstring saying *"negative weights retract; the sketch + lives in โ„ค rather than โ„•"* is exactly the rationale the + reader needs. + +6. **Performance trade-offs.** `let buf = Array.zeroCreate + ... ` in a hot path โ€” is this Gen-0 alloc deliberate or + accidental? The comment *"reused per Push; reference impl + not hot-path; for hot-path use you'd incrementalise"* tells + the reader this is *known and accepted*, not an oversight. + +7. **DST-exempt or DST-special markers** (Otto-281 + counterweight). If you write the words "DST-exempt", you + owe the next reader: *what determinism violation, why + the cost is acceptable, what the deadline-or-fix is*. + +8. **Defensive vs assertive style choices.** A null-check + that looks paranoid: *"protects the FFI boundary where + our caller may be in C; internal callers cannot reach + this branch."* + +9. **Off-by-one or bounds tricks.** `(uint64 hash32 * uint64 (uint32 w)) >>> 32` + in the CountMin column-mapper looks weird; the + *"`fastrange` on 32-bit hash; takes the low 32 bits so + the product fits without truncation"* comment in + `CountMin.fs` is exactly right. + +10. **Concurrency annotations.** `// Thread safety: NOT + thread-safe. The buffer is mutated in-place on every Add` + is an obvious why โ€” reader sees a `ResizeArray` and + immediately wonders if they can share the sketch. The + comment closes the loop. + +## What "obvious" looks like (no comment owed) + +- A `match` over a discriminated union โ€” no comment owed + unless one branch is unusual. +- Standard F# conventions like `Result<_, LawViolation>` โ€” + the codebase's standard error-result contract is clear + from CLAUDE.md. +- A loop counter `i in 0 .. n - 1`. No mystery. +- Wrapping a `Dictionary` lookup in `TryGetValue`. Standard. +- Using the project's standard logger / error type. Standard. + +The bar is *"does a competent reader pause and ask why?"* โ€” +if yes, comment in-place. If no, don't. + +## The economic argument + +The comment write-time cost is ~10 seconds. The re-derivation +cost is *~1 hour per reader per visit* โ€” looking up the paper, +re-tracing the rationale, talking to the original author (who +may have left), running git-blame, reading the linked PR. With +N readers visiting M times, the saving compounds: N ร— M ร— ~1hr +vs ~10s once. + +This is true even for the original author six months later โ€” +**you are also a future reader of your own code**, and you +will not remember the rationale unless you wrote it down. + +## The mental-load-optimization framing โ€” and the gate + +Aaron's deeper framing 2026-04-25 (immediately after the +in-place rule above): + +> *"basically if a human can't answer why they want to +> refactor until they can, this is a mental load +> optimization."* + +The why-comment rule is best understood as a **cognitive +externalization**: the rationale moves from in-head working +memory (volatile, scarce, paid by every visitor) into the +file (durable, free-on-read). The author pays the cost once; +every future reader reads the result for free. That is the +optimization. + +But the framing also implies a **gate on action**: + +> *"if a human can't answer why they want to refactor +> [...] until they can"* + +If you cannot articulate the reason for a change to +yourself, you cannot articulate it for the reader either. +The act of writing the why-comment is also a forcing +function: if writing the comment surfaces *"I actually +don't know why I'm doing this"*, the right move is to +stop and re-evaluate, not to ship the change with a +hand-wavy comment. + +This refines Otto-282 from *"comment your why"* to +**"if you cannot answer your own why, do not make the +change"** โ€” and the comment is the proof that the why +exists. No comment + no good reason = the change is +premature. + +Two failure modes the gate prevents: + +- **Cargo-cult refactor** โ€” "this looks cleaner" with no + articulable reason. Gate fails (no why); should not + ship. +- **Activity-as-progress** โ€” making changes to feel + productive when no actual problem exists. Gate fails + (no why); should not ship. + +## The deeper framing โ€” "makes sense" = "I can predict" + +Aaron pushed the framing one step deeper 2026-04-25: + +> *"if a human can answer why then they can more easily +> predict future outcomes and hold potential behavior +> outcomes in their mind because 'it makes sense' they +> understand why, something making sense and understanding +> why are two closely related human concepts."* + +Translation: **"makes sense" and "understand why" are the +same cognitive primitive** โ€” both describe the state of +having a *predictive model* of the code. When a reader +understands why a choice was made, they can hold the +*space of consequences* in working memory โ€” they can +predict how the code will behave on inputs the test suite +never covered, predict where it will break under future +load, predict which surrounding changes are safe and +which aren't. + +Without the why, the reader has only the *what* โ€” +syntax + behavior on the cases they ran. They can describe +the code but they cannot *predict* it. Surrounding code +changes feel unsafe because every modification is a leap. +The cognitive load of working in the file is high because +each line carries an unsourced "trust me" that the reader +has to either accept blind or re-derive from scratch. + +This is the deeper economic argument: **every line of code +the reader genuinely understands the why of is a line whose +neighborhood they can confidently change.** Lines without a +clear why are blast-radius constraints โ€” you can read them +but you can't safely move around them. The why-comment +isn't just a convenience; it's the substrate that lets a +maintainer *act* in the code at all. + +Composes with intentional-debt and "do nothing if nothing +is broken" feedback rules: the why-comment is the +entry-point check that the change is intentional rather +than reflexive. The author's pre-commit moment of *"can I +write a sentence saying why this change exists?"* is the +optimization โ€” once the rationale is articulated, the +reader inherits a model of the code, not just a description +of it. + +The cognitive economics summary: + +| Reader has | Reader can do | +| ----------- | ----------------------------------- | +| WHAT only | Read; describe behavior on tested cases | +| WHY too | Predict; safely change surrounding code | +| Neither | Avoid the file; cargo-cult around it | + +## What this rule SUBSUMES (consolidation) + +This is a general principle that several earlier rules were +already special-casing: + +- *"Comment magic numbers"* โ€” special case of "non-obvious + literal". +- *"DST-exempt comments need full justification"* + (Otto-281) โ€” special case of "non-obvious style choice + with a determinism cost". +- *"Document perf trade-offs"* โ€” special case of + "non-obvious algorithmic choice". +- *"Reference papers / RFCs in docstrings"* โ€” special case + of "answer the why". + +Future rules of this shape can hang off Otto-282 rather than +each becoming its own bullet in CLAUDE.md or +`docs/AGENT-BEST-PRACTICES.md`. The right home is whatever +tag fits โ€” code-style, comment-discipline, or +authoring-perspective. + +## What this rule does NOT mandate + +- **Does NOT mandate verbose comments everywhere.** Code + that is genuinely self-explanatory (good naming, standard + patterns) needs no comment. Adding "this loops over the + list" above `for x in xs do ...` is noise. +- **Does NOT mandate paragraph-length docstrings.** A + one-line *"why this constant: floor(2^64 / phi)"* is + often enough; expand only when the reader genuinely + needs more. +- **Does NOT contradict CLAUDE.md "default to no + comments".** That rule's reasoning is "don't write + comments that explain WHAT well-named code already + shows". Otto-282 is about WHY-comments specifically โ€” + the rationale a reader cannot recover from the code + alone. + +The two compose: +- WHAT โ€” encoded in names + types. Don't comment. +- WHY โ€” encoded in rationale. Comment when non-obvious. + +## Pre-commit-lint candidate + +A simple pre-commit lint could flag *new* numeric literals +that don't have a `// ` comment within ยฑ2 lines. False +positives are easy (loop bounds, indices), so the lint should +warn-not-block, and probably exempt small literals (0, 1, -1, +2, 8, 16, 32, 64) and well-known constants. The lint becomes a +nudge that asks the author "did you mean to add a rationale +here?" before commit. + +A second lint could flag the words "DST-exempt", "magic +constant", "TODO: explain", "for now", "temporarily" without a +following sentence containing "because" / "due to" / "per" โ€” a +comment that announces a non-obvious choice but then doesn't +explain it is just as bad as no comment. + +## The reverse direction โ€” when reading code, ASK why + +When reviewing or auditing existing code and you find an +unexplained non-obvious choice, the right move is **not** +to leave it (charity) and **not** to delete it (suspicion); +it is to *ask the author* (or git-blame the original PR) for +the rationale, then *land the rationale as a comment* in a +follow-up commit. The audit's job is half "find bugs" and +half "convert tribal knowledge into documented rationale". + +## The case that triggered Otto-282 + +This session, while doing the comprehensive HashCode.Combine +audit (Otto-281 follow-up), I: + +1. Created `src/Core/SplitMix64.fs` to refactor 8+ inline + copies of the SplitMix64 finaliser. + +2. **Forgot to comment WHY** the magic constant + `0x9E3779B97F4A7C15UL` was picked. Aaron caught it. + +3. Then **forgot to comment WHY** the second constant + `0xBF58476D1CE4E5B9UL` was picked. Aaron caught it. + +4. Then **forgot to comment WHY** the shift values 30/27/31 + were picked. Aaron caught it. + +5. Aaron then generalised: *"just in general when writing + code, think from the perspective of a human developer + who's looking at it, they will always ask why did you + choose this?"* + +The pattern: I was treating "well-known to me" as "obvious to +the reader". That's the bias Otto-282 corrects. The reader +does not have my Vigna-paper memory. They do not have my +Knuth TAOCP memory. They have the file in front of them and +nothing else. Write for *that* reader. + +## Composes with + +- **Otto-281** *DST-exempt is deferred bug* โ€” special case of + "comment the why for any non-obvious choice", specifically + for determinism exemptions. +- **Otto-264** *rule of balance* โ€” every found mistake + triggers a counterweight. This memory IS the counterweight + to the SplitMix64 magic-number-without-rationale mistake. +- **CLAUDE.md "default to no comments"** โ€” composes by + splitting WHAT (no comment, names suffice) from WHY + (comment when non-obvious). +- **`docs/AGENT-BEST-PRACTICES.md`** โ€” candidate BP-NN row; + worth proposing as a stable rule for the + agent-best-practices ladder.