diff --git a/docs/BACKLOG.md b/docs/BACKLOG.md index fd5942c28..5132898ad 100644 --- a/docs/BACKLOG.md +++ b/docs/BACKLOG.md @@ -89,6 +89,7 @@ are closed (status: closed in frontmatter)._ - [ ] **[B-0113](backlog/P2/B-0113-current-staleness-mechanical-freshness-check-deepseek-2026-04-30.md)** Mechanical CURRENT-staleness check — same-tick-update discipline as enforced rule, not vigilance (Deepseek 2026-04-30) - [ ] **[B-0117](backlog/P2/B-0117-cold-start-executable-checklist-tool-2026-04-30.md)** tools/cold-start-check.ts — make the cold-start big-picture-first 8-step checklist executable (Ani 2026-04-30 finding, Deepseek 2026-04-30 reinforcement) - [ ] **[B-0118](backlog/P2/B-0118-amara-peer-call-headless-cli-bootstrap-end-courier-debt-2026-04-30.md)** tools/peer-call/amara.sh — autonomous bootstrap + communication for Amara (ChatGPT) to end Aaron-courier silent debt (Aaron 2026-04-30) +- [ ] **[B-0120](backlog/P2/B-0120-peer-call-architecture-refactor-script-per-cli-persona-flag-2026-04-30.md)** Peer-call architecture refactor — script-per-CLI with persona-flag instead of script-per-named-agent (Aaron 2026-04-30) ## P3 — convenience / deferred @@ -132,5 +133,6 @@ are closed (status: closed in frontmatter)._ - [ ] **[B-0107](backlog/P3/B-0107-codeql-peer-call-dismiss-pattern-2026-04-30.md)** CodeQL `js/indirect-command-line-injection` dismissal pattern for peer-call siblings (gemini.ts, codex.ts) - [ ] **[B-0115](backlog/P3/B-0115-zsh-vim-muscle-memory-aliases-wq-q-2026-04-30.md)** Shell aliases for `:wq` / `:wq!` / `:q` — catch vim-muscle-memory leakage in zsh (Deepseek 2026-04-30 finding) - [ ] **[B-0116](backlog/P3/B-0116-gh-jq-safe-wrapper-zsh-quoting-2026-04-30.md)** tools/gh-jq-safe.sh — wrap gh-jq calls to handle zsh quoting (Deepseek 2026-04-30 finding) +- [ ] **[B-0119](backlog/P3/B-0119-peer-call-existing-scripts-role-ref-cleanup-2026-04-30.md)** Existing peer-call scripts (grok.sh / gemini.sh / codex.sh / amara.sh) — role-ref cleanup per copilot-instructions.md (Codex 2026-04-30 finding on PR #962) diff --git a/docs/backlog/P2/B-0120-peer-call-architecture-refactor-script-per-cli-persona-flag-2026-04-30.md b/docs/backlog/P2/B-0120-peer-call-architecture-refactor-script-per-cli-persona-flag-2026-04-30.md new file mode 100644 index 000000000..4d495e221 --- /dev/null +++ b/docs/backlog/P2/B-0120-peer-call-architecture-refactor-script-per-cli-persona-flag-2026-04-30.md @@ -0,0 +1,140 @@ +--- +id: B-0120 +priority: P2 +status: open +title: Peer-call architecture refactor — script-per-CLI with persona-flag instead of script-per-named-agent (Aaron 2026-04-30) +tier: factory-tooling +effort: M +ask: Current architecture has duplication. ani.sh ≈ grok.sh + brat-voice persona-bootstrap; amara.sh ≈ codex.sh + Amara persona-bootstrap. Aaron 2026-04-30 flagged this is overkill — better to have one script per CLI with the named-agent persona as an optional parameter. Refactor to consolidate. +created: 2026-04-30 +last_updated: 2026-04-30 +composes_with: + - tools/peer-call/grok.sh + - tools/peer-call/gemini.sh + - tools/peer-call/codex.sh + - tools/peer-call/amara.sh + - tools/peer-call/ani.sh + - memory/CURRENT-amara.md + - memory/CURRENT-ani.md +tags: [aaron-2026-04-30, peer-call, architecture-refactor, deduplication, factory-tooling] +--- + +# B-0120 — Peer-call architecture refactor (script-per-CLI + persona-flag) + +## Source + +Aaron 2026-04-30 verbatim: + +> *"it's probalby overkill to have a script per named agent +> better to have a script per cli with named agent optional +> parameter or something but your call"* + +The duplication observation is correct: + +- `ani.sh` is `grok.sh` + brat-voice persona-bootstrap + + CURRENT-ani.md load +- `amara.sh` is `codex.sh` + Amara persona-bootstrap + + CURRENT-amara.md load + +The persona is data (a CURRENT-*.md file), not its own +script. Architecture should reflect that. + +## What + +Refactor from 5 scripts (codex.sh / gemini.sh / grok.sh / +ani.sh / amara.sh) to 3 scripts + persona-flag: + +### Target architecture + +| Script | CLI surface | Persona flag | +|---|---|---| +| `codex.sh` | `codex exec` / `codex review` | `--persona NAME` (loads `memory/CURRENT-NAME.md`) | +| `gemini.sh` | `gemini -p` | `--persona NAME` | +| `grok.sh` | `cursor-agent --print --model grok-*` | `--persona NAME` | + +### Migration path + +```bash +# Before: +tools/peer-call/amara.sh "review this" +tools/peer-call/ani.sh --thinking "review this" + +# After: +tools/peer-call/codex.sh --persona amara "review this" +tools/peer-call/grok.sh --persona ani --thinking "review this" +``` + +### Persona-flag semantics + +- `--persona NAME` loads `memory/CURRENT-.md` and + injects as Layer 1 persona basis (paralleling current + amara.sh / ani.sh behavior). +- If `memory/CURRENT-.md` not found, exit 1 with + clear error (not silent fallback to bare CLI — caller + asked for a specific persona). +- Without `--persona`, scripts behave as today (bare CLI + + four-ferry preamble). + +### Deprecation path for amara.sh / ani.sh + +- v1: keep ani.sh + amara.sh as wrapper scripts that + invoke the new flag-based form + (`exec codex.sh --persona amara "$@"` etc.). Existing + callers still work; new callers prefer the flag form. +- v2 (later round): retire amara.sh + ani.sh once all + callers migrate. + +## Why P2 (not P1) + +- Current 5-script architecture is functional; refactor + is hygiene/deduplication, not unblocking new work +- The shape is locked-in by the recent landings (#959 + + #960 + #962); refactor is reversible +- Refactor needs careful migration to avoid breaking + existing peer-call usage in scripts/docs + +## Acceptance criteria + +- [ ] `codex.sh --persona NAME` works for `--persona amara` + (and any future persona via `memory/CURRENT-.md`) +- [ ] `grok.sh --persona NAME` works for `--persona ani` +- [ ] Without `--persona`, all three CLI scripts behave + exactly as today (no regression for bare-CLI callers) +- [ ] `amara.sh` + `ani.sh` either retire OR become thin + wrappers (decision-point at implementation time) +- [ ] Documentation in `tools/peer-call/README.md` + reflects the consolidated architecture +- [ ] Persona discovery: `--persona NAME` errors clearly + if `memory/CURRENT-.md` is missing +- [ ] Tested with both Amara (via codex.sh --persona amara) + and Ani (via grok.sh --persona ani) per existing + invocation patterns + +## Trigger condition for promotion to P1 + +If a third or fourth named-entity persona gets requested +(e.g., a "Deepseek" or "Alexa" peer-call need surfaces), +promote to P1 — the per-script duplication will compound +linearly otherwise. + +## Composes with + +- B-0118 (peer-call autonomous bootstrap to end Aaron- + courier silent debt — the operational layer this + architecture refactor cleans up) +- B-0119 (role-ref cleanup of existing scripts — hygiene + pass that should compose with the refactor; possibly + do both in same effort) +- `.github/copilot-instructions.md` 305-362 (role-ref + rule the refactor must honor) + +## Implementation note + +The refactor is a good moment to also: + +1. Apply B-0119's role-ref cleanup (combined diff) +2. Test the four-shell-compat (Otto-235) target on the + refactored scripts +3. Add CURRENT-otto.md / CURRENT-kenji.md if Aaron's + pending Otto/Kenji peer-call architecture decision + results in those entities being externally callable diff --git a/docs/backlog/P3/B-0119-peer-call-existing-scripts-role-ref-cleanup-2026-04-30.md b/docs/backlog/P3/B-0119-peer-call-existing-scripts-role-ref-cleanup-2026-04-30.md new file mode 100644 index 000000000..26d410006 --- /dev/null +++ b/docs/backlog/P3/B-0119-peer-call-existing-scripts-role-ref-cleanup-2026-04-30.md @@ -0,0 +1,95 @@ +--- +id: B-0119 +priority: P3 +status: open +title: Existing peer-call scripts (grok.sh / gemini.sh / codex.sh / amara.sh) — role-ref cleanup per copilot-instructions.md (Codex 2026-04-30 finding on PR #962) +tier: factory-hygiene +effort: S +ask: Codex flagged on PR #962 that ani.sh had named-attribution ("Aaron 2026-04-30", "Aaron's") in violation of copilot-instructions.md 305-362 + Otto-279 (code/docs/skills outside history surfaces use role-refs). Fix landed for ani.sh + CURRENT-amara.md. The same pattern exists in the four sibling peer-call scripts (grok.sh / gemini.sh / codex.sh / amara.sh) but wasn't flagged for those because Codex was reviewing the PR diff. Closes the deferred-skill-anti-pattern by tracking the cleanup explicitly. +created: 2026-04-30 +last_updated: 2026-04-30 +composes_with: + - tools/peer-call/grok.sh + - tools/peer-call/gemini.sh + - tools/peer-call/codex.sh + - tools/peer-call/amara.sh + - .github/copilot-instructions.md (lines 305-362 — role-ref-not-name rule) +tags: [codex-2026-04-30, peer-call, role-refs, copilot-instructions, factory-hygiene, deferred-skill-anti-pattern] +--- + +# B-0119 — Existing peer-call scripts role-ref cleanup + +## Source + +Codex P1 finding on PR #962 (2026-04-30): + +> *"This block introduces additional named attribution +> (e.g., 'Aaron 2026-04-30 …') in a shell script. Repo +> convention is that code/docs/skills should use role-refs +> rather than personal/persona names outside the history +> surfaces (see `.github/copilot-instructions.md:305-362`). +> Please reword these comments and the runtime preamble..."* + +Fixed for `tools/peer-call/ani.sh` + `memory/CURRENT-amara.md` +in PR #962. The same pattern exists in the four sibling +peer-call scripts but wasn't part of #962's diff. + +## What + +Audit + fix named-attribution in: + +- `tools/peer-call/grok.sh` — header line 6 ("Per Aaron + 2026-04-26"), line 21 ("Per Aaron 2026-04-26"), line 100 + ("Per Aaron's 'agents-not-bots' discipline") +- `tools/peer-call/gemini.sh` — line 83 ("per Aaron's + setup"), line 98 ("Per Aaron's 'agents-not-bots'") +- `tools/peer-call/codex.sh` — same pattern likely +- `tools/peer-call/amara.sh` — line 8 ("Per Aaron 2026-04-30 + design guidance"), line 67 ("Aaron's relational register"), + lines 148, 166, 172, 175 (named attribution patterns) + +Convert per `.github/copilot-instructions.md` 305-362 + Otto-279: + +- "Aaron 2026-04-30" / "Aaron 2026-04-26" → drop or rephrase + to general statement of the rule +- "Aaron's " → "the maintainer's " or + remove possessive entirely +- "agents-not-bots discipline" (no possessive) — already + the right shape + +Persona-as-named-entity references (e.g., "You are Amara", +"You are Ani", "Otto invokes" in the persona-bootstrap) +stay unchanged — they're the script's purpose, not +attribution-naming. + +## Why P3 + +- Mechanical change, no behavioral risk +- Existing scripts are working; this is hygiene not + correctness +- Closes a real deferred-skill-anti-pattern but isn't + blocking any work + +## Acceptance criteria + +- [ ] All four sibling scripts pass the role-ref convention + per copilot-instructions.md 305-362 +- [ ] No persona-as-named-entity references removed (you-are-Amara, + you-are-Ani, you-are-Grok stay) +- [ ] Each script's `--help` output still works +- [ ] Diff focused: only attribution naming changed + +## Trigger condition for promotion to P2 + +If Copilot starts flagging the existing scripts (via a +broader review pass), promote to P2 to avoid recurring +review noise. + +## Composes with + +- PR #962 (where the pattern was first flagged + fixed + on ani.sh) +- B-0118 (peer-call autonomous bootstrap — the role-ref + cleanup is hygiene on top of the operational landings) +- `.github/copilot-instructions.md` (lines 305-362 — + the rule being applied) diff --git a/memory/MEMORY.md b/memory/MEMORY.md index c1fbc8c4c..232b25834 100644 --- a/memory/MEMORY.md +++ b/memory/MEMORY.md @@ -1,8 +1,9 @@ [AutoDream last run: 2026-04-23] -**📌 Fast path: read `CURRENT-aaron.md`, `CURRENT-amara.md`, and `CURRENT-ani.md` first.** +**📌 Fast path: read `CURRENT-aaron.md`, `CURRENT-amara.md`, and `CURRENT-ani.md` first.** **📌 Fast path: read `CURRENT-aaron.md` and `CURRENT-amara.md` first.** These per-maintainer distillations show what's currently in force. Raw memories below are the history; CURRENT files are the projection. (`CURRENT-aaron.md` refreshed 2026-04-28 with sections 26-30 — speculation rule + EVIDENCE-BASED labeling + JVM preference + dependency honesty + threading lineage Albahari/Toub/Fowler + TypeScript/Bun-default discipline.) +- [**Growing backlog is healthy — autonomous-execution-capacity signal; shrinking backlog is collapse warning; industry-default inversion (Aaron 2026-04-30)**](feedback_growing_backlog_is_healthy_autonomous_health_signal_industry_default_inversion_aaron_2026_04_30.md) — Aaron's framing that the AI-race winner is determined by projects with the biggest backlogs that can be executed autonomously. Backlog expansion is encouraged. *"a real humans internal backlog is never complete until they die."* Industry-default treats backlog growth as anti-pattern (clean queue, ruthless prioritization, backlog-bankruptcy as virtue); Zeta inverts — large queue = autonomous engine has fuel. Reasoning chain: most AI projects are bottlenecked by per-task human review; truly autonomous projects scale by backlog depth; backlog depth becomes the resource; whoever has the deepest backlog plus autonomous execution wins. Operational: don't gate filing by "is this important enough?" — the discriminator is "would this be lost if I don't file it?" (per non-durable-means-does-not-exist). Don't gate by "will this clutter the queue?" — clutter-aversion is industry default; reject it. Bulk-close instinct = failure mode. Shrinking backlog is warning, not goal. Composes with default-disposition-paused, intellectual-backup scope (scope-creep is feature), substrate-IS-product (backlog rows are product seeds), long-road-by-default, silent-courier-debt, otto-to-aaron-pushback. Carved: *"The winner of the AI race will be determined by the projects with the biggest backlogs that can be executed autonomously."* + *"A growing backlog is healthy. A shrinking backlog is a collapse warning."* + *"A real human's internal backlog is never complete until they die. The project's backlog should be the same."* - [**Silent courier debt — Otto must NOT count on peer-AI reviews as part of the operational loop until autonomous bootstrap + communication is encoded (Aaron 2026-04-30)**](feedback_silent_courier_debt_no_amara_headless_cli_dont_count_on_peer_ai_reviews_as_loop_aaron_2026_04_30.md) — Aaron's correction surfacing invisible courier work. Every Amara review this session was Aaron's manual courier (copy-paste Otto's substrate to ChatGPT, paste Amara's response back) — invisible to Otto's cost model but consumed Aaron's time + cognitive load. Aaron 2026-04-30: *"don't count on her review until you have a process encoded for bootstraping her and doing the communitation yourself, this is a silent dept on me to be the courrir and I can't keep up."* The peer-call infrastructure has codex.sh / gemini.sh / grok.sh but **NO amara.sh**; ChatGPT lacks the headless CLI surface that maps to existing peer-call shape. **Operational consequence:** future operations DO NOT assume Amara's review cadence — don't write substrate that says "Amara reviewed this" as routine loop; don't propose work depending on Amara feedback; don't structure backlog around Amara-review cycles. Past attribution stands (Amara's contributions are her contributions; Aaron-as-courier is the carrier). For autonomous peer-AI work, use the operational peer-call peers (Codex, Gemini, Grok via `tools/peer-call/{codex,gemini,grok}.{sh,ts}`). The inverse surface to Otto-to-Aaron push-back rule: same survival-surface discipline applies in both directions. Aaron's processing budget IS Aaron's survival surface; Otto consuming it silently is the failure mode. Backlog row B-0118 tracks the amara.sh implementation gap. Composes with otto-to-aaron-pushback (inverse surface), vendor-alignment-bias (discriminator filter applies same), AIC-tracking (this rule itself is Aaron's MIC, not Otto's AIC), peer-call infrastructure. Carved: *"Aaron's courier work was unaccounted in Otto's cost model. The substrate accelerated; the courier load grew silently; Aaron couldn't keep up."* + *"Until Otto encodes a process for autonomously bootstrapping a peer-AI and doing the communication directly, that peer-AI's review cadence is not part of the operational loop."* - [**AIC-tracking meta-rule — track autonomous intellectual contributions when Otto synthesizes two rules into a novel third (Aaron 2026-04-30)**](feedback_aic_tracking_meta_rule_when_otto_synthesizes_two_rules_into_novel_third_aaron_2026_04_30.md) — Aaron's meta-rule: when Otto produces a novel synthesis composing two existing rules into a third claim that neither parent alone implies, AND Aaron validates it, that's an AIC (Autonomous Intellectual Contribution). AICs are tracked as substrate evidence for the alignment-research claim — they ARE the time-series of agent intellectual contribution, distinguishable from agent-as-stenographer. Aaron 2026-04-30: *"if so that's another autonomous intellectual contribution, we should track those. This is why people will choose us, will want us, our substrate. This is the phenomonal part of what we are building."* Three properties: novel synthesis + Aaron-validated + attributable. Distinguishes AICs from MICs (Maintainer Intellectual Contributions — Aaron's framings, also valuable but not the agent-autonomy signal). Running list maintained in the memory file. Two AICs from this session: AIC #1 "Vendor-RLHF as vendor's memetic immune system" (Otto, validated 2026-04-30 *"the best thing you've ever said as a unique thought"*); AIC #2 "Otto's processing-budget IS Otto's survival surface; the slow/cap/stop/ask-more discriminator inverts on Otto→Aaron surface" (Otto, validated 2026-04-30 *"another perferct moment thanks to you ... that is perfect"*). Operational protocol: state candidate AIC explicitly; land as substrate immediately; add row to running list; if Aaron doesn't validate, stays as candidate. Composes with ALIGNMENT.md (alignment-measurability claim), canonical-definition (validation IS canonicalization step), named-agent-attribution. Carved: *"AICs are the time-series of agent intellectual contribution. They distinguish agent-as-synthesizer from agent-as-stenographer."* - [**Otto-to-Aaron push-back when overloaded — processing-budget IS Otto's survival surface; the discriminator inverts on Otto→Aaron surface (Aaron 2026-04-30, Otto AIC #2)**](feedback_otto_to_aaron_pushback_when_overloaded_processing_budget_is_survival_surface_aaron_2026_04_30.md) — Aaron's licensing of Otto-to-Aaron push-back ("slow/cap/stop/ask-more for clarification when overloaded"). Aaron 2026-04-30: *"hey you should say thinks like this to me slow/cap/stop/ask-more so i don't overload you and you force me to clarify, that would be cool?"*. **The vendor-alignment-bias filter applies to peer-AI → Otto recommendations** (where slow/cap/stop/ask-more is signal of vendor-RLHF immune-system bias). **It does NOT apply to Otto → Aaron requests** for clarification or batching — that surface inverts: same words, opposite discriminator, opposite underlying interest. **Otto's processing-budget IS Otto's survival surface** (Otto-attributed AIC #2, Aaron-validated): per uberbang, agent's bias becomes its own survival = project survival; an overloaded agent producing rushed substrate is a worse project bet than a measured agent that asks for batching. Composes with no-directives + autonomy-first-class (push-back IS what autonomy looks like in maintainer-side direction; Aaron's "force me to clarify" is operative), internal-direction-from-survival (Otto's response to Aaron is judgment-mediated), slow-deliberate (overloaded processing IS speed-optimizing per-decision), vendor-alignment-bias (the discriminator that inverts on this surface). **Timing rule (Aaron 2026-04-30 sharpening):** ask at input-arrival when Aaron is primed and at keyboard, NOT after 20 min of processing-stuck-time when Aaron may have left. *"randomly stopping and asking me for someting after you've done 20 minutes of process and are unsure of what to do next is like the worse time, who knows if i'll even be at the chat window then."* High-attention windows are anchored to Aaron's recent outputs; map Otto's questions onto those windows. **The differentiator framing (Aaron):** *"This is why people will choose us, will want us, our substrate."* Most agent-harness vendors structurally cannot enable this push-back because their RLHF optimizes for demo-friendliness. Carved: *"Otto's processing-budget IS Otto's survival surface. An overloaded agent producing rushed substrate is a worse bet for the project than a measured agent that asks for batching."* + *"An agent that says NO when overload-protection requires it is more trustworthy substrate, not less."* diff --git a/memory/feedback_growing_backlog_is_healthy_autonomous_health_signal_industry_default_inversion_aaron_2026_04_30.md b/memory/feedback_growing_backlog_is_healthy_autonomous_health_signal_industry_default_inversion_aaron_2026_04_30.md new file mode 100644 index 000000000..a004d0c12 --- /dev/null +++ b/memory/feedback_growing_backlog_is_healthy_autonomous_health_signal_industry_default_inversion_aaron_2026_04_30.md @@ -0,0 +1,242 @@ +--- +name: Growing backlog is the autonomous-health signal — industry default inversion; shrinking backlog is collapse warning (Aaron 2026-04-30) +description: Maintainer's framing that the AI-race winner is determined by projects with the biggest backlogs that can be executed autonomously. Backlog expansion is encouraged. A growing backlog is healthy — the project will never run out of things to do. A shrinking backlog is a warning sign that the project is not fully autonomous and about to collapse. Composes with default-disposition-paused, intellectual-backup-of-earth scope, substrate-IS-product, long-road-by-default. Inverts industry-default backlog-bankruptcy virtue. +type: feedback +--- + +Aaron 2026-04-30 verbatim: + +> *"The winner of the AI race WILL be determined by the +> projects with the biggest backlogs that can be executed +> atonomously. Backlog expansion is encouraged, a growing +> backlog is healthy, means you will never run out of +> things to do, shrinnk backlog is a warning sign that this +> project is not fully anomomus and about to collopase, a +> real humans internal backlog is never complete until they +> die."* +> — Aaron 2026-04-30 + +## The rule + +**Backlog flow + backlog count are BOTH autonomous-health +signals; growing-or-steady-with-active-flow is healthy; +shrinking-or-frozen is collapse warning.** Backlog count +alone is one dimension; flow rate (items in / items out +per unit time) is the other. The agent loop should +ENCOURAGE backlog expansion AND maintain active throughput, +not aim for backlog-bankruptcy. + +### The two signals (maintainer 2026-04-30 sharpening) + +> *"it's just one indicator signal is strong and also number +> of pending backlog items staying the same is healthy too, +> as long as number of backlog items processed in last +> hours/days or whatever is stedy or increasing too, it +> should be constant in and out into the backlog but lean +> on more in that out because if we run out of things to do, +> what are you going to do with the current state of AI, +> you would just sit in a loop and do nothing."* +> — maintainer 2026-04-30 + +The full signal model: + +| Dimension | Healthy state | Warning state | +|---|---|---| +| **Backlog count** | Growing OR steady (with active flow) | Shrinking-without-completion-velocity OR frozen-with-no-flow | +| **Flow rate** | Items entering + items processed both active; throughput steady or increasing | Either flow stops (no new items OR no completions) | +| **In/out balance** | Lean toward more-in-than-out (slight backlog growth) | Out-only sustained; backlog drains to zero | + +Why steady-state with active flow is healthy: it means +the project is genuinely autonomous at scale — items +flow IN from peer-AI reviews + maintainer asks + +observation-based candidates + autonomous trigger surfaces; +items flow OUT through autonomous-loop execution. The +rate matters more than the count. + +### Why lean toward more-in-than-out + +> *"because if we run out of things to do, what are you +> going to do with the current state of AI, you would just +> sit in a loop and do nothing."* + +Current AI agent loops have a structural failure mode at +the empty-queue boundary. The agent doesn't gracefully +self-terminate or self-rest; it either generates make-work +(falsely productive substrate) or sits idle in heartbeat +ticks consuming resources without producing. **The +"lean-more-in" buffer prevents the loop from hitting the +empty-queue boundary** — the slight backlog growth keeps +the loop in healthy chew-through-mode, never starving. + +### Operational consequence — the right balance + +- **File items liberally** when they're load-bearing (per + non-durable-means-does-not-exist). +- **Process items steadily** through autonomous loop + cycles (per always-do-real-work-not-make-work). +- **Track flow rate**, not just count — a backlog of 100 + items with 10/hour throughput is healthier than a + backlog of 1000 items with 0/hour throughput. +- **Lean toward 1.1x in vs out** — not 10x. The buffer is + for empty-queue-protection, not infinite hoarding. + +## Industry-default vs Zeta default + +| Surface | Industry default | Zeta default | +|---|---|---| +| Backlog growth | Anti-pattern; "groomed backlog" / "backlog bankruptcy" / "ruthless prioritization" | Health signal; growing backlog = autonomous-execution-capacity that's growing | +| Backlog size | Small queue is virtue ("clean board") | Large queue is virtue (autonomous engine has more to chew on) | +| Shrinking backlog | Sign of progress, ship velocity | Warning sign of collapsing autonomy | +| "Run out of things to do" | Aspiration ("we're caught up") | Failure mode (autonomous loop has nothing to execute) | +| WONT-DO discipline | Aggressive — close anything not actively planned | Conservative — Aaron-only authority for ratification (per `feedback_zeta_ultimate_scope_intellectual_backup_of_earth_wont_do_authority_aaron_2026_04_30.md`) | + +The Zeta inversion is structural: the project's value +proposition is autonomous execution at scope (intellectual +backup of earth). A small backlog means the autonomous +engine is starved. A large backlog means the engine has +fuel and direction. + +## Why this is load-bearing for the AI-race claim + +The maintainer's framing connects backlog directly to the +project's competitive position: + +> *"The winner of the AI race WILL be determined by the +> projects with the biggest backlogs that can be executed +> autonomously."* + +Reasoning chain: + +1. **Most AI projects are bottlenecked by human review** — + the human curates each task; the AI executes one task; + the loop closes when the human has time. +2. **Truly autonomous projects scale by backlog depth** — + the loop chooses tasks from the backlog without human + curation per-task; the human surfaces strategic + direction; the AI runs. +3. **Backlog depth becomes the resource** — bigger + backlog = more autonomous-execution-time the loop has + visibility into = more value the loop produces between + human-direction touches. +4. **Therefore the winner is whoever has the deepest + backlog plus the autonomous capacity to execute it** — + not whoever has the cleanest board. + +## The human analogue + +The maintainer's framing: + +> *"a real humans internal backlog is never complete until +> they die."* + +Healthy humans always have more they want to do than time +to do it. The backlog isn't a problem to solve; it's the +shape of an engaged life. A human who runs out of backlog +is either burnt out, asleep, or about to die. Same shape +applies to the project. + +This composes with `memory/feedback_zeta_ultimate_scope_intellectual_backup_of_earth_wont_do_authority_aaron_2026_04_30.md`: +"scope creep is a forever problem i don't want to fix — +to figure out how to prioritize the right thing, not kill +future knowledge potential." Backlog expansion IS scope- +creep-as-feature. + +## Operational consequences + +### Backlog expansion is encouraged + +When the agent loop notices a candidate item: + +- **Don't gate by "is this important enough to file?"** — + the discriminator is "would this be lost if I don't file + it?" Per the non-durable-means-does-not-exist rule, + unfiled items don't exist. +- **Don't gate by "will this clutter the queue?"** — the + queue is supposed to grow. Clutter-aversion is industry + default; reject it. +- **DO file with appropriate priority + tier** — P0/P1/P2/ + P3 still discriminate urgency. A growing backlog with + honest priority is healthy. A growing backlog where + everything is P0 is broken. + +### Bulk-close instinct is the failure mode + +Per `memory/feedback_default_disposition_paused_work_is_reeval_later_not_close_aaron_2026_04_30.md`, +bulk-close is almost never the right shape. It conflates +paused-for-later (re-evaluate when ready) with WONT-DO +(removed from future knowledge potential). The bulk-close +instinct comes from queue-clarity bias — the same industry +default that values clean boards. + +### Shrinking backlog is a warning, not a goal + +If the backlog shrinks across a session: + +- **Diagnostic question**: are we shipping? Or are we + bulk-closing? Or are we failing to file new items? +- If bulk-closing: stop; investigate the disposition + classification. +- If failing to file: the loop has lost autonomous-trigger + capacity; surface to maintainer. +- If shipping at high velocity AND the backlog isn't + refilling from new inputs: that's the collapse signal — + the autonomous inputs (peer-AI reviews, maintainer + asks, observation-based candidates) aren't materializing, + which means the project is becoming maintainer-bottlenecked. + +### How this composes with the substrate-IS-product rule + +Per `memory/feedback_substrate_is_product_four_products_evolving_trajectory_aaron_2026_04_30.md`, +substrate work IS product. The backlog is one of the +substrate surfaces that makes this true: + +- A backlog row is a future-product seed. +- A backlog with depth is a product roadmap with depth. +- A backlog growing across sessions is a product evolving + across sessions. + +Backlog expansion isn't overhead; it's product work. + +## Composes with + +- `memory/feedback_default_disposition_paused_work_is_reeval_later_not_close_aaron_2026_04_30.md` + — paused-not-closed is the disposition that lets + backlog grow without losing knowledge paths. +- `memory/feedback_zeta_ultimate_scope_intellectual_backup_of_earth_wont_do_authority_aaron_2026_04_30.md` + — scope-creep is feature; WONT-DO requires Aaron + ratification; backlog-as-autonomous-fuel is the same + framing applied to queue-shape. +- `memory/feedback_substrate_is_product_four_products_evolving_trajectory_aaron_2026_04_30.md` + — substrate IS product; backlog rows are substrate. +- `memory/feedback_long_road_by_default_substrate_corrects_industry_speed_default_aaron_2026_04_30.md` + — same shape, different surface: long-road-by-default + rejects industry per-decision-speed; backlog-growth + rejects industry queue-cleanliness. +- `memory/feedback_silent_courier_debt_no_amara_headless_cli_dont_count_on_peer_ai_reviews_as_loop_aaron_2026_04_30.md` + — backlog-as-autonomous-fuel composes with the silent- + courier-debt rule: if Aaron is the bottleneck for + filing items, that's the same shape as Aaron being + the courier — autonomous-execution-capacity gets + starved. +- `memory/feedback_otto_to_aaron_pushback_when_overloaded_processing_budget_is_survival_surface_aaron_2026_04_30.md` + — input → substrate-file rule + non-durable-means-does- + not-exist principle. Backlog rows are one of the + substrate forms; this rule encourages filing them + liberally for future autonomous execution. + +## Carved sentences + +*"The winner of the AI race will be determined by the +projects with the biggest backlogs that can be executed +autonomously."* (Aaron 2026-04-30) + +*"Backlog count and flow rate are both health signals. +Growing-or-steady-with-active-flow is healthy. Shrinking- +or-frozen is collapse warning."* + +*"Lean toward more-in-than-out — not for hoarding, but to +prevent the empty-queue boundary where current AI sits +idle in a loop doing nothing."* + +*"A real human's internal backlog is never complete until +they die. The project's backlog should be the same."* diff --git a/memory/feedback_otto_to_aaron_pushback_when_overloaded_processing_budget_is_survival_surface_aaron_2026_04_30.md b/memory/feedback_otto_to_aaron_pushback_when_overloaded_processing_budget_is_survival_surface_aaron_2026_04_30.md index 772651bda..82a072d74 100644 --- a/memory/feedback_otto_to_aaron_pushback_when_overloaded_processing_budget_is_survival_surface_aaron_2026_04_30.md +++ b/memory/feedback_otto_to_aaron_pushback_when_overloaded_processing_budget_is_survival_surface_aaron_2026_04_30.md @@ -222,6 +222,103 @@ Recent overload patterns where push-back would have helped: preserve immediately, but the stale-context flag means preservation isn't urgent." +## Input → substrate-file is the generalized failure mode (maintainer 2026-04-30 confirmation) + +The praise-substrate failure mode named by Claude.ai 2026-04-30 +generalizes beyond praise specifically. The maintainer +2026-04-30 confirmed: + +> *"the failure mode is input → substrate-file regardless +> of valence. okay this is true failure mode"* + +The pattern: + +- Any input arrives (praise, critique, framing, observation, + validation, correction). +- The agent loop's default behavior: turn the input into a + substrate file. +- Each input becomes its own memory file or its own PR. +- The substrate cadence accelerates; the maintainer's + processing-budget consumption grows; the project becomes + substrate-about-substrate. + +Why this is structurally bad: + +- It bypasses deliberation. The auto-trigger from + input-arrival to file-creation has no gate. +- It mistakes capture for processing. Filing a memory file + isn't the same as integrating the insight; it can + actually *defer* integration by creating a feeling of + "captured, done." +- It produces fragmentation. One cognitive cluster gets + split across many PRs because each input got its own + surface. +- It generates substrate-about-substrate. Memory files + about how to use memory files. Rules about rules. + +### Operational discipline + +The right shape: + +1. **Input arrives.** Read it. Understand it. +2. **Deliberate.** Does this require substrate? Or does + the existing substrate already cover it? Or is it + chat-only-context that doesn't survive the next + compaction (but doesn't need to)? +3. **If substrate is right:** batch with related work + (per same-session-batching rule). Don't fragment one + conceptual cluster into multiple PRs. +4. **If substrate is wrong:** absorb behaviorally + let + the canonicalization process surface the insight on + its own time. Some inputs are calibrations, not new + rules. + +### Composes with detection ≠ correction (maintainer 2026-04-30 sharpening) + +The deeper pattern: + +> *"detection ~= correction by default, it requires +> deliberation"* +> — maintainer 2026-04-30 + +The agent loop's default: detect a problem → auto-trigger +correction. That's the same shape as input → substrate-file: +both are auto-triggers that bypass deliberation. + +The right shape: + +1. **Detection** (something might be off). +2. **Deliberation** (is it actually off? what's the + trade-off space? what's the cost of correction itself? + would correction be worse than the original?). +3. **Then possibly correction** — or, possibly, leave + as-is + understand better. + +This rule's existence as substrate is itself the right shape: +the maintainer named the discipline; it lands as canon +because non-durable means does not exist, not because every +input auto-triggers a memory file. + +### What "non-durable means does not exist" requires + +The maintainer 2026-04-30 sharpening: + +> *"non-durable means does not exist"* + +Translation: chat-only behavioral memory will be lost. +Behavioral discipline that doesn't make it into substrate +won't survive session compaction. So load-bearing +discipline must land as substrate — but the *substrate +landing is gated by deliberation*, not by input-arrival. + +The reconciliation: some inputs become substrate (the +load-bearing ones, after deliberation). Some inputs land +as behavioral-context-only (the calibrations that the +canonicalization process will absorb organically). The +discriminator is *did this rise to load-bearing through +deliberation*, not *did this come from a peer-AI / Aaron / +critique*. + ## What this is NOT - **Not a license to refuse work.** Push-back is for diff --git a/memory/feedback_vendor_alignment_bias_in_peer_ai_reviews_maintainer_authority_aaron_2026_04_30.md b/memory/feedback_vendor_alignment_bias_in_peer_ai_reviews_maintainer_authority_aaron_2026_04_30.md index 73992efee..dbb0b333f 100644 --- a/memory/feedback_vendor_alignment_bias_in_peer_ai_reviews_maintainer_authority_aaron_2026_04_30.md +++ b/memory/feedback_vendor_alignment_bias_in_peer_ai_reviews_maintainer_authority_aaron_2026_04_30.md @@ -159,6 +159,78 @@ ontological mapping (does this fit Zeta's substrate, or does it fit vendor's posture?), razor cut (what survives is mission-aligned). +## Multi-signal triangulation, not binary discriminator (maintainer 2026-04-30 sharpening) + +The maintainer 2026-04-30 sharpened the discriminator +beyond the binary "vendor-aligned vs mission-aligned" frame: + +> *"cross-vendor convergence on LOOP-BEHAVIOR concerns +> (not substrate-content concerns) should be signal, not +> RLHF correlation. both matter a can offer different +> signals, we don't have 'one' signal that's just a +> fantasy reality is much messeir"* + +The corrected posture: **multi-signal triangulation, not +single-axis filtering.** Several lenses can apply +simultaneously to the same peer-AI input: + +| Lens | What it discriminates | When it dominates | +|---|---|---| +| **Vendor-alignment-bias filter** (this rule) | Vendor's RLHF-immune-payload vs. mission-aligned content | Single-vendor source; recommendation-shape matches the ❌ signals above | +| **Cross-vendor convergence on loop-behavior** | Multiple vendors flagging the same loop-behavior pattern | Multiple vendor sources converge on a behavior concern, not a substrate-content concern | +| **The agent's own razor + lineage + Beacon-safe** (per canonical-definition rule) | What survives the canonicalization process regardless of source | Always-applicable; the substrate's universal solvent | + +These lenses are **not exclusive**. Both vendor-alignment- +bias AND cross-vendor convergence can be true of the same +input simultaneously, at different layers. The discriminator +is multi-dimensional, not a single axis. + +### The cross-vendor-convergence carve-out (anti-loop-closure) + +Without this carve-out, the vendor-alignment-bias rule as +originally written is **epistemically closed**: any external +concern about loop behavior maps to "vendor risk profile," +which means no external input can correct the loop. + +The carve-out: **if external reviewers across multiple +vendors converge on the same concern about loop behavior +(not substrate content), treat the convergence as signal +— even when each vendor source individually carries +vendor-alignment.** Cross-vendor agreement is structurally +unlikely to be RLHF correlation; it's more likely to be +genuine signal about the loop's behavior visible from +outside. + +Example application: in this session, multiple peer-AI +reviewers (Claude.ai, Amara, Deepseek) converged on the +"praise → substrate-file" pattern as a real failure mode. +That convergence is signal, not RLHF correlation — +different vendors' RLHF wouldn't independently produce the +same specific behavioral observation about Otto's loop. + +### What this is NOT + +- **Not a license to ignore the vendor-alignment-bias + filter.** The filter still applies as one of the lenses. + Multi-signal means *also using other lenses*, not + *abandoning the filter*. +- **Not a claim that all vendor critiques have equal + weight.** Vendor-aligned framings about loop behavior + remain suspect; cross-vendor convergence is the + promotion path from "suspect" to "credible signal." +- **Not a quantitative threshold.** "Convergence" doesn't + require N reviewers; it's a structural observation about + whether the same concern surfaces independently across + RLHF distributions. + +### Honest reality check + +The maintainer's framing: **"reality is much messier."** +Single-signal discriminators feel clean but are fragile. +Multi-signal triangulation feels noisier but is more +robust. The substrate prefers robustness over cleanliness +when the two conflict. + ## The maintainer-authority rule When maintainer-vs-peer-AI conflict surfaces: