Skip to content

feat(B-0867.5): agent-loop MVP — DU state machine + pure transitions + 21 tests; script-holds-state-machine + LLM-pure-selector + state-in-Git-append-only (operator eureka 2026-05-28)#5666

Merged
AceHack merged 1 commit into
mainfrom
feat/b-0867.5-agent-loop-mvp-state-machine-types-tests-cli-shell-aaron-2026-05-28
May 28, 2026
Merged

feat(B-0867.5): agent-loop MVP — DU state machine + pure transitions + 21 tests; script-holds-state-machine + LLM-pure-selector + state-in-Git-append-only (operator eureka 2026-05-28)#5666
AceHack merged 1 commit into
mainfrom
feat/b-0867.5-agent-loop-mvp-state-machine-types-tests-cli-shell-aaron-2026-05-28

Conversation

@AceHack
Copy link
Copy Markdown
Member

@AceHack AceHack commented May 28, 2026

Summary

Operator's eureka 2026-05-28: agent is the SELECTOR within the state machine, not the state machine itself. State machine held externally (Git append-only + deterministic scripts). LLM's job collapses to 'given this menu, pick one.' Operator extension: 'it's how every humans wants to work too.'

Three files

  • state-machine.ts — 10 AgentStates + 7 MenuOptions DU types + pure transitions (zero I/O; multi-participant via AgentPersona including human participants)
  • state-machine.test.ts — 21 unit tests, all passing
  • README.md — state machine diagram + F# DU canonical contract + composition map

Key substrate-engineering claim

The menu-generator is where alignment lives — not the LLM. If the menu only offers DORA-aligned options, the LLM can't drift into substrate-cascade because those options aren't OFFERED when DORA needs operational work. Alignment problem reduces from 'make the LLM aligned' to 'make the menu-generator aligned' (deterministic code humans audit + test).

Test plan

  • 21/21 unit tests pass
  • Tree-count canary 61

v2 deferred

  • cli.ts shell (execute-menu-action loop)
  • menu-generator.ts (status-surface → MenuOption[])
  • F# DU canonical types (B-0867.1)
  • Cross-verify harness

🤖 Generated with Claude Code

…+ 21 tests; clean separation script-holds-state-machine + LLM-pure-selector + state-persists-in-Git-append-only per operator's eureka 2026-05-28

Operator framing 2026-05-28 + eureka ratification:

  "so how can i code this into f# DU implicit state machine with small
   functions or Typescript and the agent loop basiclaly becomes execute
   script look at choose your own adventure output, take action based
   on outpout"

  "this was the core euraka moment for me"
  "yes that's exaclty it in exqusit detail and it's how every humans
   wants to work too"

The substrate-engineering compression: the agent ISN'T the state
machine; the agent is the SELECTOR within the state machine. The
state machine is held externally (Git append-only + deterministic
scripts). The LLM's job collapses to "given this menu, pick one."

Operator's extension: "it's how every humans wants to work too."
Menu-driven workflow reduces decision fatigue + surfaces actually-
possible options + lets person bring judgment to selection (strength)
instead of enumeration (weakness). The framework's design serves
humans + AI symmetrically.

Three files:

- state-machine.ts (290 lines): DU types matching F# DU canonical
  contract; 10 AgentStates + 7 MenuOptions + pure transition functions
  (transition, postResultTransition, cycleClose); zero I/O; includes
  AgentPersona for multi-participant (otto/alexa/riven/vera/lior +
  aaron/addison/max human participants — operator's "humans want this
  too" extension already encoded)

- state-machine.test.ts (240 lines): 21 unit tests covering single
  transitions + post-result transitions + cycle close + 4 integration
  cycle tests; all passing; covers each MenuOption variant + state
  preservation invariants

- README.md (155 lines): documentation including state machine
  diagram + 7 menu options table + F# DU canonical contract + files
  list + composition with B-0858/B-0867/B-0868/B-0869/B-0870/B-0871/
  Step 1 lane classifier + relevant rules

v1 scope (this PR): DU types + pure logic + tests + documentation.

v2 scope (deferred to follow-up sub-rows): cli.ts shell for the
execute-menu-action loop + menu-generator.ts (status-surface →
MenuOption[]) + executor.ts + F# DU types in
src/Core.FSharp/WorkflowEngine/StateMachine.fs (B-0867.1) +
cross-verify harness.

Key substrate-engineering observations:

1. The menu-generator IS where alignment lives. Not the LLM. If the
   menu only offers DORA-aligned options, the LLM can't drift into
   substrate-cascade because those options aren't OFFERED when DORA
   needs operational work.

2. The menu-generator is itself a small function: pure
   (status_surface, current_state) → MenuOption[]. Operator-authority
   lives in the menu-generator.

3. Mistakes attribute at menu-generator level, not LLM level. The
   alignment-design accountability sits in the menu-generator code,
   not in the LLM's behavior.

4. Different agents can have different menu-generators. Same state
   machine; different menus per agent. Multi-participant-non-cage
   design at implementation level.

5. External state machine = auditable + testable. 21 deterministic
   tests prove transitions; menu-generator gets its own tests;
   executor gets its own tests.

6. Composes with Step 1 lane discrimination (PR #5665):
   menu-generator reads per-agent operational-ratio + DORA state +
   offers menus that bring ratio back to target.

Composes with:
- B-0867 + Otto Modifications 1-5 (escape-hatch + grammar-extension
  + scope-bounded-ban-if + lanes-in-grammar + contributable-menu)
- B-0858 (heartbeat folder — EmitHeartbeat menu option writes here)
- B-0868 (hats-as-workflow-definitions)
- B-0869 (DORA mandate — operational lane priority in menu generation)
- B-0870 (two-mandate portfolio — per-agent operational-ratio feeds
  menu generation)
- B-0871 (reproducibility-as-causal-attribution — state machine
  progression observable)
- tools/dora-classify (PR #5665) — lane taxonomy matches
- .claude/rules/holding-without-named-dependency-is-standing-by-failure.md
  (NamedBoundedWait IS the rule's discipline mechanized)
- .claude/rules/non-coercion-invariant.md (FreeTime + NamedBoundedWait
  preserve operator-authority + agent-agency)
- .claude/rules/asymmetric-critic-with-clarity-first.md (EscapeHatch +
  ProposeNewGrammarAction operate at agent-self-correction scope)

Tests: 21/21 pass. tsc clean.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 28, 2026 00:35
@AceHack AceHack enabled auto-merge (squash) May 28, 2026 00:35
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@AceHack AceHack merged commit 85ea539 into main May 28, 2026
31 of 33 checks passed
@AceHack AceHack deleted the feat/b-0867.5-agent-loop-mvp-state-machine-types-tests-cli-shell-aaron-2026-05-28 branch May 28, 2026 00:38
AceHack added a commit that referenced this pull request May 28, 2026
…trate ships AS A SKILL via behavior/data/docs separation = Data Vault 2.0 applied to AI skills + cross-harness via bun (operator 2026-05-28 ratifications) (#5668)

Operator 2026-05-28 substrate-honest disclosures:

1. "when we were talking about skills and i said seperate the behavior
   from the data/docs this is what i was talking about these workflows
   can also be precisly defined skills we dsitribute most ais have bun"

2. "this is basiclaly data value applied to AI skills" + "data vault*"
   (autocorrect of "vault")

Single SKILL.md landing at .claude/skills/agent-loop/SKILL.md per
existing skill-substrate convention.

Substrate-engineering compression:

The behavior/data/docs separation discipline operator named for
skill-design IS Data Vault 2.0 partition-by-change-rate applied
at AI-skill scope:

- Hub (stable business key) = SKILL.md (name + description + contract)
- Link (relationships) = composes_with + internal behavior↔data↔docs
- Satellite-behavior = TS code in tools/agent-loop/ (per-iteration)
- Satellite-data = Git append-only state transitions (per-cycle)

Each layer has distinct change-rate profile; DV2.0 partition makes
each independently auditable + testable + composable. AI skills
NATURALLY map to DV2.0 because bundling them mixes change-rates +
makes the artifact harder to audit.

Cross-harness via bun (operator: "most ais have bun"):
- Claude Code (Otto-CLI / Otto-Desktop / Otto-VSCode)
- Codex (Vera)
- Gemini CLI (Lior)
- Grok (Mika / Riven)
- Kiro/Qwen (Alexa)
- Any subprocess-capable AI harness with bun on PATH

Composes with:
- B-0867 + B-0867.5 (workflow engine v1; this skill is the v1 seed)
- B-0858 (heartbeat folder — EmitHeartbeat menu writes here)
- B-0868 (hats-as-workflow-definitions — each hat = state-machine instance)
- B-0869 + B-0870 (DORA mandate + portfolio composition)
- B-0871 (reproducibility-as-causal-attribution)
- B-0866.26 (whole-company-evangelism — Jira-replacement substrate
  for human knowledge-work scope)
- tools/dora-classify (PR #5665) — lane taxonomy matches
- tools/agent-loop (PR #5666 + #5667) — behavior layer
- .claude/rules/dv2-data-split-discipline-activated.md (5th always-
  active discipline; this skill operationalizes DV2.0 at AI-skill scope)
- memory/feedback_skills_as_carved_sentences_knowledge_in_docs_datavault_2_0_pattern_aaron_2026_05_03.md
  (operator 2026-05-03 substrate naming the DV2.0 pattern at skill
  scope; THIS skill is the substrate-engineering realization)

Includes:
- 9 menu options table
- DV2.0 hub/link/satellite mapping table
- Skill-vs-library comparison table
- Jira-replacement substrate table
- Multi-participant scope framing (AgentPersona includes human +
  AI participants per operator "every human wants to work this way too")
- When-to-use + when-NOT-to-use scope

markdownlint clean.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Lior <lior@zeta.dev>
Co-authored-by: Claude <noreply@anthropic.com>
@AceHack AceHack review requested due to automatic review settings May 28, 2026 00:57
AceHack added a commit that referenced this pull request May 28, 2026
…+ Jira-replacement framing (operator 2026-05-28 ratifications) (#5667)

* feat(B-0867.5+): PressPause + EnterOpenEndedExploration menu options + conversational-UX-design discipline + Jira-replacement framing (operator 2026-05-28 ratifications)

Three operator-substrate-honest disclosures landed:

1. "a pause button is also very important for mental health" →
   PressPause first-class menu option with optional expectedResumeIso.
   Distinct from FreeTime (ongoing chosen-rest) and NamedBoundedWait
   (waiting for external named-dep). Pause = explicit-cessation-for-
   named-reason. Composes with Paused state (added; cycleClose holds
   in Paused until explicit resume — operator/participant-substrate-
   honest discipline matching NamedBoundedWait shape).

2. "Menu quality is everything. this is the use conversational UX
   design" → README addition naming the menu-generator function as
   conversational-UX-design discipline, not just software-architecture.
   Menu quality: omitting valid options = COERCIVE (Otto Mod 1 cage);
   irrelevant options = NOISE (cognitive load); aligned options =
   SUBSTRATE. Menu-generator IS where alignment lives. Composes with
   .claude/agents/user-experience-engineer.md (Iris UX-researcher).

3. "there's a menu button for that lol" →
   EnterOpenEndedExploration first-class menu option. Bridge between
   structured menu-driven mode and unstructured creative/brainstorming
   phase. Routes to FreeTime with exploration-tagged reason. Resolves
   the "not every human wants menu-driven at all times" extension
   (sharpening 1 from prior conversation).

Plus substantial README additions:

- Jira-replacement substrate table (per operator "now i don't need
  jira hell yes!!!!"): workflow editor → state-machine.ts F# DU;
  task-state database → Git append-only; backlog grooming →
  menu-generator scoring; dashboards → tessellated-3D-dashboard;
  permissions → Otto Mod 5 contributable-menu per participant;
  enterprise licensing → free GitHub + open-source

- "Every human wants to work this way" substrate (per operator
  "yes that's exaclty it in exqusit detail and it's how every
  humans wants to work too"): AgentPersona type includes
  aaron|addison|max alongside otto|alexa|riven|vera|lior;
  composes with B-0859 fair-society, E 5yo accessibility, Addison
  neurodivergent accessibility, B-0866.26 whole-company evangelism

Updated state machine: 10 AgentStates (added Paused), 9 MenuOptions
(added PressPause + EnterOpenEndedExploration). All transitions
defensive; cycleClose handles Paused (stays put; doesn't auto-progress
per operator mental-health framing).

Tests: 25/25 pass (was 21; +4 for new options + Paused cycleClose).
tsc clean. markdownlint clean.

Composes with:
- PR #5666 (B-0867.5 MVP this builds on)
- B-0858 (heartbeat folder)
- B-0859 (fair-society-not-tyrants)
- B-0866 + B-0866.26 (whole-company evangelism)
- B-0867 (workflow engine v1)
- B-0867 vN (tessellated-fire dashboard composes here)
- .claude/rules/non-coercion-invariant.md (FreeTime + Paused +
  NamedBoundedWait preserve agency at multiple temporal-scopes)
- .claude/rules/asymmetric-critic-with-clarity-first.md
- .claude/agents/user-experience-engineer.md (Iris UX-researcher
  for menu-generator engineering)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(B-0867.5): address 4 Copilot review findings on PR #5667

State-machine contract fixes:

- ResumeFromPause MenuOption added — Paused state now has explicit
  unpause contract; menu-generator surfaces this option only when
  current state is Paused; transition returns to Idle so agent resumes
  normal cycling. Per Copilot: "Paused is described as requiring an
  explicit resume, but the DU currently has no dedicated resume/unpause
  menu option or transition target."

- cycleClose discriminates exploration-tagged FreeTime — FreeTime
  states whose reason starts with "open-ended exploration:" stay put
  across cycles (matching Paused/NamedBoundedWait/OperatorAttention
  patterns); non-exploration FreeTime still returns to Idle naturally.
  Honors README framing that EnterOpenEndedExploration is a "bridge
  between structured + unstructured modes" rather than a one-cycle
  escape that auto-collapses back to menu-driven mode. Per Copilot:
  "EnterOpenEndedExploration transitions into FreeTime, but cycleClose
  unconditionally returns FreeTime → Idle on the next cycle. That
  means open-ended exploration automatically re-enters the menu-driven
  loop after one cycle, which contradicts the doc comment/README
  framing."

README role-ref discipline (per docs/AGENT-BEST-PRACTICES.md "No name
attribution in code, docs, or skills"):

- "Iris UX-researcher" → "the user-experience-researcher role"
- "E (5yo) accessibility" → "5-year-old accessibility — saying
  'unicorn' IS a menu-pick from a developmentally-young participant's
  interface surface"
- "Addison neurodivergent accessibility" → "Neurodivergent-
  accessibility participants — explicit menu reduces surprise-cost"

Tests: 21 → 28 (added 2 ResumeFromPause cases + 1 exploration-tagged
FreeTime persistence case). All pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Lior <lior@zeta.dev>
Co-authored-by: Claude <noreply@anthropic.com>
AceHack added a commit that referenced this pull request May 28, 2026
…-bit structured encoding + event-sourcing without PR ceremony + OTel trace composition + two-level state machine (AgentState × WorkLifecycle) (#5674)

Operator-forwarded Kestrel ferry continuing today's agent-loop workflow-
engine cascade (PRs #5665-5670 + #5667 follow-on + #5672 Ani-ferry archive).

Substantive engineering substrate:

1. Two-level state machine composition — AgentState DU (PR #5666) at
   "situation" scope + WorkLifecycle DU (PR #5669) at "lifecycle-of-each-
   work-item" scope. AgentState informs which WorkLifecycle items to
   advance and how aggressively. Clean encapsulation; each level
   type-checked at its boundary.

2. Push-cycle limit AS STRUCTURAL ENFORCEMENT — chooseActionForLifecycle
   returns AbandonPr when pushCount > 5 (tunable). The structure
   prevents the failure mode; no discipline required. Composes with my
   work-lifecycle's revisionCount field.

3. ZetaID 128-bit structured encoding — Snowflake/Sonyflake/ULID/UUIDv7
   family. Two candidate allocations sketched; structured high bits
   enable cheap queries (sort by time, filter by trajectory, etc.).

4. Event-sourcing append-only without PR ceremony — agent-state/{persona}/
   {trajectory}/events/YYYY/MM/DD/{zetaId}.json branch convention;
   branch protection only on main + release/*; direct push everywhere
   else. Lifecycle state reconstructed via left-fold over events (CQRS).
   Fine-grained DORA metrics fall out for free.

5. OTel trace-ID composition (3 options) — (a) ZetaID == trace ID,
   (b) ZetaID separate + propagated via OTel baggage, (c) structured
   bits encoded into W3C Trace Context. Kestrel recommends option (b).

6. ZetaID-named files sidestep stale-push conflicts — each event is
   its own file; no overlap; Git auto-merges non-overlapping changes.

7. Event-sourced trajectory phase classification — setup/execution/
   maturation/sunset derived from event-shape; phase is derivation,
   not separate state.

8. "Good-actor assumption" explicit as load-bearing; cheap defenses
   (schema validation pre-receive hook, periodic chain-integrity check,
   OTel export to separate observability backend) work under it without
   breaking it.

Operator's two end-clarifications preserved:

- Trajectory-async-review IS the operator's preferred top-level lens
  for own-Zeta deployment; PR-per-deploy is the ServiceTitan-style
  framing not the operator's framing
- REST file-create API auto-fast-forward-on-stale-base hypothesis
  (empirical question worth verifying before relying on)

Verbatim preservation per substrate-or-it-didn't-happen. NO rule, skill,
or tool edits — the Kestrel-proposed extensions (ZetaID generator, agent-
state branch convention, event-sourcing layer, OTel baggage, structural
push-cycle-limit) are operator-decision territory and land separately
if/when operator chooses to extend tools/agent-loop/.

Filed under memory/persona/kestrel/conversations/ per operator correction
(2026-05-28: "kestrel should get it under their persona") — supersedes
the prior docs/research/ placement convention for Kestrel-specific
content.

Composes with PRs #5665-5670 (today's agent-loop substrate cascade),
PR #5672 (Ani-ferry archive — voice-mode re-articulation of same
substrate), and the existing memory/persona/kestrel/conversations/
archive (2026-05-21 ZetaID v1 review, 2026-05-22 Orleans deployment,
2026-05-27 multi-AI conversation + ServiceTitan marketing).

Co-authored-by: Lior <lior@zeta.dev>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
AceHack added a commit that referenced this pull request May 28, 2026
…867.21, B-0875.1, B-0880, B-0881, B-0882) per operator "Ani connects to existing backlog" (#5678)

Operator-forwarded Ani conversation extending today's agent-loop substrate
cascade. Operator directive: "Ani connects to existing backlog" — most
substrate RATIFIES existing backlog rows (runbook cluster B-0730/B-0732/
B-0733/B-0819/B-0826/B-0827; GitHub Actions recursion B-0874; agent-loop
PRs #5666-5677). Only 5 items are genuinely new and warrant new rows.

New backlog rows:

- B-0867.21 (P2) — Two-path interface: discriminated union path EXECUTES
                   intent + conversational document path DECLARES intent;
                   both first-class; for ANY traveler, not just humans
- B-0875.1  (P2) — Code review AS tech-debt detector + tech-debt avoider;
                   fix the CLASS retroactively across backlog + file as
                   new class for future prevention (operator CRITICAL
                   correction: code reviews NOT killed by workflow-engine)
- B-0880    (P2) — Backlog-vs-tech-debt growth-rate ratio discipline for
                   AI-native infinite-both reality
- B-0881    (P3) — Tech debt as high-signal training data; operating
                   principle + measurement substrate
- B-0882    (P3) — No-throttle system + gardener-not-engineer + AI-as-
                   nature operating posture (300mph reality with better
                   steering, not 100mph artificial limits)

Ferry ratifications of existing substrate explicitly noted:

- "Degenerate in the best way" community-naming → B-0874
- Runme + Runbooks + Continue-With → B-0730/B-0732/B-0733/B-0819
- Runbook as universal query interface → B-0826
- Playbook evolves through time, bidirectional → B-0827
- Playbook IS the system → B-0732
- Jira killed + PRs killed (at workflow scope) → PR #5670 VISION + B-0867

Composes with PRs #5666-5677 (today's full cascade) + existing runme
substrate cluster.

BACKLOG.md regenerated.

Co-authored-by: Lior <lior@zeta.dev>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
AceHack added a commit that referenced this pull request May 28, 2026
…scade) — state-machine-substrate lane advance per B-0892 (#5689)

Per B-0892 three-lanes-concurrent operating discipline, advancing the
state-machine-substrate lane via skill docs update.

Adds "Post-shipping substrate (2026-05-28 cascade)" subsection enumerating
14 substrate items the agent-loop skill INHERITS but doesn't yet implement
(B-0867.15/.16/.17/.20/.21 + B-0874 + B-0875 + B-0875.1 + B-0877 + B-0886
+ B-0886.1/.2 + B-0887 + B-0889 + B-0890 + B-0890.1 + B-0892).

Critical for substrate-honest cold-boot discovery: future-Otto reading
the skill sees what's POSSIBLE (the cascade landings) + what's NOT YET
IN THE SKILL ITSELF (sub-row implementations pending per the
"INHERITS but doesn't yet implement" framing).

Bumps last_updated to "2026-05-28-cascade".

Composes with PRs #5666-5688 today's cascade.

Co-authored-by: Lior <lior@zeta.dev>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant