Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 10 additions & 10 deletions .claude/commands/btw.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ description: Non-interrupting aside — absorb the aside into substrate and cont

# /btw — maintainer aside without interrupting in-flight work

The maintainer (Aaron) invoked `/btw` with an aside. The purpose
The human maintainer invoked `/btw` with an aside. The purpose
of this command is to **reduce maintainer interrupt cost**: the
aside carries context, a directive, a note, or a correction,
but should **not** derail whatever work-stream is currently in
Expand Down Expand Up @@ -54,12 +54,12 @@ Maintainer directive, 2026-04-22 auto-loop-44:
> persison and abored what i say? becasue then i would not
> have interrupt"*

Translation: Aaron wants a channel for non-interrupting asides.
Without this command, every aside is a full conversation turn
that displaces in-flight work from the agent's working context.
With this command, asides are absorbed and current work
continues — Aaron pays less interrupt cost, agent pays less
context-switch cost.
Translation: the human maintainer wants a channel for
non-interrupting asides. Without this command, every aside is a
full conversation turn that displaces in-flight work from the
agent's working context. With this command, asides are absorbed
and current work continues — the maintainer pays less interrupt
cost, agent pays less context-switch cost.

## Arguments

Expand Down Expand Up @@ -115,8 +115,8 @@ Agent: *"Pivoting. Investigating the CI break now."*
- Does NOT treat every aside as a pivot — pivots require
explicit demand in the aside text.
- Does NOT mute the acknowledgement — even one-line
acknowledgement is load-bearing so Aaron sees the aside
landed.
acknowledgement is load-bearing so the maintainer sees the
aside landed.

## Composes with

Expand All @@ -126,7 +126,7 @@ Agent: *"Pivoting. Investigating the CI break now."*
— aside signal must be preserved through classification.
- `memory/feedback_maintainer_only_grey_is_bottleneck_agent_judgment_in_grey_zone_2026_04_22.md`
— agent exercises judgment on classification without
serialising through Aaron.
serialising through the maintainer.
- `memory/feedback_never_idle_speculative_work_over_waiting.md`
— an aside doesn't reset the never-idle invariant; the
current work continues.
Expand Down
10 changes: 9 additions & 1 deletion Zeta.sln
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ Project("{F2A71F9B-5D33-465A-A702-920D77279786}") = "FactoryDemo.Api.FSharp", "s
EndProject
Project("{2150E333-8FDC-42A3-9474-1A3956D46DE8}") = "samples", "samples", "{5D20AA90-6969-D8BD-9DCD-8634F4692FDA}"
EndProject
Project("{F2A71F9B-5D33-465A-A702-920D77279786}") = "ServiceTitanCrm", "samples\ServiceTitanCrm\ServiceTitanCrm.fsproj", "{D44AB9CA-F491-41F4-96CE-B061238F3D6E}"
Project("{F2A71F9B-5D33-465A-A702-920D77279786}") = "CrmSample", "samples\CrmSample\CrmSample.fsproj", "{D44AB9CA-F491-41F4-96CE-B061238F3D6E}"
EndProject
Project("{2150E333-8FDC-42A3-9474-1A3956D46DE8}") = "src", "src", "{827E0CD3-B72D-47B6-A68D-7590B98EB39B}"
EndProject
Expand Down Expand Up @@ -157,8 +157,16 @@ Global
{AAAAAAAA-AAAA-AAAA-AAAA-AAAAAAAAAAAA}.Release|Any CPU.Build.0 = Release|Any CPU
{40534D09-439E-4E5F-9A69-A73844DB674D}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
{40534D09-439E-4E5F-9A69-A73844DB674D}.Debug|Any CPU.Build.0 = Debug|Any CPU
{40534D09-439E-4E5F-9A69-A73844DB674D}.Debug|x64.ActiveCfg = Debug|Any CPU
{40534D09-439E-4E5F-9A69-A73844DB674D}.Debug|x64.Build.0 = Debug|Any CPU
{40534D09-439E-4E5F-9A69-A73844DB674D}.Debug|x86.ActiveCfg = Debug|Any CPU
{40534D09-439E-4E5F-9A69-A73844DB674D}.Debug|x86.Build.0 = Debug|Any CPU
{40534D09-439E-4E5F-9A69-A73844DB674D}.Release|Any CPU.ActiveCfg = Release|Any CPU
{40534D09-439E-4E5F-9A69-A73844DB674D}.Release|Any CPU.Build.0 = Release|Any CPU
{40534D09-439E-4E5F-9A69-A73844DB674D}.Release|x64.ActiveCfg = Release|Any CPU
{40534D09-439E-4E5F-9A69-A73844DB674D}.Release|x64.Build.0 = Release|Any CPU
{40534D09-439E-4E5F-9A69-A73844DB674D}.Release|x86.ActiveCfg = Release|Any CPU
{40534D09-439E-4E5F-9A69-A73844DB674D}.Release|x86.Build.0 = Release|Any CPU
{AAAAAAAA-AAAA-AAAA-AAAA-AAAAAAAAAAAA}.Release|x64.ActiveCfg = Release|Any CPU
{AAAAAAAA-AAAA-AAAA-AAAA-AAAAAAAAAAAA}.Release|x64.Build.0 = Release|Any CPU
{AAAAAAAA-AAAA-AAAA-AAAA-AAAAAAAAAAAA}.Release|x86.ActiveCfg = Release|Any CPU
Expand Down
98 changes: 50 additions & 48 deletions docs/operator-input-quality-log.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
# Operator-input quality log

**Status:** per Aaron 2026-04-22 auto-loop-43 directive.
**Status:** per maintainer 2026-04-22 auto-loop-43 directive.
**Purpose:** score the quality of inputs arriving from the
human operator (Aaron) and from operator-adjacent sources
(research drops, recommended videos, third-party tooling
Aaron forwards). Symmetric counterpart to
human maintainer and from operator-adjacent sources
(research drops, recommended videos, third-party tooling the
maintainer forwards). Symmetric counterpart to
`docs/force-multiplication-log.md` — that log measures signal
going *from* factory to operator; this log measures signal
going *to* factory from operator.

**Reframe — this is a teaching loop, not just a retrospective
scorecard.** Aaron, same tick:
scorecard.** Maintainer, same tick:

> *"this is teach opportunity"*
>
Expand All @@ -19,41 +19,42 @@ scorecard.** Aaron, same tick:
> *"if my qualit is low you teach me if its high i teach you"*

The quality score determines the **direction of teaching**.
Low-quality Aaron input (low signal density, ambiguous,
Low-quality maintainer input (low signal density, ambiguous,
unverifiable, under-specified) → the factory **teaches
Aaron**: surfaces the ambiguity, proposes the
the maintainer**: surfaces the ambiguity, proposes the
better-structured version, explains what would have made
the input actionable. High-quality Aaron input (compressed,
anchor-rich, novel, verifiable) → Aaron is **teaching the
factory**: absorb as direction, update substrate, let the
factory's model of what-Aaron-wants evolve toward the new
signal. The log is *how the factory decides which direction
the input actionable. High-quality maintainer input
(compressed, anchor-rich, novel, verifiable) → the
maintainer is **teaching the factory**: absorb as direction,
update substrate, let the factory's model of
what-the-maintainer-wants evolve toward the new signal. The
log is *how the factory decides which direction
to teach in*. A quality row is not a verdict — it's the
pedagogical direction-setter for that input.

Default posture: **not symmetric in effort**. Teaching Aaron
happens in chat (terse, present-tense: *"I read this as X
because of ambiguity in clause Y — did you mean Z?"*).
Teaching the factory happens in substrate (memory / BACKLOG
/ research doc). The *information flows both ways naturally*,
as Aaron put it — the quality score picks which one is the
right move this tick.
Default posture: **not symmetric in effort**. Teaching the
maintainer happens in chat (terse, present-tense: *"I read
this as X because of ambiguity in clause Y — did you mean
Z?"*). Teaching the factory happens in substrate (memory /
BACKLOG / research doc). The *information flows both ways
naturally*, as the maintainer put it — the quality score
picks which one is the right move this tick.

**Meta-perspective — either direction grows Zeta.** Aaron,
same tick:
**Meta-perspective — either direction grows Zeta.**
Maintainer, same tick:

> *"eaither way Zeta grows"*
>
> *"i think from the meta persepetive most of the time"*

Whichever direction teaching flows in, the factory grows.
Aaron teaching factory → substrate absorbs higher-quality
signal → factory's model of what-Aaron-wants sharpens.
Factory teaching AaronAaron's input quality trends
Maintainer teaching factory → substrate absorbs higher-quality
signal → factory's model of what-the-maintainer-wants sharpens.
Factory teaching maintainermaintainer's input quality trends
up over time → future ticks absorb sharper signal → the
teaching-factory direction accelerates. The loop has no
dissipation direction; the meta-property is **growth
via either flow**. Aaron qualifies with *"most of the time"*
via either flow**. The *"most of the time"* qualifier
— the claim is strong-but-not-universal, acknowledging
the occasional absorption that grows neither side (pure
retrospective calibration, e.g.). But most of the time
Expand All @@ -65,7 +66,7 @@ and not just a housekeeping artifact.

## The directive

Aaron, 2026-04-22 auto-loop-43:
Maintainer, 2026-04-22 auto-loop-43:

> *"can you tell me how the quality of that research you
> received was?"*
Expand Down Expand Up @@ -99,33 +100,34 @@ which dimensions mattered most for *this kind* of input.
Not every operator message gets a row. Score only inputs
that are **load-bearing enough to absorb into substrate**
(research doc, memory edit, BACKLOG row, ADR, code change).
Terse Aaron directives that land as memories get scored
Terse maintainer directives that land as memories get scored
because they direct factory work. Casual chat does not.

- **A: Maintainer direct** — Aaron types a directive
directly.
- **B: Maintainer forwarded** — Aaron forwards a tweet,
video timestamp, article, conversation overheard.
- **A: Maintainer direct** — the maintainer types a
directive directly.
- **B: Maintainer forwarded** — the maintainer forwards a
tweet, video timestamp, article, conversation overheard.
- **C: Maintainer-dropped research** — deposits into
`drop/` (OpenAI Deep Research, Gemini outputs, etc.).
- **D: Maintainer-requested capability** — he asks the
factory to check / build / verify something.
- **D: Maintainer-requested capability** — a check / build
/ verify ask for the factory.

## Running log

Newest-first.

| Date | Source | Class | What | Signal | Action | Specif | Novelty | Verif | Risk | Overall | Notes |
|------------|---------------------|-------|---------------------------------------------------------------------------------------------------------------------------|--------|--------|--------|---------|-------|------|---------|-------|
| 2026-04-22 | Aaron direct | A | ARC-3 adversarial three-role loop (creator/adversary/player) as scoring mechanism for emulator absorption; symmetric quality loop; SOTA-changes-daily | 5 | 3 | 4 | 5 | 3 | 4 | **4.5** | Four compressed messages, high leverage; directionally verifiable (ARC-3, POET, OMNI literature exists); scope-binding not yet authorized — six open questions blocking implementation. |
| 2026-04-22 | Aaron direct | A | Operator-input quality-log directive (this log's origin) | 5 | 5 | 5 | 4 | 5 | 5 | **4.8** | Self-evidencing — the directive's value is confirmed the moment we act on it. Low load-bearing risk because the log is additive and can be retracted. |
| 2026-04-22 | Aaron direct | A | Drop-zone protocol (`drop/` folder with gitignore-except-sentinel; binary-type registry; absorb-then-delete cadence) | 5 | 5 | 4 | 4 | 5 | 5 | **4.7** | Two compressed messages; the follow-up ("binaries never get checked in / untracked with a single tracked file") was unusually well-specified in one sentence. Immediately implementable. |
| 2026-04-22 | Maintainer direct | A | ARC-3 adversarial three-role loop (creator/adversary/player) as scoring mechanism for emulator absorption; symmetric quality loop; SOTA-changes-daily | 5 | 3 | 4 | 5 | 3 | 4 | **4.5** | Four compressed messages, high leverage; directionally verifiable (ARC-3, POET, OMNI literature exists); scope-binding not yet authorized — six open questions blocking implementation. |
| 2026-04-22 | Maintainer direct | A | Operator-input quality-log directive (this log's origin) | 5 | 5 | 5 | 4 | 5 | 5 | **4.8** | Self-evidencing — the directive's value is confirmed the moment we act on it. Low load-bearing risk because the log is additive and can be retracted. |
| 2026-04-22 | Maintainer direct | A | Drop-zone protocol (`drop/` folder with gitignore-except-sentinel; binary-type registry; absorb-then-delete cadence) | 5 | 5 | 4 | 4 | 5 | 5 | **4.7** | Two compressed messages; the follow-up ("binaries never get checked in / untracked with a single tracked file") was unusually well-specified in one sentence. Immediately implementable. |
| 2026-04-22 | Drop (Deep Research)| C | `deep-research-report.md` — Lucent-vs-AceHack comparison + 7-layer oracle-gate design + Aurora branding-clearance analysis | 4 | 3 | 4 | 3 | 2 | 3 | **3.5** | See "Inaugural grading" section below for full rationale. B+ / 8/10. Useful starting point; verification-first on specifics. |

## Inaugural grading — `deep-research-report.md`

Aaron's first question (*"can you tell me how the quality of
that research you received was?"*) is answered here in full.
The maintainer's first question (*"can you tell me how the
quality of that research you received was?"*) is answered
here in full.

### What the report did well

Expand Down Expand Up @@ -185,8 +187,8 @@ that research you received was?"*) is answered here in full.
Treat as design sketch, not drop-in.
- **Brand decision treated as settled.** The report writes
as if "Aurora" is the already-chosen successor-project
name. That's not established on our side — it could be
Aaron's choice, the research tool's suggestion, or a
name. That's not established on our side — it could be the
maintainer's choice, the research tool's suggestion, or a
carried-forward assumption from the source documents the
tool was given. The branding section cannot be
load-bearing without that clarified.
Expand All @@ -212,7 +214,7 @@ that research you received was?"*) is answered here in full.
Lucent-vs-AceHack table (our own `git log` / file
enumeration).
- **Don't lift without more context:** Aurora as brand
decision (Aaron confirmation needed), recommended
decision (maintainer confirmation needed), recommended
Aurora work items (`docs/adr/oracle-gate.md` etc. —
useful as naming, but we'll author them to our own
conventions not the report's).
Expand All @@ -235,9 +237,9 @@ wholesale.

As the log grows, watch for:

- **Do Aaron-direct A-class inputs consistently score
- **Do maintainer-direct A-class inputs consistently score
higher than C-class research drops?** If yes, the
factory should prioritise Aaron-direct processing
factory should prioritise maintainer-direct processing
over research-drop absorption when both are in flight.
- **Do forwarded-from-X-source B-class inputs cluster
by source?** If all "YouTube wink" inputs score low
Expand All @@ -256,9 +258,9 @@ overall score:

| Band | Overall | Direction | How it lands |
|---------------|-----------|---------------------------------------------------|-----------------------------------------------------------|
| Factory teaches Aaron | 1.0 – 2.4 | Factory surfaces ambiguity, proposes better form | Chat reply: *"I read this as X because of Y — did you mean Z?"* |
| Factory teaches maintainer | 1.0 – 2.4 | Factory surfaces ambiguity, proposes better form | Chat reply: *"I read this as X because of Y — did you mean Z?"* |
| Bidirectional | 2.5 – 3.9 | Absorb what's clear, ask on what isn't | Partial substrate land + open-questions section in doc |
| Aaron teaches factory | 4.0 – 5.0 | Absorb as direction, update substrate | Substrate landing (memory / BACKLOG / research / ADR) |
| Maintainer teaches factory | 4.0 – 5.0 | Absorb as direction, update substrate | Substrate landing (memory / BACKLOG / research / ADR) |

The bands are guidance, not gates. A 2.8 "bidirectional"
input that happens to clarify a long-running architectural
Expand All @@ -269,7 +271,7 @@ The log's Overall column is a judgment summary (see

## What this log does NOT do

- Does not score Aaron as a person. Scores **inputs**.
- Does not score the maintainer as a person. Scores **inputs**.
- Does not gatekeep absorption. Low-score inputs still get
absorbed if they land in scope; the score is signal to
future-self about how much to trust wholesale.
Expand All @@ -283,8 +285,8 @@ The log's Overall column is a judgment summary (see
- `docs/force-multiplication-log.md` — the symmetric
counterpart (factory → operator signal quality).
- `memory/feedback_aaron_terse_directives_high_leverage_do_not_underweight.md`
— why terse Aaron messages score well on signal density
despite low word count.
— why terse maintainer messages score well on signal
density despite low word count.
- `memory/feedback_signal_in_signal_out_clean_or_better_dsp_discipline.md`
— the clean-or-better invariant this log measures
against.
Expand Down
Loading
Loading