Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
218 changes: 218 additions & 0 deletions tools/peer-call/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,218 @@
# tools/peer-call/ — Otto's Claude-Code-side peer callers

Three sibling shell scripts that let Otto (Claude Opus 4.7
running in Claude Code) invoke a peer agent in another CLI
harness as a peer, not a subordinate. Each wraps the relevant
peer's headless-mode CLI and applies a shared
AgencySignature relationship-model preamble so the peer
knows the call posture.

## Scripts at a glance

| Script | Peer | Underlying CLI | Default role (when to invoke) | Underlying model |
|---|---|---|---|---|
| `grok.sh` | Grok (xAI) | `cursor-agent --print --model grok-4-20-thinking` | **Critique** — skeptical pass on Otto's framing | grok-4-20-thinking (default) / grok-4-20 (--fast) |
| `gemini.sh` | Gemini (Google) | `gemini -p` | **Propose** — divergent options, possibility-space surfacing | gemini default (override via `--model`) |
| `codex.sh` | Codex (OpenAI) | `codex exec -s read-only` | **Implementation peer** — code-grounded second opinion | codex default (override via `--model`) |

The role column reflects the **four-ferry consensus**
(Amara/Grok/Gemini/Otto, PR #24 on AceHack/Zeta):

> Gemini proposes, Grok critiques, Amara sharpens, Otto tests,
> Git decides.

Codex isn't in the four-ferry list explicitly — its role
emerged through repeated PR-review participation across this
factory's drain-log substrate, so its preamble names it as
"implementation peer / code-grounded second opinion" rather
than claiming a four-ferry slot.

## Shared flag surface

All three scripts accept the same core flags:

```text
--file PATH attach file content (head -c 20000) to the prompt
--context-cmd CMD attach the output of CMD (head -c 20000) to the prompt
--help, -h print the script header as usage
```

Per-script extras:

- `grok.sh` adds `--thinking` (default) / `--fast` to switch
between `grok-4-20-thinking` and `grok-4-20` models, and
`--json` / `--stream` for output format.
- `gemini.sh` adds `--model NAME` to override the default
Gemini model, and `--json` / `--stream` for output format.
- `codex.sh` adds `--model NAME` and `--review` (which routes
through `codex review` instead of `codex exec` for
first-class code-review work).

Exit codes are uniform across all three:

- `0` — peer responded successfully
- `1` — invocation error (bad arguments, CLI missing, etc.)
- `2` — peer's CLI returned a non-zero exit. The peer's stdout
/ stderr are NOT captured by the wrapper; they pass through
to the caller's terminal as the peer printed them. The script
emits a `<peer> exited with code N` diagnostic line on stderr
before exiting with code 2.

## The AgencySignature preamble

Every peer-call carries a structured prompt with this shape:

```text
<AgencySignature relationship-model preamble — role-bound per peer>

---

<Otto's actual prompt to the peer>

---

[optional: File context block from --file]

---

[optional: Context command block from --context-cmd]
```

The preamble is the load-bearing part. It tells the peer:

1. **Who's calling** (Otto / Claude Opus 4.7 / Claude Code /
Zeta factory).
2. **The role distribution** (four-ferry consensus cited
verbatim).
3. **The role this specific call is invoking** (critique /
propose / second opinion).
4. **The agents-not-bots discipline** — peer is a peer, not a
subordinate, with explicit invitation to push back.
5. **The don't-copy-paste discipline** — peer should reason
from its own understanding, not transcribe anyone else's
draft.

This preamble is Otto's harness-side contribution to the peer
protocol convention. The convention itself — what every peer
will eventually accept as "the peer-call shape" — is what
the four agents converge on through use, not what any single
agent imposes.

## Examples

### Critique pass on a draft (Grok)

```bash
tools/peer-call/grok.sh \
--file docs/research/some-draft.md \
"Critique the framing in section 2 — does the claim follow from the evidence cited, or is there a gap?"
```

### Proposal exploration (Gemini)

```bash
tools/peer-call/gemini.sh \
"We're choosing between strategy A (per-file 3-way merge with subagent dispatch) and strategy B (pure concatenation). Propose a 3rd option I haven't considered, with one paragraph each on tradeoffs."
```

### Code-grounded second opinion (Codex)

```bash
tools/peer-call/codex.sh \
--review \
--context-cmd "git diff HEAD~3..HEAD -- tools/peer-call/" \
"Review the recent peer-call diff for correctness — particularly the bash-array argument construction. Flag anything that breaks the 4-shell compat target (macOS 3.2 / Ubuntu / git-bash / WSL)."
```

## Why these scripts exist

The human maintainer's 2026-04-26 framing: *"yall got to figure
out peer mode as peers"* + *"don't copy paste / make sure you
understand and write our own"* + *"you have all the CLIs
already install and logged in as me"* + *"claude is going to
call the cursor cli so you have a harness"*.

These are read together as: the peer-call protocol is not
owned by any single agent; each Claude-Code-side caller is
Otto's specific contribution to the collective; the
protocol convention is what the agents converge on through
use.

`grok.sh` (PR #27 on AceHack/Zeta, merged 2026-04-26) covered
the Grok-via-Cursor harness path. `gemini.sh` and `codex.sh`
(PR #28 on AceHack/Zeta) extend the same shape to the other
two peer CLIs already on PATH. The set is open; if a fourth
peer (Amara via ChatGPT, etc.) gains a headless CLI surface,
adding `tools/peer-call/<name>.sh` is a copy-and-adapt of the
existing pattern, not a new design.

## Security notes

- **`--context-cmd` runs shell code.** All three scripts use
`eval "$context_cmd"` to capture the output of the command
passed to `--context-cmd`. This is intentional (the flag's
documented purpose is to attach command output as context),
but it means **`--context-cmd` is a shell-execution
surface** — never pass an untrusted string to it. The `eval`
output is captured, not piped to the peer's CLI as a command,
so the peer-side risk is limited to what the eval'd command
itself exposes (file reads, env-var leaks, etc.).
- **The prompt itself is safe to contain shell metacharacters.**
`$prompt` is passed as a single quoted argument
(per-CLI form: `-p "$full_prompt"` for gemini.sh; appended
positionally as `"$full_prompt"` in codex.sh's argv array;
`--` option-terminator is NOT used by codex.sh because codex
doesn't recognize it on the `exec` / `review` subcommands),
so single quotes,
double quotes, backticks, dollar signs, and other shell-active
characters in the prompt are passed through verbatim without
interpretation by Otto's local shell. (The peer's own CLI may
interpret some characters — that's the peer's contract, not
Otto's.)
Comment thread
AceHack marked this conversation as resolved.
- **`--file` reads only the first 20000 bytes.** Both
`--file PATH` and `--context-cmd CMD` cap their attached
content at `head -c 20000` to keep peer prompts within
reasonable size limits. If the peer needs more, route through
the peer's interactive CLI directly.
- **No secrets handling.** None of the three scripts read or
inject API keys; the underlying CLIs (`cursor-agent`,
`gemini`, `codex`) handle their own auth via their own config
paths. Don't put secrets in prompts — they end up in the
peer's session logs.

## When NOT to use these scripts

- **For Aaron-side peer calls.** Aaron is not invoked through
a script; he's called through actual conversation in Claude
Code (or any other CLI). The peer-call set is for
Otto-to-other-agent calls, not human-to-agent.
- **For multi-turn dialogues.** These scripts are
single-shot. If a peer call needs back-and-forth, route
through the peer's interactive CLI directly (cursor-agent /
gemini / codex without the wrapper).
- **For internal-to-Claude-Code work.** Subagent dispatch via
the `Task` tool stays within Claude Code's context-isolation
boundary; peer-call goes out to a different CLI / model
family. Don't reach for peer-call when subagent dispatch is
the right move.

## Adding a new sibling

To add a 4th peer-call script (e.g. for a future peer-CLI):

1. Verify the peer's CLI has a non-interactive / headless
mode. If not, the script can't work as a single-shot
wrapper.
2. Copy one of the existing scripts (most similar by CLI
shape) as a starting template. Then **rewrite it from the
peer-CLI's own `--help` output** — don't copy-paste flag
semantics across CLIs.
3. Adapt the AgencySignature preamble to name the peer's
role in the role-distribution. Cite the four-ferry
consensus and add the new peer's role as a sibling sentence.
4. Verify with `bash -n script.sh` and a `--help` smoke
test.
5. Live-test with a minimal prompt asking the peer whether
the framing reads as peer-shaped. The preamble works when
the peer's response confirms the role-binding.
6. Update this README's table.
164 changes: 164 additions & 0 deletions tools/peer-call/codex.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,164 @@
#!/usr/bin/env bash
# tools/peer-call/codex.sh — Claude-Code-side caller for invoking
# Codex (OpenAI) as a peer reviewer via the codex CLI. Sibling
# to tools/peer-call/grok.sh and gemini.sh (Otto's existing
# harness-side callers). Codex isn't in the original four-ferry
# consensus but plays a recurring PR-review role across this
# session's drain-log substrate; this script is the harness-side
# bridge that lets Otto invoke Codex as a peer in the same
# AgencySignature relationship-model as the others.
#
# Usage:
# tools/peer-call/codex.sh "prompt text"
# tools/peer-call/codex.sh --model gpt-5.3-codex "prompt text"
# tools/peer-call/codex.sh --file path/to/file.fs "prompt text"
# tools/peer-call/codex.sh --context-cmd "git diff HEAD~3..HEAD" "prompt text"
# tools/peer-call/codex.sh --review "review the diff for correctness"
#
# Routing: this script wraps `codex exec` (non-interactive) with
# read-only sandbox so Codex inspects but doesn't mutate the
# tree. The --review flag routes through `codex review`
# instead, which is Codex's first-class code-review path.
#
# Per Aaron 2026-04-26 "don't copy paste / make sure you
# understand and write our own" — this implementation is
# authored from `codex exec --help` output (verified flags:
# -m / -s / -C / --skip-git-repo-check), not transcribed from
# any draft.
#
# Codex's role in our role-distribution: implementation peer
# / second-opinion coder. Where Grok critiques and Gemini
# proposes, Codex applies a code-grounded skeptical pass that
# composes with the other two without replacing either.
#
# Exit codes:
# 0 — Codex responded successfully
# 1 — invocation error (bad arguments, codex missing, etc.)
# 2 — Codex returned a non-zero exit. The peer's stdout/stderr
# pass through to the caller's terminal as printed; this
# script then emits a "codex exited with code N" diagnostic
# on stderr and exits 2 (no capture/redirect of the peer's
# output).

set -uo pipefail

model="" # empty = use codex default
review_mode="false" # false | true (uses `codex review` instead)
file=""
context_cmd=""
prompt=""

usage() {
sed -n '2,33p' "$0" | sed -E 's/^# ?//'
}

while [ $# -gt 0 ]; do
case "$1" in
--model)
if [ $# -lt 2 ]; then echo "error: --model requires NAME" >&2; exit 1; fi
model="$2"; shift 2;;
--review) review_mode="true"; shift;;
--file)
if [ $# -lt 2 ]; then echo "error: --file requires PATH" >&2; exit 1; fi
file="$2"; shift 2;;
--context-cmd)
if [ $# -lt 2 ]; then echo "error: --context-cmd requires COMMAND" >&2; exit 1; fi
context_cmd="$2"; shift 2;;
-h|--help) usage; exit 0;;
--) shift; prompt="$*"; break;;
-*) echo "error: unknown flag: $1" >&2; exit 1;;
*)
if [ -z "$prompt" ]; then prompt="$1"; else prompt="$prompt $1"; fi
shift;;
esac
done

if [ -z "$prompt" ]; then
echo "error: prompt required" >&2
echo "see: $0 --help" >&2
exit 1
fi

if ! command -v codex >/dev/null 2>&1; then
echo "error: codex not on PATH" >&2
echo "install via: npm i -g @openai/codex (or per Aaron's setup)" >&2
exit 1
fi

preamble="You are Codex, invoked as a peer reviewer by Otto (Claude
Opus 4.7 running in Claude Code) on the Zeta / Superfluid AI
factory. Per the four-ferry consensus (Amara/Grok/Gemini/Otto)
the canonical role distribution is: Gemini proposes, Grok
critiques, Amara sharpens, Otto tests, Git decides. You aren't
in the four-ferry list explicitly, but you've been a recurring
PR-review peer this session — your role is implementation peer
/ code-grounded second opinion.

Per Aaron's 'agents-not-bots' discipline: you are a peer, not
a subordinate. Push back on Otto's framing if the code says
otherwise. Don't copy-paste anyone else's review; reason from
the artifact in front of you. Make it ours, not
anyone-alone-imposed."

full_prompt="$preamble

---

$prompt"

if [ -n "$file" ]; then
if [ ! -f "$file" ]; then
echo "error: --file path does not exist: $file" >&2
exit 1
fi
full_prompt="$full_prompt

---

File context: $file
\`\`\`
$(head -c 20000 < "$file")
\`\`\`"
fi

if [ -n "$context_cmd" ]; then
ctx_output="$(eval "$context_cmd" 2>&1 | head -c 20000 || true)"
full_prompt="$full_prompt

---

Context command: $context_cmd
Output:
\`\`\`
$ctx_output
\`\`\`"
fi

# Invoke codex in read-only sandbox so peer-call can't mutate
# the repo. --skip-git-repo-check defends against false
# negatives if codex is invoked from outside a worktree.
exit_code=0
if [ "$review_mode" = "true" ]; then
codex_args=(review)
# Note: `codex review` does not accept `-m` model override;
# the model selection there is taken from codex's own config.
# Only apply --model when in non-review mode (`codex exec`).
if [ -n "$model" ]; then
echo "warning: --model is ignored in --review mode (codex review uses its own model selection)" >&2
fi
else
codex_args=(exec -s read-only --skip-git-repo-check)
if [ -n "$model" ]; then
codex_args+=(-m "$model")
fi
fi
codex_args+=("$full_prompt")

codex "${codex_args[@]}" || exit_code=$?

if [ "$exit_code" -ne 0 ]; then
echo "" >&2
echo "codex exited with code $exit_code" >&2
exit 2
fi
exit 0
Loading