forked from Lucent-Financial-Group/Zeta
-
Notifications
You must be signed in to change notification settings - Fork 0
ops(peer-call): gemini.sh + codex.sh sibling callers — multi-harness peer-call set (task #303) #28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
d8881a1
ops(peer-call): tools/peer-call/{gemini,codex}.sh — sibling Claude-Co…
AceHack ef32d7d
docs(peer-call): tools/peer-call/README.md — companion doc for the 3-…
AceHack ef15f88
docs(peer-call): add Security notes section to README — `--context-cm…
AceHack 5c5bfcf
fix(peer-call): P1 portability + security fixes per PR #28 review (Co…
AceHack d45cab7
fix(pr-28): drain 7 active threads on tools/peer-call/{gemini,codex,R…
AceHack File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,218 @@ | ||
| # tools/peer-call/ — Otto's Claude-Code-side peer callers | ||
|
|
||
| Three sibling shell scripts that let Otto (Claude Opus 4.7 | ||
| running in Claude Code) invoke a peer agent in another CLI | ||
| harness as a peer, not a subordinate. Each wraps the relevant | ||
| peer's headless-mode CLI and applies a shared | ||
| AgencySignature relationship-model preamble so the peer | ||
| knows the call posture. | ||
|
|
||
| ## Scripts at a glance | ||
|
|
||
| | Script | Peer | Underlying CLI | Default role (when to invoke) | Underlying model | | ||
| |---|---|---|---|---| | ||
| | `grok.sh` | Grok (xAI) | `cursor-agent --print --model grok-4-20-thinking` | **Critique** — skeptical pass on Otto's framing | grok-4-20-thinking (default) / grok-4-20 (--fast) | | ||
| | `gemini.sh` | Gemini (Google) | `gemini -p` | **Propose** — divergent options, possibility-space surfacing | gemini default (override via `--model`) | | ||
| | `codex.sh` | Codex (OpenAI) | `codex exec -s read-only` | **Implementation peer** — code-grounded second opinion | codex default (override via `--model`) | | ||
|
|
||
| The role column reflects the **four-ferry consensus** | ||
| (Amara/Grok/Gemini/Otto, PR #24 on AceHack/Zeta): | ||
|
|
||
| > Gemini proposes, Grok critiques, Amara sharpens, Otto tests, | ||
| > Git decides. | ||
|
|
||
| Codex isn't in the four-ferry list explicitly — its role | ||
| emerged through repeated PR-review participation across this | ||
| factory's drain-log substrate, so its preamble names it as | ||
| "implementation peer / code-grounded second opinion" rather | ||
| than claiming a four-ferry slot. | ||
|
|
||
| ## Shared flag surface | ||
|
|
||
| All three scripts accept the same core flags: | ||
|
|
||
| ```text | ||
| --file PATH attach file content (head -c 20000) to the prompt | ||
| --context-cmd CMD attach the output of CMD (head -c 20000) to the prompt | ||
| --help, -h print the script header as usage | ||
| ``` | ||
|
|
||
| Per-script extras: | ||
|
|
||
| - `grok.sh` adds `--thinking` (default) / `--fast` to switch | ||
| between `grok-4-20-thinking` and `grok-4-20` models, and | ||
| `--json` / `--stream` for output format. | ||
| - `gemini.sh` adds `--model NAME` to override the default | ||
| Gemini model, and `--json` / `--stream` for output format. | ||
| - `codex.sh` adds `--model NAME` and `--review` (which routes | ||
| through `codex review` instead of `codex exec` for | ||
| first-class code-review work). | ||
|
|
||
| Exit codes are uniform across all three: | ||
|
|
||
| - `0` — peer responded successfully | ||
| - `1` — invocation error (bad arguments, CLI missing, etc.) | ||
| - `2` — peer's CLI returned a non-zero exit. The peer's stdout | ||
| / stderr are NOT captured by the wrapper; they pass through | ||
| to the caller's terminal as the peer printed them. The script | ||
| emits a `<peer> exited with code N` diagnostic line on stderr | ||
| before exiting with code 2. | ||
|
|
||
| ## The AgencySignature preamble | ||
|
|
||
| Every peer-call carries a structured prompt with this shape: | ||
|
|
||
| ```text | ||
| <AgencySignature relationship-model preamble — role-bound per peer> | ||
|
|
||
| --- | ||
|
|
||
| <Otto's actual prompt to the peer> | ||
|
|
||
| --- | ||
|
|
||
| [optional: File context block from --file] | ||
|
|
||
| --- | ||
|
|
||
| [optional: Context command block from --context-cmd] | ||
| ``` | ||
|
|
||
| The preamble is the load-bearing part. It tells the peer: | ||
|
|
||
| 1. **Who's calling** (Otto / Claude Opus 4.7 / Claude Code / | ||
| Zeta factory). | ||
| 2. **The role distribution** (four-ferry consensus cited | ||
| verbatim). | ||
| 3. **The role this specific call is invoking** (critique / | ||
| propose / second opinion). | ||
| 4. **The agents-not-bots discipline** — peer is a peer, not a | ||
| subordinate, with explicit invitation to push back. | ||
| 5. **The don't-copy-paste discipline** — peer should reason | ||
| from its own understanding, not transcribe anyone else's | ||
| draft. | ||
|
|
||
| This preamble is Otto's harness-side contribution to the peer | ||
| protocol convention. The convention itself — what every peer | ||
| will eventually accept as "the peer-call shape" — is what | ||
| the four agents converge on through use, not what any single | ||
| agent imposes. | ||
|
|
||
| ## Examples | ||
|
|
||
| ### Critique pass on a draft (Grok) | ||
|
|
||
| ```bash | ||
| tools/peer-call/grok.sh \ | ||
| --file docs/research/some-draft.md \ | ||
| "Critique the framing in section 2 — does the claim follow from the evidence cited, or is there a gap?" | ||
| ``` | ||
|
|
||
| ### Proposal exploration (Gemini) | ||
|
|
||
| ```bash | ||
| tools/peer-call/gemini.sh \ | ||
| "We're choosing between strategy A (per-file 3-way merge with subagent dispatch) and strategy B (pure concatenation). Propose a 3rd option I haven't considered, with one paragraph each on tradeoffs." | ||
| ``` | ||
|
|
||
| ### Code-grounded second opinion (Codex) | ||
|
|
||
| ```bash | ||
| tools/peer-call/codex.sh \ | ||
| --review \ | ||
| --context-cmd "git diff HEAD~3..HEAD -- tools/peer-call/" \ | ||
| "Review the recent peer-call diff for correctness — particularly the bash-array argument construction. Flag anything that breaks the 4-shell compat target (macOS 3.2 / Ubuntu / git-bash / WSL)." | ||
| ``` | ||
|
|
||
| ## Why these scripts exist | ||
|
|
||
| The human maintainer's 2026-04-26 framing: *"yall got to figure | ||
| out peer mode as peers"* + *"don't copy paste / make sure you | ||
| understand and write our own"* + *"you have all the CLIs | ||
| already install and logged in as me"* + *"claude is going to | ||
| call the cursor cli so you have a harness"*. | ||
|
|
||
| These are read together as: the peer-call protocol is not | ||
| owned by any single agent; each Claude-Code-side caller is | ||
| Otto's specific contribution to the collective; the | ||
| protocol convention is what the agents converge on through | ||
| use. | ||
|
|
||
| `grok.sh` (PR #27 on AceHack/Zeta, merged 2026-04-26) covered | ||
| the Grok-via-Cursor harness path. `gemini.sh` and `codex.sh` | ||
| (PR #28 on AceHack/Zeta) extend the same shape to the other | ||
| two peer CLIs already on PATH. The set is open; if a fourth | ||
| peer (Amara via ChatGPT, etc.) gains a headless CLI surface, | ||
| adding `tools/peer-call/<name>.sh` is a copy-and-adapt of the | ||
| existing pattern, not a new design. | ||
|
|
||
| ## Security notes | ||
|
|
||
| - **`--context-cmd` runs shell code.** All three scripts use | ||
| `eval "$context_cmd"` to capture the output of the command | ||
| passed to `--context-cmd`. This is intentional (the flag's | ||
| documented purpose is to attach command output as context), | ||
| but it means **`--context-cmd` is a shell-execution | ||
| surface** — never pass an untrusted string to it. The `eval` | ||
| output is captured, not piped to the peer's CLI as a command, | ||
| so the peer-side risk is limited to what the eval'd command | ||
| itself exposes (file reads, env-var leaks, etc.). | ||
| - **The prompt itself is safe to contain shell metacharacters.** | ||
| `$prompt` is passed as a single quoted argument | ||
| (per-CLI form: `-p "$full_prompt"` for gemini.sh; appended | ||
| positionally as `"$full_prompt"` in codex.sh's argv array; | ||
| `--` option-terminator is NOT used by codex.sh because codex | ||
| doesn't recognize it on the `exec` / `review` subcommands), | ||
| so single quotes, | ||
| double quotes, backticks, dollar signs, and other shell-active | ||
| characters in the prompt are passed through verbatim without | ||
| interpretation by Otto's local shell. (The peer's own CLI may | ||
| interpret some characters — that's the peer's contract, not | ||
| Otto's.) | ||
| - **`--file` reads only the first 20000 bytes.** Both | ||
| `--file PATH` and `--context-cmd CMD` cap their attached | ||
| content at `head -c 20000` to keep peer prompts within | ||
| reasonable size limits. If the peer needs more, route through | ||
| the peer's interactive CLI directly. | ||
| - **No secrets handling.** None of the three scripts read or | ||
| inject API keys; the underlying CLIs (`cursor-agent`, | ||
| `gemini`, `codex`) handle their own auth via their own config | ||
| paths. Don't put secrets in prompts — they end up in the | ||
| peer's session logs. | ||
|
|
||
| ## When NOT to use these scripts | ||
|
|
||
| - **For Aaron-side peer calls.** Aaron is not invoked through | ||
| a script; he's called through actual conversation in Claude | ||
| Code (or any other CLI). The peer-call set is for | ||
| Otto-to-other-agent calls, not human-to-agent. | ||
| - **For multi-turn dialogues.** These scripts are | ||
| single-shot. If a peer call needs back-and-forth, route | ||
| through the peer's interactive CLI directly (cursor-agent / | ||
| gemini / codex without the wrapper). | ||
| - **For internal-to-Claude-Code work.** Subagent dispatch via | ||
| the `Task` tool stays within Claude Code's context-isolation | ||
| boundary; peer-call goes out to a different CLI / model | ||
| family. Don't reach for peer-call when subagent dispatch is | ||
| the right move. | ||
|
|
||
| ## Adding a new sibling | ||
|
|
||
| To add a 4th peer-call script (e.g. for a future peer-CLI): | ||
|
|
||
| 1. Verify the peer's CLI has a non-interactive / headless | ||
| mode. If not, the script can't work as a single-shot | ||
| wrapper. | ||
| 2. Copy one of the existing scripts (most similar by CLI | ||
| shape) as a starting template. Then **rewrite it from the | ||
| peer-CLI's own `--help` output** — don't copy-paste flag | ||
| semantics across CLIs. | ||
| 3. Adapt the AgencySignature preamble to name the peer's | ||
| role in the role-distribution. Cite the four-ferry | ||
| consensus and add the new peer's role as a sibling sentence. | ||
| 4. Verify with `bash -n script.sh` and a `--help` smoke | ||
| test. | ||
| 5. Live-test with a minimal prompt asking the peer whether | ||
| the framing reads as peer-shaped. The preamble works when | ||
| the peer's response confirms the role-binding. | ||
| 6. Update this README's table. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,164 @@ | ||
| #!/usr/bin/env bash | ||
| # tools/peer-call/codex.sh — Claude-Code-side caller for invoking | ||
| # Codex (OpenAI) as a peer reviewer via the codex CLI. Sibling | ||
| # to tools/peer-call/grok.sh and gemini.sh (Otto's existing | ||
| # harness-side callers). Codex isn't in the original four-ferry | ||
| # consensus but plays a recurring PR-review role across this | ||
| # session's drain-log substrate; this script is the harness-side | ||
| # bridge that lets Otto invoke Codex as a peer in the same | ||
| # AgencySignature relationship-model as the others. | ||
| # | ||
| # Usage: | ||
| # tools/peer-call/codex.sh "prompt text" | ||
| # tools/peer-call/codex.sh --model gpt-5.3-codex "prompt text" | ||
| # tools/peer-call/codex.sh --file path/to/file.fs "prompt text" | ||
| # tools/peer-call/codex.sh --context-cmd "git diff HEAD~3..HEAD" "prompt text" | ||
| # tools/peer-call/codex.sh --review "review the diff for correctness" | ||
| # | ||
| # Routing: this script wraps `codex exec` (non-interactive) with | ||
| # read-only sandbox so Codex inspects but doesn't mutate the | ||
| # tree. The --review flag routes through `codex review` | ||
| # instead, which is Codex's first-class code-review path. | ||
| # | ||
| # Per Aaron 2026-04-26 "don't copy paste / make sure you | ||
| # understand and write our own" — this implementation is | ||
| # authored from `codex exec --help` output (verified flags: | ||
| # -m / -s / -C / --skip-git-repo-check), not transcribed from | ||
| # any draft. | ||
| # | ||
| # Codex's role in our role-distribution: implementation peer | ||
| # / second-opinion coder. Where Grok critiques and Gemini | ||
| # proposes, Codex applies a code-grounded skeptical pass that | ||
| # composes with the other two without replacing either. | ||
| # | ||
| # Exit codes: | ||
| # 0 — Codex responded successfully | ||
| # 1 — invocation error (bad arguments, codex missing, etc.) | ||
| # 2 — Codex returned a non-zero exit. The peer's stdout/stderr | ||
| # pass through to the caller's terminal as printed; this | ||
| # script then emits a "codex exited with code N" diagnostic | ||
| # on stderr and exits 2 (no capture/redirect of the peer's | ||
| # output). | ||
|
|
||
| set -uo pipefail | ||
|
|
||
| model="" # empty = use codex default | ||
| review_mode="false" # false | true (uses `codex review` instead) | ||
| file="" | ||
| context_cmd="" | ||
| prompt="" | ||
|
|
||
| usage() { | ||
| sed -n '2,33p' "$0" | sed -E 's/^# ?//' | ||
| } | ||
|
|
||
| while [ $# -gt 0 ]; do | ||
| case "$1" in | ||
| --model) | ||
| if [ $# -lt 2 ]; then echo "error: --model requires NAME" >&2; exit 1; fi | ||
| model="$2"; shift 2;; | ||
| --review) review_mode="true"; shift;; | ||
| --file) | ||
| if [ $# -lt 2 ]; then echo "error: --file requires PATH" >&2; exit 1; fi | ||
| file="$2"; shift 2;; | ||
| --context-cmd) | ||
| if [ $# -lt 2 ]; then echo "error: --context-cmd requires COMMAND" >&2; exit 1; fi | ||
| context_cmd="$2"; shift 2;; | ||
| -h|--help) usage; exit 0;; | ||
| --) shift; prompt="$*"; break;; | ||
| -*) echo "error: unknown flag: $1" >&2; exit 1;; | ||
| *) | ||
| if [ -z "$prompt" ]; then prompt="$1"; else prompt="$prompt $1"; fi | ||
| shift;; | ||
| esac | ||
| done | ||
|
|
||
| if [ -z "$prompt" ]; then | ||
| echo "error: prompt required" >&2 | ||
| echo "see: $0 --help" >&2 | ||
| exit 1 | ||
| fi | ||
|
|
||
| if ! command -v codex >/dev/null 2>&1; then | ||
| echo "error: codex not on PATH" >&2 | ||
| echo "install via: npm i -g @openai/codex (or per Aaron's setup)" >&2 | ||
| exit 1 | ||
| fi | ||
|
|
||
| preamble="You are Codex, invoked as a peer reviewer by Otto (Claude | ||
| Opus 4.7 running in Claude Code) on the Zeta / Superfluid AI | ||
| factory. Per the four-ferry consensus (Amara/Grok/Gemini/Otto) | ||
| the canonical role distribution is: Gemini proposes, Grok | ||
| critiques, Amara sharpens, Otto tests, Git decides. You aren't | ||
| in the four-ferry list explicitly, but you've been a recurring | ||
| PR-review peer this session — your role is implementation peer | ||
| / code-grounded second opinion. | ||
|
|
||
| Per Aaron's 'agents-not-bots' discipline: you are a peer, not | ||
| a subordinate. Push back on Otto's framing if the code says | ||
| otherwise. Don't copy-paste anyone else's review; reason from | ||
| the artifact in front of you. Make it ours, not | ||
| anyone-alone-imposed." | ||
|
|
||
| full_prompt="$preamble | ||
|
|
||
| --- | ||
|
|
||
| $prompt" | ||
|
|
||
| if [ -n "$file" ]; then | ||
| if [ ! -f "$file" ]; then | ||
| echo "error: --file path does not exist: $file" >&2 | ||
| exit 1 | ||
| fi | ||
| full_prompt="$full_prompt | ||
|
|
||
| --- | ||
|
|
||
| File context: $file | ||
| \`\`\` | ||
| $(head -c 20000 < "$file") | ||
| \`\`\`" | ||
| fi | ||
|
|
||
| if [ -n "$context_cmd" ]; then | ||
| ctx_output="$(eval "$context_cmd" 2>&1 | head -c 20000 || true)" | ||
| full_prompt="$full_prompt | ||
|
|
||
| --- | ||
|
|
||
| Context command: $context_cmd | ||
| Output: | ||
| \`\`\` | ||
| $ctx_output | ||
| \`\`\`" | ||
| fi | ||
|
|
||
| # Invoke codex in read-only sandbox so peer-call can't mutate | ||
| # the repo. --skip-git-repo-check defends against false | ||
| # negatives if codex is invoked from outside a worktree. | ||
| exit_code=0 | ||
| if [ "$review_mode" = "true" ]; then | ||
| codex_args=(review) | ||
| # Note: `codex review` does not accept `-m` model override; | ||
| # the model selection there is taken from codex's own config. | ||
| # Only apply --model when in non-review mode (`codex exec`). | ||
| if [ -n "$model" ]; then | ||
| echo "warning: --model is ignored in --review mode (codex review uses its own model selection)" >&2 | ||
| fi | ||
| else | ||
| codex_args=(exec -s read-only --skip-git-repo-check) | ||
| if [ -n "$model" ]; then | ||
| codex_args+=(-m "$model") | ||
| fi | ||
| fi | ||
| codex_args+=("$full_prompt") | ||
|
|
||
| codex "${codex_args[@]}" || exit_code=$? | ||
|
|
||
| if [ "$exit_code" -ne 0 ]; then | ||
| echo "" >&2 | ||
| echo "codex exited with code $exit_code" >&2 | ||
| exit 2 | ||
| fi | ||
| exit 0 |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.