storybookjs · valentinpalkovic · May 15, 2026 · May 15, 2026 · May 18, 2026 · May 18, 2026
diff --git a/.agents/skills/verify-recipe-author/SKILL.md b/.agents/skills/verify-recipe-author/SKILL.md
@@ -0,0 +1,192 @@
+---
+name: verify-recipe-author
+description: Generate the Playwright recipe spec for a PR-verify-pr-generate prompt bundle. Reads `.verify-output/<runId>/prompt-bundle.json`, dispatches the OMC executor agent (model=opus), and pipes the raw agent reply into `verify-pr-author` (stdin mode). The TypeScript core owns extraction, deny-regex, header-comment provenance, the file write to `.verify-recipes/pr-<#>.spec.ts`, scoped lint, the single retry, and `.verify-output/<runId>/result.json`. Trigger after `yarn verify-pr-generate`.
+allowed-tools: Agent, Bash, Read, Write, Edit
+---
+
+# Verify Recipe Author
+
+Consumes a prompt bundle emitted by `yarn verify-pr-generate --pr <#>` and produces the per-PR Playwright recipe spec for human review. Authoring only — never executes the spec.
+
+This skill is invoked **after** `yarn verify-pr-generate --pr <#>` succeeds. The bun script does the deterministic I/O (gh fetch, triage, prompt assembly, bundle write); this skill **only** dispatches the agent and pipes its raw reply into the `verify-pr-author` CLI. Extraction, deny-regex, provenance, file write, lint, the single retry, and `result.json` all live in TypeScript core — the skill never does them itself.
+
+> **Paths are repo-root-relative.** Every path below is written relative to
+> the repository root, denoted `$REPO_ROOT`. Resolve it once at runtime with
+> `REPO_ROOT="$(git rev-parse --show-toplevel)"` (works from any clone,
+> worktree, or CI checkout) and substitute it wherever `$REPO_ROOT` appears.
+> Never hardcode an absolute machine path — it breaks on every other
+> clone/worktree/CI runner.
+
+The full design and acceptance criteria live in `$REPO_ROOT/.omc/plans/pr-verify-v3-agent-generated-recipes.md` (§Lane C, §D6, §D8, §D9). Read the plan if anything below is ambiguous.
+
+## Inputs
+
+No args required. The skill discovers the most recent bundle automatically. The caller may optionally pass an explicit bundle path as the skill argument.
+
+1. **Auto-discover (default)**: list `$REPO_ROOT/.verify-output/`, pick the directory with the lexicographically largest name (ISO timestamps sort correctly), then read `prompt-bundle.json` inside it.
+2. **Explicit path**: if the user passed an absolute path to a `prompt-bundle.json`, read that file directly.
+
+Bundle shape (see `scripts/verify-pr-generate.ts` for the canonical emitter):
+
+```jsonc
+{
+  "version": 1,
+  "prNumber": 12345,
+  "runId": "...",
+  "outputSpecPath": "/abs/path/.verify-recipes/pr-12345.spec.ts",
+  "force": false,
+  "prompt": "<full assembled prompt>",
+  "metadata": {
+    "agentModel": "claude-opus-4-7[1m]",
+    "referenceSpecs": ["..."],
+    "triageGlobs": ["..."],
+    "generatedAt": "<ISO>"
+  }
+}
+```
+
+The `<runId>` is the parent directory of the bundle — derive it from the bundle path, not from a field.
+
+## Runbook
+
+Follow these steps in order. Stop and emit `result.json` per §Failure Modes on any non-success outcome.
+
+### Step 1 — Read the bundle
+
+`Read` the bundle JSON. Capture `prNumber`, `runId` (from the parent dir), `outputSpecPath`, `force`, `prompt`, and `metadata`.
+
+### Step 2 — Pre-flight collision check (D9, TOCTOU re-guard)
+
+Re-check whether `bundle.outputSpecPath` already exists. The bun script enforced D9 at bundle-emit time; the skill re-checks because the user may have created the file between the two steps.
+
+- If the file exists and `bundle.force === false` → write `result.json` with `{ status: "collision", specPath: <path>, attempts: 0 }` and stop. (This mirrors the CLI's own `collision` status / exit 1; the pre-flight only exists to skip a wasted agent dispatch — the CLI re-enforces D9 regardless.)
+- Otherwise proceed.
+
+> **One owner.** After dispatch, the TypeScript core
+> (`scripts/verify-pr-author.ts` → `scripts/verify/recipe-author-core.ts`)
+> owns spec-body extraction, deny-regex, header-comment provenance, the
+> file write, scoped lint, post-write regex checks, the single retry, and
+> `result.json`. The skill does **not** extract fences, run deny-regex, or
+> write the spec itself. Steps 3–5 below are the entire runbook.
+
+### Step 3 — Dispatch the agent (attempt 1)
+
+```
+Agent({
+  description: "Generate PR recipe spec",
+  subagent_type: "oh-my-claudecode:executor",
+  model: "opus",
+  prompt: bundle.prompt
+})
+```
+
+The bundle's `prompt` already contains the full authoring contract,
+reference specs, PR diff, and fence-marker instruction
+(`<<<SPEC_START>>>` … `<<<SPEC_END>>>`). Capture the agent's full raw
+reply as `$REPLY` (do not parse or edit it).
+
+### Step 4 — Pipe the raw reply to `verify-pr-author` (stdin mode)
+
+```bash
+printf '%s' "$REPLY" | node "$REPO_ROOT/scripts/verify-pr-author.ts" --bundle <abs-bundle-path> --dispatch-mode stdin
+```
+
+The CLI performs extraction, deny-regex, provenance, file write, scoped
+lint (`scripts/verify/lint-invocation.ts`), post-write regex checks, and
+writes `result.json`. Exit codes:
+
+- `0` — success. CLI wrote the spec and `result.json`. Go to Step 6.
+- `75` — retryable failure (lint, post-write regex, **or a first
+  deny-regex hit** — the CLI asks the agent to self-correct). The CLI
+  emitted a framed retry block on stdout. Go to Step 5.
+- `1` — terminal failure (collision, extract-failed, or any gate
+  exhausted on the final attempt). CLI already wrote `result.json` with
+  the failure status. Print the failure line (Step 6) and stop.
+
+Exit 75 is the sole retry sentinel; any other non-zero exit is terminal.
+The skill never decides retryability — the CLI does.
+
+### Step 5 — Retry once (on exit 75)
+
+Parse stdout for the framed retry block:
+
+```
+===VERIFY_PR_AUTHOR_RETRY_BEGIN===
+<retryMessage payload — already categorized and capped at 5 errors>
+===VERIFY_PR_AUTHOR_RETRY_END===
+```
+
+Assemble the retry prompt and re-dispatch the agent (same
+`subagent_type` and `model`):
+
+```
+<bundle.prompt>
+
+[RETRY]
+<retryMessage>
+```
+
+Pipe the new raw reply back through the CLI in retry mode:
+
+```bash
+printf '%s' "$REPLY2" | node "$REPO_ROOT/scripts/verify-pr-author.ts" --bundle <abs-bundle-path> --dispatch-mode stdin --retry-of <runId>
+```
+
+The CLI enforces `MAX_RECIPE_ATTEMPTS` (read from
+`scripts/verify/recipe-author-core.ts`; currently 2) and will **not**
+re-emit exit 75 on the retry call. Expected exits:
+
+- `0` — success. Go to Step 6.
+- `1` — terminal failure (any gate exhausted on attempt 2). CLI wrote
+  `result.json` with `attempts: 2` and the terminal status. Print the
+  failure line and stop.
+
+### Step 6 — Print actionable next-step lines
+
+`result.json` is already written by the CLI — do **not** write it from
+the skill. On success print:
+
+```
+[verify-recipe-author] spec written: <abs spec path>
+[verify-recipe-author] result.json: <abs result.json path>
+[verify-recipe-author] attempts: <n>
+[verify-recipe-author] Next: review the spec, then run `yarn verify-pr --recipe-spec <spec path>`
+```
+
+On a terminal exit-1, print instead:
+
+```
+[verify-recipe-author] FAILED: <status> — see <abs result.json path>
+```
+
+## Failure Modes
+
+`result.json` is written by the CLI, not the skill. `status` is the exact
+`RecipeAuthorStatus` union from `scripts/verify/recipe-author-core.ts` —
+do not invent values. On attempt 1 in stdin mode, lint / post-write-regex
+/ **first deny-regex hit** all return `retry-requested` (CLI exit 75) so
+the agent can self-correct; the terminal status below is what lands when
+attempts are exhausted (CLI exit 1).
+
+| Cause | terminal `status` | Exit | Retried once first? |
+|---|---|---|---|
+| `outputSpecPath` exists and `force === false` | `collision` | 1 | no |
+| No parseable body between fence markers | `extract-failed` | 1 | no (terminal immediately) |
+| Deny-regex hit | `deny-regex-hit` | 1 | **yes** (attempt-1 → `retry-requested`/exit 75) |
+| Scoped lint failed | `lint-failed` | 1 | yes (attempt-1 → `retry-requested`/exit 75) |
+| Post-write regex check failed (listener-before-goto OR attach) | `regex-failed` | 1 | yes (attempt-1 → `retry-requested`/exit 75) |
+| All gates pass | `spec-written` | 0 | n/a |
+
+## Notes
+
+- This skill runs inside Claude Code; it uses `Agent`, `Read`, `Write`, `Bash`, and `Edit` tools.
+- Paths in invocations are repo-root-relative (`$REPO_ROOT`, resolved via `git rev-parse --show-toplevel` — see the note near the top); resolve `$REPO_ROOT` to an absolute path before invoking. Lint commands `cd code` via `yarn --cwd`.
+- Max attempts = `MAX_RECIPE_ATTEMPTS` (currently 2). Read the value from `scripts/verify/recipe-author-core.ts` — do not hardcode.
+- The skill **never executes** the generated spec. The human review gate (Phase-1 lethal-trifecta breaker) is preserved.
+- A first deny-regex hit is retried **once** in stdin mode (the CLI emits `retry-requested` / exit 75 so the agent can self-correct, e.g. eval #36); only an exhausted deny hit is the terminal `deny-regex-hit`. The deny-regex remains a security gate — the single self-correction attempt does not weaken it (every attempt is re-checked; a persistent hit still terminates).
+- Cap retry feedback at 5 errors (R3).
+- The `runId` is the basename of the parent directory of the bundle; do not invent a new one.
+
+## Phase-2 follow-up
+
+This skill currently couples generation to a running Claude Code session via the `Agent` tool dispatch. Phase-2 CI activation will require migrating to a direct Anthropic SDK call (`@anthropic-ai/sdk`) with an `ANTHROPIC_API_KEY` env var, replacing the `Agent` dispatch with a standalone API call so the workflow at `.github/workflows/verify-pr.yml` can run unattended. Tracked as a follow-up in the plan's ADR §Follow-ups.
diff --git a/.circleci/config.yml b/.circleci/config.yml
@@ -17,6 +17,13 @@ parameters:
     default: ''
     description: The PR number
     type: string
+  ghIsFork:
+    default: 'false'
+    description: >
+      'true' when the triggering PR head is a fork (untrusted). SECURITY:
+      gates save_cache so a fork pipeline cannot poison the project-global
+      cache that trusted merged/daily pipelines restore.
+    type: string
   workflow:
     default: skipped
     description: Which workflow to run
@@ -44,7 +51,7 @@ jobs:
       - run:
           name: Generate config
           command: |
-            yarn dlx jiti ./scripts/ci/main.ts --workflow=<< pipeline.parameters.workflow >>
+            yarn dlx jiti ./scripts/ci/main.ts --workflow=<< pipeline.parameters.workflow >> --is-fork=<< pipeline.parameters.ghIsFork >>
       - continuation/continue:
           configuration_path: .circleci/config.generated.yml
 workflows:

diff --git a/.claude/skills/verify-recipe-author/SKILL.md b/.claude/skills/verify-recipe-author/SKILL.md
@@ -0,0 +1 @@
+@../../../.agents/skills/verify-recipe-author/SKILL.md
diff --git a/.dockerignore b/.dockerignore
@@ -0,0 +1,19 @@
+.env
+.env.*
+**/.env
+**/.env.*
+~/.ssh/
+~/.aws/
+~/.config/gcloud/
+~/.azure/
+~/.docker/config.json
+~/.kube/config
+.npmrc
+.pypirc
+**/*-service-account.json
+**/*.pem
+**/*.key
+~/.git-credentials
+.verify-output/
+node_modules/
+.nx/
diff --git a/.github/actions/agentic-pr-prepare/README.md b/.github/actions/agentic-pr-prepare/README.md
@@ -0,0 +1,142 @@
+# agentic-pr-prepare
+
+Universal infrastructure setup for agentic workflows running under
+`pull_request_target`: actor-permission gate, base + PR-head manual clones,
+toolchain install, sandbox-runtime (srt) install + sha-pin verification,
+srt-settings JSON, egress smoke-test, and trusted-harness sync.
+
+This is **half 1 of 2** of the split `verify-pr.yml` infrastructure. The
+companion is `agentic-pr-publish`.
+
+## Caller contract
+
+The composite **cannot** declare these — the caller workflow MUST:
+
+1. Trigger on `pull_request_target` (composite `uses: ./.github/actions/...`
+   resolves against the **base ref** under PRT, which is load-bearing for
+   trust — never lift this to a trigger that resolves against PR-head).
+2. Declare a `permissions:` block. Verify-PR needs at least:
+   ```yaml
+   permissions:
+     pull-requests: write
+     issues: write
+     statuses: write
+     contents: write  # side-branch screenshot push (drop if not needed)
+   ```
+3. Declare a `concurrency:` block. Single-PR:
+   ```yaml
+   concurrency:
+     group: verify-${{ github.event.pull_request.number }}
+     cancel-in-progress: true
+   ```
+   With `strategy.matrix`, include the matrix dim in the key:
+   `verify-${{ pr-num }}-${{ matrix.target }}` (matrix-concurrency footgun).
+4. Pass `srt-sha256` **inline** with every call. The composite has **no
+   default** — this keeps a chore-bump PR carrying the heightened
+   workflow-review bar instead of single-approval flipping a composite default.
+
+## Inputs
+
+| Name                   | Required | Default                          | Purpose                                                                                |
+|------------------------|----------|----------------------------------|----------------------------------------------------------------------------------------|
+| `github-token`         | yes      | —                                | Base + PR-head manual clones.                                                          |
+| `base-ref`             | yes      | —                                | `github.event.pull_request.base.ref`.                                                  |
+| `base-sha`             | yes      | —                                | `github.event.pull_request.base.sha`.                                                  |
+| `pr-head-sha`          | yes      | —                                | `github.event.pull_request.head.sha`.                                                  |
+| `repo`                 | yes      | —                                | `github.repository`.                                                                   |
+| `srt-version`          | no       | `0.0.51`                         | Pinned `@anthropic-ai/sandbox-runtime` version.                                        |
+| `srt-sha256`           | **yes**  | — (no default by design)         | sha256 of the resolved `srt` shim at `srt-version`. Bump via `_srt-sha-probe.yml`.     |
+| `srt-allowed-domains`  | no       | localhost + registries + CDNs    | Newline list. Caller may extend.                                                       |
+| `srt-allow-write-paths`| no       | `$PR_HEAD_DIR`, `$SANDBOX_TMPDIR`, `/tmp`, `$HOME/.cache`, … | Newline list; env vars expanded at composite runtime.            |
+| `srt-deny-read-paths`  | no       | `$HOME/.ssh`, `$HOME/.aws`, …    | Newline list.                                                                          |
+| `srt-deny-write-paths` | no       | `$GITHUB_WORKSPACE`, `$GITHUB_WORKSPACE/.git` | Newline list.                                                              |
+| `sync-files`           | no       | (empty)                          | Newline-delimited `src:dst` pairs (paths relative). H2 path-validated.                 |
+| `sync-trees`           | no       | (empty)                          | Newline-delimited tree paths (relative). H2 path-validated.                            |
+| `provenance-secret`    | no       | (empty → per-run random)         | Optional caller-supplied. M2: written to file, not `$GITHUB_ENV`.                      |
+| `install-code-deps`    | no       | `true`                           | Pass-through to `setup-node-and-install`.                                              |
+
+### Path-input safety (H2)
+
+`sync-files` and `sync-trees` reject `..`, leading `/`, extra `:`; resolve
+realpath and assert under `$PR_HEAD_DIR`. Refuses symlink at destination
+before `cp --no-dereference` / `cp -aT`.
+
+### srt-settings JSON emission (H3)
+
+allowWrite / denyRead / denyWrite / allowedDomains arrays are emitted via
+`jq -R . | jq -s .` so PR-controllable strings cannot inject JSON keys.
+
+## Outputs
+
+| Name                       | Purpose                                                                                          |
+|----------------------------|--------------------------------------------------------------------------------------------------|
+| `pr-head-dir`              | Absolute path to untrusted PR-head workspace clone.                                              |
+| `srt-settings-path`        | Absolute path to `srt-settings.json`.                                                            |
+| `diff-path`                | Absolute path to captured `pr.diff`.                                                             |
+| `provenance-secret-path`   | M2: path to file (mode 0600) holding the per-run provenance secret. NOT in `$GITHUB_ENV`.        |
+
+## Side-effects
+
+Writes to `$GITHUB_ENV` (so subsequent caller steps in the same job see them):
+
+- `PR_HEAD_DIR` — absolute path to PR-head workspace
+- `SRT_SETTINGS` — absolute path to srt-settings.json
+- `CLAUDE_CODE_TMPDIR` — absolute path to sandbox scratch tmpdir
+
+Does **NOT** write `VERIFY_PROVENANCE_SECRET` to `$GITHUB_ENV`. Trusted task
+steps load it explicitly: `cat "$(provenance-secret-path)"`.
+
+## Worked example
+
+```yaml
+- name: Prepare agentic environment
+  id: prep
+  uses: ./.github/actions/agentic-pr-prepare
+  with:
+    github-token: ${{ secrets.GITHUB_TOKEN }}
+    base-ref: ${{ github.event.pull_request.base.ref }}
+    base-sha: ${{ github.event.pull_request.base.sha }}
+    pr-head-sha: ${{ github.event.pull_request.head.sha }}
+    repo: ${{ github.repository }}
+    srt-version: '0.0.51'
+    srt-sha256: '36de38197ac22991c8c9edead4d6184914c8b786e040ecf27bdcf26abd166338'
+    sync-files: |
+      .verify-recipes/_util.ts:.verify-recipes/_util.ts
+    sync-trees: |
+      scripts/verify
+    provenance-secret: ${{ secrets.VERIFY_PROVENANCE_SECRET }}
+
+- name: Your task
+  env:
+    PROVENANCE_SECRET_PATH: ${{ steps.prep.outputs.provenance-secret-path }}
+  run: |
+    VERIFY_PROVENANCE_SECRET="$(cat "$PROVENANCE_SECRET_PATH")" \
+      yarn your-thing
+```
+
+## Pre-existing architectural debt (C1 — NOT fixed by this composite)
+
+`verify-result.json` (the file the verdict is read from) lives at
+`$PR_HEAD_DIR/.verify-out-trusted/verify-result.json` — inside srt's
+`allowWrite` set. A malicious PR-added unit test running inside srt can
+forge it. The split documented here does NOT make C1 worse; it stays at
+its current path so the legitimate writer (`verify-pr.ts`, which itself
+runs INSIDE srt) keeps working.
+
+**The architectural fix requires** one of:
+
+1. **Process-split** — orchestrator OUTSIDE srt, only Playwright + dev-server
+   spawns wrapped. **Attempted 2026-05-14, failed**: srt uses bubblewrap with
+   a fresh network namespace per invocation, so localhost IPC between
+   orchestrator (outside) and dispatcher (inside) breaks. Reviving requires
+   shared host netns (loses egress policy on dispatcher), host-network bridge
+   / Unix socket, or moving dispatcher outside srt (loosens trust on
+   PR-modified framework code).
+2. **HMAC-bound verdict** — `verify-pr.ts` HMAC-signs the JSON with the
+   provenance secret; trusted bash verifies. Requires scrubbing the secret
+   from orchestrator env before spawning Playwright + auditing
+   `/proc/<pid>/environ` reachability inside srt.
+
+Until that lands, the verdict is trustworthy ONLY when paired with the
+side-channel signals (PR comment, telemetry, GitHub run conclusion) that an
+attacker would also have to forge. Tracked as separate follow-up.
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		@../../../.agents/skills/verify-recipe-author/SKILL.md