Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
5a22a8e
feat(scripts): pr verify harness PoC MVP
valentinpalkovic May 8, 2026
8a7626c
feat(scripts): pr verify harness v2 — raw playwright + security harde…
valentinpalkovic May 11, 2026
064bd40
feat(scripts): add --port flag to verify-pr harness
valentinpalkovic May 11, 2026
175cfc0
feat(scripts): pr verify harness v3 — agent-generated recipes
valentinpalkovic May 11, 2026
7f3c95e
feat(scripts): pr verify harness v4 — ci activation + sdk dispatch
valentinpalkovic May 11, 2026
478ba47
feat(scripts): pr verify harness v5 — pin action SHAs for activation
valentinpalkovic May 11, 2026
51ca947
fix(scripts): correct MODEL_ID_MAP to use valid public Anthropic IDs
valentinpalkovic May 11, 2026
e3aae89
fix(scripts): pr verify harness — install deps + robust post-comment
valentinpalkovic May 11, 2026
2a1aa10
fix(scripts): pr verify harness — install bun on CI runner
valentinpalkovic May 11, 2026
a256f3a
fix(scripts): pr verify harness — upload artifact path glob
valentinpalkovic May 11, 2026
7b660d7
debug(scripts): pr verify harness — locate verify-output dir
valentinpalkovic May 11, 2026
0993b55
fix(scripts): pr verify harness — include hidden files in artifact
valentinpalkovic May 11, 2026
1ead1ab
feat(scripts): pr verify harness v5 — Dockerfile + image build
valentinpalkovic May 11, 2026
449b4c5
fix(scripts): pr verify harness — manual git clone for PR head checkout
valentinpalkovic May 11, 2026
32f2846
fix(scripts): pr verify harness — install xz-utils for Node tarball e…
valentinpalkovic May 11, 2026
6308bf3
fix(scripts): pr verify harness — idempotent user creation in Dockerfile
valentinpalkovic May 11, 2026
a7b1139
fix(scripts): pr verify harness — rename YARN_BIN env to HARNESS_YARN…
valentinpalkovic May 11, 2026
712d6be
fix(scripts): pr verify harness — copy eslint-plugin-local-rules port…
valentinpalkovic May 11, 2026
5593f74
fix(scripts): pr verify harness — chown workdir before yarn install
valentinpalkovic May 11, 2026
75745ce
fix(scripts): pr verify harness — compile cli packages before sandbox…
valentinpalkovic May 11, 2026
d3bcded
fix(scripts): pr verify harness — use 'cli' nx project name for cli-s…
valentinpalkovic May 11, 2026
499be3f
fix(scripts): pr verify harness — verify dispatcher.js, not top-level…
valentinpalkovic May 11, 2026
cfa4fc7
fix(scripts): pr verify harness — recompile core before sandbox dist …
valentinpalkovic May 11, 2026
6978e5f
revert(scripts): pr verify harness — drop Dockerfile + supply-chain s…
valentinpalkovic May 11, 2026
b3d229b
feat(scripts): pr verify harness v6 — local-first pipeline + target d…
valentinpalkovic May 11, 2026
526c91c
feat(workflow): pr verify harness v6 — runner-based verify-pr.yml
valentinpalkovic May 11, 2026
dc87af5
chore(docs): pr verify harness v6 README + RUNBOOK + SECURITY rewrite
valentinpalkovic May 11, 2026
28c551e
fix(workflow): pr verify harness v6 — force NX_NO_CLOUD on Verify PR …
valentinpalkovic May 11, 2026
95067cf
fix(workflow): pr verify harness v6 — compile every NX project for in…
valentinpalkovic May 11, 2026
53a89c7
fix(workflow): pr verify harness v6 — compile core before run-many fo…
valentinpalkovic May 11, 2026
29ac5e8
fix(workflow): pr verify harness v6 — clone PR head outside base chec…
valentinpalkovic May 11, 2026
a3976bf
fix(scripts): pr verify harness v6 — install playwright chromium + fi…
valentinpalkovic May 11, 2026
9665521
fix(workflow): pr verify harness v6 — auto-create verified-by-harness…
valentinpalkovic May 11, 2026
bfc3fad
fix(core): universal-store — suppress unhandled-rejection on follower…
valentinpalkovic May 11, 2026
33af576
feat(workflow): pr verify harness v6 — push screenshots to side branc…
valentinpalkovic May 11, 2026
47e15da
feat(workflow): pr verify harness v6 — single-round pivot (author + e…
valentinpalkovic May 12, 2026
73d7721
fix(workflow): pr verify harness v6 — move PR_HEAD_DIR off invalid jo…
valentinpalkovic May 12, 2026
6e159ed
fix(workflow): pr verify harness v6 — pin base checkout to branch ref…
valentinpalkovic May 12, 2026
18f650c
Core: Use UndoIcon for Review-changes clear button
valentinpalkovic May 11, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
246 changes: 246 additions & 0 deletions .agents/skills/verify-recipe-author/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,246 @@
---
name: verify-recipe-author
description: Generate the Playwright recipe spec for a PR-verify-pr-generate prompt bundle. Reads `.verify-output/<runId>/prompt-bundle.json`, dispatches the OMC executor agent (model=opus), runs deny-regex, writes `.verify-recipes/pr-<#>.spec.ts` with header-comment provenance, lints with one retry, emits `.verify-output/<runId>/result.json`. Trigger after `yarn verify-pr-generate`.
allowed-tools: Agent, Bash, Read, Write, Edit
---

# Verify Recipe Author

Consumes a prompt bundle emitted by `yarn verify-pr-generate --pr <#>` and produces the per-PR Playwright recipe spec for human review. Authoring only — never executes the spec.

This skill is invoked **after** `yarn verify-pr-generate --pr <#>` succeeds. The bun script does the deterministic I/O (gh fetch, triage, prompt assembly, bundle write); this skill drives the agent dispatch, deny-regex, lint, retry, and the final write to `.verify-recipes/pr-<#>.spec.ts`.

The full design and acceptance criteria live in `/Users/valentinpalkovic/Projects/storybook/.omc/plans/pr-verify-v3-agent-generated-recipes.md` (§Lane C, §D6, §D8, §D9). Read the plan if anything below is ambiguous.

## Inputs

No args required. The skill discovers the most recent bundle automatically. The caller may optionally pass an explicit bundle path as the skill argument.

1. **Auto-discover (default)**: list `/Users/valentinpalkovic/Projects/storybook/.verify-output/`, pick the directory with the lexicographically largest name (ISO timestamps sort correctly), then read `prompt-bundle.json` inside it.
2. **Explicit path**: if the user passed an absolute path to a `prompt-bundle.json`, read that file directly.

Bundle shape (see `scripts/verify-pr-generate.ts` for the canonical emitter):

```jsonc
{
"version": 1,
"prNumber": 12345,
"runId": "...",
"outputSpecPath": "/abs/path/.verify-recipes/pr-12345.spec.ts",
"force": false,
"prompt": "<full assembled prompt>",
"metadata": {
"agentModel": "claude-opus-4-7[1m]",
"referenceSpecs": ["..."],
"triageGlobs": ["..."],
"generatedAt": "<ISO>"
}
}
```

The `<runId>` is the parent directory of the bundle — derive it from the bundle path, not from a field.

## Runbook

Follow these steps in order. Stop and emit `result.json` per §Failure Modes on any non-success outcome.

### Step 1 — Read the bundle

`Read` the bundle JSON. Capture `prNumber`, `runId` (from the parent dir), `outputSpecPath`, `force`, `prompt`, and `metadata`.

### Step 2 — Pre-flight collision check (D9, TOCTOU re-guard)

Re-check whether `bundle.outputSpecPath` already exists. The bun script enforced D9 at bundle-emit time; the skill re-checks because the user may have created the file between the two steps.

- If the file exists and `bundle.force === false` → write `result.json` with `{ status: "collision-aborted", specPath: <path>, attempts: 0 }` and stop.
- Otherwise proceed.

### Step 3 — Dispatch the agent (attempt 1)

```
Agent({
description: "Generate PR recipe spec",
subagent_type: "oh-my-claudecode:executor",
model: "opus",
prompt: bundle.prompt
})
```

The bundle's `prompt` already contains the full authoring contract, reference specs, PR diff, and fence-marker instruction (`<<<SPEC_START>>>` … `<<<SPEC_END>>>`).

### Step 4 — Extract spec body

Parse the agent's reply for the text strictly between `<<<SPEC_START>>>` and `<<<SPEC_END>>>` (exclusive of the markers, trimmed).

- If both markers are present and the body is non-empty → continue to Step 5.
- Otherwise → treat as a fence-miss. On attempt 1, jump to Step 9 with a retry message asking the agent to re-emit between fence markers and nothing else. If attempt 2 also fences-misses → write `result.json` `{ status: "agent-emitted-no-spec", attempts: 2 }` and stop.

### Step 5 — Deny-regex (security gate, NO retry)

Run the deny-regex pure function from `/Users/valentinpalkovic/Projects/storybook/scripts/verify/recipe-deny.ts` via Bash, executed from the repo root:

```bash
bun -e "import('/Users/valentinpalkovic/Projects/storybook/scripts/verify/recipe-deny.ts').then(m => { m.assertNoDeniedPatterns(process.argv[1]); })" -- "$SPEC_BODY"
```

Pass the spec body via a temp file (avoid shell-escaping pitfalls): write the body to `.verify-output/<runId>/.deny-input.txt`, then read it inside the bun one-liner. The function throws on hit.

- On any throw → write `result.json` `{ status: "deny-regex-failed", error: <message>, attempts: <current> }` and stop. **Do not retry — deny hits are security blockers.**
- On exit 0 → continue.

### Step 6 — Prepend header-comment provenance (D8)

Build a JSON-pretty block from `bundle.metadata` and `bundle.prNumber`, then prepend it as a block comment to the spec body:

```ts
/**
* verify-pr-generate: AUTO-GENERATED — review and commit alongside the PR
* {
* "generatedAt": "<metadata.generatedAt>",
* "agentModel": "<metadata.agentModel>",
* "prNumber": <bundle.prNumber>,
* "referenceSpecs": [ "<path>", ... ],
* "triageGlobs": [ "<glob>", ... ]
* }
*/
<spec body>
```

The JSON body inside the comment uses 2-space indentation. Every line of the embedded JSON begins with ` * ` to keep the block-comment well-formed.

### Step 7 — Write the file

`Write` the assembled content (header + spec body) to `bundle.outputSpecPath` (absolute path from the bundle).

### Step 8 — Pipe to `verify-pr-author` (D4-α sentinel-exit-75 contract)

Lint, deny-regex, post-write regex checks, header-comment provenance, retry-message
authoring, and result.json emission all live in TypeScript core. The skill's job is
strictly to dispatch the agent and pipe its raw reply into the author CLI.

#### 8a. Dispatch the agent (attempt 1)

```
Agent({
description: "Generate PR recipe spec",
subagent_type: "oh-my-claudecode:executor",
model: "opus",
prompt: bundle.prompt
})
```

Capture the agent's full reply as `<reply>`.

#### 8b. Pipe to `verify-pr-author --bundle <bundle-path> --dispatch-mode stdin`

```bash
printf '%s' "$REPLY" | node /Users/valentinpalkovic/Projects/storybook/scripts/verify-pr-author.ts --bundle <abs-bundle-path> --dispatch-mode stdin
```

The CLI runs the deny-regex, header-comment provenance, file write, scoped lint
(`scripts/verify/lint-invocation.ts`), and post-write regex checks. Exit codes:

- `0` — success. CLI has already written the spec and `result.json`. Skip to Step 10.
- `1` — non-retryable error (deny-regex hit, collision, IO error). CLI has written
`result.json` with the failure status. Print the failure line and stop.
- `75` — retryable lint/structural failure. The CLI has emitted a framed retry
block on stdout (see Step 9). Continue to Step 9.

Treat **exit 75** as the sole retry sentinel. Any other non-zero exit is terminal.

### Step 9 — Retry once (attempt 2, sentinel-exit-75)

On exit 75 from Step 8b, parse stdout for the framed retry block:

```
===VERIFY_PR_AUTHOR_RETRY_BEGIN===
<retryMessage payload — already categorized and capped at 5 errors>
===VERIFY_PR_AUTHOR_RETRY_END===
```

Extract the lines strictly between the BEGIN/END markers and assemble the retry
prompt as:

```
<bundle.prompt>

[RETRY]
<retryMessage>
```

Dispatch the agent again with the assembled retry prompt (same `subagent_type` and
`model`). Capture the new reply as `<reply2>`, then pipe it back through the CLI in
retry mode:

```bash
printf '%s' "$REPLY2" | node /Users/valentinpalkovic/Projects/storybook/scripts/verify-pr-author.ts --bundle <abs-bundle-path> --dispatch-mode stdin --retry-of <runId>
```

The CLI enforces `RECIPE_RETRY_POLICY.maxAttempts` (currently 2) — it will not
re-emit exit 75 on the retry call. Expected exits:

- `0` — success. Skip to Step 10.
- `1` — terminal failure (lint exhausted, regex-check exhausted, deny-regex hit).
CLI has written `result.json` with `attempts: 2` and the failure status. Print
the failure line and stop.

### Step 10 — Emit `result.json` on success

Write `/Users/valentinpalkovic/Projects/storybook/.verify-output/<runId>/result.json`:

```jsonc
{
"version": 1,
"status": "spec-written",
"specPath": "<abs path>",
"attempts": 1 | 2,
"lint": "clean",
"regex": { "listenerBeforeGoto": true, "attachPattern": true },
"agentModel": "<bundle.metadata.agentModel>",
"generatedAt": "<ISO now>"
}
```

### Step 11 — Print actionable next-step lines

Emit these lines (one per line) to the agent's final reply:

```
[verify-recipe-author] spec written: <abs spec path>
[verify-recipe-author] result.json: <abs result.json path>
[verify-recipe-author] attempts: <n> | lint: clean
[verify-recipe-author] Next: review the spec, then run `yarn verify-pr --recipe-spec <spec path>`
```

## Failure Modes

Always write `result.json` to `.verify-output/<runId>/result.json` before stopping. Schema is the same as Step 10 with the `status` field swapped.

| Cause | `result.json.status` | Retry? |
|---|---|---|
| `outputSpecPath` exists and `force === false` | `collision-aborted` | no |
| Agent reply lacks fence markers / empty body | `agent-emitted-no-spec` | yes (1 retry) |
| `assertNoDeniedPatterns` throws | `deny-regex-failed` | **NO — security block** |
| Lint exit non-zero | `lint-failed` | yes (1 retry) |
| Post-write regex check failed (listener-before-goto OR attach pattern) | `regex-check-failed` | yes (1 retry, fed back as lint-equivalent) |
| All gates pass | `spec-written` | n/a |

Print the failure cause + the `result.json` path on stop:

```
[verify-recipe-author] FAILED: <status> — see <abs result.json path>
```

## Notes

- This skill runs inside Claude Code; it uses `Agent`, `Read`, `Write`, `Bash`, and `Edit` tools.
- All paths in invocations are absolute. Lint commands `cd code` via `yarn --cwd`.
- Max attempts = `RECIPE_RETRY_POLICY.maxAttempts` (currently 2). Read the value from `scripts/verify/recipe-retry-policy.ts` — do not hardcode.
- The skill **never executes** the generated spec. The human review gate (Phase-1 lethal-trifecta breaker) is preserved.
- Deny-regex hits are not retried — they are security signals, not lint nits.
- Cap retry feedback at 5 errors (R3).
- The `runId` is the basename of the parent directory of the bundle; do not invent a new one.

## Phase-2 follow-up

This skill currently couples generation to a running Claude Code session via the `Agent` tool dispatch. Phase-2 CI activation will require migrating to a direct Anthropic SDK call (`@anthropic-ai/sdk`) with an `ANTHROPIC_API_KEY` env var, replacing the `Agent` dispatch with a standalone API call so the workflow at `.github/workflows/verify-pr.yml` can run unattended. Tracked as a follow-up in the plan's ADR §Follow-ups.
1 change: 1 addition & 0 deletions .claude/skills/verify-recipe-author/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
@../../../.agents/skills/verify-recipe-author/SKILL.md
19 changes: 19 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
.env
.env.*
**/.env
**/.env.*
~/.ssh/
~/.aws/
~/.config/gcloud/
~/.azure/
~/.docker/config.json
~/.kube/config
.npmrc
.pypirc
**/*-service-account.json
**/*.pem
**/*.key
~/.git-credentials
.verify-output/
node_modules/
.nx/
Loading
Loading