try: storybookjs/storybook#34767 — UndoIcon for Review-changes clear#13
try: storybookjs/storybook#34767 — UndoIcon for Review-changes clear#13valentinpalkovic wants to merge 39 commits into
Conversation
Single-template (react-vite/default-ts), single-story
(example-button--primary) PR verification entry script with 6 helpers
under scripts/verify/.
Flow: compile core -> symlink code/core/dist into NX-cached sandbox ->
boot Storybook on :6006 -> Playwright capture via SbPage from
code/e2e-tests/util.ts -> emit verify-result.json + iframe-clipped
screenshot under .verify-output/<runId>/.
Helpers:
- core.ts: types, run-path math, computeVerdict, pruneOldRuns(10)
- symlink.ts: lifted EPERM/EEXIST cp fallback from
scripts/tasks/sandbox-parts.ts:43-79 + net-new dangling-symlink heal
- sandbox.ts: multi-base resolveSandboxDir (code/sandbox, sandbox,
../storybook-sandboxes, STORYBOOK_SANDBOX_ROOT override),
snapshot/restore, sanitizeResolutions
- sync.ts: yarn nx compile core (run from repoRoot) + symlink dist
- boot.ts: cross-platform port preflight, idempotent SIGINT/SIGTERM
handlers, dual wait-on iframe.html + index.html (uses
node:child_process.spawn per repo lint policy)
- capture.ts: page.on('pageerror'/'console') registered before goto,
iframe-clipped screenshot
Run via `yarn verify-pr` (uses bun for native TS exec — node
strip-types rejects transitive enums in cli/projectTypes.ts).
Verification:
- V-1 sanity: verdict=verified, ~8s wall-time (well under 90s SLO)
- V-2 regression: VERIFY_HARNESS_TEST sentinel detected at compile,
exit 1
Plan: .omc/plans/pr-verify-poc-mvp.md
Research: .omc/research/research-20260508-prverify/report.md
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ning Pivot from custom Chromium launch (capture.ts) to spawning `bun x playwright test` against committed specs under `.verify-recipes/`. Trace artifacts are produced by Playwright's built-in tracing API and replayable via `npx playwright show-trace`. Schema bumped to v2 with per-test results, attached pageErrors/consoleErrors, and trace paths sourced from the Playwright JSON report contract. Adds Phase-1 security hardening: `.claude/settings.json` deny rules (local), `.dockerignore` for credential exclusion, `SECURITY.md` with phase-gated threat model and isolation matrix, and a gated `.github/workflows/verify-pr.yml` (if: false) scaffolding the Phase-2 container/proxy shape. Recipe-local `RecipePage` (`.verify-recipes/_util.ts`) reimplements only the subset of `SbPage` needed for verify recipes — Playwright's Node worker processes cannot strip the non-erasable TS enums reached transitively from `code/e2e-tests/util.ts`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Allow overriding the Storybook port (default 6006) so the harness can run alongside side-processes that already occupy 6006. baseURL, preflightPort, bootStorybook, and the --resync alive-check are all threaded through the resolved port. Validates that the value parses as an integer in 1..65535. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a two-step flow on top of the v2 raw-Playwright runner: yarn verify-pr-generate --pr <#> # emit prompt bundle Skill: verify-recipe-author # dispatch executor, write spec yarn verify-pr --recipe-spec ... # run committed spec The generator script does deterministic I/O only — gh pr fetch, triage routing (19 path globs in scripts/verify/recipes/triage-table.ts mapping addon/manager/csf-tools/builder/framework/renderer changes to reference specs under code/e2e-tests/), per-file 500-line cap with 20-file total cap sorted triage-matched first, and prompt-bundle emission. The script never dispatches an agent and never writes the final spec. The verify-recipe-author skill (.agents/skills/verify-recipe-author/SKILL.md with redirect at .claude/skills/...) consumes the bundle, dispatches the oh-my-claudecode:executor subagent (model=opus), runs a security deny-regex guard (recipe-deny.ts: child_process, fs.unlink/rm, process.exit, eval, node: imports), prepends a header-comment provenance block to the agent output, writes .verify-recipes/pr-<#>.spec.ts, lints via yarn --cwd code lint:js:cmd with one categorized retry (recipe-retry-policy.ts: maxAttempts=2, errorCategories=[listener-before-goto, attach-pattern, imports]), runs post-write regex checks for the listener-before-goto and testInfo.attach pattern invariants, and emits result.json. Spec-name collision = fail unless --force; the human-review gate from v2's SECURITY.md is preserved (the skill never executes its output). The authoring guide at .verify-recipes/_recipe-authoring-guide.md is the agent's contract: import surface, listener-before-goto rule, attach pattern, RecipePage API, what to avoid, story URL routing, and per-change assertion shapes. Verification: structural ACs (V3-6, V3-7, V3-9, V3-10) pass via grep against the new files; AC-V3-1 (generator exit 0 + bundle written + next-step printed) and AC-V3-5 (committed spec runs end-to-end via verify-pr, schema v2 verdict emitted with trace.zip and per-test attachments) ran clean against PR storybookjs#34737 (manager-api/modules/stories.ts); AC-V3-3/V3-4 (listener-before-goto + attach-pattern regex) and AC-V3-8 (deny-regex aborts on child_process) verified directly. Phase-1 security model unchanged: spec-review gate is the lethal-trifecta breaker; the bun script + skill make that gate easy to apply, but never substitute for it. Phase-2 CI activation will require migration to a direct Anthropic SDK call with API-key handling — tracked in the SECURITY.md / README roadmap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a headless authoring path (yarn verify-pr-author) that consumes the
v3 prompt bundle and dispatches Claude directly via @anthropic-ai/sdk
(single-block prompt caching on guide + canonical smoke). Skill and CI
script share scripts/verify/recipe-author-core.ts so they cannot drift.
Three lanes:
- Lane A — scripts/verify/agent-dispatch.ts (SDK + MODEL_ID_MAP +
transport retry + stub mode + DEBUG redaction), recipe-author-core.ts
(TOCTOU -> dispatch -> deny-regex -> D8 header -> lint -> regex ->
categorize + retry), verify-pr-author.ts CLI with --dispatch-mode
{sdk|stdin} and --retry-of (D4-α EX_TEMPFAIL=75 sentinel),
recipe-retry-policy.ts extension (categorizeEslintViolations +
formatRetryMessage), three stub fixtures, @anthropic-ai/sdk 0.65.0
exact-pinned.
- Lane B — .github/workflows/verify-pr.yml flipped from if:false to
label-gated (ci:verify) + !draft + actor-permission-action; Generate
bundle + Author recipe steps added on bare runner with
ANTHROPIC_API_KEY scoped to Author recipe env only; spec-runner
container keeps --network=none and never sees the key; proxy.sock
mount removed (Envoy deferred to v5). SECURITY.md Phase-2 section +
README two-paths section.
- Lane C — scripts/verify/lint-invocation.ts wrapper (eslint via
require.resolve('eslint/package.json') + bin/eslint.js, --no-eslintrc
--no-ignore --resolve-plugins-relative-to repo-root); D3-E dedicated
recipe eslintrc (parserOptions.project:false, non-typed recommended,
argsIgnorePattern:'none'); SKILL.md Step 8 rewritten for the D4-α
retry contract.
Verification (10 acceptance criteria):
- AC-V4-2/4/5/6/7a/7b/8/9/10 PASS end-to-end against the existing v3
bundle. AC-V4-1 and AC-V4-3b gated on a live ANTHROPIC_API_KEY (CI
verification mandatory; local optional). AC-V4-3a passes 9/9
buildAnthropicRequest shape checks.
- AC-V4-7a SHA-256 parity: stdin + sdk paths produce byte-identical
specs (D8 header generatedAt pinned to bundle.metadata.generatedAt).
- AC-V4-9 redaction: dispatch-request.json contains no x-api-key /
authorization / sk-ant- substrings.
- AC-V4-10 retry: stdin attempt 1 exits 75 with framed retry block +
result.partial.json; stdin --retry-of <runId> attempt 2 exits 0 with
attempts=2.
scripts/verify-pr.ts (runner) untouched (frozen this increment). Envoy
credential-injector and author_association gating deferred to v5.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Resolve placeholder SHAs for the four third-party actions in .github/workflows/verify-pr.yml to commit SHAs of their latest stable releases. Required activation gate before the harness can fire in CI. - prince-chrismc/check-actor-permissions-action: v3.0.2 - actions/checkout: v6.0.2 - actions/upload-artifact: v7.0.1 - actions/github-script: v9.0.0 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous map pointed every entry at claude-opus-4-5-20250929, which returns 404 from the Anthropic API. Update to current public IDs: - claude-opus-4-7[1m] / claude-opus-4-7 → claude-opus-4-7 - claude-opus-4-6 → claude-opus-4-6 - claude-opus-4-5 → claude-opus-4-5-20251101 (correct snapshot) Update MODEL_MAX_TOKENS keys to match. Verified live AC-V4-1 (spec written) and AC-V4-3b (cache_read_input_tokens=4358 >= 1024) against PR storybookjs#34761. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two activation-blocking bugs surfaced by A4 label-fire test on valentinpalkovic/storybook fork: 1. Generate bundle step failed with "Couldn't find the node_modules state file" — workflow never ran yarn install after checkout. Add the standard `./.github/actions/setup-node-and-install` composite step between Checkout and Fetch PR diff. 2. Post PR comment hard-failed with ENOENT on `.verify-output/latest/verify-result.json`. The harness writes timestamped dirs and never creates a `latest` symlink, so the path was wrong on every run, not just failures. Replace with a sort- newest-first scan of `.verify-output/*/verify-result.json` and degrade gracefully when no verdict exists (workflow failed before harness ran), so the comment always posts a useful status. Remaining gap: `Run harness in container` step references `verify-harness:pinned-sha` which has no Dockerfile in repo and is not built anywhere in the workflow. Tracked as next activation gate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A4 label-fire test on fork run #25673185333 failed at Generate bundle with "command not found: bun". The verify-pr-generate and verify-pr yarn scripts (in package.json:40,42) invoke bun directly. The composite setup-node-and-install action provisions Node/Yarn but not Bun, so add oven-sh/setup-bun pinned to v2.2.0 between Node setup and Fetch PR diff. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Upload-artifact v7 emitted "No files were found with the provided path: .verify-output/*/" on A4 run #25673778823 despite the dir existing. The trailing-slash dir-glob isn't accepted as a file pattern in v7. Replace with the directory path, which uploads the whole tree. Add explicit `if-no-files-found: warn` so future glob drift surfaces as a warning rather than silent zero artifacts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A4 run #25674121554 confirmed .verify-output/ exists with the prompt bundle inside, but upload-artifact v7 silently skipped it because the default include-hidden-files: false rejects dot-prefixed paths. Set include-hidden-files: true. Drop the temporary debug step. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the v5-0 gap where the `Run harness in container` workflow step
referenced `verify-harness:pinned-sha` with no Dockerfile in repo. The
harness can now produce verdicts in CI.
Implementation follows the ralplan-approved design at
`.omc/plans/v5-0-dockerfile.md` (4 iterations to consensus APPROVE from
Architect + Critic under DELIBERATE mode).
What lands:
- `scripts/verify/Dockerfile` — multi-stage build pinned by SHA digest
(Playwright v1.58.2-jammy base + Bun 1.3.0-slim via `COPY --from=`).
Pre-bakes node_modules + code/core/dist + react-vite/default-ts sandbox
so the runtime container can satisfy `--read-only` + `--network=none`.
Corepack is bypassed — yarn invoked directly via `node $YARN_BIN`.
Bakes `HEAD_SHA` for runtime drift detection.
- `scripts/verify/harden-build-context.sh` +
`scripts/verify/strip-lifecycle-scripts.mjs` — supply-chain hardening
that runs on the bare runner before `docker build`. Overlays trusted
`.dockerignore` / `.yarnrc.yml` / `.yarn/releases/` from base-sha,
strips lifecycle scripts from every workspace `package.json`,
normalises `packageManager`, deletes head-supplied `.npmrc`,
diff-asserts `Dockerfile` byte-identity. Walker is hardened with
symlink-skip, max-depth, 1 MB file-size cap, 60s timeout, and
prototype-chain hygiene.
- `.github/workflows/verify-pr.yml` — adds `Checkout PR head`,
`Spec precheck`, `Harden build context`, `Build harness image`
(with per-PR cache scope), `Smoke test image` (digest fail-closed),
`Run harness in container` (named container), and `Mirror tmpfs
output` (no `|| true` on the load-bearing copy).
- `.github/actions/verify-spec-precheck/action.yml` — extracted
composite action so v5-1's first-time-use UX can swap internals
without touching the workflow shape.
- `scripts/verify/core.ts` — adds `writeRegressionResult()` helper plus
optional `regressionReason` / `inContainer` / `imageDigest` /
`headSha` fields (schemaVersion unchanged).
- `scripts/verify-pr.ts` — honours `VERIFY_HARNESS_IN_CONTAINER=1` at
every sandbox-prep call site; rejects `--resync` in-container;
asserts `HEAD_SHA` via the new helper, warn-and-skip when
`VERIFY_HARNESS_EXPECTED_HEAD_SHA` is unset (laptop dev mode).
- `scripts/verify/playwright.config.ts` — chromium-only projects.
- `renovate.json` — tracks Playwright + Bun digests on weekly schedule.
- `scripts/verify/SECURITY.md` § Image-build provenance — documents
every supply-chain control plus the residual `GITHUB_TOKEN`-in-buildx
risk and its v5-1 job-split mitigation.
- `scripts/verify/RUNBOOK.md` — diagnosis playbook for failure signals.
- `scripts/verify/__tests__/` — four integration tests covering the
short-circuit, sandbox-root env, head-sha assertion, and hadolint.
Known residual risk (documented in SECURITY.md, deferred to v5-1):
`GITHUB_TOKEN` remains in the buildx daemon's process env on the
build step. The mitigation stack (lifecycle-script stripping,
`enableScripts: false`, `.npmrc` purge, corepack bypass, per-PR cache
scope, Dockerfile byte-identity) defends `yarn install` against
head-controlled code execution. v5-1 splits into prep + harness jobs
with `permissions: {}` on the harness job to eliminate this surface.
TODOs flagged in source:
- `timeout-minutes: 30` is a placeholder; AC-V5-0-2 cold-build
measurement is required before final lock-in.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
actions/checkout's submodule-foreach cleanup pass aborts with exit 128 on this repo because of orphan gitlinks under `.external/` that have no matching entries in any (missing) `.gitmodules` file. The base-sha checkout escapes this because it doesn't pass `persist-credentials: false` (the cleanup phase that runs `git submodule foreach` is gated on needing to scrub credentials). The PR-head checkout did set the flag for the v5-0 untrusted-context posture and hit the gitlinks/no-modules mismatch. Replace `actions/checkout@v6.0.2` for the PR-head step with a manual `git clone --no-tags --no-checkout --filter=blob:none` followed by a single-sha fetch and checkout. Strip the cached credential helper + rewrite the remote URL to drop the token afterwards. Net posture is equivalent to `persist-credentials: false` and `.git/` is excluded from the docker build context by `.dockerignore`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…xtraction Build failure on fork firetest run 25684557942: playwright:v1.58.2-jammy ships Node 24.13.0 (not 22.22.1), so the conditional Node re-install always fires; but the base image is missing `xz-utils`, so `tar -xJf` on the .tar.xz tarball aborts with "xz: Cannot exec: No such file or directory". Add an apt-get install of xz-utils + ca-certificates inside the same RUN block, gated on the same version-mismatch conditional so a future Playwright base that already ships Node 22.22.1 skips the apt fetch entirely (resolves the apt-vs-probe trade-off in OQ-V5-0-E). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Build failure on fork firetest PR #3 run 25685301642: playwright base image ships `pwuser` at UID/GID 1000, so the unconditional `groupadd --gid 1000` aborts with "GID '1000' already exists". Guard the group and user creates with getent/id probes so the layer is idempotent across base-image variants that may or may not ship the user. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…_BIN Yarn 4 treats any YARN_<KEY> env var as a config setting, so 'YARN_BIN' was being parsed as a 'bin' config key and rejected with 'Unrecognized or legacy configuration settings found: bin'. Rename the variable to HARNESS_YARN_BIN throughout the Dockerfile and matching docs.
…al target
scripts/package.json depends on eslint-plugin-local-rules via portal:
specifier. Yarn install fails ('Manifest not found') unless the portal
target directory is present in the build context. Add a COPY for the
whole scripts/eslint-plugin-local-rules/ directory in stage 2 so yarn
can resolve the portal manifest and 'yarn lint' has the rule files at
runtime.
WORKDIR /opt/verify-harness/repo runs as root, leaving the directory root-owned even though COPY --chown=1000:1000 sets file ownership. Yarn 4 runs as uid 1000 (USER 1000:1000) and fails the link step with EACCES while creating node_modules/. Add an explicit chown of the workdir before the USER switch so yarn can create node_modules and persist the link tree.
… task scripts/utils/cli-step.ts resolves dist/bin/index.js for both cli-storybook and create-storybook at module-eval time. Stage 3 only compiled 'core', so the sandbox task failed with MODULE_NOT_FOUND. Expand the nx target list to compile core + cli-storybook + create-storybook before the sandbox bootstrap step runs.
…torybook The nx project name in code/lib/cli-storybook/project.json is 'cli' (package name '@storybook/cli'), not 'cli-storybook'. The previous nx run-many list silently dropped the unknown target, so the cli package was never compiled and the sandbox task still failed with MODULE_NOT_FOUND for cli-storybook/dist/bin/index.js.
… index.js code/core/dist has no top-level index.js — the bundle is split into per-entry-point subdirectories (preview-api/, manager-api/, etc.) plus the bin script at dist/bin/dispatcher.js (declared in core package.json#bin). Update the sandbox dist sanity check to verify the dispatcher bin file instead.
…overlay 'yarn task sandbox' runs run-registry -> publish, which packs and republishes packages through verdaccio. The publish step can churn code/core/dist, so by the time stage 3.d tries to cp it the directory no longer exists. Re-run 'nx compile core' immediately before the cp to guarantee the freshly-built artifact from the PR head is in place, and fail loudly with a directory listing if it is somehow still missing.
…caffolding Eleven v5-0 firetest rounds confirmed the Dockerfile architecture is asymmetrically over-engineered: ~70% of the complexity (digest pins, harden-build-context overlay, lifecycle-script stripping, Verdaccio publish pipeline, BuildKit cache scope, smoke-test sentinel) addresses supply-chain threats that `enableScripts: false` + lockfile + `.npmrc` purge already mitigate. Runtime isolation — the threat the doc actually calls out for CI/CD — was weakly addressed (no `--cap-drop ALL` / `--network=none` / `--read-only` / `--tmpfs` in places it would matter). BuildKit also proved fragile: `code/core/dist` kept disappearing between stages 6 and 7 despite multiple recompile attempts. v6 drops the container and accepts the same isolation profile as the existing Storybook PR CI (ephemeral GitHub Actions runner). The recipe author step keeps `ANTHROPIC_API_KEY` scope-limited to one step; the committed-spec runner remains the lethal-trifecta breaker. If the threat model later expands to processing third-party PRs at scale with adversarial recipe authors, sandbox-runtime (bubblewrap on Linux) wraps just the playwright step in ~10 lines of config per Anthropic's "Securely deploying AI agents" doc. Do not reintroduce the full Docker + Verdaccio stack. Deletes: - scripts/verify/Dockerfile - scripts/verify/harden-build-context.sh - scripts/verify/strip-lifecycle-scripts.mjs - scripts/verify/SECURITY.md - scripts/verify/__tests__/dockerfile-lint.test.ts - scripts/verify/__tests__/head-sha-assertion.test.ts - scripts/verify/__tests__/in-container-shortcircuit.test.ts Strips v5-0 additions from .dockerignore (preserves the pre-existing sensitive-path entries) and removes the Dockerfile-pin rules from renovate.json. The workflow yml + verify-pr.ts still reference the container path; both get rewritten in the subsequent v6 commits. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ispatch Rewrites scripts/verify-pr.ts around two execution targets selected per recipe via a `// @verify-target:` header comment (scanned in the first 30 lines of the spec): internal-ui (default) Builds code/storybook-static once via `yarn storybook:ui:build`, then serves it on the requested port via `yarn http-server`. The fast path for fixes that exercise the monorepo's own UI against the PR head's compiled packages — no sandbox bootstrap, no verdaccio publish, no docker. sandbox:<template> Pre-existing sandbox flow — snapshotSandbox, sanitizeResolutions, syncCorePackage (symlink code/core/dist into the sandbox), then bootStorybook. Use only when a fix is template-specific (rare). Also: - Adds a positional <PR#> argument so `yarn verify-pr 34762` resolves to `.verify-recipes/pr-34762.spec.ts` automatically. The explicit `--recipe-spec <path>` flag still works and takes precedence. - Drops every `VERIFY_HARNESS_IN_CONTAINER` short-circuit, the `/opt/verify-harness/HEAD_SHA` runtime assertion, and the `VERIFY_HARNESS_EXPECTED_HEAD_SHA` env-var contract. The container paths no longer exist. - Drops the imageDigest / inContainer / headSha fields from VerifyResult writes and from writeRegressionResult's options. The fields stayed optional in the schema for backward-compat readers but are no longer populated. - Widens VerifyResult.template from the `'react-vite/default-ts'` literal to `string` so the field can carry `'internal-ui'` and arbitrary sandbox templates. - Switches the root `verify-pr` script from `bun scripts/verify-pr.ts` to `node ./scripts/verify-pr.ts`. verify-pr.ts no longer imports any of the non-erasable enum chain from cli-storybook, so Node 22.22.1's native TS strip is sufficient. The Playwright runner still spawns `bun x playwright test` internally because the recipe specs live under .verify-recipes/ and load through Playwright's own worker process, not through verify-pr.ts. - --resync now only applies to sandbox-target recipes (the internal-ui build is fast enough that --resync would add no value); the script exits with an actionable error if --resync is passed for an internal-ui recipe. - New: scripts/verify/target.ts (header parser, default = internal-ui). - New: scripts/verify/internal-ui.ts (storybook:ui:build + http-server boot, waitOn :port/index.html). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the v5-0 Docker pipeline (harden build context, buildx, image
build, smoke test, docker run with --network=none / --cap-drop ALL /
--read-only / tmpfs, docker cp mirror) with a single shell step that
runs the harness directly on the GitHub Actions runner.
Verify step (only when `verify-spec-precheck` reports the committed spec
exists at the PR head):
set -euo pipefail
yarn install --immutable
yarn nx run-many -t compile -p core,cli,create-storybook
yarn verify-pr --recipe-spec ".verify-recipes/pr-${PR_NUMBER}.spec.ts"
The internal-ui target (default) builds code/storybook-static once and
serves it via http-server. Sandbox targets follow the pre-existing
snapshot + sanitize + sync + boot flow. The recipe header chooses.
New: a `Read verdict` step parses `pr-head/.verify-output/*/verify-result.json`
and a `Apply verified-by-harness label` step adds the label to the PR
when the verdict is `verified`. The PR comment script renders the same
two-state not-applicable message and a verified-vs-regression block,
but drops the imageDigest reference (no longer populated by v6).
Permissions: pull-requests + issues (label add) + statuses. The
ANTHROPIC_API_KEY remains scoped to the `Author recipe` step only;
nothing downstream of that step ever sees the secret. The committed
spec under review is still the lethal-trifecta breaker — the runner
executes only what was committed and reviewed at the PR head.
Workflow surface dropped:
- Harden build context (./scripts/verify/harden-build-context.sh)
- Set up Docker Buildx
- Build harness image (docker/build-push-action + cache-to/from)
- Capture image digest
- Smoke test image
- Run harness in container
- Mirror tmpfs output to runner workspace
Net delta: -78 lines, +33 lines (-45 net) on verify-pr.yml.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Rewrites the harness documentation around the v6 local-first
architecture. The Docker / Verdaccio / image-build-provenance sections
are dropped; new sections cover the per-recipe target dispatch
(`internal-ui` vs `sandbox:<template>`) and the runner-native CI
workflow.
scripts/verify/README.md
* Architecture diagram updated to list target.ts + internal-ui.ts.
* Flag table adds the positional <PR#> sugar and clarifies that
`--resync` and `--restore-sandbox` only apply to sandbox-target
recipes.
* `verify-result.json` example uses `template: "internal-ui"`.
* Prerequisites section calls out Node 22 (native TS-strip) as the
primary runtime; Bun is needed only by the Playwright runner.
* Side-effects section narrows to the sandbox target.
* CI section documents the new yaml shape.
* Drops the "Running inside the verify-harness container" section
in its entirety.
scripts/verify/RUNBOOK.md
* Full rewrite around the two flows: local AI fix-loop +
GH Actions runner. Drops every Docker / buildx / harden-script /
smoke-sentinel signal. Adds signals specific to v6:
bootInternalUi timeout, --resync rejected, sandbox bootstrap
missing, not-applicable verdict, label-step skipped, github-script
verdict-read failure.
scripts/verify/SECURITY.md (recreated)
* Brief threat-model note (~70 lines instead of 250+). Restates the
eight load-bearing controls (committed-spec review, scoped API
key, deny-regex, lint gate, provenance header, actor permission,
label gate, repo-wide deny rules) and explains why v6 dropped the
container without weakening the trifecta breakers. Notes the
sandbox-runtime path as the next-step option if the threat model
expands.
.verify-recipes/_recipe-authoring-guide.md
* New §12 "Target selection (v6)" documenting the
`// @verify-target:` header convention. Renumbers existing §12
"Output budget" to §13.
.verify-recipes/example-smoke.spec.ts
* Adds the explicit `// @verify-target: internal-ui` header as the
canonical baseline example.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…step The Verify PR step runs 'yarn nx run-many -t compile -p core,cli,create-storybook' inside the pr-head/ checkout. When the workflow runs on a fork without Nx Cloud org access (e.g. the v6 firetest on valentinpalkovic/storybook), nx aborts with: NX Cloud: Workspace is unable to be authorized. Exiting run. This Nx Cloud organization is disabled. The Verify step doesn't need distributed cache for correctness — a clean compile against the PR head is the whole point. Force-disable Nx Cloud (NX_NO_CLOUD=true + empty access token) on the step's env block. Upstream storybookjs/storybook CI is unaffected: other workflow steps (Generate bundle, Author recipe) that already rely on Nx Cloud auth continue to use it; only the Verify step opts out. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ternal-ui
The Verify PR step previously ran:
yarn nx run-many -t compile -p core,cli,create-storybook
This is sufficient for sandbox-target recipes (the sandbox already has
its own node_modules) but not for the internal-ui target. The
internal-ui build invokes 'yarn storybook:ui:build' in code/, which
loads code/.storybook/main.ts, which imports @storybook/react-vite
plus every addon (addon-onboarding, addon-themes, addon-docs,
addon-designs, addon-vitest, addon-a11y, addon-mcp) and transitive
renderer + builder packages. None of those are compiled by the
narrower filter, so .storybook/main.ts evaluation fails with
ERR_MODULE_NOT_FOUND.
Drop the project filter and let nx compile every project. Slower per
run but correct for the default target. Sandbox-target recipes are
unaffected — the same compile output is reused under the
syncCorePackage symlink path.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…r eslint-plugin topo eslint-plugin's prebuild script (code/lib/eslint-plugin/scripts/...) imports from 'storybook/csf' via jiti. nx run-many parallelises 42 projects without enforcing the compile-order edge (the repo lacks an explicit dependsOn: ['^compile'] for that target), so eslint-plugin runs before core finishes and the import resolves upward through the parent base checkout's node_modules/storybook symlink → which has no dist/csf yet → ERR_MODULE_NOT_FOUND. Compile core first explicitly, then run-many for everything else. nx caches the core build so the second pass is a no-op for that target. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Verify HarnessVerdict: Evidence (vision-check, Vision reasoningRecipe produced no screenshots — cannot verify visible evidence. Replay: Screenshots
|
Verify HarnessVerdict: Reason: Compile output (last 4KB)Replay: |
Verify HarnessVerdict: Evidence (vision-check, Vision reasoningRecipe produced no screenshots — cannot verify visible evidence. Replay: Screenshots
|
Verify HarnessNo verdict produced — the workflow failed before the harness ran (likely recipe-author dispatch, deny-regex, or lint). See run log for details. |
Verify HarnessNo verdict produced — the workflow failed before the harness ran (likely recipe-author dispatch, deny-regex, or lint). See run log for details. |
Verify HarnessVerdict: Evidence (vision-check, Vision reasoningRecipe produced no screenshots — cannot verify visible evidence. Replay: Screenshots
|
Verify HarnessVerdict: Replay: Screenshots
|
Verify HarnessVerdict: Evidence (vision-check, Vision reasoningRecipe produced no screenshots — cannot verify visible evidence. Replay: Screenshots
|
Verify HarnessVerdict: Evidence (vision-check, Vision reasoningRecipe produced no screenshots — cannot verify visible evidence. Replay: Screenshots
|





Cherry-pick of upstream b779fb7 (storybookjs#34767) onto fork's next so the v6 single-round verify harness can author + run a recipe against it.
Reference: storybookjs#34767