Build: Agentic Workflow improvements#33985
Conversation
…mdc, capture-terminal-output.ts, exec.ts)
…s for builder-vite and builder-webpack5
…detection jobs, streamline terminal output checks
…seline snapshots for builder-webpack5
|
View your CI Pipeline Execution ↗ for commit 67d335f
☁️ Nx Cloud last updated this comment at |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds Claude skill workflows and verification docs, terminal-output capture/normalization tooling with baseline snapshots, CI GitHub Actions to run terminal-output checks (including Playwright browser install), workspace/storybook stories and e2e smoke tests, and supporting script/exec changes to enable snapshot capture and comparison. Changes
Sequence Diagram(s)sequenceDiagram
participant CI as "GitHub Actions (PR)"
participant Repo as "Repo (checkout)"
participant Setup as "Setup & Build"
participant Sandbox as "Sandbox Generator"
participant Capture as "capture-terminal-output"
participant Normalize as "Normalizer"
participant Compare as "Snapshot Comparator"
participant Reporter as "CI Reporter"
CI->>Repo: checkout
CI->>Setup: install deps + install Playwright browsers
Setup->>Setup: compile packages
CI->>Sandbox: generate sandboxes (react-vite, react-webpack)
CI->>Capture: invoke per-builder capture (build/dev)
Capture->>Sandbox: run yarn storybook / build-storybook
Sandbox->>Capture: stream terminal output
Capture->>Normalize: send raw output
Normalize->>Compare: provide normalized output
Compare->>Compare: load baseline snapshot
alt match
Compare->>Reporter: pass
else mismatch
Compare->>Reporter: fail + unified diff
end
Reporter->>CI: exit with status
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (7)
.claude/skills/renderer-bug-workflow/SKILL.md (1)
53-57: Use a safer port cleanup command beforekill -9.At Line 55, leading with SIGKILL is unnecessarily aggressive. Prefer graceful termination first, then escalate only if needed.
Proposed command update
- lsof -ti :6006 | xargs kill -9 + lsof -ti :6006 | xargs kill + # if a process does not exit, then force kill: + lsof -ti :6006 | xargs kill -9🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.claude/skills/renderer-bug-workflow/SKILL.md around lines 53 - 57, Update the "Known failure — port in use" guidance to avoid starting with SIGKILL: replace the current one-line command that uses kill -9 with a two-step safer sequence that first attempts a graceful SIGTERM (kill -15) against the pid(s) found by lsof (or equivalent) and only escalates to SIGKILL (kill -9) if the process remains; ensure the commands use xargs safely (e.g., handle empty input) and mention retrying the dev server after successful termination; reference the text block titled "Known failure — port in use" and the existing lsof | xargs kill usage when making this change.scripts/terminal-output-snapshots/builder-vite-build.snap.txt (1)
16-277: Consider reducing snapshot volatility by filtering per-asset listings.This baseline captures a very large chunk/file inventory, which is likely to churn on unrelated changes and generate noisy failures. Keeping phase markers, warnings, and completion lines would preserve intent with better signal-to-noise.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/terminal-output-snapshots/builder-vite-build.snap.txt` around lines 16 - 277, The snapshot is too noisy because it records every per-asset listing (lines containing "Vite storybook-static/assets/" and large entries like "Vite storybook-static/assets/iframe-<HASH>.js"), causing churn; update the snapshot generation or test that produces builder-vite-build.snap.txt to filter or collapse per-asset file entries (keep only phase markers/warnings/completion lines and aggregate asset counts or sizes) so snapshots only assert high-level output rather than each asset path, e.g., replace detailed asset listings with a single summary line or whitelist only important filenames (retain "Vite storybook-static", "iframe-<HASH>.js" and top-level warnings/completion text).storybook.code-workspace (1)
24-28: Pinchrome-devtools-mcpto a concrete version instead of@latest.At Line 27,
@latestmakes agent tooling behavior non-reproducible and can introduce sudden breakage. Recent releases (e.g., v0.18.0) include behavior-affecting changes like "move emulation settings to context" that could unexpectedly alter tool functionality. Pin to a stable version like0.18.1and update intentionally.Proposed change
- "chrome-devtools-mcp@latest" + "chrome-devtools-mcp@0.18.1"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@storybook.code-workspace` around lines 24 - 28, Replace the floating tag "chrome-devtools-mcp@latest" in the args array (the npx command invocation) with a concrete version to ensure reproducible tooling, e.g., "chrome-devtools-mcp@0.18.1"; locate the args entry that currently reads "chrome-devtools-mcp@latest" and update it to the chosen fixed version so future runs use the pinned release instead of `@latest`..github/workflows/copilot-verification.yml (2)
64-66: Missing newline at end of file.Add a newline at the end of the file for POSIX compliance.
Add trailing newline
if [ $failed -eq 1 ]; then exit 1 fi +🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/copilot-verification.yml around lines 64 - 66, The file ends without a trailing newline; add a single newline character at the end of the file so the last line ("if [ $failed -eq 1 ]; then" / "exit 1" / "fi") is terminated by a newline to satisfy POSIX and linters—simply ensure the final line is followed by one newline character.
3-5: Consider adding path filters to reduce unnecessary CI runs.This workflow runs on every PR, including those that don't affect builder code or terminal output. Consider adding path filters to run only when relevant files change.
Example path filter configuration
on: pull_request: types: [opened, synchronize, reopened] + paths: + - 'code/builders/**' + - 'scripts/capture-terminal-output.ts' + - 'scripts/terminal-output-snapshots/**' + - '.github/workflows/copilot-verification.yml'🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/copilot-verification.yml around lines 3 - 5, The pull_request trigger currently fires for all PRs (see the on: pull_request: types: [opened, synchronize, reopened] block); add path filters (paths and/or paths-ignore) to that pull_request trigger so the workflow only runs when relevant files change (e.g., builder, CI, or terminal/output-related files); update the on: pull_request block to include appropriate path patterns to limit runs and keep the existing event types unchanged.code/renderers/react/template/stories/visual-render-verification.stories.tsx (1)
26-29: Consider usingsatisfiesfor better type safety.Using
as Meta<...>allows type coercion, whilesatisfiesprovides type checking without widening. This is a minor stylistic preference.Alternative using satisfies
export default { component: VisualRenderVerification, parameters: { chromatic: { disableSnapshot: true } }, -} as Meta<typeof VisualRenderVerification>; +} satisfies Meta<typeof VisualRenderVerification>;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@code/renderers/react/template/stories/visual-render-verification.stories.tsx` around lines 26 - 29, The default export uses "as Meta<typeof VisualRenderVerification>" which allows coercion—replace the type assertion with the "satisfies" operator to retain type checking without widening: change the export default object so it ends with " } satisfies Meta<typeof VisualRenderVerification>;" (referencing the exported object, VisualRenderVerification and Meta) and ensure your TS target supports "satisfies"; no runtime behavior changes needed, only the type annotation.scripts/capture-terminal-output.ts (1)
80-141: Diff function outputs all content as context instead of limiting to surrounding lines.The
CONTEXT_LINESconstant is defined but never used to limit context output. The function currently outputs the entire content as context lines rather than showing only the surrounding context around changes (typically 3 lines before/after). This could result in very large diff outputs for files with small changes.Additionally, line 91 filters out empty lines (
filter((line) => line.length > 0)), which loses blank line information and could make the diff output misleading when comparing files with significant whitespace.Consider limiting context output
- const lines = text.split('\n').filter((line) => line.length > 0); + const lines = text.split('\n'); + // Remove only the trailing empty string from split if text ends with newline + if (lines.length > 0 && lines[lines.length - 1] === '') { + lines.pop(); + }For a proper unified diff with limited context, consider using a dedicated diff library like
diff(from npm) which has built-in support for unified diff format with configurable context lines.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/capture-terminal-output.ts` around lines 80 - 141, The createUnifiedDiff currently treats all equal lines as context (and strips blank lines) instead of limiting to CONTEXT_LINES; fix it by (1) removing the .filter((line) => line.length > 0) so blank lines are preserved, (2) change how contextBuffer is used: make it a sliding window that only retains the last CONTEXT_LINES of {type:0,line} and when you detect the first non-equal change (hunkStart set in createUnifiedDiff) flush the preceding up-to-CONTEXT_LINES context into the output for that hunk, then collect following changed lines and up-to-CONTEXT_LINES trailing equal lines after the change before closing the hunk, and (3) recompute hunkStart, beforeIndex and afterIndex counts based on the trimmed context and the actual lines emitted so @@ -start,count +start,count reflects the trimmed context; use the existing symbols (CONTEXT_LINES, contextBuffer, hunkStart, beforeIndex, afterIndex, diffLines) to locate where to implement this logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.claude/skills/fix-bug/SKILL.md:
- Around line 418-430: The checklist contains contradictory guidance about
force-pushing: replace the two conflicting items with an explicit policy and
instructions. Update the checklist text so it reads that contributors should not
force-push for routine fixes, but when resolving merge conflicts during a rebase
(the block that currently shows the git fetch/rebase/resolve sequence) they may
rebase and push using git push --force-with-lease; add a short note requiring a
comment in the PR or approval from a maintainer before performing the
force-with-lease push to make the exception explicit.
- Around line 27-43: The fenced code blocks that begin with "Step 1: Fetch &
Understand Issue" (and other similar untyped fences) are missing language
identifiers and trigger markdownlint MD040; update each opening triple-backtick
for these blocks to include a language token (e.g., ```text) so the fences are
typed, ensuring all occurrences in SKILL.md use consistent language identifiers
and re-run linting to confirm MD040 is resolved.
In @.claude/skills/manager-bug-workflow/SKILL.md:
- Around line 44-46: The markdown code fence containing the line
"{title-prefix}-{story-title}--{export-name}" is missing a language identifier
(MD040); update the opening fence from ``` to include a language tag (for
example change it to ```text) so the block becomes a fenced code block with a
language identifier for that exact content.
In `@CLAUDE.md`:
- Around line 5-23: The fenced code block in CLAUDE.md is unlabeled (triple
backticks around the repository tree) which triggers markdownlint MD040; update
the opening fence to include a language specifier such as "text" (change ``` to
```text) for the block that contains the repository layout so the code fence is
labeled while keeping the closing ``` unchanged and preserving the content of
the tree.
In `@scripts/utils/exec.ts`:
- Around line 52-57: The loop over the command array incorrectly detects the
last subcommand by comparing subcommand text, so duplicate command strings can
trigger an early return; update the loop in the exec utility to detect the final
element by position (e.g., use an indexed for loop or track the index) and only
perform the captureOutput return when the current index === command.length - 1,
keeping references to currentChild, logger, execa, captureOutput and options
intact.
---
Nitpick comments:
In @.claude/skills/renderer-bug-workflow/SKILL.md:
- Around line 53-57: Update the "Known failure — port in use" guidance to avoid
starting with SIGKILL: replace the current one-line command that uses kill -9
with a two-step safer sequence that first attempts a graceful SIGTERM (kill -15)
against the pid(s) found by lsof (or equivalent) and only escalates to SIGKILL
(kill -9) if the process remains; ensure the commands use xargs safely (e.g.,
handle empty input) and mention retrying the dev server after successful
termination; reference the text block titled "Known failure — port in use" and
the existing lsof | xargs kill usage when making this change.
In @.github/workflows/copilot-verification.yml:
- Around line 64-66: The file ends without a trailing newline; add a single
newline character at the end of the file so the last line ("if [ $failed -eq 1
]; then" / "exit 1" / "fi") is terminated by a newline to satisfy POSIX and
linters—simply ensure the final line is followed by one newline character.
- Around line 3-5: The pull_request trigger currently fires for all PRs (see the
on: pull_request: types: [opened, synchronize, reopened] block); add path
filters (paths and/or paths-ignore) to that pull_request trigger so the workflow
only runs when relevant files change (e.g., builder, CI, or
terminal/output-related files); update the on: pull_request block to include
appropriate path patterns to limit runs and keep the existing event types
unchanged.
In
`@code/renderers/react/template/stories/visual-render-verification.stories.tsx`:
- Around line 26-29: The default export uses "as Meta<typeof
VisualRenderVerification>" which allows coercion—replace the type assertion with
the "satisfies" operator to retain type checking without widening: change the
export default object so it ends with " } satisfies Meta<typeof
VisualRenderVerification>;" (referencing the exported object,
VisualRenderVerification and Meta) and ensure your TS target supports
"satisfies"; no runtime behavior changes needed, only the type annotation.
In `@scripts/capture-terminal-output.ts`:
- Around line 80-141: The createUnifiedDiff currently treats all equal lines as
context (and strips blank lines) instead of limiting to CONTEXT_LINES; fix it by
(1) removing the .filter((line) => line.length > 0) so blank lines are
preserved, (2) change how contextBuffer is used: make it a sliding window that
only retains the last CONTEXT_LINES of {type:0,line} and when you detect the
first non-equal change (hunkStart set in createUnifiedDiff) flush the preceding
up-to-CONTEXT_LINES context into the output for that hunk, then collect
following changed lines and up-to-CONTEXT_LINES trailing equal lines after the
change before closing the hunk, and (3) recompute hunkStart, beforeIndex and
afterIndex counts based on the trimmed context and the actual lines emitted so
@@ -start,count +start,count reflects the trimmed context; use the existing
symbols (CONTEXT_LINES, contextBuffer, hunkStart, beforeIndex, afterIndex,
diffLines) to locate where to implement this logic.
In `@scripts/terminal-output-snapshots/builder-vite-build.snap.txt`:
- Around line 16-277: The snapshot is too noisy because it records every
per-asset listing (lines containing "Vite storybook-static/assets/" and large
entries like "Vite storybook-static/assets/iframe-<HASH>.js"), causing churn;
update the snapshot generation or test that produces builder-vite-build.snap.txt
to filter or collapse per-asset file entries (keep only phase
markers/warnings/completion lines and aggregate asset counts or sizes) so
snapshots only assert high-level output rather than each asset path, e.g.,
replace detailed asset listings with a single summary line or whitelist only
important filenames (retain "Vite storybook-static", "iframe-<HASH>.js" and
top-level warnings/completion text).
In `@storybook.code-workspace`:
- Around line 24-28: Replace the floating tag "chrome-devtools-mcp@latest" in
the args array (the npx command invocation) with a concrete version to ensure
reproducible tooling, e.g., "chrome-devtools-mcp@0.18.1"; locate the args entry
that currently reads "chrome-devtools-mcp@latest" and update it to the chosen
fixed version so future runs use the pinned release instead of `@latest`.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (21)
.claude/skills/builder-bug-workflow/SKILL.md.claude/skills/fix-bug/SKILL.md.claude/skills/manager-bug-workflow/SKILL.md.claude/skills/renderer-bug-workflow/SKILL.md.claude/skills/verification-checklist/SKILL.md.github/copilot-instructions.md.github/workflows/copilot-setup-steps.yml.github/workflows/copilot-verification.yml.gitignoreCLAUDE.mdcode/e2e-tests/manager.spec.tscode/renderers/react/template/stories/copilot-verification-example.stories.tsxcode/renderers/react/template/stories/visual-render-verification.stories.tsxpackage.jsonscripts/capture-terminal-output.tsscripts/package.jsonscripts/terminal-output-snapshots/README.mdscripts/terminal-output-snapshots/builder-vite-build.snap.txtscripts/terminal-output-snapshots/builder-webpack5-build.snap.txtscripts/utils/exec.tsstorybook.code-workspace
💤 Files with no reviewable changes (1)
- .gitignore
| - [ ] Address any feedback in new commits (do NOT force-push) | ||
| - [ ] Re-run verification if requested | ||
| - [ ] Respond to questions about the fix | ||
|
|
||
| **If merge conflicts occur**: | ||
|
|
||
| ```bash | ||
| git fetch origin | ||
| git rebase origin/next | ||
| # Resolve conflicts | ||
| cd code && yarn test | ||
| git push --force-with-lease origin agent/fix-issue-$ARGUMENTS[0] | ||
| ``` |
There was a problem hiding this comment.
Resolve contradictory force-push guidance.
Line 418 says “do NOT force-push,” but Line 429 instructs --force-with-lease. Please make the policy explicit so contributors know when force-push is allowed.
Suggested fix
-- [ ] Address any feedback in new commits (do NOT force-push)
+- [ ] Address any feedback in new commits (avoid force-push except when required after a local rebase conflict resolution)
git fetch origin
git rebase origin/next
# Resolve conflicts
cd code && yarn test
-git push --force-with-lease origin agent/fix-issue-$ARGUMENTS[0]
+git push --force-with-lease origin agent/fix-issue-$ARGUMENTS[0] # allowed here after conflict-resolution rebase🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.claude/skills/fix-bug/SKILL.md around lines 418 - 430, The checklist
contains contradictory guidance about force-pushing: replace the two conflicting
items with an explicit policy and instructions. Update the checklist text so it
reads that contributors should not force-push for routine fixes, but when
resolving merge conflicts during a rebase (the block that currently shows the
git fetch/rebase/resolve sequence) they may rebase and push using git push
--force-with-lease; add a short note requiring a comment in the PR or approval
from a maintainer before performing the force-with-lease push to make the
exception explicit.
| ``` | ||
| {title-prefix}-{story-title}--{export-name} | ||
| ``` |
There was a problem hiding this comment.
Add a language identifier to the fenced block (MD040).
The code fence at Line 44 is missing a language tag and will trigger markdownlint.
Proposed fix
-```
+```text
{title-prefix}-{story-title}--{export-name}</details>
<!-- suggestion_start -->
<details>
<summary>📝 Committable suggestion</summary>
> ‼️ **IMPORTANT**
> Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
```suggestion
🧰 Tools
🪛 markdownlint-cli2 (0.21.0)
[warning] 44-44: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.claude/skills/manager-bug-workflow/SKILL.md around lines 44 - 46, The
markdown code fence containing the line
"{title-prefix}-{story-title}--{export-name}" is missing a language identifier
(MD040); update the opening fence from ``` to include a language tag (for
example change it to ```text) so the block becomes a fenced code block with a
language identifier for that exact content.
| for (const subcommand of command) { | ||
| logger.debug(`> ${subcommand}`); | ||
| currentChild = execa(subcommand, { ...defaultOptions, ...options }); | ||
| await currentChild; | ||
| const result = await currentChild; | ||
| if (captureOutput && subcommand === command[command.length - 1]) { | ||
| return result.all ?? ''; |
There was a problem hiding this comment.
Fix last-subcommand detection to avoid early return on duplicate commands.
Line 56 compares command text instead of position. If a subcommand string repeats, this can return too early and skip remaining commands.
Suggested fix
- for (const subcommand of command) {
+ for (const [index, subcommand] of command.entries()) {
logger.debug(`> ${subcommand}`);
currentChild = execa(subcommand, { ...defaultOptions, ...options });
const result = await currentChild;
- if (captureOutput && subcommand === command[command.length - 1]) {
+ if (captureOutput && index === command.length - 1) {
return result.all ?? '';
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for (const subcommand of command) { | |
| logger.debug(`> ${subcommand}`); | |
| currentChild = execa(subcommand, { ...defaultOptions, ...options }); | |
| await currentChild; | |
| const result = await currentChild; | |
| if (captureOutput && subcommand === command[command.length - 1]) { | |
| return result.all ?? ''; | |
| for (const [index, subcommand] of command.entries()) { | |
| logger.debug(`> ${subcommand}`); | |
| currentChild = execa(subcommand, { ...defaultOptions, ...options }); | |
| const result = await currentChild; | |
| if (captureOutput && index === command.length - 1) { | |
| return result.all ?? ''; | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/utils/exec.ts` around lines 52 - 57, The loop over the command array
incorrectly detects the last subcommand by comparing subcommand text, so
duplicate command strings can trigger an early return; update the loop in the
exec utility to detect the final element by position (e.g., use an indexed for
loop or track the index) and only perform the captureOutput return when the
current index === command.length - 1, keeping references to currentChild,
logger, execa, captureOutput and options intact.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.claude/skills/builder-bug-workflow/SKILL.md:
- Line 3: Update the inconsistent flow numbering in the workflow description:
find all occurrences of "Flow 1", "Flow 2", and "Flow 3" in the SKILL.md
workflow text (the description line and the paragraphs that label builder
verification flows) and make them consistent so that the builder verification
flows referenced at the top (frontend output vs terminal output) match the later
flow labels (e.g., change the later "Flow 1" and "Flow 2" back to "Flow 2" and
"Flow 3" or vice versa), ensuring each flow name used in the sentence fragments
and headings is identical across the file; update any cross-references/snapshots
or sandbox requirements that mention the old numbering to the corrected numbers.
In @.claude/skills/fix-bug/SKILL.md:
- Line 30: The document has inconsistent step numbering (the overview lists
"Step 2: Determine Verification Flow" but the planning section and the later
verification-flow section use different step numbers), so pick the correct
canonical order (e.g., keep "Determine Verification Flow" as Step 2) and update
every occurrence to match: change the header and any inline references in the
overview, the planning section (currently labeled as Step 2 at Line 69), the
reintroduced verification-flow header (currently Step 4 at Line 198), and the
later reference (Line 248) so all mentions of these steps and the Table of
Contents use the same step numbers and wording (search for the literal tokens
"Step 2" and "Step 4" and the section title "Determine Verification Flow" to
ensure consistency).
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
.claude/skills/builder-bug-workflow/SKILL.md.claude/skills/fix-bug/SKILL.mdCLAUDE.md
| @@ -0,0 +1,150 @@ | |||
| --- | |||
| name: builder-bug-workflow | |||
| description: Complete workflows for verifying builder bug fixes in code/builders/**. Includes Flow 2 (frontend output) and Flow 3 (terminal output) with sandbox requirements, hash normalization, and snapshot management. | |||
There was a problem hiding this comment.
Fix inconsistent flow numbering in this workflow doc.
Lines 3 and 8 define builder verification as Flow 2 (frontend) and Flow 3 (terminal), but Lines 17 and 79 rename them to Flow 1 and Flow 2. This can misroute verification steps; keep numbering aligned everywhere.
Proposed doc fix
-# Flow 1 — Builder Bug Verification (Frontend Output)
+# Flow 2 — Builder Bug Verification (Frontend Output)
-# Flow 2 — Builder Bug Verification (Terminal Output)
+# Flow 3 — Builder Bug Verification (Terminal Output)Also applies to: 8-8, 17-17, 79-79
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.claude/skills/builder-bug-workflow/SKILL.md at line 3, Update the
inconsistent flow numbering in the workflow description: find all occurrences of
"Flow 1", "Flow 2", and "Flow 3" in the SKILL.md workflow text (the description
line and the paragraphs that label builder verification flows) and make them
consistent so that the builder verification flows referenced at the top
(frontend output vs terminal output) match the later flow labels (e.g., change
the later "Flow 1" and "Flow 2" back to "Flow 2" and "Flow 3" or vice versa),
ensuring each flow name used in the sentence fragments and headings is identical
across the file; update any cross-references/snapshots or sandbox requirements
that mention the old numbering to the corrected numbers.
Package BenchmarksCommit: No significant changes detected, all good. 👏 |
3d59ac4 to
ca406a5
Compare
- Added detailed decision-making steps for baseline creation and updates in builder-bug-workflow. - Revised fix-bug workflow to clarify steps for planning, implementing, and verifying bug fixes. - Introduced implement-and-verify-fix skill to streamline the implementation and verification process. - Updated manager-bug-workflow to include comprehensive evidence quality checklists. - Created open-pull-request skill to automate PR creation with flow-specific templates and required labels. - Established plan-bug-fix skill to facilitate thorough issue understanding and fix planning. - Improved renderer-bug-workflow with enhanced evidence quality criteria for screenshots. - Refined CLAUDE.md to clarify linting and formatting processes.
ca406a5 to
d962d1a
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (3)
.claude/skills/plan-bug-fix/SKILL.md (1)
27-37: Consider adding language identifiers to template/diagram fences.Lines 27, 57, 76, and 129 have unlabeled fences containing workflow diagrams and templates. Adding
textidentifiers resolves markdownlint MD040.Example fix for line 27
-``` +```text Step 1: Fetch & Understand Issue ↓Also applies to: 57-62, 76-95, 129-161
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.claude/skills/plan-bug-fix/SKILL.md around lines 27 - 37, Add a language identifier (use "text") to each unlabeled fenced code block that contains workflow diagrams or templates in SKILL.md (e.g., the fence that begins with "Step 1: Fetch & Understand Issue", the block covering the checkpoint "Confirm routing", the larger template blocks around Steps 3–4, and the template starting near "COMPLETE — Ready for implementation") so markdownlint MD040 is resolved; update each opening triple-backtick to ```text for those four diagram/template fences..claude/skills/open-pull-request/SKILL.md (1)
33-40: Add language identifier to workflow overview fence.Line 33 has an unlabeled fence for the workflow diagram. Adding
textresolves markdownlint MD040.Suggested fix
-``` +```text Step 1: Push Feature Branch ↓🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.claude/skills/open-pull-request/SKILL.md around lines 33 - 40, The fenced block containing the workflow diagram in SKILL.md is missing a language identifier (causing markdownlint MD040); update the opening fence for the diagram from ``` to ```text so the workflow overview fence is labeled, e.g., change the code fence that wraps the "Step 1: Push Feature Branch ... [DONE: PR created, labeled, and ready]" diagram to use ```text.CLAUDE.md (1)
29-32: Add language identifiers to bash code fences.Lines 29, 36, 42, 49, 65, and 78 have unlabeled fences for bash commands. Adding
bashidentifiers improves syntax highlighting and resolves markdownlint MD040.Example fix for line 29
-``` +```bash yarn nx run-many -t compile -c production yarn nx compile <package-name> -c productionApply the same pattern to the other bash code blocks. </details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In
@CLAUDE.mdaround lines 29 - 32, Add the Bash language identifier to each
unlabeled code fence containing shell commands so markdownlint MD040 is
satisfied; specifically update the fences that wrap the blocks with the commands
"yarn nx run-many -t compile -c production" / "yarn nx compile -c
production" and the other unlabeled blocks that contain similar yarn/npm shell
examples by changing the opening "" to "bash" for each such block.</details> </blockquote></details> </blockquote></details> <details> <summary>🤖 Prompt for all review comments with AI agents</summary>Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In @.claude/skills/open-pull-request/SKILL.md:
- Around line 33-40: The fenced block containing the workflow diagram in
SKILL.md is missing a language identifier (causing markdownlint MD040); update
the opening fence for the diagram fromtotext so the workflow overview
fence is labeled, e.g., change the code fence that wraps the "Step 1: Push
Feature Branch ... [DONE: PR created, labeled, and ready]" diagram to useIn @.claude/skills/plan-bug-fix/SKILL.md: - Around line 27-37: Add a language identifier (use "text") to each unlabeled fenced code block that contains workflow diagrams or templates in SKILL.md (e.g., the fence that begins with "Step 1: Fetch & Understand Issue", the block covering the checkpoint "Confirm routing", the larger template blocks around Steps 3–4, and the template starting near "COMPLETE — Ready for implementation") so markdownlint MD040 is resolved; update each opening triple-backtick to ```text for those four diagram/template fences. In `@CLAUDE.md`: - Around line 29-32: Add the Bash language identifier to each unlabeled code fence containing shell commands so markdownlint MD040 is satisfied; specifically update the fences that wrap the blocks with the commands "yarn nx run-many -t compile -c production" / "yarn nx compile <package-name> -c production" and the other unlabeled blocks that contain similar yarn/npm shell examples by changing the opening "```" to "```bash" for each such block.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
.claude/skills/builder-bug-workflow/SKILL.md.claude/skills/fix-bug/SKILL.md.claude/skills/implement-and-verify-fix/SKILL.md.claude/skills/manager-bug-workflow/SKILL.md.claude/skills/open-pull-request/SKILL.md.claude/skills/plan-bug-fix/SKILL.md.claude/skills/renderer-bug-workflow/SKILL.mdCLAUDE.md
There was a problem hiding this comment.
♻️ Duplicate comments (1)
.claude/skills/fix-bug/SKILL.md (1)
29-39:⚠️ Potential issue | 🟡 MinorAdd language identifiers to all fenced code blocks.
markdownlintMD040 is still triggered by untyped fences (e.g., Line 29 and the other listed blocks). Please annotate each opening fence (textfor workflow/output examples,bashfor commands).Suggested patch
-``` +```text Step 1: Plan Bug Fix (understand, route, plan) ↓ [via /plan-bug-fix skill] Step 2: Implement and Verify Fix (code, test, verify) ↓ [MUST PASS: All tests, evidence gathered] Step 3: Prepare PR Description (with evidence) ↓ Step 4: Open PR (via /open-pull-request skill) ↓ [Handles: push, PR creation, labels] ✅ COMPLETE — PR ready for review@@
-+text
/plan-bug-fix [issue-number]@@ -``` +```text /plan-bug-fix 12345@@
-+text
/implement-and-verify-fix [issue-number]@@ -``` +```text /implement-and-verify-fix 12345@@
-+text
/open-pull-request [issue-number]@@ -``` +```text /open-pull-request 12345</details> Also applies to: 56-58, 62-64, 102-104, 108-110, 195-197, 201-203 <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In @.claude/skills/fix-bug/SKILL.md around lines 29 - 39, Update all fenced code
blocks in .claude/skills/fix-bug/SKILL.md to include explicit language
identifiers (e.g., usetext for workflow/output examples andbash for
shell commands) so markdownlint MD040 stops flagging untyped fences;
specifically add language tags to the top workflow block (the multi-line fence
around "Step 1: Plan Bug Fix..." and the individual command/example fences that
show/plan-bug-fix,/implement-and-verify-fix,/open-pull-requestand
their example invocations), and apply the same change to the other untyped
fences reported (the blocks near the other listed locations) so every opening
🧹 Nitpick comments (1)
.claude/skills/fix-bug/SKILL.md (1)
176-177: Narrow staging scope to avoid accidental unrelated docs commits.Line 176 stages the entire
.claude/skills/tree, which can unintentionally include unrelated edits. Prefer explicit file paths (orgit add -p) in this workflow doc.Suggested patch
- git add CLAUDE.md .claude/skills/ + # Stage only files you intentionally changed + git add CLAUDE.md .claude/skills/fix-bug/SKILL.md + # or use: git add -p git commit -m "Docs: Improve [skill-name] instructions based on workflow execution"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.claude/skills/fix-bug/SKILL.md around lines 176 - 177, Replace the broad staging command "git add CLAUDE.md .claude/skills/" with a narrowly-scoped add (e.g., "git add CLAUDE.md .claude/skills/<specific-file>.md" or use "git add -p" to interactively stage hunks) so unrelated files under .claude/skills/ are not included; update the workflow text that shows the commit command accordingly (keep the git commit -m line but ensure the preceding git add shows the explicit path or -p usage).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In @.claude/skills/fix-bug/SKILL.md:
- Around line 29-39: Update all fenced code blocks in
.claude/skills/fix-bug/SKILL.md to include explicit language identifiers (e.g.,
use ```text for workflow/output examples and ```bash for shell commands) so
markdownlint MD040 stops flagging untyped fences; specifically add language tags
to the top workflow block (the multi-line fence around "Step 1: Plan Bug Fix..."
and the individual command/example fences that show `/plan-bug-fix`,
`/implement-and-verify-fix`, `/open-pull-request` and their example
invocations), and apply the same change to the other untyped fences reported
(the blocks near the other listed locations) so every opening ``` has a language
identifier.
---
Nitpick comments:
In @.claude/skills/fix-bug/SKILL.md:
- Around line 176-177: Replace the broad staging command "git add CLAUDE.md
.claude/skills/" with a narrowly-scoped add (e.g., "git add CLAUDE.md
.claude/skills/<specific-file>.md" or use "git add -p" to interactively stage
hunks) so unrelated files under .claude/skills/ are not included; update the
workflow text that shows the commit command accordingly (keep the git commit -m
line but ensure the preceding git add shows the explicit path or -p usage).
b9c0c21 to
f0afb5b
Compare
…lan-bug-fix skills for clarity
…l to streamline usage
…aved in the repository with the fix
…ration steps for clarity
…g and PR attachment process
There was a problem hiding this comment.
Why is this a GitHub Workflow instead of another job in CircleCI?
- Replace page.goto() + waitUntilLoaded() pattern with navigateToStory() - Add note: Controls panel argType rows need both args + control to render - Add troubleshooting tips: error-context.md ARIA snapshot and react-aria dialog accessible name behaviour Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Apply labels via 'gh pr edit --add-label' after PR creation instead of '--label' in 'gh pr create'. The --label flag can fail silently when GitHub's label lookup is slow, leaving the PR without the 'agent' label. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…GitHub.com Copilot Remove stale 'Copilot handles it automatically' wording from description, overview, and summary. The skill now explicitly uses 'gh pr create' + 'gh pr edit --add-label' in Step 3, which works in both Claude Code CLI and GitHub.com Copilot coding agent environments. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
'gh pr edit --add-label' only works for maintainers/Copilot with write access. For non-maintainers, fall back to leaving a PR comment listing the labels to apply. Use '|| true' so the step doesn't hard-fail. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ci:normal is now added alongside agent+bug in Step 3, but the step is explicitly gated by a pre-flight checklist: verification-checklist must pass, flow-specific verification must be done, and PR evidence section must be fully populated. ci:normal is never applied if any of those are incomplete. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Story IDs are hard to predict due to Storybook's title calculation logic. Add a note to manager-bug-workflow and renderer-bug-workflow explaining that agents can query http://localhost:6006/index.json to look up the exact story id, title, and name instead of guessing the URL pattern. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This reverts commit 7a2dcef.
Closes #33953
What I did
This PR introduces a comprehensive infrastructure for AI agents (GitHub Copilot/Claude) to automatically verify bug fixes in Storybook with proper evidence collection.
What's Added
1. Bug Verification Workflows (Claude Skills)
Five new skills in skills that guide AI agents through systematic bug fix verification:
/fix-bug: End-to-end orchestration: fetch issue → plan fix → implement → verify → open PR. Now requires PRs to be labeled with agent, ci:normal, and bug/verification-checklist: Universal pre-PR checks for all bug types (Flow 0)/renderer-bug-workflow: Verify renderer bugs with template stories and visual inspection (Flow 1)/builder-bug-workflow: Verify builder bugs with frontend snapshots (Flow 2) or terminal output (Flow 3)/manager-bug-workflow: Verify Manager UI bugs with E2E tests and screenshots (Flow 4)Key Feature: Smart routing logic that determines verification requirements based on where the bug manifests to users, not just where code is changed.
2. Terminal Output Snapshot Testing
New script
capture-terminal-output.ts that:3. GitHub Actions Workflow
New workflow
copilot-verification.ymlthat:4. Documentation Updates
Why This Matters
Before: AI agents would fix bugs but couldn't systematically verify the fix worked or collect evidence for reviewers.
After: AI agents can:
✅ builder-vite-build.snap.txt
✅ builder-webpack5-build.snap.txt
Checklist for Contributors
Testing
The changes in this PR are covered in the following automated tests:
Manual testing
Caution
This section is mandatory for all contributions. If you believe no manual test is necessary, please state so explicitly. Thanks!
Documentation
MIGRATION.MD
Checklist for Maintainers
When this PR is ready for testing, make sure to add
ci:normal,ci:mergedorci:dailyGH label to it to run a specific set of sandboxes. The particular set of sandboxes can be found incode/lib/cli-storybook/src/sandbox-templates.tsMake sure this PR contains one of the labels below:
Available labels
bug: Internal changes that fixes incorrect behavior.maintenance: User-facing maintenance tasks.dependencies: Upgrading (sometimes downgrading) dependencies.build: Internal-facing build tooling & test updates. Will not show up in release changelog.cleanup: Minor cleanup style change. Will not show up in release changelog.documentation: Documentation only changes. Will not show up in release changelog.feature request: Introducing a new feature.BREAKING CHANGE: Changes that break compatibility in some way with current major version.other: Changes that don't fit in the above categories.🦋 Canary release
This PR does not have a canary release associated. You can request a canary release of this pull request by mentioning the
@storybookjs/coreteam here.core team members can create a canary release here or locally with
gh workflow run --repo storybookjs/storybook publish.yml --field pr=<PR_NUMBER>Summary by CodeRabbit
Documentation
Tests
Chores