Skip to content

Telemetry: Report CssCheck result in ai-setup-final-scoring#34600

Merged
yannbf merged 5 commits into
project/sb-agentic-setupfrom
cursor/telemetry-css-check-story
Apr 28, 2026
Merged

Telemetry: Report CssCheck result in ai-setup-final-scoring#34600
yannbf merged 5 commits into
project/sb-agentic-setupfrom
cursor/telemetry-css-check-story

Conversation

@kasperpeulen
Copy link
Copy Markdown
Member

@kasperpeulen kasperpeulen commented Apr 20, 2026

Closes #

What I did

This PR adds cssCheck to TestRunAnalysis so AI setup telemetry can report whether the dedicated CssCheck story passed, failed, or was not run.

analyzeTestResults looks for a story whose id ends in --css-check and reports:

Story outcome cssCheck
Story ran and passed 'pass'
Story ran and failed 'fail'
No CssCheck story in the suite 'not-run'
Story existed but was skipped, filtered, or otherwise not executed 'not-run'

The field is always present and only contains a fixed enum, so no user-authored content is sent. This makes the dedicated CSS check visible in telemetry instead of hiding it inside the overall pass/fail totals.

Checklist for Contributors

Testing

The changes in this PR are covered in the following automated tests:

  • stories
  • unit tests
  • integration tests
  • end-to-end tests

Unit tests cover all three values, first-match-wins behavior, and suffix matching. parse-vitest-report.test.ts confirms the value flows from the Vitest report into the final summary.

Manual testing

Caution

This section is mandatory for all contributions. If you believe no manual test is necessary, please state so explicitly. Thanks!

  1. Run npx storybook ai setup in a React + Vitest project.
  2. Ensure the generated stories include export const CssCheck.
  3. Run the test flow and inspect the ai-setup-final-scoring telemetry payload.
  4. Confirm results.cssCheck is reported as 'pass', 'fail', or 'not-run'.

Documentation

  • Add or update documentation reflecting your changes
  • If you are deprecating/removing a feature, make sure to update
    MIGRATION.MD

No user-facing documentation — internal telemetry plumbing only.

Checklist for Maintainers

  • When this PR is ready for testing, make sure to add ci:normal, ci:merged or ci:daily GH label to it to run a specific set of sandboxes. The particular set of sandboxes can be found in code/lib/cli-storybook/src/sandbox-templates.ts

  • Make sure this PR contains one of the labels below:

    Available labels
    • bug: Internal changes that fixes incorrect behavior.
    • maintenance: User-facing maintenance tasks.
    • dependencies: Upgrading (sometimes downgrading) dependencies.
    • build: Internal-facing build tooling & test updates. Will not show up in release changelog.
    • cleanup: Minor cleanup style change. Will not show up in release changelog.
    • documentation: Documentation only changes. Will not show up in release changelog.
    • feature request: Introducing a new feature.
    • BREAKING CHANGE: Changes that break compatibility in some way with current major version.
    • other: Changes that don't fit in the above categories.

🦋 Canary release

This pull request has been released as version 0.0.0-pr-34600-sha-60a3006d. Try it out in a new sandbox by running npx storybook@0.0.0-pr-34600-sha-60a3006d sandbox or in an existing project with npx storybook@0.0.0-pr-34600-sha-60a3006d upgrade.

More information
Published version 0.0.0-pr-34600-sha-60a3006d
Triggered by @yannbf
Repository storybookjs/storybook
Branch cursor/telemetry-css-check-story
Commit 60a3006d
Datetime Thu Apr 23 16:47:02 UTC 2026 (1776962822)
Workflow run 24847421751

To request a new release of this pull request, mention the @storybookjs/core team.

core team members can create a new canary release here or locally with gh workflow run --repo storybookjs/storybook publish.yml --field pr=34600

The AI setup prompt now asks the agent to include exactly one story
named `CssCheck` — whose `play` function asserts a component-specific
computed style via `getComputedStyle`. That story proves the shared
preview actually loaded the app's CSS (rather than just: something
rendered that we can call `toBeVisible` on).

Currently, the `ai-setup-final-scoring` telemetry event only reports
an aggregate pass/fail count. We cannot tell "all tests passed" from
"CSS check passed" — if the suite fails, the reason is opaque.

Wires the CssCheck story's individual result into the event:

- `TestRunSummary`: new optional `storyResults` field with the raw
  `StoryTestResult[]` (no other callers changed — `ghost-stories`
  stays on the aggregate summary).
- `parseVitestResults`: populates `storyResults` (it was already
  building the array internally and discarding it).
- `ai-setup-utils`: new `findCssCheckStoryResult(storyResults)`
  helper, locates the story by its CSF-sanitized `--css-check`
  storyId suffix. Defensive `.toLowerCase()` kept in case upstream
  stops sanitizing.
- `ai-setup-channel`: calls the helper after `runStoryTests`, spreads
  `cssCheck: { storyId, status }` into the telemetry payload when
  a match is present; omits the field when the agent did not create
  the story.

Background: the alternative considered was an after-each runtime hook
that sniffs stylesheet loading. Rejected upstream because it needs a
fragile "not ours" filter against vitest-browser, Storybook, and
addon sheets, and it can only answer "was something applied" rather
than "did this component's expected styles apply". A specifically
named story gives us a clean telemetry signal without any of that.

Companion PRs add the story name to both prompts:
- #34595 (eval harness prompt + grade check)
- #34596 (user-facing `storybook setup` prompt)
@kasperpeulen kasperpeulen added maintenance User-facing maintenance tasks ci:normal labels Apr 20, 2026
@nx-cloud
Copy link
Copy Markdown

nx-cloud Bot commented Apr 20, 2026

View your CI Pipeline Execution ↗ for commit 60a3006

Command Status Duration Result
nx run-many -t compile,check,knip,test,lint,fmt... ✅ Succeeded 9m 23s View ↗

☁️ Nx Cloud last updated this comment at 2026-04-22 13:27:41 UTC

@nx-cloud
Copy link
Copy Markdown

nx-cloud Bot commented Apr 20, 2026

View your CI Pipeline Execution ↗ for commit d0fe90e

Command Status Duration Result
nx run-many -t compile,check,knip,test,lint,fmt... ✅ Succeeded 11m 44s View ↗

☁️ Nx Cloud last updated this comment at 2026-04-20 14:37:03 UTC

@kasperpeulen kasperpeulen changed the title Telemetry: attribute the CssCheck story result in ai-setup-final-scoring Telemetry: Attribute the CssCheck story result in ai-setup-final-scoring Apr 21, 2026
Pure documentation; no behavior change.

- CssCheckStoryResult: call out that `status` includes `'PENDING'`
  (story existed but Vitest did not run it) and that consumers should
  treat that as "unknown", not as a health signal.
- findCssCheckStoryResult: note that on prompt violation (agent
  created multiple `CssCheck` stories), only the first is reported,
  and clarify that `PENDING` is distinct from the `undefined` "story
  missing" case.
- TestRunSummary.storyResults: order is per-file then per-assertion
  (Vitest emission order), not input-file order — the helper does
  not care about order, but the JSDoc was misleading.
@storybook-app-bot
Copy link
Copy Markdown

storybook-app-bot Bot commented Apr 21, 2026

Package Benchmarks

Commit: 60a3006, ran on 22 April 2026 at 13:29:17 UTC

The following packages have significant changes to their size or dependencies:

@storybook/builder-webpack5

Before After Difference
Dependency count 186 186 0
Self size 76 KB 76 KB 0 B
Dependency size 32.75 MB 32.78 MB 🚨 +31 KB 🚨
Bundle Size Analyzer Link Link

@storybook/angular

Before After Difference
Dependency count 185 185 0
Self size 140 KB 140 KB 0 B
Dependency size 30.41 MB 30.44 MB 🚨 +31 KB 🚨
Bundle Size Analyzer Link Link

@storybook/ember

Before After Difference
Dependency count 190 190 0
Self size 15 KB 15 KB 🚨 +18 B 🚨
Dependency size 29.47 MB 29.50 MB 🚨 +31 KB 🚨
Bundle Size Analyzer Link Link

@storybook/nextjs

Before After Difference
Dependency count 535 535 0
Self size 651 KB 650 KB 🎉 -120 B 🎉
Dependency size 60.77 MB 60.80 MB 🚨 +31 KB 🚨
Bundle Size Analyzer Link Link

@storybook/react-webpack5

Before After Difference
Dependency count 273 273 0
Self size 23 KB 23 KB 0 B
Dependency size 45.38 MB 45.41 MB 🚨 +31 KB 🚨
Bundle Size Analyzer Link Link

@storybook/server-webpack5

Before After Difference
Dependency count 198 198 0
Self size 16 KB 16 KB 0 B
Dependency size 34.01 MB 34.04 MB 🚨 +31 KB 🚨
Bundle Size Analyzer Link Link

@storybook/cli

Before After Difference
Dependency count 184 184 0
Self size 819 KB 819 KB 🎉 -84 B 🎉
Dependency size 68.22 MB 68.25 MB 🚨 +26 KB 🚨
Bundle Size Analyzer Link Link

@storybook/codemod

Before After Difference
Dependency count 177 177 0
Self size 32 KB 32 KB 🚨 +36 B 🚨
Dependency size 66.74 MB 66.76 MB 🚨 +26 KB 🚨
Bundle Size Analyzer Link Link

@storybook/preset-react-webpack

Before After Difference
Dependency count 166 166 0
Self size 18 KB 18 KB 0 B
Dependency size 31.75 MB 31.78 MB 🚨 +30 KB 🚨
Bundle Size Analyzer Link Link

Pure review-follow-up; reworks the shape of the signal added in the
previous two commits on this branch. No behavior change at the call
sites beyond what the new shape implies.

Why
---
Emitting `cssCheck: { storyId, status }` meant putting a user-authored
storyId (from a user-named component) into telemetry. Component names
are user code — they cannot go into analytics.

It also added a new `storyResults?: StoryTestResult[]` field to
`TestRunSummary`, a separate `findCssCheckStoryResult` helper, and a
tri-state (`'PASS' | 'FAIL' | 'PENDING'`) that the JSDoc itself said
consumers should treat as a boolean anyway.

What
----
- `TestRunAnalysis` gets an optional `cssCheck?: boolean`. Sits
  alongside `passed`, `successRate`, etc. — where every other
  derived-from-results signal lives.
- `analyzeTestResults` computes it inline from the story list:
  PASS -> true, FAIL -> false, PENDING or no match -> absent
  (conditional spread, so the key is missing, not `undefined`).
  PENDING is genuinely unknown; collapsing it to false would
  inflate CSS-failure rates in telemetry.
- `ai-setup-channel` emits `results: summary` as before; the boolean
  rides along automatically, no helper import, no extra spread at
  the call site.

Delete
------
- `TestRunSummary.storyResults` field + its JSDoc.
- The `storyResults: storyTestResults` line in `parseVitestResults`.
- `findCssCheckStoryResult` helper, `CssCheckStoryResult` interface,
  and the `CSS_CHECK_STORY_ID_SUFFIX` constant in
  `ai-setup-utils.ts`, plus the `StoryTestResult` import.
- The `findCssCheckStoryResult` describe block in
  `ai-setup-utils.test.ts` (7 cases); equivalents live under
  `analyze-test-results.test.ts` now.

Tests
-----
- 7 new cases under `describe('cssCheck', ...)` in
  `analyze-test-results.test.ts` (true/false, absent on
  no-match/PENDING/empty, first-match-wins on prompt violation,
  case-insensitive defense).
- The `parse-vitest-report` test keeps its realistic fixture
  (expected rgb(37, 99, 235) but got rgba(0, 0, 0, 0)) and now
  asserts `summary.cssCheck === false` instead of checking the
  removed `storyResults` array.

Net diff vs base
----------------
+132 / -0 across 4 files (was +179 / -1 across 6 files). No new
public surface on `TestRunSummary`.

Ghost-stories side-effect
-------------------------
`ghost-stories-channel.ts` emits `results: testRunResult.summary` too,
so a user whose suite happens to contain a story named `CssCheck`
would see a `cssCheck` boolean in ghost-stories telemetry as well.
Benign (boolean, no user data), very unlikely in practice; not gating
the computation behind a parameter.
@kasperpeulen kasperpeulen marked this pull request as ready for review April 21, 2026 16:29
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 21, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 838b1f64-a213-4dc6-ad1c-5eaf6e3c3972

📥 Commits

Reviewing files that changed from the base of the PR and between 5d2a7fb and 60a3006.

📒 Files selected for processing (1)
  • code/core/src/core-server/server-channel/ghost-stories-channel.test.ts

📝 Walkthrough

Walkthrough

Add a three-state cssCheck field ('pass' | 'fail' | 'not-run') to test-run analysis and parse results; detect story IDs ending with --css-check (case-insensitive) to set the field and update related tests and telemetry expectations.

Changes

Cohort / File(s) Summary
Type Definitions
code/core/src/shared/utils/test-result-types.ts
Added required `cssCheck: 'pass'
Analysis Implementation
code/core/src/shared/utils/analyze-test-results.ts
Added logic to find the first storyId that case-insensitively ends with --css-check and set cssCheck to 'pass'/'fail' or 'not-run' otherwise.
Analysis Tests
code/core/src/shared/utils/analyze-test-results.test.ts
Expanded tests with describe('cssCheck') covering pass/fail/not-run, empty input, non-PASS statuses, multiple matches (first wins), and case-insensitive matching.
Vitest Parser Tests
code/core/src/core-server/utils/ghost-stories/parse-vitest-report.test.ts
Updated expectations to include summary.cssCheck: 'not-run' for all-pass fixtures and added a test asserting summary.cssCheck: 'fail' when a components-button--css-check story fails.
Telemetry Tests
code/core/src/core-server/server-channel/ghost-stories-channel.test.ts
Updated two successful-discovery telemetry assertions to include results.cssCheck: 'not-run' in the expected payload.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
code/core/src/shared/utils/analyze-test-results.test.ts (1)

166-168: Optional: assert key omission on empty input for consistency.

You already assert omission with 'cssCheck' in analysis in other absent cases; doing the same here keeps the contract check uniform.

Suggested test tweak
       it('is absent for an empty result list', () => {
-        expect(analyzeTestResults([]).cssCheck).toBeUndefined();
+        const analysis = analyzeTestResults([]);
+        expect(analysis.cssCheck).toBeUndefined();
+        expect('cssCheck' in analysis).toBe(false);
       });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@code/core/src/shared/utils/analyze-test-results.test.ts` around lines 166 -
168, Test for analyzeTestResults should assert key omission uniformly: in the
"'is absent for an empty result list'" test, replace the current
expect(analyzeTestResults([]).cssCheck).toBeUndefined() with an assertion that
the 'cssCheck' property is not present on the returned analysis object (use the
"'cssCheck' in analysis" style used in other tests) — locate the test by the it
block name and the analyzeTestResults function to update the assertion.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@code/core/src/shared/utils/analyze-test-results.test.ts`:
- Around line 166-168: Test for analyzeTestResults should assert key omission
uniformly: in the "'is absent for an empty result list'" test, replace the
current expect(analyzeTestResults([]).cssCheck).toBeUndefined() with an
assertion that the 'cssCheck' property is not present on the returned analysis
object (use the "'cssCheck' in analysis" style used in other tests) — locate the
test by the it block name and the analyzeTestResults function to update the
assertion.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d924e720-9ad3-49e5-8b86-60b884f29f86

📥 Commits

Reviewing files that changed from the base of the PR and between 8ed4332 and a320350.

📒 Files selected for processing (4)
  • code/core/src/core-server/utils/ghost-stories/parse-vitest-report.test.ts
  • code/core/src/shared/utils/analyze-test-results.test.ts
  • code/core/src/shared/utils/analyze-test-results.ts
  • code/core/src/shared/utils/test-result-types.ts

Widens the field from `cssCheck?: boolean` to a required
`cssCheck: 'pass' | 'fail' | 'not-run'`.

Why
---
An optional boolean collapsed two operationally different cases into
an absent key:

  - agent didn't author a CssCheck story (prompt non-compliance)
  - agent authored it, Vitest returned PENDING — skipped, todo,
    filtered out, etc.

Neither produces a pass/fail signal, so from the dashboard's point of
view they're the same bucket — but the code treated "absent" as a
fourth state alongside true/false/missing. A required three-valued
enum collapses them explicitly under `'not-run'` and removes the
absent case entirely, so every run carries a self-describing value.

Shape
-----
- `'pass'`    — a CssCheck story ran and passed.
- `'fail'`    — a CssCheck story ran and failed.
- `'not-run'` — no pass/fail signal: either no CssCheck story in the
                 suite, or the story existed but wasn't executed
                 (Vitest PENDING: skipped / pending / todo / filtered).

Still only a string enum — no storyId or component name — so no
user-authored data enters telemetry.

Consumers that don't know about the field simply ignore it; the
three values are self-describing in dashboards and logs.
@yannbf yannbf requested review from Sidnioulz and yannbf April 21, 2026 19:18
@kasperpeulen kasperpeulen changed the title Telemetry: Attribute the CssCheck story result in ai-setup-final-scoring Telemetry: Report CssCheck result in ai-setup-final-scoring Apr 22, 2026
Copy link
Copy Markdown
Member

@Sidnioulz Sidnioulz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic for handling and reporting the css check story LGTM, but I'm wondering: shouldn't that story be filtered out from the other test statistics we report, as it is not actually a story designed to represent component behaviour?

@yannbf yannbf merged commit 59b8813 into project/sb-agentic-setup Apr 28, 2026
129 checks passed
@yannbf yannbf deleted the cursor/telemetry-css-check-story branch April 28, 2026 07:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci:normal maintenance User-facing maintenance tasks

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants