test(a2ui): use playground render preview for ui-judge tests#2673
Conversation
|
📝 WalkthroughWalkthroughThis PR integrates UI Judge visual correctness testing into the CI workflow end-to-end: enhances the playground runtime to support zero-delay simulation and multiple protocol variants, provides a playground server test helper replacing local fixtures, creates a GitHub composite action for posting results to PR comments, wires the workflow with fallback artifact handling, and documents operational practices. ChangesUI Judge Visual Correctness Testing Integration
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Merging this PR will degrade performance by 7.3%
|
| Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|
| ❌ | transform 1000 view elements |
40 ms | 43.2 ms | -7.3% |
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing hw/codex/ui-judge-playground-preview (83162b2) with main (7e6ff74)
Footnotes
-
26 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
React External#1678 Bundle Size — 698.01KiB (0%).83162b2(current) vs 7e6ff74 main#1660(baseline) Bundle metrics
|
| Current #1678 |
Baseline #1660 |
|
|---|---|---|
0B |
0B |
|
0B |
0B |
|
0% |
0% |
|
0 |
0 |
|
3 |
3 |
|
17 |
17 |
|
5 |
5 |
|
8.59% |
8.59% |
|
0 |
0 |
|
0 |
0 |
Bundle analysis report Branch hw/codex/ui-judge-playground-pre... Project dashboard
Generated by RelativeCI Documentation Report issue
React Example with Element Template#831 Bundle Size — 202.16KiB (0%).83162b2(current) vs 7e6ff74 main#814(baseline) Bundle metrics
|
| Current #831 |
Baseline #814 |
|
|---|---|---|
0B |
0B |
|
0B |
0B |
|
0% |
0% |
|
0 |
0 |
|
4 |
4 |
|
100 |
100 |
|
30 |
30 |
|
39.22% |
39.22% |
|
2 |
2 |
|
0 |
0 |
Bundle size by type no changes
| Current #831 |
Baseline #814 |
|
|---|---|---|
145.76KiB |
145.76KiB |
|
56.41KiB |
56.41KiB |
Bundle analysis report Branch hw/codex/ui-judge-playground-pre... Project dashboard
Generated by RelativeCI Documentation Report issue
React Example#8563 Bundle Size — 237.81KiB (0%).83162b2(current) vs 7e6ff74 main#8545(baseline) Bundle metrics
|
| Current #8563 |
Baseline #8545 |
|
|---|---|---|
0B |
0B |
|
0B |
0B |
|
0% |
0% |
|
0 |
0 |
|
4 |
4 |
|
200 |
200 |
|
80 |
80 |
|
44.68% |
44.68% |
|
2 |
2 |
|
0 |
0 |
Bundle size by type no changes
| Current #8563 |
Baseline #8545 |
|
|---|---|---|
145.76KiB |
145.76KiB |
|
92.05KiB |
92.05KiB |
Bundle analysis report Branch hw/codex/ui-judge-playground-pre... Project dashboard
Generated by RelativeCI Documentation Report issue
React MTF Example#1696 Bundle Size — 208.75KiB (0%).83162b2(current) vs 7e6ff74 main#1678(baseline) Bundle metrics
|
| Current #1696 |
Baseline #1678 |
|
|---|---|---|
0B |
0B |
|
0B |
0B |
|
0% |
0% |
|
0 |
0 |
|
3 |
3 |
|
195 |
195 |
|
77 |
77 |
|
44.17% |
44.17% |
|
2 |
2 |
|
0 |
0 |
Bundle size by type no changes
| Current #1696 |
Baseline #1678 |
|
|---|---|---|
111.23KiB |
111.23KiB |
|
97.52KiB |
97.52KiB |
Bundle analysis report Branch hw/codex/ui-judge-playground-pre... Project dashboard
Generated by RelativeCI Documentation Report issue
Web Explorer#10137 Bundle Size — 903.53KiB (0%).83162b2(current) vs 7e6ff74 main#10119(baseline) Bundle metrics
Bundle size by type
|
| Current #10137 |
Baseline #10119 |
|
|---|---|---|
499.15KiB |
499.15KiB |
|
402.16KiB |
402.16KiB |
|
2.22KiB |
2.22KiB |
Bundle analysis report Branch hw/codex/ui-judge-playground-pre... Project dashboard
Generated by RelativeCI Documentation Report issue
81ba50d to
609c9e0
Compare
d1209b1 to
cbf445d
Compare
3423b1a to
a5842bd
Compare
UI JudgeAverage score: 2 / 5 across 1 result.
DetailsResult 1
|
dbd4430 to
0002eec
Compare
0002eec to
a968ffe
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (5)
.github/workflows/test.yml (1)
109-130: 💤 Low valueDe-duplicate the repeated
if:expression on the three steps.The same multi-clause expression is repeated on checkout, download-artifact, and the comment step. Hoist it into a single env var (or a single job-level guard plus a per-step skip) so it only has to be maintained in one place.
♻️ Example using a job-level env
ui-judge-comment: needs: ui-judge if: always() runs-on: lynx-ubuntu-24.04-medium permissions: contents: read issues: write pull-requests: write + env: + SHOULD_COMMENT: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.repo.full_name == github.repository && needs.ui-judge.result != 'skipped' && needs.ui-judge.result != 'cancelled' }} steps: - uses: actions/checkout@93cb6efe18208431cddfb8368fd83d5badbf9bfd # v5 - if: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.repo.full_name == github.repository && needs.ui-judge.result != 'skipped' && needs.ui-judge.result != 'cancelled' }} + if: ${{ env.SHOULD_COMMENT == 'true' }} with: persist-credentials: false - uses: actions/download-artifact@634f93cb2916e3fdff6788551b99b062d0335ce0 # v5 - if: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.repo.full_name == github.repository && needs.ui-judge.result != 'skipped' && needs.ui-judge.result != 'cancelled' }} + if: ${{ env.SHOULD_COMMENT == 'true' }} with: name: ui-judge-results - name: Comment UI Judge result - if: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.repo.full_name == github.repository && needs.ui-judge.result != 'skipped' && needs.ui-judge.result != 'cancelled' }} + if: ${{ env.SHOULD_COMMENT == 'true' }} uses: ./.github/actions/ui-judge-comment with: result-file: ui-judge-results.json🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/test.yml around lines 109 - 130, The repeated multi-clause condition used in the three steps of the ui-judge-comment job should be centralized: create a single job-level guard or job env variable (e.g., set env like UI_JUDGE_RUN_CONDITION or move the entire if: expression to the job-level if for ui-judge-comment) and then remove the duplicated per-step if: on the actions/checkout, actions/download-artifact and the "Comment UI Judge result" step; update those steps to rely on the job-level guard (or a simple per-step conditional that checks the new env var) so the expression is maintained in one place and all three steps reference that single symbol instead of duplicating the long clause..github/actions/ui-judge-comment/comment.mjs (1)
219-231: 💤 Low valueSimplify table row construction.
The
.join(' | ').replace(/^/, '| ').replace(/$/, ' |')chain is non-obvious; a template literal expresses the same intent more clearly without the regex round-trip.♻️ Proposed simplification
- return [ - String(index + 1), - escapeTableCell(result.dimension), - `${result.score} / 5`, - page, - status, - ].join(' | ').replace(/^/, '| ').replace(/$/, ' |'); + const cells = [ + String(index + 1), + escapeTableCell(result.dimension), + `${result.score} / 5`, + page, + status, + ]; + return `| ${cells.join(' | ')} |`;🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/actions/ui-judge-comment/comment.mjs around lines 219 - 231, The table row construction in formatTableRow uses an opaque join(' | ').replace(/^/, '| ').replace(/$/, ' |') pattern; simplify by building the row with a single template literal that inserts String(index + 1), escapeTableCell(result.dimension), `${result.score} / 5`, the page variable (from sanitizeUrlForMarkdown(result.url)), and status separated by " | " and wrapped with leading and trailing pipes, ensuring the same output but clearer intent and easier maintenance.packages/genui/ui-judge/tests/helpers/playground-preview-server.ts (2)
159-180: 💤 Low valueFree-port allocation has an inherent TOCTOU race.
findFreePortcloses the listener beforepnpm devbinds, so another local process could grab the port in between and the readiness loop would silently time out on a foreign server. For a CI helper this is usually acceptable, but consider passing the listening server's port viaSO_REUSEADDR-style handoff or acceptingPORTvia env to allow a CI-provided port range. Optional hardening only.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/genui/ui-judge/tests/helpers/playground-preview-server.ts` around lines 159 - 180, findFreePort currently picks and closes a free port, creating a TOCTOU race between allocation and the child process binding; to harden it, first allow honoring an externally provided port (process.env.PORT) or an allowed port-range argument and validate it in findFreePort, and if no env port is provided change the API to either return a still-bound server/socket (so the caller can hand it off to the child process or keep it open until the child confirms binding) or implement a simple handoff (e.g., keep the listener open and pass the port to the child via env + wait for child to accept, then close), and update the readiness loop to use the new behavior; touch the findFreePort implementation and any callers in the readiness loop to consume the bound server or PORT env instead of relying on a closed-port result.
10-16: 💤 Low value
protocoloption type is narrower than the playground accepts.
render.tsxnow treats'0.9','a2ui', and'openui'as valid, butPlaygroundDemoPreviewOptions.protocolhere is'a2ui' | 'openui'. Not a bug today (tests only usea2ui), but it makes the helper inconsistent with the runtime contract.Suggested change
- protocol?: 'a2ui' | 'openui'; + protocol?: '0.9' | 'a2ui' | 'openui';🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/genui/ui-judge/tests/helpers/playground-preview-server.ts` around lines 10 - 16, PlaygroundDemoPreviewOptions.protocol is too narrow compared to runtime expectations in render.tsx; update the protocol type in the PlaygroundDemoPreviewOptions interface to include '0.9' (e.g. '0.9' | 'a2ui' | 'openui') or widen it to string so the test helper matches render.tsx's accepted values and avoids type mismatches when render.tsx treats '0.9' as valid.packages/genui/ui-judge/tests/judge-page.spec.ts (1)
52-84: 💤 Low valueRender-only checks are gated behind
MIDSCENE_MODEL_NAME.The parameterized
renders playground example …tests don't invokejudgePageor any Midscene model, but they live inside thisdescribeblock and are therefore skipped wheneverMIDSCENE_MODEL_NAMEis absent. If the intent is to use them as cheap smoke tests for the playground preview in regular CI, consider splitting them out of the model-gateddescribeso they run unconditionally; if the intent is to keep them only on model-backed runs, this is fine as-is.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/genui/ui-judge/tests/judge-page.spec.ts` around lines 52 - 84, The tests under the describe "A2UI playground preview" are being skipped when hasMidsceneModelConfig() is false because test.skip wraps the entire block, but the parameterized tests that iterate PLAYGROUND_DEMO_CASES (the `renders playground example ${demo.demoId} with speed zero` tests) do not require the Midscene model; move those render-only tests out of the model-gated describe or create a new describe that is not guarded by test.skip, leaving model-dependent tests (that call judgePage/Midscene) inside the existing gated block; adjust references to startPlaygroundPreviewServer(), previewServer.createDemoPreviewUrl(), waitForPreviewText(), and the PLAYGROUND_DEMO_CASES loop so the render-only loop runs unconditionally while preserving the gated tests that actually need hasMidsceneModelConfig().
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/actions/ui-judge-comment/comment.mjs:
- Around line 317-324: findExistingComment currently fetches only the first page
of comments and can miss matches on busy PRs; update it to paginate the
/repos/{owner}/{repo}/issues/{prNumber}/comments endpoint by following the Link
header (rel="next") from the client response and iterate pages until you find a
comment whose body includes the marker, returning immediately when found; ensure
the loop handles the client’s response shape (comments array and response
headers) and stops when no next link is present to avoid infinite loops.
In `@packages/genui/ui-judge/README.md`:
- Around line 39-47: The README command and the prereq check message disagree:
update either the README snippet in packages/genui/ui-judge/README.md or the
error text in playground-preview-server.ts so both instruct the same canonical
build command; pick the canonical command used by your build system (e.g., pnpm
--filter `@lynx-js/a2ui-reactlynx` build or pnpm turbo build:lynx --filter
a2ui-playground), then replace the mismatched string in README.md or the
hardcoded message in playground-preview-server.ts (search for the runtime error
text "Run `pnpm --filter `@lynx-js/a2ui-reactlynx` build`..." and the README build
block) so users get one consistent prep command.
---
Nitpick comments:
In @.github/actions/ui-judge-comment/comment.mjs:
- Around line 219-231: The table row construction in formatTableRow uses an
opaque join(' | ').replace(/^/, '| ').replace(/$/, ' |') pattern; simplify by
building the row with a single template literal that inserts String(index + 1),
escapeTableCell(result.dimension), `${result.score} / 5`, the page variable
(from sanitizeUrlForMarkdown(result.url)), and status separated by " | " and
wrapped with leading and trailing pipes, ensuring the same output but clearer
intent and easier maintenance.
In @.github/workflows/test.yml:
- Around line 109-130: The repeated multi-clause condition used in the three
steps of the ui-judge-comment job should be centralized: create a single
job-level guard or job env variable (e.g., set env like UI_JUDGE_RUN_CONDITION
or move the entire if: expression to the job-level if for ui-judge-comment) and
then remove the duplicated per-step if: on the actions/checkout,
actions/download-artifact and the "Comment UI Judge result" step; update those
steps to rely on the job-level guard (or a simple per-step conditional that
checks the new env var) so the expression is maintained in one place and all
three steps reference that single symbol instead of duplicating the long clause.
In `@packages/genui/ui-judge/tests/helpers/playground-preview-server.ts`:
- Around line 159-180: findFreePort currently picks and closes a free port,
creating a TOCTOU race between allocation and the child process binding; to
harden it, first allow honoring an externally provided port (process.env.PORT)
or an allowed port-range argument and validate it in findFreePort, and if no env
port is provided change the API to either return a still-bound server/socket (so
the caller can hand it off to the child process or keep it open until the child
confirms binding) or implement a simple handoff (e.g., keep the listener open
and pass the port to the child via env + wait for child to accept, then close),
and update the readiness loop to use the new behavior; touch the findFreePort
implementation and any callers in the readiness loop to consume the bound server
or PORT env instead of relying on a closed-port result.
- Around line 10-16: PlaygroundDemoPreviewOptions.protocol is too narrow
compared to runtime expectations in render.tsx; update the protocol type in the
PlaygroundDemoPreviewOptions interface to include '0.9' (e.g. '0.9' | 'a2ui' |
'openui') or widen it to string so the test helper matches render.tsx's accepted
values and avoids type mismatches when render.tsx treats '0.9' as valid.
In `@packages/genui/ui-judge/tests/judge-page.spec.ts`:
- Around line 52-84: The tests under the describe "A2UI playground preview" are
being skipped when hasMidsceneModelConfig() is false because test.skip wraps the
entire block, but the parameterized tests that iterate PLAYGROUND_DEMO_CASES
(the `renders playground example ${demo.demoId} with speed zero` tests) do not
require the Midscene model; move those render-only tests out of the model-gated
describe or create a new describe that is not guarded by test.skip, leaving
model-dependent tests (that call judgePage/Midscene) inside the existing gated
block; adjust references to startPlaygroundPreviewServer(),
previewServer.createDemoPreviewUrl(), waitForPreviewText(), and the
PLAYGROUND_DEMO_CASES loop so the render-only loop runs unconditionally while
preserving the gated tests that actually need hasMidsceneModelConfig().
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 01f9d408-aec2-41be-afe6-3e20735c758b
📒 Files selected for processing (18)
.github/actions/ui-judge-comment/README.md.github/actions/ui-judge-comment/action.yml.github/actions/ui-judge-comment/comment.mjs.github/scripts/write-ui-judge-failure-result.mjs.github/ui-judge-ci.instructions.md.github/ui-judge.instructions.md.github/workflows/test.yml.github/workflows/workflow-test.yml.gitignoreAGENTS.mdpackages/genui/a2ui-playground/lynx-src/a2ui/App.tsxpackages/genui/a2ui-playground/src/render.tsxpackages/genui/a2ui-playground/src/utils/renderUrl.tspackages/genui/ui-judge/README.mdpackages/genui/ui-judge/src/index.tspackages/genui/ui-judge/tests/fixtures/interactive.htmlpackages/genui/ui-judge/tests/helpers/playground-preview-server.tspackages/genui/ui-judge/tests/judge-page.spec.ts
💤 Files with no reviewable changes (1)
- packages/genui/ui-judge/tests/fixtures/interactive.html
There was a problem hiding this comment.
🧹 Nitpick comments (2)
.github/workflows/test.yml (1)
115-123: 💤 Low valueConsider DRYing the repeated step-level condition.
The same
github.event_name == 'pull_request' && github.event.pull_request.head.repo.full_name == github.repository && needs.ui-judge.result != 'skipped' && needs.ui-judge.result != 'cancelled'is repeated verbatim on the checkout, download-artifact, and comment steps. The reason for keeping it at step-level (rather than at job-level) is presumably so the job still resolves tosuccessfor thedoneaggregator on non-PR/fork events; that's fine. As a minor cleanup, you could compute it once into an env value (e.g.env.SHOULD_COMMENT) at the job level and reference${{ env.SHOULD_COMMENT == 'true' }}per step to avoid drift if the eligibility rules change. Non-blocking.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/test.yml around lines 115 - 123, The repeated complex step-level condition used on the checkout, download-artifact, and "Comment UI Judge result" steps should be computed once and reused; add a job-level env variable (e.g. env.SHOULD_COMMENT) that evaluates the same expression to 'true' or 'false' and then replace each step's if: with a simple check referencing that env (e.g. if: ${{ env.SHOULD_COMMENT == 'true' }}), updating the steps that include actions/checkout, actions/download-artifact@..., and the "Comment UI Judge result" step to use the new env variable..github/scripts/write-ui-judge-result.mjs (1)
7-10: 💤 Low valueGuard against
GITHUB_WORKSPACEbeing unset.When neither
UI_JUDGE_RESULT_FILEnorGITHUB_WORKSPACEis set (e.g., when running the script locally for the dry-run testing path mentioned in the PR description),join(undefined, 'ui-judge-results.json')throws aTypeError [ERR_INVALID_ARG_TYPE]with a stack trace rather than a helpful message. Adding an explicit check (or falling back toprocess.cwd()) keeps the failure mode clean for local invocations.🛡️ Suggested defensive fallback
-const resultFile = process.env.UI_JUDGE_RESULT_FILE - || join(process.env.GITHUB_WORKSPACE, 'ui-judge-results.json'); +const workspace = process.env.GITHUB_WORKSPACE || process.cwd(); +const resultFile = process.env.UI_JUDGE_RESULT_FILE + || join(workspace, 'ui-judge-results.json');🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/scripts/write-ui-judge-result.mjs around lines 7 - 10, The code assumes process.env.GITHUB_WORKSPACE is defined when computing resultFile (using join(process.env.GITHUB_WORKSPACE, 'ui-judge-results.json')), which throws if it's undefined; change the logic that sets resultFile to first use process.env.UI_JUDGE_RESULT_FILE, and if absent compute the path using a safe workspace variable (e.g., const workspace = process.env.GITHUB_WORKSPACE || process.cwd()) before calling join; update the resultFile assignment to use that workspace fallback so local/dry-run invocations don't throw a TypeError.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In @.github/scripts/write-ui-judge-result.mjs:
- Around line 7-10: The code assumes process.env.GITHUB_WORKSPACE is defined
when computing resultFile (using join(process.env.GITHUB_WORKSPACE,
'ui-judge-results.json')), which throws if it's undefined; change the logic that
sets resultFile to first use process.env.UI_JUDGE_RESULT_FILE, and if absent
compute the path using a safe workspace variable (e.g., const workspace =
process.env.GITHUB_WORKSPACE || process.cwd()) before calling join; update the
resultFile assignment to use that workspace fallback so local/dry-run
invocations don't throw a TypeError.
In @.github/workflows/test.yml:
- Around line 115-123: The repeated complex step-level condition used on the
checkout, download-artifact, and "Comment UI Judge result" steps should be
computed once and reused; add a job-level env variable (e.g. env.SHOULD_COMMENT)
that evaluates the same expression to 'true' or 'false' and then replace each
step's if: with a simple check referencing that env (e.g. if: ${{
env.SHOULD_COMMENT == 'true' }}), updating the steps that include
actions/checkout, actions/download-artifact@..., and the "Comment UI Judge
result" step to use the new env variable.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 7692921e-e4e1-40c5-98cd-b823a0d3a3f1
📒 Files selected for processing (4)
.github/scripts/write-ui-judge-result.mjs.github/ui-judge-ci.instructions.md.github/workflows/test.yml.github/workflows/workflow-test.yml
💤 Files with no reviewable changes (1)
- .github/workflows/workflow-test.yml
✅ Files skipped from review due to trivial changes (1)
- .github/ui-judge-ci.instructions.md
Summary
render.htmlpreview path.packages/genui/a2ui-playgroundwithpnpm devfrom a test-only helper, and cover several direct examples withspeed=0.judgePageas the only public@lynx-js/ui-judgeAPI while keeping validation tests server-free..github/actions/ui-judge-comment, a dependency-free composite action that creates or updates a PR comment fromUiJudgeResultJSON.Details
/render.html?protocol=a2ui&demoUrl=.%2Fa2ui.web.js&theme=light&demo=recs&speed=0.speed=0now means no streaming delay, so all messages are processed immediately.MIDSCENE_MODEL_NAMEis configured.result-fileorresult-json, supports dry-run output, updates an existing marked comment by default, and usesgithub.tokenunless a token is supplied.ui-judgejob is gated to pull requests with Midscene secrets and UI Judge/A2UI/playground-related file changes; it writesUI_JUDGE_RESULT_FILEand calls the new comment action.ui-judgechanged-file gate uses the GitHub pull request files API instead of localgit diff, because the custom container runner does not expose the checkout as a Git worktree inside shell commands.ui-judgejob now follows the web-elements Playwright pattern: it depends on the repositorybuildjob, restores the strict.turbocache for the current commit, runspnpm turbo build --summarizein the Playwright container, then prepares the A2UI playground Lynx artifact before testing.Test Plan
pnpm turbo build:lynx --filter a2ui-playgroundpnpm --filter @lynx-js/ui-judge buildUI_JUDGE_RESULT_FILE=/tmp/ui-judge-results.json pnpm --filter @lynx-js/ui-judge test(6 passed; generated a score result JSON)INPUT_RESULT_FILE=/tmp/ui-judge-results.json INPUT_DRY_RUN=true GITHUB_REPOSITORY=lynx-family/lynx-stack GITHUB_RUN_ID=123 node .github/actions/ui-judge-comment/comment.mjsnode --check .github/actions/ui-judge-comment/comment.mjspnpm eslint .github/actions/ui-judge-comment/comment.mjs packages/genui/ui-judge/tests/judge-page.spec.ts --flag v10_config_lookup_from_filepnpm biome check .github/actions/ui-judge-comment/comment.mjs packages/genui/ui-judge/tests/judge-page.spec.tsruby -e 'require "yaml"; YAML.load_file(".github/workflows/test.yml"); puts "yaml ok"'env -u MIDSCENE_MODEL_NAME -u MIDSCENE_MODEL_API_KEY bash -ceu 'tmp=$(mktemp); event=$(mktemp); printf "{\"number\":2673}\n" > "$event"; export GITHUB_OUTPUT="$tmp" GITHUB_EVENT_PATH="$event" GITHUB_EVENT_NAME=pull_request; node --input-type=module <<'"'"'NODE'"'"' import { appendFileSync } from "node:fs"; let shouldRun = false; let reason = "UI Judge only comments on pull_request events."; if (process.env.GITHUB_EVENT_NAME === "pull_request") { if (!process.env.MIDSCENE_MODEL_NAME || !process.env.MIDSCENE_MODEL_API_KEY) { reason = "Midscene model secrets are not configured for this pull request."; } } appendFileSync(process.env.GITHUB_OUTPUT,should-run=${shouldRun}\n); appendFileSync(process.env.GITHUB_OUTPUT,reason=${reason}\n); console.info(reason); NODE cat "$tmp"'git diff --checkSummary by CodeRabbit
Release Notes
New Features
Improvements
speed=0for no-delay rendering modeDocumentation