feat: add page-based GenUI UI judge package#2629
Conversation
|
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughThis PR introduces the Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Merging this PR will improve performance by 17.81%
Performance Changes
Tip Curious why this is faster? Comment Comparing Footnotes
|
Web Explorer#10038 Bundle Size — 903.49KiB (~-0.01%).a010ecb(current) vs 531ef76 main#10033(baseline) Bundle metrics
Bundle size by type
Bundle analysis report Branch PupilTong:codex/genuiuijudge-0 Project dashboard Generated by RelativeCI Documentation Report issue |
React Example with Element Template#733 Bundle Size — 200.08KiB (0%).a010ecb(current) vs 531ef76 main#728(baseline) Bundle metrics
Bundle size by type
|
| Current #733 |
Baseline #728 |
|
|---|---|---|
145.76KiB |
145.76KiB |
|
54.32KiB |
54.32KiB |
Bundle analysis report Branch PupilTong:codex/genuiuijudge-0 Project dashboard
Generated by RelativeCI Documentation Report issue
React External#1579 Bundle Size — 695.64KiB (0%).a010ecb(current) vs 531ef76 main#1574(baseline) Bundle metrics
|
| Current #1579 |
Baseline #1574 |
|
|---|---|---|
0B |
0B |
|
0B |
0B |
|
0% |
0% |
|
0 |
0 |
|
3 |
3 |
|
17 |
17 |
|
5 |
5 |
|
8.59% |
8.59% |
|
0 |
0 |
|
0 |
0 |
Bundle analysis report Branch PupilTong:codex/genuiuijudge-0 Project dashboard
Generated by RelativeCI Documentation Report issue
React Example#8464 Bundle Size — 237.24KiB (0%).a010ecb(current) vs 531ef76 main#8459(baseline) Bundle metrics
|
| Current #8464 |
Baseline #8459 |
|
|---|---|---|
0B |
0B |
|
0B |
0B |
|
0% |
0% |
|
0 |
0 |
|
4 |
4 |
|
198 |
198 |
|
80 |
80 |
|
44.74% |
44.74% |
|
2 |
2 |
|
0 |
0 |
Bundle size by type no changes
| Current #8464 |
Baseline #8459 |
|
|---|---|---|
145.76KiB |
145.76KiB |
|
91.48KiB |
91.48KiB |
Bundle analysis report Branch PupilTong:codex/genuiuijudge-0 Project dashboard
Generated by RelativeCI Documentation Report issue
React MTF Example#1597 Bundle Size — 208.18KiB (0%).a010ecb(current) vs 531ef76 main#1592(baseline) Bundle metrics
|
| Current #1597 |
Baseline #1592 |
|
|---|---|---|
0B |
0B |
|
0B |
0B |
|
0% |
0% |
|
0 |
0 |
|
3 |
3 |
|
193 |
193 |
|
77 |
77 |
|
44.24% |
44.24% |
|
2 |
2 |
|
0 |
0 |
Bundle size by type no changes
| Current #1597 |
Baseline #1592 |
|
|---|---|---|
111.23KiB |
111.23KiB |
|
96.95KiB |
96.95KiB |
Bundle analysis report Branch PupilTong:codex/genuiuijudge-0 Project dashboard
Generated by RelativeCI Documentation Report issue
a01f968 to
32f64bd
Compare
32f64bd to
e36f807
Compare
This reverts commit e36f807.
There was a problem hiding this comment.
🧹 Nitpick comments (2)
packages/genui/ui-judge/tests/fixtures/interactive.html (1)
108-110: ⚡ Quick winFix inconsistent indentation in the script block.
Line 108 has no indentation while lines 109-110 have 6 spaces. All variable declarations should use consistent indentation.
✨ Proposed fix for consistent indentation
<script> - const details = document.getElementById('details'); + const details = document.getElementById('details'); const viewport = document.getElementById('viewport'); const reveal = document.getElementById('reveal');🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/genui/ui-judge/tests/fixtures/interactive.html` around lines 108 - 110, The const declarations for the DOM elements (const details, const viewport, const reveal) have inconsistent indentation; make all three declarations use the same indentation level (e.g., align the leading whitespace so each line starts with the same number of spaces or tabs) so the script block is consistently formatted; update the lines that define document.getElementById('details'), document.getElementById('viewport'), and document.getElementById('reveal') to match the chosen indentation style.packages/genui/ui-judge/package.json (1)
25-30: ⚡ Quick winMove
@playwright/testto devDependencies.The runtime source (
src/index.ts) only importsPageas a type, while@playwright/testis actually needed for test execution and build configuration. Moving it fromdependenciestodevDependenciesprevents test tooling from being included in runtime installs.♻️ Proposed manifest adjustment
"dependencies": { - "`@midscene/web`": "^1.8.0", - "`@playwright/test`": "^1.58.2" + "`@midscene/web`": "^1.8.0" }, "devDependencies": { - "`@types/node`": "^24.10.13" + "`@playwright/test`": "^1.58.2", + "`@types/node`": "^24.10.13" },🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/genui/ui-judge/package.json` around lines 25 - 30, Update the package manifest so `@playwright/test` is listed under devDependencies instead of dependencies: remove "`@playwright/test`" from the "dependencies" block and add it to "devDependencies" (keeping the same version "^1.58.2"); this ensures runtime imports like the type-only Page in src/index.ts do not pull test tooling into production installs and keeps test-only packages with other dev tooling.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@packages/genui/ui-judge/package.json`:
- Around line 25-30: Update the package manifest so `@playwright/test` is listed
under devDependencies instead of dependencies: remove "`@playwright/test`" from
the "dependencies" block and add it to "devDependencies" (keeping the same
version "^1.58.2"); this ensures runtime imports like the type-only Page in
src/index.ts do not pull test tooling into production installs and keeps
test-only packages with other dev tooling.
In `@packages/genui/ui-judge/tests/fixtures/interactive.html`:
- Around line 108-110: The const declarations for the DOM elements (const
details, const viewport, const reveal) have inconsistent indentation; make all
three declarations use the same indentation level (e.g., align the leading
whitespace so each line starts with the same number of spaces or tabs) so the
script block is consistently formatted; update the lines that define
document.getElementById('details'), document.getElementById('viewport'), and
document.getElementById('reveal') to match the chosen indentation style.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 0ea55898-b698-48bb-9d04-c1ebb9f31cee
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (12)
.github/ui-judge.instructions.mdpackages/genui/tsconfig.jsonpackages/genui/ui-judge/README.mdpackages/genui/ui-judge/package.jsonpackages/genui/ui-judge/playwright.config.tspackages/genui/ui-judge/rslib.config.tspackages/genui/ui-judge/src/index.tspackages/genui/ui-judge/tests/fixtures/interactive.htmlpackages/genui/ui-judge/tests/judge-page.spec.tspackages/genui/ui-judge/tsconfig.build.jsonpackages/genui/ui-judge/tsconfig.jsonpackages/genui/ui-judge/turbo.json
Summary
@lynx-js/ui-judgeunderpackages/genui/ui-judgewith a single publicjudgePageAPI.page; callers own navigation, viewport, cookies, route mocks, authentication, and page lifecycle.aiAct/aiNumberto interact with the current page and return a JSON-serializablevisual-correctnessscore from0to5; the returnedurlis read frompage.url().Self-review
judgePageand exported TypeScript types.url, or callspage.goto()internally.0-5value and does not reintroduceGRADE:or letter grades.score: 0anderror.messageinstead of escaping as unhandled failures.Validation
pnpm run buildpnpm -F @lynx-js/ui-judge buildpnpm eslint packages/genui/ui-judge --flag v10_config_lookup_from_filepnpm -F @lynx-js/ui-judge testin the current session:1 passed, 1 skippedbecause this Codex shell does not haveMIDSCENE_MODEL_NAME.2 passed (17.8s).pnpm dprint check packages/genui/ui-judge .github/ui-judge.instructions.mdreturned exit code 0 with a sandbox cache-write warning.git diff --checkeslint,biome,dprint, andsort-package-json.Summary by CodeRabbit
Release Notes
New Features
@lynx-js/ui-judgepackage for automated UI evaluationjudgePageAPI for assessing visual correctness on a 0-5 scaleDocumentation
Chores