fix: internal calibration CLI improvements (#63)#82
Conversation
Reclassify categories: layout + ai-readability → structure, handoff-risk → behavior. Merge overlapping rules: no-auto-layout absorbs ambiguous-structure + missing-layout-hint, fixed-size-in-auto-layout absorbs fixed-width-in-responsive-context, missing-min-width + missing-max-width → missing-size-constraint, invisible-layer + empty-frame → unnecessary-node, hardcode-risk removed (covered by absolute-position-in-auto-layout). Add 3 new rules: overflow-behavior-unknown, wrap-behavior-unknown, variant-structure-mismatch. Final count: structure(9) token(7) component(4) naming(5) behavior(4) = 29 rules. Closes #79 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…-size) Merged rules should provide more detailed fix guidance that covers all absorbed symptoms, not just the base rule's original fix. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Was too vague ("Use a consistent naming convention"). Now includes
specific examples and explains why AI needs consistent naming.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…checks, test fixes - Add INSTANCE to isContainerNode helper (structure rules) - Guard noUncheckedIndexedAccess: early continue for undefined children - Skip wrap-behavior-unknown when children lack bounding box data - Add +1 tolerance comment for floating-point rounding in overflow check - Fix misleading test title for hidden overlapping children case - Add 48x48 boundary test for unnecessary-node placeholder exemption Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…c assertions - Add boundary-condition tests for text-truncation-unhandled (length=50, width=300, length=51) - Fix non-deterministic test in responsive-fields: remove if-guard, assert deterministically - Rename misleading test title to match actual expectation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
TypeScript strict types + JSDoc on public APIs serve as documentation. The 80% threshold wasn't respecting the configured 40% override anyway. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… accuracy - Rename test titles to match actual scenarios (both missing vs one missing) - Add minWidth to fixture that tests maxWidth-only path - Add "only minWidth missing" test for complete coverage of 3 violation paths Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove `outputPath: "unused"` from calibrate-run's call to runCalibrationAnalyze, since the function never uses it - Add CalibrationConfigInput type (z.input) so the function accepts optional fields with defaults (outputPath, maxConversionNodes, etc.) - calibrate-evaluate optional args and calibrate-gap-report minRepeat edge case were already fixed on this branch Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughA comprehensive restructuring of the rule categorization system, replacing Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 4❌ Failed checks (4 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/core/rules/structure/z-index-dependent-layout.test.ts (1)
4-83: 🧹 Nitpick | 🔵 TrivialConsider restoring boundary condition tests.
Based on the AI summary, this test file previously included tests for:
- Exact 20% overlap boundary — The implementation uses
overlapArea > smallerArea * 0.2(strictly greater than), so exact 20% overlap should returnnull. This boundary test would catch off-by-one changes to the threshold.- Children without
absoluteBoundingBox— The implementation handles this withif (!boxA || !boxB) continue;, but a regression test ensures this graceful handling persists.These boundary conditions are important for preventing regressions. Consider restoring them:
💚 Suggested boundary tests
it("returns null when overlap is exactly 20% (boundary)", () => { // Create children where overlap area = exactly 20% of smaller element const childA = makeNode({ id: "c:1", absoluteBoundingBox: { x: 0, y: 0, width: 100, height: 100 }, // area = 10000 }); const childB = makeNode({ id: "c:2", // 20% of 10000 = 2000, so overlap needs to be exactly 2000 // overlap of 50x40 = 2000 absoluteBoundingBox: { x: 50, y: 60, width: 100, height: 100 }, }); const node = makeNode({ type: "FRAME", children: [childA, childB], }); expect(zIndexDependentLayout.check(node, makeContext())).toBeNull(); }); it("returns null when children have no absoluteBoundingBox", () => { const childA = makeNode({ id: "c:1" }); const childB = makeNode({ id: "c:2" }); const node = makeNode({ type: "FRAME", children: [childA, childB], }); expect(zIndexDependentLayout.check(node, makeContext())).toBeNull(); });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/core/rules/structure/z-index-dependent-layout.test.ts` around lines 4 - 83, Add two regression tests to cover boundary conditions: one that constructs two children via makeNode whose overlap area is exactly 20% of the smaller element and asserts zIndexDependentLayout.check(node, makeContext()) returns null (verifying the implementation's strict > 20% threshold), and another that creates two children without absoluteBoundingBox properties and asserts zIndexDependentLayout.check(...) returns null (verifying the early-continue handling when boxA/boxB are missing); place these alongside the existing tests referencing zIndexDependentLayout.check, makeNode, and makeContext so future changes won't regress these cases.src/agents/orchestrator.ts (1)
174-185:⚠️ Potential issue | 🟠 MajorHonor
targetNodeIdwhen the caller provides it.The schema accepts
targetNodeId, but Lines 182-185 still buildanalyzeOptionsonly from the node ID extracted out ofinput. A caller that passestargetNodeIdseparately will silently analyze the wrong subtree.🔧 Suggested fix
const parsed = CalibrationConfigSchema.parse(config); const { file, fileKey, nodeId } = await loadFile(parsed.input, parsed.token); + if (parsed.targetNodeId && nodeId && parsed.targetNodeId !== nodeId) { + throw new Error("targetNodeId conflicts with the node id encoded in input"); + } - const analyzeOptions = nodeId ? { targetNodeId: nodeId } : {}; + const targetNodeId = parsed.targetNodeId ?? nodeId; + const analyzeOptions = targetNodeId ? { targetNodeId } : {}; const analysisResult = analyzeFile(file, analyzeOptions);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/agents/orchestrator.ts` around lines 174 - 185, runCalibrationAnalyze builds analyzeOptions from the nodeId returned by loadFile but ignores a caller-supplied parsed.targetNodeId; update the logic in runCalibrationAnalyze to honor parsed.targetNodeId (e.g. use parsed.targetNodeId if present, otherwise fall back to nodeId from loadFile) when constructing analyzeOptions before calling analyzeFile so the correct subtree is analyzed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/core/engine/scoring.ts`:
- Around line 75-80: TOTAL_RULES_PER_CATEGORY is hard-coded and can get out of
sync with the actual registered rules; replace the manual map by computing
per-category counts at module initialization from the authoritative rule
registry (e.g., iterate the registered rules/definitions collection used
elsewhere in this module) and assign that result to TOTAL_RULES_PER_CATEGORY (or
a new derived constant) so density/diversity calculations use the real
denominators; update any references in scoring functions (e.g., where
TOTAL_RULES_PER_CATEGORY is used) to use the derived counts so adding/removing
rules no longer requires manual edits.
In `@src/core/rules/behavior/overflow-behavior-unknown.test.ts`:
- Line 1: The import list in the test file unnecessarily includes makeFile;
remove makeFile from the import statement (leaving makeNode and makeContext) to
eliminate the unused import and clean up the module import for the file where
makeNode, makeContext are used.
In `@src/core/rules/behavior/wrap-behavior-unknown.test.ts`:
- Line 1: The import statement includes an unused symbol `makeFile`; update the
import at the top of the test to remove `makeFile` so only the actually used
helpers (`makeNode`, `makeContext`) are imported, keeping the import list
minimal and avoiding the unused `makeFile` identifier.
In `@src/core/rules/component/index.ts`:
- Around line 391-404: The current comparison uses the first element as baseline
(const base = fingerprints[0]) causing order-dependent counts; update the logic
in this block to compute the canonical fingerprint as the most frequent value
(mode) across fingerprints and then compute mismatched as those not equal to
that mode so mismatchCount is order-independent, then return the same object
(ruleId: variantStructureMismatchDef.id, nodeId: node.id, nodePath:
context.path.join(" > "), message using node.name) but with the corrected
mismatchCount/totalVariants values.
In `@src/core/rules/component/variant-structure-mismatch.test.ts`:
- Line 1: The import list at the top includes an unused symbol `makeFile`;
remove `makeFile` from the import statement so only the used helpers
(`makeNode`, `makeContext`) are imported; update the import in the test file
that currently references `makeNode, makeFile, makeContext` to drop `makeFile`
to eliminate the unused-import lint error.
In `@src/core/rules/structure/no-auto-layout.test.ts`:
- Line 1: The import list includes an unused symbol `makeFile` in the module
where `makeNode` and `makeContext` are used; either remove `makeFile` from the
import statement or use it in the tests (e.g., replace a helper call with
`makeFile` where appropriate). Update the import that currently reads `import {
makeNode, makeFile, makeContext }` to drop `makeFile` if it's not needed, or add
a test case that uses `makeFile` so the import is justified.
---
Outside diff comments:
In `@src/agents/orchestrator.ts`:
- Around line 174-185: runCalibrationAnalyze builds analyzeOptions from the
nodeId returned by loadFile but ignores a caller-supplied parsed.targetNodeId;
update the logic in runCalibrationAnalyze to honor parsed.targetNodeId (e.g. use
parsed.targetNodeId if present, otherwise fall back to nodeId from loadFile)
when constructing analyzeOptions before calling analyzeFile so the correct
subtree is analyzed.
In `@src/core/rules/structure/z-index-dependent-layout.test.ts`:
- Around line 4-83: Add two regression tests to cover boundary conditions: one
that constructs two children via makeNode whose overlap area is exactly 20% of
the smaller element and asserts zIndexDependentLayout.check(node, makeContext())
returns null (verifying the implementation's strict > 20% threshold), and
another that creates two children without absoluteBoundingBox properties and
asserts zIndexDependentLayout.check(...) returns null (verifying the
early-continue handling when boxA/boxB are missing); place these alongside the
existing tests referencing zIndexDependentLayout.check, makeNode, and
makeContext so future changes won't regress these cases.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 2cab55c8-10d6-4a21-aae2-592e85b6ebf2
📒 Files selected for processing (47)
.coderabbit.yamlsrc/agents/analysis-agent.test.tssrc/agents/contracts/calibration.tssrc/agents/evidence-collector.test.tssrc/agents/gap-rule-report.test.tssrc/agents/orchestrator.test.tssrc/agents/orchestrator.tssrc/agents/report-generator.test.tssrc/cli/commands/internal/calibrate-run.tssrc/core/contracts/category.tssrc/core/contracts/rule.tssrc/core/engine/integration.test.tssrc/core/engine/rule-engine.test.tssrc/core/engine/scoring.test.tssrc/core/engine/scoring.tssrc/core/report-html/index.tssrc/core/rules/ai-readability/ambiguous-structure.test.tssrc/core/rules/ai-readability/empty-frame.test.tssrc/core/rules/ai-readability/index.tssrc/core/rules/ai-readability/invisible-layer.test.tssrc/core/rules/ai-readability/missing-layout-hint.test.tssrc/core/rules/behavior/index.tssrc/core/rules/behavior/overflow-behavior-unknown.test.tssrc/core/rules/behavior/prototype-link-in-design.test.tssrc/core/rules/behavior/text-truncation-unhandled.test.tssrc/core/rules/behavior/wrap-behavior-unknown.test.tssrc/core/rules/component/index.tssrc/core/rules/component/variant-structure-mismatch.test.tssrc/core/rules/custom/custom-rule-loader.test.tssrc/core/rules/handoff-risk/hardcode-risk.test.tssrc/core/rules/handoff-risk/prototype-link-in-design.test.tssrc/core/rules/index.tssrc/core/rules/layout/absolute-position.test.tssrc/core/rules/layout/deep-nesting.test.tssrc/core/rules/layout/fixed-width-in-responsive-context.test.tssrc/core/rules/layout/group-usage.test.tssrc/core/rules/layout/index.tssrc/core/rules/layout/missing-responsive-behavior.test.tssrc/core/rules/layout/no-auto-layout.test.tssrc/core/rules/naming/index.tssrc/core/rules/rule-config.tssrc/core/rules/structure/index.tssrc/core/rules/structure/no-auto-layout.test.tssrc/core/rules/structure/responsive-fields.test.tssrc/core/rules/structure/unnecessary-node.test.tssrc/core/rules/structure/z-index-dependent-layout.test.tssrc/core/ui-constants.ts
💤 Files with no reviewable changes (15)
- src/cli/commands/internal/calibrate-run.ts
- src/core/rules/layout/missing-responsive-behavior.test.ts
- src/core/rules/layout/group-usage.test.ts
- src/core/rules/handoff-risk/hardcode-risk.test.ts
- src/core/rules/ai-readability/missing-layout-hint.test.ts
- src/core/rules/ai-readability/ambiguous-structure.test.ts
- src/core/rules/handoff-risk/prototype-link-in-design.test.ts
- src/core/rules/ai-readability/empty-frame.test.ts
- src/core/rules/ai-readability/invisible-layer.test.ts
- src/core/rules/layout/deep-nesting.test.ts
- src/core/rules/layout/no-auto-layout.test.ts
- src/core/rules/layout/absolute-position.test.ts
- src/core/rules/layout/fixed-width-in-responsive-context.test.ts
- src/core/rules/ai-readability/index.ts
- src/core/rules/layout/index.ts
| const TOTAL_RULES_PER_CATEGORY: Record<Category, number> = { | ||
| layout: 8, | ||
| structure: 9, | ||
| token: 7, | ||
| component: 3, | ||
| component: 4, | ||
| naming: 5, | ||
| "ai-readability": 5, | ||
| "handoff-risk": 4, | ||
| behavior: 4, |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Avoid hand-maintaining category rule totals.
These literals have to stay in sync with the registered rules. If a rule is added or removed and this map is missed, diversity percentages become silently wrong. Please derive the denominators from the rule definitions/registry at module init instead of updating them by hand.
As per coding guidelines, src/core/engine/scoring.ts: Verify density/diversity weighting logic is mathematically sound.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/core/engine/scoring.ts` around lines 75 - 80, TOTAL_RULES_PER_CATEGORY is
hard-coded and can get out of sync with the actual registered rules; replace the
manual map by computing per-category counts at module initialization from the
authoritative rule registry (e.g., iterate the registered rules/definitions
collection used elsewhere in this module) and assign that result to
TOTAL_RULES_PER_CATEGORY (or a new derived constant) so density/diversity
calculations use the real denominators; update any references in scoring
functions (e.g., where TOTAL_RULES_PER_CATEGORY is used) to use the derived
counts so adding/removing rules no longer requires manual edits.
| @@ -0,0 +1,89 @@ | |||
| import { makeNode, makeFile, makeContext } from "../test-helpers.js"; | |||
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Unused import: makeFile
The makeFile helper is imported but never used in this test file.
🧹 Proposed fix
-import { makeNode, makeFile, makeContext } from "../test-helpers.js";
+import { makeNode, makeContext } from "../test-helpers.js";📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| import { makeNode, makeFile, makeContext } from "../test-helpers.js"; | |
| import { makeNode, makeContext } from "../test-helpers.js"; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/core/rules/behavior/overflow-behavior-unknown.test.ts` at line 1, The
import list in the test file unnecessarily includes makeFile; remove makeFile
from the import statement (leaving makeNode and makeContext) to eliminate the
unused import and clean up the module import for the file where makeNode,
makeContext are used.
| @@ -0,0 +1,121 @@ | |||
| import { makeNode, makeFile, makeContext } from "../test-helpers.js"; | |||
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Unused import: makeFile
The makeFile helper is imported but never used in this test file.
🧹 Proposed fix
-import { makeNode, makeFile, makeContext } from "../test-helpers.js";
+import { makeNode, makeContext } from "../test-helpers.js";📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| import { makeNode, makeFile, makeContext } from "../test-helpers.js"; | |
| import { makeNode, makeContext } from "../test-helpers.js"; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/core/rules/behavior/wrap-behavior-unknown.test.ts` at line 1, The import
statement includes an unused symbol `makeFile`; update the import at the top of
the test to remove `makeFile` so only the actually used helpers (`makeNode`,
`makeContext`) are imported, keeping the import list minimal and avoiding the
unused `makeFile` identifier.
| // Compare all fingerprints to the first one | ||
| const base = fingerprints[0]; | ||
| const mismatched = fingerprints.filter(fp => fp !== base); | ||
|
|
||
| if (mismatched.length === 0) return null; | ||
|
|
||
| const mismatchCount = mismatched.length; | ||
| const totalVariants = fingerprints.length; | ||
|
|
||
| return { | ||
| ruleId: variantStructureMismatchDef.id, | ||
| nodeId: node.id, | ||
| nodePath: context.path.join(" > "), | ||
| message: `"${node.name}" has ${mismatchCount}/${totalVariants} variants with different child structures — AI cannot create a unified component template`, |
There was a problem hiding this comment.
Make the mismatch count independent of variant order.
Line 392 uses the first variant as the baseline. For fingerprints like [A, B, B], this reports 2/3 mismatches if the outlier happens to be first, and 1/3 if it is not. The issue itself is valid, but the reported count becomes order-dependent and can overstate divergence.
🔧 Suggested fix
- // Compare all fingerprints to the first one
- const base = fingerprints[0];
- const mismatched = fingerprints.filter(fp => fp !== base);
-
- if (mismatched.length === 0) return null;
-
- const mismatchCount = mismatched.length;
+ const fingerprintCounts = new Map<string, number>();
+ let dominantCount = 0;
+ for (const fingerprint of fingerprints) {
+ const nextCount = (fingerprintCounts.get(fingerprint) ?? 0) + 1;
+ fingerprintCounts.set(fingerprint, nextCount);
+ if (nextCount > dominantCount) dominantCount = nextCount;
+ }
+
+ const mismatchCount = fingerprints.length - dominantCount;
+ if (mismatchCount === 0) return null;
const totalVariants = fingerprints.length;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/core/rules/component/index.ts` around lines 391 - 404, The current
comparison uses the first element as baseline (const base = fingerprints[0])
causing order-dependent counts; update the logic in this block to compute the
canonical fingerprint as the most frequent value (mode) across fingerprints and
then compute mismatched as those not equal to that mode so mismatchCount is
order-independent, then return the same object (ruleId:
variantStructureMismatchDef.id, nodeId: node.id, nodePath: context.path.join(" >
"), message using node.name) but with the corrected mismatchCount/totalVariants
values.
| @@ -0,0 +1,95 @@ | |||
| import { makeNode, makeFile, makeContext } from "../test-helpers.js"; | |||
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Unused import: makeFile
The makeFile helper is imported but never used in this test file.
🧹 Proposed fix
-import { makeNode, makeFile, makeContext } from "../test-helpers.js";
+import { makeNode, makeContext } from "../test-helpers.js";📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| import { makeNode, makeFile, makeContext } from "../test-helpers.js"; | |
| import { makeNode, makeContext } from "../test-helpers.js"; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/core/rules/component/variant-structure-mismatch.test.ts` at line 1, The
import list at the top includes an unused symbol `makeFile`; remove `makeFile`
from the import statement so only the used helpers (`makeNode`, `makeContext`)
are imported; update the import in the test file that currently references
`makeNode, makeFile, makeContext` to drop `makeFile` to eliminate the
unused-import lint error.
| @@ -0,0 +1,140 @@ | |||
| import { makeNode, makeFile, makeContext } from "../test-helpers.js"; | |||
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Unused import: makeFile
The makeFile helper is imported but never used in this test file.
🧹 Proposed fix
-import { makeNode, makeFile, makeContext } from "../test-helpers.js";
+import { makeNode, makeContext } from "../test-helpers.js";📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| import { makeNode, makeFile, makeContext } from "../test-helpers.js"; | |
| import { makeNode, makeContext } from "../test-helpers.js"; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/core/rules/structure/no-auto-layout.test.ts` at line 1, The import list
includes an unused symbol `makeFile` in the module where `makeNode` and
`makeContext` are used; either remove `makeFile` from the import statement or
use it in the tests (e.g., replace a helper call with `makeFile` where
appropriate). Update the import that currently reads `import { makeNode,
makeFile, makeContext }` to drop `makeFile` if it's not needed, or add a test
case that uses `makeFile` so the import is justified.
Summary
outputPathfrom call torunCalibrationAnalyze()CalibrationConfigInput(input type) so callers can omit fields with schema defaultsDepends on #81
Closes #63
Test plan
pnpm lintpassespnpm test:run590/590 passes🤖 Generated with Claude Code
Summary by CodeRabbit
Refactor
New Features