feat: responsive viewport comparison in calibration (#182) by let-sunny · Pull Request #183 · let-sunny/canicode

let-sunny · 2026-03-29T14:04:06Z

Summary

캘리브레이션 파이프라인에 responsive viewport 비교 추가. responsive-critical 룰의 score를 AI 의견이 아닌 pixel-level 측정값으로 검증.

Problem

기존 파이프라인은 디자인 원본 사이즈에서만 1회 비교. Ablation 실험에서 확인된 바:

같은 뷰포트: bad-structure 98% vs good 94% (구분 불가)
다른 뷰포트: bad-structure 63% vs good 95% (-32%p gap)

fixed-size-in-auto-layout, missing-size-constraint 등 responsive-critical 룰의 score가 측정 없이 AI 의견으로만 validated/overscored 판정됨.

Changes

1. `visual-compare` CLI — `--figma-screenshot` 옵션 추가

로컬 스크린샷 파일을 직접 지정 (Figma API fetch 생략)
--figma-url과 조합: URL은 파싱용, 스크린샷은 로컬 파일 사용
기존 --figma-url 단독 사용은 변경 없음 (하위 호환)

# 기존 (변경 없음)
npx canicode visual-compare output.html --figma-url "..." --output ./run

# 새로운: 확대 뷰포트 비교
npx canicode visual-compare output.html \
  --figma-url "..." \
  --figma-screenshot fixtures/desktop-home-page/screenshot-1920.png \
  --width 1920 \
  --output ./run/responsive

2. `visual-compare` core — `figmaScreenshotPath` 옵션

VisualCompareOptions에 figmaScreenshotPath 필드 추가
제공되면 해당 파일을 figma.png로 복사, API fetch 생략
없으면 기존 동작 (URL fetch 또는 캐시)

3. Converter 프롬프트 — 2-pass 비교

fixture에서 screenshot-*.png glob → 숫자 정렬 → 가장 작은 = 원본, 가장 큰 = 확대
원본 비교 (기존) → similarity
확대 비교 (신규) → responsiveSimilarity
responsiveDelta = similarity - responsiveSimilarity
스크린샷 1개뿐이면 responsive 비교 skip

4. `conversion.json` — responsive 필드 추가

{
  "similarity": 87,
  "responsiveSimilarity": 72,
  "responsiveDelta": 15,
  "responsiveViewport": 1920,
  ...
}

5. Evaluation — responsiveDelta 기반 difficulty 결정

responsive-critical 룰에 한해 AI 의견 대신 측정된 responsiveDelta로 difficulty 판정
임계값 (Experiment 04 기반):

responsiveDelta	Difficulty	근거
≤5%p	easy	뷰포트 변경해도 거의 안 깨짐
6-15%p	moderate	눈에 띄는 깨짐
16-30%p	hard	심각한 깨짐
>30%p	failed	완전히 깨짐 (Exp.04: -32%p)

기존 mismatch의 type과 reasoning이 측정값으로 덮어씌워지며, 원래 AI 의견도 기록에 남김

6. Orchestrator — responsiveDelta 전달

conversion.json에서 responsiveDelta 추출 → evaluation input에 전달
null이면 responsive override 안 함 (하위 호환)

Architecture

Converter
  ├── visual-compare (original viewport) → similarity
  └── visual-compare --figma-screenshot (expanded) → responsiveSimilarity
                                                      ↓
  responsiveDelta = similarity - responsiveSimilarity
                                                      ↓
Evaluation
  ├── non-responsive rules: AI opinion (기존)
  └── responsive-critical rules: responsiveDelta (측정값) ← NEW

Test plan

pnpm lint — pass
pnpm test:run — 646 tests pass
calibrate-loop 실행으로 responsive 비교 동작 확인

Closes #182

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Responsive viewport visual comparison when multiple fixture screenshots exist; records responsiveSimilarity, responsiveDelta and responsiveViewport
- Use a local Figma screenshot via CLI (--figma-screenshot) and automatic export-scale detection for local screenshots
- Capture HTML/code metrics (bytes, lines, CSS class/variable counts) in conversion output
Behavior / Evaluation
- Rule evaluation can be influenced by responsiveDelta to adjust difficulty and validation
Documentation
- Updated converter and calibration loop docs to describe new outputs and metrics
Tests
- Added tests for responsive-driven evaluation overrides

Add --figma-screenshot option to visual-compare CLI for local screenshot comparison without API fetch. Converter now runs 2-pass comparison: original viewport + expanded viewport from fixture screenshots. responsiveDelta feeds into evaluation to validate responsive-critical rule scores with measured pixel data instead of AI opinion. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-03-29T14:04:13Z

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

Adds an optional responsive-viewport comparison and HTML/CSS code-metric collection to the converter; orchestrator forwards a numeric responsiveDelta to the evaluation agent, which can selectively reclassify responsive-critical rules based on that delta. Visual-compare and its CLI accept local Figma screenshot inputs and infer export scale.

Changes

Cohort / File(s)	Summary
Converter docs `.claude/agents/calibration/converter.md`	Documented responsive comparison step; added `responsiveSimilarity`, `responsiveDelta`, `responsiveViewport`, and code metrics (`htmlBytes`, `htmlLines`, `cssClassCount`, `cssVariableCount`) to `conversion.json`.
Evaluation contract `src/agents/contracts/evaluation-agent.ts`	Added optional `responsiveDelta?: number
Evaluation logic & tests `src/agents/evaluation-agent.ts`, `src/agents/evaluation-agent.test.ts`	Added `responsiveDeltaToDifficulty` helper and logic to override classifications for rules in `RULE_ID_CATEGORY` with category `"responsive-critical"` when `responsiveDelta` is provided; updated reasoning/validated set handling and added tests covering overrides and edge cases.
Orchestrator wiring `src/agents/orchestrator.ts`	Extracts numeric `responsiveDelta` from `conversionJson["responsiveDelta"]` (or `null`) and passes it into `runEvaluationAgent`.
CLI visual-compare `src/cli/commands/visual-compare.ts`	Added `--figma-screenshot` option; allowed either `--figma-url` or `--figma-screenshot`; made token requirement conditional; pass screenshot path through to `visualCompare`.
Visual-compare helpers `src/core/engine/visual-compare-helpers.ts`	Added exported `KNOWN_1X_WIDTHS`, `inferExportScale`, `countCssClasses`, `countCssVariables`, and `computeCodeMetrics`.
Visual-compare core `src/core/engine/visual-compare.ts`	Added `VisualCompareOptions.figmaScreenshotPath`; support validating/copying local PNG as ground truth and infer `exportScale` from PNG width when using local screenshot.
Ablation helpers re-export `src/agents/ablation/helpers.ts`	Replaced local 1x-width heuristic and CSS-metric implementations with imports/re-exports from `visual-compare-helpers`.
Calibration loop docs & logging `.claude/commands/calibrate-loop.md`	Persist converter token/tool/duration metrics into `conversion.json` and include token counts in converter activity logs (`activity.jsonl`).

Sequence Diagrams

sequenceDiagram
    participant Converter
    participant Orchestrator
    participant EvaluationAgent
    participant RuleClassifier
    Converter->>Orchestrator: emit conversionJson (includes responsiveDelta, code metrics)
    Orchestrator->>EvaluationAgent: runEvaluationAgent(input + responsiveDelta)
    EvaluationAgent->>EvaluationAgent: if responsiveDelta != null\ncompute responsiveDifficulty
    EvaluationAgent->>RuleClassifier: classifyFlaggedRule(score, responsiveDifficulty)
    RuleClassifier-->>EvaluationAgent: updated type & actualDifficulty
    EvaluationAgent->>EvaluationAgent: update reasoning & validatedRuleSet
    EvaluationAgent-->>Orchestrator: return updated mismatches & report

sequenceDiagram
    participant User
    participant CLI
    participant VisualCompare
    participant FileSystem
    User->>CLI: visual-compare --figma-screenshot <path>
    CLI->>VisualCompare: visualCompare({figmaScreenshotPath: path, ...})
    VisualCompare->>FileSystem: stat/validate path
    FileSystem-->>VisualCompare: file metadata (width, exists)
    VisualCompare->>VisualCompare: inferExportScale(png.width)
    VisualCompare->>FileSystem: copy PNG -> outputDir/figma.png
    VisualCompare->>VisualCompare: run comparison with inferred scale
    VisualCompare-->>CLI: comparison results (similarity, responsiveSimilarity, metrics)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

refactor: extract visual-compare helpers to separate module (#33) #40: Centralizes visual-compare helper functions—overlaps additions to src/core/engine/visual-compare-helpers.ts.
feat: ablation v2 — redesigned experiment with multi-metric measurement #147: Touches CSS/HTML metric helpers and agent re-exports—closely related to agents/ablation re-export changes.
refactor: split CLI index.ts into per-command modules (#47) #59: Modifies the visual-compare CLI handling—related to the new --figma-screenshot option and parsing changes.

Poem

🐰 I hopped from small to wide this morn,
I measured deltas where views were torn.
Pixels whispered, classes counted true,
Rules were nudged by what I knew.
Hooray — the rabbit logged the view!

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat: responsive viewport comparison in calibration (`#182`)' clearly and specifically describes the main feature addition: implementing responsive viewport comparison in the calibration pipeline.
Linked Issues check	✅ Passed	The PR implements all core requirements: responsive viewport comparison with dual-screenshot processing, responsiveDelta measurement, evaluation override logic for responsive-critical rules with specified thresholds, backward compatibility, and code metrics collection.
Out of Scope Changes check	✅ Passed	All changes directly support responsive viewport comparison: CLI option for local screenshots, converter two-pass logic with responsiveDelta, evaluation rule override using thresholds, orchestrator integration, visual-compare core extensions, and refactoring of CSS metric helpers for code reuse.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/182-responsive-visual-compare

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

- Delete rule from validatedRuleSet when override changes type away from validated - Clamp negative responsiveDelta to 0 (treat as easy) - Add 4 tests: override to underscored, keep validated, skip non-responsive, negative delta Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

@2x

Reuse ablation helpers' KNOWN_1X_WIDTHS convention: 1920/768 = @1x, others = @2x. When --figma-screenshot is used, inferExportScale() reads PNG width and sets the correct scale automatically. Previously defaulted to @2x for all local screenshots, causing dimension mismatch for 1920px/768px fixture screenshots. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Add computeCodeMetrics() to visual-compare-helpers.ts (htmlBytes, htmlLines, cssClassCount, cssVariableCount) - Converter records metrics in conversion.json after generating output.html - De-duplicate: ablation helpers.ts re-exports from core instead of own copy - Metrics proxy for token cost — subagents can't measure tokens directly Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.claude/agents/calibration/converter.md:
- Around line 63-77: The ordered list item labelled "5. **Responsive
comparison**" has incorrect/ inconsistent numbering; update the list numbering
to continue the sequence used earlier (e.g., ensure it follows the prior item
number) and adjust the markdown so there is a blank line before and after the
fenced code block to satisfy MD031; locate the "Responsive comparison" section
and the fenced code block that starts with "```bash" and correct the list
numeral and surrounding blank lines accordingly.

In `@src/agents/evaluation-agent.ts`:
- Around line 172-193: The code uses RULE_ID_CATEGORY[mismatch.ruleId as RuleId]
without guarding for an unknown key; change the block in evaluate/adjustment so
you first ensure mismatch.ruleId is present and that
RULE_ID_CATEGORY[mismatch.ruleId] is !== undefined before comparing to
"responsive-critical" (remove the unsafe `as RuleId` cast), e.g. compute const
category = RULE_ID_CATEGORY[mismatch.ruleId]; if (!mismatch.ruleId || category
=== undefined) continue; then proceed with classifyFlaggedRule, setting
mismatch.type, actualDifficulty, reasoning and updating validatedRuleSet as
before.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 5eb89a93-5ef4-4fde-a67b-6fea4343f405

📥 Commits

Reviewing files that changed from the base of the PR and between 87e7d64 and 2289bf8.

📒 Files selected for processing (8)

.claude/agents/calibration/converter.md
src/agents/contracts/evaluation-agent.ts
src/agents/evaluation-agent.test.ts
src/agents/evaluation-agent.ts
src/agents/orchestrator.ts
src/cli/commands/visual-compare.ts
src/core/engine/visual-compare-helpers.ts
src/core/engine/visual-compare.ts

.claude/agents/calibration/converter.md

src/agents/evaluation-agent.ts

Orchestrator records converterTokens, converterToolUses, converterDurationMs from subagent usage metadata. Direct measurement — no API proxy needed. Code metrics (htmlBytes, cssClassCount etc.) remain for output quality, token usage now measured from subagent execution. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Fix ordered list numbering in converter.md (MD031/MD029) - Guard unknown ruleId before RULE_ID_CATEGORY access (noUncheckedIndexedAccess) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

src/agents/ablation/helpers.ts (1)

154-156: 🧹 Nitpick | 🔵 Trivial

Duplicate KNOWN_1X_WIDTHS constant — import from shared helpers instead.

This constant duplicates KNOWN_1X_WIDTHS from src/core/engine/visual-compare-helpers.ts (line 19). Since you're already importing from that module, consider importing this constant as well to maintain a single source of truth.

♻️ Proposed fix

-import { compareScreenshots } from "../../core/engine/visual-compare-helpers.js";
+import { compareScreenshots, KNOWN_1X_WIDTHS } from "../../core/engine/visual-compare-helpers.js";

Then remove lines 155-156:

   // Figma save-fixture exports at `@2x` by default. 1920/768 condition screenshots are `@1x`.
   // Detect: if width matches a known `@1x` size (1920, 768), use scale 1. Otherwise `@2x`.
-  const KNOWN_1X_WIDTHS = [1920, 768];
   const exportScale = KNOWN_1X_WIDTHS.includes(figmaWidth) ? 1 : 2;

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/agents/ablation/helpers.ts` around lines 154 - 156, Remove the duplicate
KNOWN_1X_WIDTHS constant in src/agents/ablation/helpers.ts and import the shared
KNOWN_1X_WIDTHS from src/core/engine/visual-compare-helpers.ts instead; update
the code that computes exportScale (the line using
KNOWN_1X_WIDTHS.includes(figmaWidth) ? 1 : 2) to reference the imported constant
and delete the local declaration so the module uses the single source of truth.

.claude/agents/calibration/converter.md (1)

81-90: ⚠️ Potential issue | 🟡 Minor

Duplicate step numbering: two steps labeled "6".

Lines 81 and 90 are both numbered "6". The second should be "7" to maintain sequential ordering.

📝 Suggested fix

-6. **MANDATORY — Rule Impact Assessment**: For EVERY rule ID in `nodeIssueSummaries[].flaggedRuleIds`, assess its actual impact on conversion.
+7. **MANDATORY — Rule Impact Assessment**: For EVERY rule ID in `nodeIssueSummaries[].flaggedRuleIds`, assess its actual impact on conversion.

Also update the subsequent steps:

Line 94: "7." → "8."
Line 99: "8." → "9."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In @.claude/agents/calibration/converter.md around lines 81 - 90, Fix the
duplicate step numbering in the calibration converter docs by renumbering the
second "6." (the Rule Impact Assessment step that references
nodeIssueSummaries[].flaggedRuleIds and ruleImpactAssessment) to "7.", then
increment the following steps accordingly (change the current "7." to "8." and
"8." to "9.") so the sequence is consecutive.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In @.claude/agents/calibration/converter.md:
- Around line 81-90: Fix the duplicate step numbering in the calibration
converter docs by renumbering the second "6." (the Rule Impact Assessment step
that references nodeIssueSummaries[].flaggedRuleIds and ruleImpactAssessment) to
"7.", then increment the following steps accordingly (change the current "7." to
"8." and "8." to "9.") so the sequence is consecutive.

In `@src/agents/ablation/helpers.ts`:
- Around line 154-156: Remove the duplicate KNOWN_1X_WIDTHS constant in
src/agents/ablation/helpers.ts and import the shared KNOWN_1X_WIDTHS from
src/core/engine/visual-compare-helpers.ts instead; update the code that computes
exportScale (the line using KNOWN_1X_WIDTHS.includes(figmaWidth) ? 1 : 2) to
reference the imported constant and delete the local declaration so the module
uses the single source of truth.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: fae58546-060d-4004-a18a-5a100762f686

📥 Commits

Reviewing files that changed from the base of the PR and between 2289bf8 and 8b3934b.

📒 Files selected for processing (5)

.claude/agents/calibration/converter.md
.claude/commands/calibrate-loop.md
src/agents/ablation/helpers.ts
src/agents/evaluation-agent.ts
src/core/engine/visual-compare-helpers.ts

- ablation helpers.ts: import inferExportScale() from core instead of local constant - converter.md: fix duplicate step 6 → sequential 7/8/9 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 4

♻️ Duplicate comments (1)

.claude/agents/calibration/converter.md (1)
63-63: ⚠️ Potential issue | 🟡 Minor

List numbering still incorrect after previous fix attempt.

Static analysis continues to flag this line as incorrectly numbered (expected 6, actual 5). The past review comment indicated this was addressed, but the issue persists.

Root cause: Line 56 duplicates step number 3, causing all subsequent steps to be off by one. While line 56 is not part of this PR's changes, the new responsive comparison step should be numbered correctly as step 6 to reflect its actual position in the sequence.
🔢 Numbering fix
-5. **Responsive comparison** (if expanded screenshot exists):
+6. **Responsive comparison** (if expanded screenshot exists):
Also fix subsequent steps: 6→7, 7→8, 8→9, 9→10, or address the root cause by fixing the duplicate step 3 at line 56.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.claude/agents/calibration/converter.md at line 63, The numbered list for
"Responsive comparison" is currently labeled 5 but should be 6; update the list
item text "5. **Responsive comparison** (if expanded screenshot exists):" to "6.
**Responsive comparison** (if expanded screenshot exists):" and then renumber
the following steps (change 6→7, 7→8, 8→9, 9→10) OR remove the duplicate "3."
entry at the earlier step (the duplicate on line 56) so the entire sequence
shifts correctly; ensure the label text for the item containing "Responsive
comparison" and all subsequent step labels are consistent with the corrected
sequence.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.claude/agents/calibration/converter.md:
- Around line 79-80: The instructions currently say "set both to null" when only
1 screenshot exists but the schema defines three responsive fields; update the
converter text to explicitly set responsiveSimilarity, responsiveDelta, and
responsiveViewport to null when fewer than two screenshots exist (i.e., skip
responsive comparison and assign null to all three fields).
- Line 98: The description for cssVariableCount uses the ambiguous pattern
`--*:`; update it to an explicit pattern or example so readers know what to
count — e.g., change `--*:` to a concrete regex `--[a-zA-Z0-9_-]+:` (or show an
example like `--primary-color:`) and mention that cssVariableCount should count
unique occurrences of custom properties defined in a <style> block using that
pattern.
- Around line 65-77: The doc is missing explicit steps to pick the largest
screenshot, extract its width, and read the similarity result; update the
instructions to: list and sort screenshot-*.png into an array (e.g.,
SCREENSHOTS), set LARGEST to the last element of that sorted list, extract the
numeric width from LARGEST into LARGEST_WIDTH (e.g., by regex matching the
screenshot-<number>.png pattern), pass "$LARGEST" and "$LARGEST_WIDTH" to the
existing npx canicode visual-compare invocation, and after the run read
responsiveSimilarity from the comparison output (e.g., parse
$RUN_DIR/responsive/comparison.json and read the similarity field into
RESPONSIVE_SIMILARITY). Include these exact symbol names (SCREENSHOTS, LARGEST,
LARGEST_WIDTH, RESPONSIVE_SIMILARITY) and reference the screenshot-*.png pattern
and $RUN_DIR/responsive/comparison.json so reviewers can locate the changes.
- Around line 94-99: The documented code-metrics fields (htmlBytes, htmlLines,
cssClassCount, cssVariableCount, responsiveViewport, responsiveSimilarity) in
step 8 and the conversion.json schema are unused downstream (downstream agents
only read ruleImpactAssessment, uncoveredStruggles, responsiveDelta); either
remove these six fields from the converter.md step 8 and the conversion.json
schema so the converter agent stops collecting them, or update the evaluation
pipeline/downstream extraction to consume them (i.e., add mappings from
htmlBytes/htmlLines/cssClassCount/cssVariableCount/responsiveViewport/responsiveSimilarity
into the downstream outputs or ruleImpactAssessment/responsiveDelta), and ensure
conversion.json, step 8 text, and any schema references (conversion.json,
ruleImpactAssessment, uncoveredStruggles, responsiveDelta) are kept consistent.

---

Duplicate comments:
In @.claude/agents/calibration/converter.md:
- Line 63: The numbered list for "Responsive comparison" is currently labeled 5
but should be 6; update the list item text "5. **Responsive comparison** (if
expanded screenshot exists):" to "6. **Responsive comparison** (if expanded
screenshot exists):" and then renumber the following steps (change 6→7, 7→8,
8→9, 9→10) OR remove the duplicate "3." entry at the earlier step (the duplicate
on line 56) so the entire sequence shifts correctly; ensure the label text for
the item containing "Responsive comparison" and all subsequent step labels are
consistent with the corrected sequence.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 186753e8-3a2c-4501-9542-a9c1bbb41336

📥 Commits

Reviewing files that changed from the base of the PR and between 8b3934b and 3f9548c.

📒 Files selected for processing (2)

.claude/agents/calibration/converter.md
src/agents/ablation/helpers.ts

.claude/agents/calibration/converter.md

coderabbitai · 2026-03-29T15:26:25Z

.claude/agents/calibration/converter.md

+8. **Code metrics**: After writing `output.html`, record these in conversion.json:
+   - `htmlBytes`: file size in bytes
+   - `htmlLines`: line count
+   - `cssClassCount`: unique CSS class selectors in `<style>` block
+   - `cssVariableCount`: unique CSS custom properties (`--*:`) in `<style>` block
+9. Note any difficulties NOT covered by existing rules as `uncoveredStruggles`


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify that code metrics and responsive fields (except responsiveDelta) are not consumed rg -n --type=ts -C3 'htmlBytes|htmlLines|cssClassCount|cssVariableCount|responsiveSimilarity|responsiveViewport' \ -g '!*.test.ts' \ -g '!*.md'

Repository: let-sunny/canicode

Length of output: 5457

🏁 Script executed:

# Check orchestrator.ts at the referenced lines to see what fields are extracted find . -name "orchestrator.ts" -type f | head -5

Repository: let-sunny/canicode

Length of output: 90

🏁 Script executed:

# Also search for calibration converter implementation and how it's used rg -l "calibration.*converter" --type=ts -g '!*.test.ts'

Repository: let-sunny/canicode

Length of output: 44

🏁 Script executed:

# Check what the converter agent actually does with these metrics rg -n "conversion\.json" --type=ts -C2 -g '!*.test.ts' | head -50

Repository: let-sunny/canicode

Length of output: 2245

🏁 Script executed:

# Check orchestrator.ts at the referenced lines sed -n '300,330p' src/agents/orchestrator.ts

Repository: let-sunny/canicode

Length of output: 866

🏁 Script executed:

# Check calibrate-debate.ts to see what fields it extracts from conversion.json sed -n '28,80p' src/cli/commands/internal/calibrate-debate.ts

Repository: let-sunny/canicode

Length of output: 2013

🏁 Script executed:

# Also search for where these metrics (htmlBytes, htmlLines, etc.) are used after being read rg -n "ruleImpactAssessment|uncoveredStruggles|conversion" src/cli/commands/internal/calibrate-debate.ts -A3 -B3

Repository: let-sunny/canicode

Length of output: 1700

🏁 Script executed:

# Read the converter.md file mentioned in the review cat .claude/agents/calibration/converter.md | head -150

Repository: let-sunny/canicode

Length of output: 6743

🏁 Script executed:

# Search for actual converter agent implementation or execution rg -l "converter" src/agents --type=ts -g '!*.test.ts' | head -10

Repository: let-sunny/canicode

Length of output: 94

🏁 Script executed:

# Look for where conversion.json is written to understand the schema rg -n "conversion.json" src --type=ts -B5 -A5 | grep -A20 "writeFileSync.*conversion"

Repository: let-sunny/canicode

Length of output: 1761

🏁 Script executed:

# Verify responsiveSimilarity and responsiveViewport are extracted rg -n "responsiveSimilarity|responsiveViewport" src/agents/orchestrator.ts -B2 -A2

Repository: let-sunny/canicode

Length of output: 44

🏁 Script executed:

# Double-check evaluation-agent contract for these fields grep -A30 "responsiveSimilarity\|responsiveViewport" src/agents/contracts/evaluation-agent.ts

Repository: let-sunny/canicode

Length of output: 451

🏁 Script executed:

# Check for all responsive-related field usage in orchestrator rg -n "responsive" src/agents/orchestrator.ts -i

Repository: let-sunny/canicode

Length of output: 281

🏁 Script executed:

# Check if responsiveViewport appears anywhere in codebase rg -n "responsiveViewport" src --type=ts

Repository: let-sunny/canicode

Length of output: 44

🏁 Script executed:

# Check the full conversion.json schema comment in converter.md to see all fields sed -n '114,155p' .claude/agents/calibration/converter.md

Repository: let-sunny/canicode

Length of output: 1380

Remove or use code metrics fields documented in step 8.

The instructions direct the converter agent to collect htmlBytes, htmlLines, cssClassCount, cssVariableCount, responsiveViewport, and responsiveSimilarity (lines 94-99 and the conversion.json schema). However, downstream agents extract only ruleImpactAssessment, uncoveredStruggles, and responsiveDelta from conversion.json. The six metrics above are never consumed and create unnecessary effort for the agent.

Either remove these fields from the documentation or integrate them into the evaluation pipeline if they serve a future purpose.

🧰 Tools

🪛 markdownlint-cli2 (0.22.0)

[warning] 94-94: Ordered list item prefix
Expected: 9; Actual: 8; Style: 1/2/3

(MD029, ol-prefix)

[warning] 99-99: Ordered list item prefix
Expected: 10; Actual: 9; Style: 1/2/3

(MD029, ol-prefix)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.claude/agents/calibration/converter.md around lines 94 - 99, The documented code-metrics fields (htmlBytes, htmlLines, cssClassCount, cssVariableCount, responsiveViewport, responsiveSimilarity) in step 8 and the conversion.json schema are unused downstream (downstream agents only read ruleImpactAssessment, uncoveredStruggles, responsiveDelta); either remove these six fields from the converter.md step 8 and the conversion.json schema so the converter agent stops collecting them, or update the evaluation pipeline/downstream extraction to consume them (i.e., add mappings from htmlBytes/htmlLines/cssClassCount/cssVariableCount/responsiveViewport/responsiveSimilarity into the downstream outputs or ruleImpactAssessment/responsiveDelta), and ensure conversion.json, step 8 text, and any schema references (conversion.json, ruleImpactAssessment, uncoveredStruggles, responsiveDelta) are kept consistent.

.claude/agents/calibration/converter.md

…s, css pattern - Fix step numbering: 1-10 sequential (was duplicate 3, duplicate 6) - Concrete bash: array sort + grep for screenshot width extraction - "set both to null" → explicit 3 fields (responsiveSimilarity, responsiveDelta, responsiveViewport) - Code metrics annotated as "for analysis/reporting — not consumed by evaluation" - CSS variable pattern: --*: → example (--primary-color:, --spacing-md:) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

let-sunny and others added 2 commits March 29, 2026 23:13

let-sunny marked this pull request as ready for review March 29, 2026 14:22

let-sunny marked this pull request as draft March 29, 2026 14:25

coderabbitai bot reviewed Mar 29, 2026

View reviewed changes

.claude/agents/calibration/converter.md Outdated Show resolved Hide resolved

src/agents/evaluation-agent.ts Show resolved Hide resolved

let-sunny and others added 2 commits March 29, 2026 23:29

fix: address CodeRabbit review — list numbering + defensive guard

8b3934b

- Fix ordered list numbering in converter.md (MD031/MD029) - Guard unknown ruleId before RULE_ID_CATEGORY access (noUncheckedIndexedAccess) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

let-sunny marked this pull request as ready for review March 29, 2026 14:33

coderabbitai bot reviewed Mar 29, 2026

View reviewed changes

fix: deduplicate KNOWN_1X_WIDTHS + fix converter step numbering

3f9548c

- ablation helpers.ts: import inferExportScale() from core instead of local constant - converter.md: fix duplicate step 6 → sequential 7/8/9 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai bot reviewed Mar 29, 2026

View reviewed changes

let-sunny merged commit 65b24e2 into main Mar 29, 2026
2 of 3 checks passed

let-sunny deleted the feat/182-responsive-visual-compare branch March 29, 2026 16:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: responsive viewport comparison in calibration (#182)#183

feat: responsive viewport comparison in calibration (#182)#183
let-sunny merged 8 commits intomainfrom
feat/182-responsive-visual-compare

let-sunny commented Mar 29, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 29, 2026 •

edited

Loading

Review failed

Walkthrough

Changes

Sequence Diagrams

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Mar 29, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

let-sunny commented Mar 29, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Changes

1. visual-compare CLI — --figma-screenshot 옵션 추가

2. visual-compare core — figmaScreenshotPath 옵션

3. Converter 프롬프트 — 2-pass 비교

4. conversion.json — responsive 필드 추가

5. Evaluation — responsiveDelta 기반 difficulty 결정

6. Orchestrator — responsiveDelta 전달

Architecture

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagrams

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

let-sunny commented Mar 29, 2026 •

edited by coderabbitai bot

Loading

1. `visual-compare` CLI — `--figma-screenshot` 옵션 추가

2. `visual-compare` core — `figmaScreenshotPath` 옵션

4. `conversion.json` — responsive 필드 추가

coderabbitai bot commented Mar 29, 2026 •

edited

Loading