feat: weight diversity scoring by rule severity by let-sunny · Pull Request #111 · let-sunny/canicode

let-sunny · 2026-03-26T10:09:37Z

Summary

Replace equal-count diversity (triggered rules / total rules) with severity-weighted diversity (Σ|score| of triggered / Σ|score| of all)
High-severity rules now penalize diversity score more than low-severity rules
TOTAL_SCORE_PER_CATEGORY computed dynamically from RULE_CONFIGS + RULE_ID_CATEGORY (no hardcoded sync risk)

Re-evaluation: Is density alone sufficient after #104?

After #110 landed calculatedScore-based density, density does differentiate rules by score. However, diversity still adds value in a specific scenario:

Density alone misses breadth. Consider two designs with identical density (same weightedIssueCount / nodeCount):

Design A: 20 instances of no-auto-layout (1 rule, systematic)
Design B: 4 different structure rules triggered 5 times each (scattered)

Density scores are identical, but Design B is harder to fix — 4 different problems require 4 different solutions. Diversity captures this by penalizing breadth.

Severity-weighting makes diversity more accurate: in old equal-count diversity, triggering 1 blocking rule and 1 suggestion rule had equal impact. Now blocking rules correctly dominate the diversity penalty.

Conclusion: Diversity remains necessary as a complement to density. The change in this PR makes it more accurate by aligning diversity penalties with actual rule severity.

Edge case: low-severity rules

3 suggestion rules (score -2, -1, -1) triggering in structure → weighted ratio 4/42 = 9.5% → diversity 90%.
This is intentional: low-severity issues represent minor concerns that shouldn't heavily penalize the design score.

Depends on

feat: use calibrated per-rule scores in final percentage calculation #110 (merged)

Closes #106

Summary by CodeRabbit

Improvements
- Scoring refined to weight density and diversity by each issue’s calculated impact and per-rule impact, improving prioritization across categories and presets.
- Scoring now uses the effective rule configuration (including presets and overrides), so results reflect active settings.
Tests
- Updated tests to validate scoring reacts to per-issue and per-rule impact values (including density/diversity interactions) rather than severity-only assumptions.

coderabbitai · 2026-03-26T10:09:44Z

📝 Walkthrough

Walkthrough

Scoring now derives density from each issue’s calculatedScore and computes diversity as a weighted ratio of triggered rule base scores vs per-category total base scores. calculateScores accepts optional rule configs; a RULE_ID_CATEGORY mapping was added and CLI/server callers pass the effective configs. Tests updated accordingly.

Changes

Cohort / File(s)	Summary
Scoring core & tests `src/core/engine/scoring.ts`, `src/core/engine/scoring.test.ts`	Density now uses `Math.abs(issue.calculatedScore)` for weighted issue counts; diversity is computed as `weightedTriggered / weightedTotal` using `Math.abs(config.score)`; `calculateScores` signature gained optional `configs?: Record<RuleId, RuleConfig>`; tests updated to assert calculatedScore-driven density and score-weighted diversity.
Rule configuration `src/core/rules/rule-config.ts`	Added exported `RULE_ID_CATEGORY: Record<RuleId, Category>` mapping rule IDs to categories (new public constant) for per-category aggregations.
CLI & server callers `src/cli/commands/analyze.ts`, `src/mcp/server.ts`	Call sites now pass the effective/preset-adjusted `configs` into `calculateScores(result, configs as Record<RuleId, RuleConfig>)` so scoring uses the active rule configs.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant CLI as CLI / analyze command
participant Server as MCP server
participant Engine as Scoring Engine
participant Configs as Rule Configs
CLI->>Server: request analyze
Server->>Configs: compute effective configs (preset + overrides)
Server->>Engine: analyze file(s) -> produce AnalysisResult
Server->>Engine: calculateScores(AnalysisResult, configs)
Engine->>Engine: computeDensity(using issue.calculatedScore)
Engine->>Engine: computeDiversity(weighted by |config.score| per category)
Engine-->>Server: ScoreReport
Server-->>CLI: return results / exit code

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Diversity scoring should weight triggered rules by severity #106 — Directly addressed: diversity now weights triggered rules by severity/score instead of counting unique rules equally.
Use calibrated per-rule scores in final percentage calculation #104 — Related: builds on using calculatedScore for density; this PR extends scoring to consume rule configs for diversity denominators.

Possibly related PRs

calibrate: nightly run — 7 score adjustments from 9 fixtures #93 — Updates RULE_CONFIGS scores; those config values will influence the new weighted diversity and scoring calculations.
feat: use calibrated per-rule scores in final percentage calculation #110 — Overlaps on moving density to per-issue calculatedScore and related scoring test changes.
test: scoring.ts, rule-engine.ts 핵심 모듈 유닛 테스트 추가 #58 — Touches the same scoring paths and tests; likely overlaps with this PR's test/spec adjustments.

Poem

🐰
I hopped through scores both wide and deep,
I counted carrots, not just peeps,
Weighted hops for every rule,
Balanced baskets — neat and cool,
A crunchy carrot for the keep!

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: diversity scoring now weights triggered rules by their severity/score values instead of treating all rules equally.
Linked Issues check	✅ Passed	The PR fully implements the objectives from issue `#106`: diversity scoring now uses severity-weighted ratios (Σ\|score\| of triggered rules / Σ\|score\| of all rules), dynamically computes TOTAL_SCORE_PER_CATEGORY, and includes tests validating severity-weighted diversity behavior.
Out of Scope Changes check	✅ Passed	All changes are directly aligned with issue `#106` objectives: weight diversity by rule severity, compute denominators dynamically, and update related scoring logic and tests to support the new weighted diversity model.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/severity-weighted-diversity

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

let-sunny

코드 리뷰

전반적으로 문제 정의가 명확하고 해결 방향도 적절합니다. 몇 가지 피드백 남깁니다.

1. `TOTAL_SCORE_PER_CATEGORY` 하드코딩 → 동기화 리스크

rule-config.ts의 score가 변경되거나 룰이 추가/삭제될 때마다 이 상수를 수동으로 맞춰야 합니다. 주석에 "Must be updated"라고 적었지만, 실제로는 잊기 쉽습니다. rule-config.ts에서 동적으로 계산하는 방식을 고려해볼 수 있습니다.

2. density vs diversity의 score 사용 비대칭

density: issue.calculatedScore 사용 (= base score × depthWeight) → 위치(depth) 반영
diversity: issue.config.score 사용 (= base score만) → 위치 무시

의도적 설계로 보이지만 (diversity는 "어떤 종류의 문제가 있는가"를 측정하므로 depth 무관), 이 비대칭에 대한 근거를 코드 주석에 명시해두면 향후 혼란을 줄일 수 있을 것 같습니다.

3. diversity 의미론 변화에 따른 edge case

기존: "얼마나 다양한 종류의 문제가 있는가" (breadth)
변경 후: "severity 가중치 기준 weighted breadth"

suggestion 룰 3개 트리거 (score: -2, -1, -1) 예시:

기존: 3/9 = 33% ratio → diversity 67%
변경 후: 4/42 = 9.5% ratio → diversity 90%

낮은 severity 룰들이 대량 트리거되는 케이스에서 diversity가 과도하게 높게 나올 수 있습니다. 이것이 의도된 동작인지 확인이 필요합니다.

4. #106 이슈의 재검토 조건 미충족

이슈 #106 본문에 아래 내용이 있습니다:

Re-evaluate after #104 whether this change is still needed, or whether density alone provides sufficient differentiation.

#104가 #110으로 이미 머지된 상태이므로, density만으로 severity 차이가 충분히 반영되는지 재검토 결과를 PR 본문에 포함하면 좋겠습니다.

5. 테스트 & 정확성

새 테스트 케이스가 핵심 시나리오를 잘 커버하고 있습니다 ✓
TOTAL_SCORE_PER_CATEGORY 값을 rule-config.ts 대조 검증한 결과 모두 정확합니다 ✓

let-sunny · 2026-03-26T10:31:05Z

Review Feedback Addressed (`5b4c80b`)

1. ✅ TOTAL_SCORE_PER_CATEGORY 동적 계산

RULE_ID_CATEGORY 매핑을 rule-config.ts에 추가
scoring.ts에서 IIFE로 RULE_CONFIGS + RULE_ID_CATEGORY에서 자동 계산
룰 추가/삭제/점수 변경 시 수동 동기화 불필요

2. ✅ density vs diversity 비대칭 코드 주석

diversity 블록에 "Uses base score intentionally — diversity measures what types of problems exist, not where they occur" 주석 추가

3. ✅ low-severity edge case

의도된 동작으로 판단: low-severity 룰은 실제로 덜 중요한 문제 → diversity 영향도 적은 게 맞음
명시적 테스트 추가: structure 내에서 suggestion (score -2) → diversity 95% vs blocking (score -10) → diversity 76%

4. ✅ #104 이후 재검토 결과

PR 본문에 추가: density만으로는 breadth(문제 종류의 다양성)를 잡지 못함
동일한 density에서도 "1종류 20건" vs "4종류 5건씩"은 수정 난이도가 다름 → diversity 필요

5. 테스트 & 정확성

625 tests passed, type check clean

let-sunny · 2026-03-26T10:32:41Z

✅ LGTM — 리뷰 피드백 5건 모두 반영 확인. 머지 가능합니다.

TOTAL_SCORE_PER_CATEGORY 동적 계산으로 동기화 리스크 해소
density/diversity score 비대칭 근거 주석 추가
low-severity edge case 의도 명시 + 테스트 추가
Use calibrated per-rule scores in final percentage calculation #104 재검토 근거 PR 본문에 기재
625 tests pass, type check clean

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/core/engine/scoring.test.ts`:
- Around line 89-117: Test currently uses makeIssue where opts.score populates
both issue.config.score and issue.calculatedScore, so it doesn't prove density
uses calculatedScore; change the fixture to create an issue whose config.score
and calculatedScore differ and assert density/weightedIssueCount follows
calculatedScore. Locate the tests using calculateScores, makeResult and
makeIssue (the two "uses calculatedScore..." and "differentiates rules..."
cases) and either call makeIssue with an explicit calculatedScore different from
config.score (or mutate the produced issue to set issue.calculatedScore) before
passing to makeResult, then keep the existing expectations on
byCategory.structure.densityScore and weightedIssueCount to ensure they reflect
calculatedScore rather than config.score.

In `@src/core/engine/scoring.ts`:
- Around line 69-82: TOTAL_SCORE_PER_CATEGORY is computed from the static
RULE_CONFIGS while weightedTriggered/issue.config.score come from the
preset-adjusted map returned by getConfigsWithPreset, causing mismatched
numerator/denominator; change the code to compute totals from the same preset
config map used to produce result.issues (i.e., derive totals from the config
object returned by getConfigsWithPreset instead of RULE_CONFIGS). Specifically,
when you call getConfigsWithPreset (or wherever result.issues' issue.config is
created), pass that resulting config map into the TOTAL_SCORE_PER_CATEGORY
calculation (or replace the current IIFE with a function that accepts the preset
config map) so totals are computed by iterating over that preset-adjusted config
map (use the same rule ids and RULE_ID_CATEGORY mapping) to sum
Math.abs(config.score) per category; ensure any other places noted (lines
~214-218) also use the preset config totals rather than the base RULE_CONFIGS.

🪄 Autofix (Beta)

✅ Autofix completed

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 10a9a73c-a096-450d-bd7a-c8c6829b65ec

📥 Commits

Reviewing files that changed from the base of the PR and between a20d5fa and 5b4c80b.

📒 Files selected for processing (3)

src/core/engine/scoring.test.ts
src/core/engine/scoring.ts
src/core/rules/rule-config.ts

src/core/engine/scoring.test.ts

src/core/engine/scoring.ts

coderabbitai · 2026-03-26T10:41:42Z

Note

Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it.

Fixes Applied Successfully

Fixed 4 file(s) based on 2 unresolved review comments.

Files modified:

src/cli/commands/analyze.ts
src/core/engine/scoring.test.ts
src/core/engine/scoring.ts
src/mcp/server.ts

Commit: 60841c1807a34cac7f2a8264cfb9aa973bf9bb7c

The changes have been pushed to the feat/severity-weighted-diversity branch.

Time taken: 7m 10s

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/core/engine/scoring.test.ts (1)

32-46: 🧹 Nitpick | 🔵 Trivial

Consider adding calculatedScore parameter to makeIssue helper.

The tests at lines 91-95 and 110-114 manually override calculatedScore after calling makeIssue. Adding an optional parameter would be cleaner.

♻️ Suggested helper improvement

 function makeIssue(opts: {
   ruleId: string;
   category: Category;
   severity: Severity;
   score?: number;
+  calculatedScore?: number;
 }): AnalysisIssue {
   return {
     violation: makeViolation(opts.ruleId),
     rule: makeRule({ id: opts.ruleId, category: opts.category }),
     config: makeConfig(opts.severity, opts.score ?? -5),
     depth: 0,
     maxDepth: 5,
-    calculatedScore: opts.score ?? -5,
+    calculatedScore: opts.calculatedScore ?? opts.score ?? -5,
   };
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/core/engine/scoring.test.ts` around lines 32 - 46, The helper makeIssue
currently sets calculatedScore from opts.score and tests override it manually;
add an optional calculatedScore?: number field to the makeIssue opts and use it
to set the returned object's calculatedScore (falling back to opts.score ?? -5)
so tests can pass a specific calculatedScore directly; update the function
signature and the returned object (references: makeIssue, makeRule, makeConfig,
calculatedScore) and update callers in the tests to pass the new arg instead of
mutating the result.

♻️ Duplicate comments (1)

src/core/engine/scoring.ts (1)

229-236: ⚠️ Potential issue | 🟡 Minor

Potential division by zero if a category has no enabled rules.

If totalScorePerCategory[category] is 0 (e.g., all rules in a category are disabled via preset), line 234 will produce Infinity or NaN.

🛡️ Proposed fix to guard against zero denominator

     let diversityScore = 100;
     if (catScore.issueCount > 0) {
       const ruleScores = ruleScorePerCategory.get(category)!;
       const weightedTriggered = Array.from(ruleScores.values()).reduce((sum, s) => sum + s, 0);
       const weightedTotal = totalScorePerCategory[category];
-      const diversityRatio = weightedTriggered / weightedTotal;
+      const diversityRatio = weightedTotal > 0 ? weightedTriggered / weightedTotal : 0;
       diversityScore = clamp(Math.round((1 - diversityRatio) * 100), 0, 100);
     }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/core/engine/scoring.ts` around lines 229 - 236, The division can produce
Infinity/NaN when totalScorePerCategory[category] is zero; in the block that
computes diversityScore (using catScore.issueCount, ruleScorePerCategory,
totalScorePerCategory, and clamp) guard against a zero/negative weightedTotal by
checking if weightedTotal <= 0 and in that case leave diversityScore at its
default (100) or treat diversityRatio as 0 before computing clamp; ensure the
check is added before computing diversityRatio so no division occurs.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/core/engine/scoring.test.ts`:
- Around line 179-195: The comment and thresholds rely on a hardcoded total
score "42" which can go stale; update the test in scoring.test.ts to compute the
total dynamically from the current RULE_CONFIGS (or derive it via existing
helpers) instead of embedding 42 in the comment and expectations, e.g., use
RULE_CONFIGS sum-of-absolute-scores (or call the same scoring helpers used by
calculateScores/makeIssue/makeResult) to compute the ratio explanation and keep
the assertions (expect(...diversityScore).toBeGreaterThan(90) etc.) valid
against changing rule weights; adjust the inline comment to reference the
computed total rather than the literal "42".

In `@src/core/engine/scoring.ts`:
- Around line 189-194: The fallback to RULE_CONFIGS when configs is falsy can
cause numerator/denominator mismatch if issues were generated with a preset;
update the function or surrounding JSDoc to clearly state that callers must
provide the same configs/preset used to produce issues to ensure denominator
alignment, and document this behavior next to the computeTotalScorePerCategory
usage and the configs parameter (and mention RULE_CONFIGS as the default
fallback) so callers know to always pass configs when using presets.

---

Outside diff comments:
In `@src/core/engine/scoring.test.ts`:
- Around line 32-46: The helper makeIssue currently sets calculatedScore from
opts.score and tests override it manually; add an optional calculatedScore?:
number field to the makeIssue opts and use it to set the returned object's
calculatedScore (falling back to opts.score ?? -5) so tests can pass a specific
calculatedScore directly; update the function signature and the returned object
(references: makeIssue, makeRule, makeConfig, calculatedScore) and update
callers in the tests to pass the new arg instead of mutating the result.

---

Duplicate comments:
In `@src/core/engine/scoring.ts`:
- Around line 229-236: The division can produce Infinity/NaN when
totalScorePerCategory[category] is zero; in the block that computes
diversityScore (using catScore.issueCount, ruleScorePerCategory,
totalScorePerCategory, and clamp) guard against a zero/negative weightedTotal by
checking if weightedTotal <= 0 and in that case leave diversityScore at its
default (100) or treat diversityRatio as 0 before computing clamp; ensure the
check is added before computing diversityRatio so no division occurs.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: d3ac25e5-c4d7-4727-9dd7-6e79e1c7bc6e

📥 Commits

Reviewing files that changed from the base of the PR and between 5b4c80b and 60841c1.

📒 Files selected for processing (4)

src/cli/commands/analyze.ts
src/core/engine/scoring.test.ts
src/core/engine/scoring.ts
src/mcp/server.ts

src/core/engine/scoring.test.ts

src/core/engine/scoring.ts

Replace equal-count diversity (each triggered rule = 1/N) with severity-weighted diversity (each triggered rule = |score|/totalScore). Before: no-auto-layout (score -10) and unnecessary-node (score -2) both counted as 1/9 in structure diversity — a single concentrated blocking problem got a high diversity score. After: no-auto-layout counts as 10/42, unnecessary-node as 2/42. Concentrated blocking issues now correctly lower the diversity score. Closes #106

1. Dynamic TOTAL_SCORE_PER_CATEGORY: Replace hardcoded values with computation from RULE_CONFIGS + RULE_ID_CATEGORY mapping, eliminating sync risk when rules are added/removed or scores change. 2. Document density vs diversity asymmetry: density uses calculatedScore (with depthWeight), diversity uses config.score (without depthWeight). Intentional — diversity measures "what types of problems" not "where". 3. Low-severity edge case: documented as intentional behavior with test. Within structure, 1 suggestion rule (score -2) → diversity 95%, 1 blocking rule (score -10) → diversity 76%. This correctly reflects that low-severity issues are less concerning for implementation.

Fixed 4 file(s) based on 2 unresolved review comments. Co-authored-by: CodeRabbit <noreply@coderabbit.ai>

1. Division by zero guard: diversityRatio defaults to 0 when weightedTotal is 0 (all category rules disabled via preset). 2. Clarify fallback comment: document that omitting configs param is only correct when issues were produced with default RULE_CONFIGS. 3. Remove hardcoded "42" from test comment to avoid stale values when calibration changes rule scores. 4. Restore trailing newlines removed by autofix in analyze.ts and server.ts.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

src/core/engine/scoring.test.ts (1)
179-194: 🧹 Nitpick | 🔵 Trivial

Make the diversity assertions calibration-proof.

These > 90 / < 80 checks still encode today’s structure weights, so a calibration-only change in src/core/rules/rule-config.ts can fail the test without any regression in diversity logic. Derive the expected percentages from the current structure total instead of hard-coded thresholds. Based on learnings, src/core/rules/rule-config.ts is automatically adjusted by a nightly calibration pipeline.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/core/engine/scoring.test.ts` around lines 179 - 194, Replace the
hard-coded >90 / <80 assertions with thresholds computed from the current
"structure" category total in the rule configuration: import or access the
structure category total from the rule-config (e.g., via a
getCategoryTotal("structure") or the exported config in rule-config.ts), compute
expectedPercentageLow = (lowSeverity.byCategory.structure.diversityScore /
structureTotal) * 100 and expectedPercentageHigh =
(highSeverity.byCategory.structure.diversityScore / structureTotal) * 100 (or
otherwise compute the percentages consistently), and assert those relative
values instead of fixed numeric literals so the test adapts to nightly
calibration changes; update the assertions that reference calculateScores,
makeResult, makeIssue, and the byCategory.structure.diversityScore accordingly.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/core/engine/scoring.ts`:
- Around line 70-84: computeTotalScorePerCategory currently uses the built-in
RULE_ID_CATEGORY lookup so any custom rules merged into configs are ignored;
update computeTotalScorePerCategory to derive the category per rule id from the
active rule definitions/category map used by calculateScores (e.g.,
RULE_DEFINITIONS[id].category or an activeRuleCategoryMap passed in) instead of
RULE_ID_CATEGORY, and include enabled custom rules' Math.abs(config.score) in
the totals so weightedTotal/weightedTriggered operate over the same rule set.

---

Duplicate comments:
In `@src/core/engine/scoring.test.ts`:
- Around line 179-194: Replace the hard-coded >90 / <80 assertions with
thresholds computed from the current "structure" category total in the rule
configuration: import or access the structure category total from the
rule-config (e.g., via a getCategoryTotal("structure") or the exported config in
rule-config.ts), compute expectedPercentageLow =
(lowSeverity.byCategory.structure.diversityScore / structureTotal) * 100 and
expectedPercentageHigh = (highSeverity.byCategory.structure.diversityScore /
structureTotal) * 100 (or otherwise compute the percentages consistently), and
assert those relative values instead of fixed numeric literals so the test
adapts to nightly calibration changes; update the assertions that reference
calculateScores, makeResult, makeIssue, and the
byCategory.structure.diversityScore accordingly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: a176e762-fc9d-4740-9feb-0cf7991998f1

📥 Commits

Reviewing files that changed from the base of the PR and between 60841c1 and 8413767.

📒 Files selected for processing (5)

src/cli/commands/analyze.ts
src/core/engine/scoring.test.ts
src/core/engine/scoring.ts
src/core/rules/rule-config.ts
src/mcp/server.ts

coderabbitai · 2026-03-26T11:10:50Z

src/core/engine/scoring.ts

+function computeTotalScorePerCategory(
+  configs: Record<RuleId, RuleConfig>
+): Record<Category, number> {
+  const totals = Object.fromEntries(
+    CATEGORIES.map(c => [c, 0])
+  ) as Record<Category, number>;
+
+  for (const [id, config] of Object.entries(configs)) {
+    const category = RULE_ID_CATEGORY[id as RuleId];
+    if (category && config.enabled) {
+      totals[category] += Math.abs(config.score);
+    }
+  }
+
+  return totals;


⚠️ Potential issue | 🟠 Major

Include custom rules in the diversity denominator.

RULE_ID_CATEGORY only covers built-in IDs, but both updated call sites now merge custom rule configs before invoking calculateScores. Any enabled custom rule is skipped here, while Line 205 still counts it once triggered, so weightedTotal and weightedTriggered stop describing the same rule set. That over-penalizes custom rules and can even leave diversity at 100 when a category only has custom rules enabled. Build the denominator from the active rule definitions/category map, not just the built-in lookup.

Also applies to: 205-205

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/core/engine/scoring.ts` around lines 70 - 84, computeTotalScorePerCategory currently uses the built-in RULE_ID_CATEGORY lookup so any custom rules merged into configs are ignored; update computeTotalScorePerCategory to derive the category per rule id from the active rule definitions/category map used by calculateScores (e.g., RULE_DEFINITIONS[id].category or an activeRuleCategoryMap passed in) instead of RULE_ID_CATEGORY, and include enabled custom rules' Math.abs(config.score) in the totals so weightedTotal/weightedTriggered operate over the same rule set.

let-sunny commented Mar 26, 2026

View reviewed changes

let-sunny marked this pull request as ready for review March 26, 2026 10:33

coderabbitai bot reviewed Mar 26, 2026

View reviewed changes

src/core/engine/scoring.test.ts Show resolved Hide resolved

src/core/engine/scoring.ts Outdated Show resolved Hide resolved

coderabbitai bot reviewed Mar 26, 2026

View reviewed changes

src/core/engine/scoring.test.ts Show resolved Hide resolved

src/core/engine/scoring.ts Outdated Show resolved Hide resolved

let-sunny and others added 4 commits March 26, 2026 19:59

fix: apply CodeRabbit auto-fixes

08809a6

Fixed 4 file(s) based on 2 unresolved review comments. Co-authored-by: CodeRabbit <noreply@coderabbit.ai>

let-sunny force-pushed the feat/severity-weighted-diversity branch from 60841c1 to 8413767 Compare March 26, 2026 11:00

coderabbitai bot reviewed Mar 26, 2026

View reviewed changes

let-sunny merged commit a53ea35 into main Mar 26, 2026
3 checks passed

This was referenced Mar 26, 2026

refactor: remove custom rules feature #113

Merged

Add token consumption and multi-viewport calibration signals #108

Closed

let-sunny deleted the feat/severity-weighted-diversity branch March 27, 2026 07:46

This was referenced Mar 28, 2026

refactor: reorganize rule categories and scores based on ablation data #154

Merged

feat: non-standard-naming rule — detect dev-unfriendly state names (#159) #162

Merged

feat: adjust rule scores based on AI implementation interview (#200) #203

Merged

Conversation

let-sunny commented Mar 26, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Re-evaluation: Is density alone sufficient after #104?

Edge case: low-severity rules

Depends on

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related issues

Possibly related PRs

Poem

Uh oh!

let-sunny left a comment

Choose a reason for hiding this comment

코드 리뷰

1. TOTAL_SCORE_PER_CATEGORY 하드코딩 → 동기화 리스크

2. density vs diversity의 score 사용 비대칭

3. diversity 의미론 변화에 따른 edge case

4. #106 이슈의 재검토 조건 미충족

5. 테스트 & 정확성

Uh oh!

let-sunny commented Mar 26, 2026

Review Feedback Addressed (5b4c80b)

1. ✅ TOTAL_SCORE_PER_CATEGORY 동적 계산

2. ✅ density vs diversity 비대칭 코드 주석

3. ✅ low-severity edge case

4. ✅ #104 이후 재검토 결과

5. 테스트 & 정확성

Uh oh!

let-sunny commented Mar 26, 2026

Uh oh!

coderabbitai bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Fixes Applied Successfully

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

let-sunny commented Mar 26, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 26, 2026 •

edited

Loading

1. `TOTAL_SCORE_PER_CATEGORY` 하드코딩 → 동기화 리스크

Review Feedback Addressed (`5b4c80b`)

coderabbitai bot left a comment •

edited

Loading

coderabbitai bot commented Mar 26, 2026 •

edited

Loading