feat: evidence category validation + gap collection + CLI arg validation (#174, #175) by let-sunny · Pull Request #176 · let-sunny/canicode

let-sunny · 2026-03-29T11:24:51Z

Summary

Two issues combined:

#174 — Discovery evidence category validation + gap-analysis collection

Category validation: appendDiscoveryEvidence now warns when entries have non-standard categories (not in CategorySchema). Safety net for converter typos/old category labels — the converter prompt was fixed in #173, this is the code-side guard.

Gap evidence collection: New CLI calibrate-collect-gap-evidence <runDir> extracts actionable uncovered gaps from gaps.json and appends to data/discovery-evidence.json as source: "gap-analysis" entries. Previously only the evaluation path (source: "evaluation") fed discovery evidence — gap-analysis was a dead end.

calibrate-loop.md updated to run this after Step 3 (Gap Analysis).

#175 — Zod argument validation for internal CLI commands

All 7 new internal commands now validate args with z.string().trim().min(1):

calibrate-gather-evidence, calibrate-finalize-debate
calibrate-enrich-evidence, calibrate-prune-evidence
discovery-filter-evidence, rule-apply-decision, calibrate-collect-gap-evidence

Test plan

643 tests pass (4 new for collectGapEvidence)
Type check passes
Gap collection: actionable + uncovered only, skips covered/non-actionable
Category warning fires on non-standard categories
Zod validation rejects empty/whitespace args

Closes #174, closes #175

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- New CLI step that collects uncovered, actionable analysis gaps and appends them to discovery evidence.
Bug Fixes
- Stronger CLI argument validation with clearer handling for invalid run directories and keywords.
- Discovery-evidence now warns when a non-standard category is provided.
Tests
- Expanded tests for gap-evidence collection, category validation, and related control flow.

1. appendDiscoveryEvidence warns on non-standard categories (CategorySchema validation as safety net for converter typos) 2. calibrate-collect-gap-evidence CLI command: Reads gaps.json, extracts actionable + uncovered gaps, appends to data/discovery-evidence.json as source: "gap-analysis" entries. Fills the gap-analysis → discovery evidence path that was missing. 3. calibrate-loop.md updated to run gap evidence collection after Step 3 Tests: collectGapEvidence (4 new) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

All new internal CLI commands now validate string arguments with Zod before processing (z.string().trim().min(1)): - calibrate-gather-evidence, calibrate-finalize-debate (calibrate-debate.ts) - discovery-filter-evidence, rule-apply-decision, calibrate-collect-gap-evidence (rule-discovery.ts) - calibrate-enrich-evidence, calibrate-prune-evidence (fixture-management.ts) Invalid args → console.log + early return (internal CLI convention). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-03-29T11:24:57Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds Zod validation for internal CLI runDir/keyword args, warns on non-standard discovery-evidence categories during append, and introduces a new CLI command that reads gaps.json, selects actionable uncovered gaps, and appends them as source: "gap-analysis" evidence.

Changes

Cohort / File(s)	Summary
Orchestration note `\.claude/commands/calibrate-loop.md`	Documented new deterministic CLI step: `npx canicode calibrate-collect-gap-evidence $RUN_DIR` after gap-analysis.
CLI helpers & registrations `src/cli/commands/internal/cli-helpers.ts`, `src/cli/index.ts`	Added `RUN_DIR_ARG_SCHEMA`, `KEYWORD_ARG_SCHEMA`, and `resolveRunDir(runDir)`; registered the new `collect-gap-evidence` internal command.
Internal CLI commands (run-dir validation) `src/cli/commands/internal/calibrate-debate.ts`, `src/cli/commands/internal/fixture-management.ts`, `src/cli/commands/internal/rule-discovery.ts`	Replaced ad-hoc `resolve`/`existsSync` checks with shared `resolveRunDir`; commands early-return on unresolved dirs and use parsed/validated keyword where applicable.
Gap evidence collection (impl + tests) `src/cli/commands/internal/rule-discovery.ts`, `src/cli/commands/internal/rule-discovery.test.ts`	Added exported `collectGapEvidence(runDir, fixture)` and `registerCollectGapEvidence(cli)`; reads/parses `gaps.json`, filters `actionable === true && coveredByRule == null`, maps to `DiscoveryEvidenceEntry[]` (`source: "gap-analysis"`, `impact: "medium"`, ISO timestamp), exposes CLI command to append entries; tests cover missing/malformed files and filter outcomes.
Discovery evidence validation `src/agents/evidence-collector.ts`, `src/agents/evidence-collector.test.ts`	`appendDiscoveryEvidence` now validates each entry's `category` with `CategorySchema.safeParse()` and emits `console.warn` for non-standard categories; tests updated to assert warnings and adjusted fixture categories.

Sequence Diagram

sequenceDiagram
    participant CLI as CLI Handler
    participant Zod as Zod Validator
    participant FS as File System
    participant Parser as JSON Parser
    participant Filter as Gap Filter
    participant Appender as appendDiscoveryEvidence

    CLI->>Zod: safeParse(runDir / keyword)
    alt Invalid
        Zod-->>CLI: Parse Error
        CLI->>CLI: Log + return
    else Valid
        Zod-->>CLI: Validated args
        CLI->>FS: read `gaps.json` from runDir
        alt Missing or unreadable
            FS-->>CLI: not found / error
            CLI->>CLI: return []
        else Read file
            FS-->>Parser: file contents
            Parser->>Parser: parse JSON (safe)
            alt JSON invalid
                Parser-->>CLI: parse error
                CLI->>CLI: return []
            else JSON valid
                Parser-->>Filter: gaps[]
                Filter->>Filter: select actionable && coveredByRule == null
                Filter-->>Appender: actionable uncovered gaps
                Appender->>Appender: map -> DiscoveryEvidenceEntry[] (source: "gap-analysis", impact: "medium", timestamp)
                Appender->>Zod: validate entry.category with CategorySchema
                alt Category invalid
                    Zod-->>Appender: invalid
                    Appender-->>CLI: console.warn (non-standard category)
                end
                Appender->>FS: append entries to discovery-evidence.json
                CLI->>CLI: log appended count
            end
        end
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

feat: discovery-evidence dedupe + schema versioning #43: Modifies discovery-evidence persistence/append logic and tests — closely related to category validation and append behavior.
feat: structured debate with pro/con, confidence, early-stop (#144) #172: Implements/changes deterministic CLI gap-evidence collection (collectGapEvidence / calibrate-collect-gap-evidence) — directly related to the new command and flow.
fix: CLI input validation with Zod schemas (#60) #68: Adds Zod-based runtime CLI input validation — related to the new RUN_DIR_ARG_SCHEMA / KEYWORD_ARG_SCHEMA changes.

Poem

🐰 I hopped through gaps and read the run,
I trimmed the paths till each was one.
I whispered warnings for odd category names,
I gathered uncovered gaps into evidence frames.
Hooray — now more traces for rule-makers to fun!

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely summarizes the main changes: evidence category validation, gap collection, and CLI argument validation, with issue references.
Linked Issues check	✅ Passed	Code changes fully implement requirements from `#174` (category validation, gap-analysis collection, CLI command) and `#175` (Zod validation for internal CLI commands).
Out of Scope Changes check	✅ Passed	All changes align with the stated objectives; no unrelated modifications detected beyond `#174` and `#175` scope.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/evidence-validation-174-175

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…ategory - Test: category warning fires on non-standard, does not fire on standard (2 new) - coveredByRule: truthy check instead of null/undefined (handles empty string) - Legacy migration test: "color" → "pixel-critical" (avoids CI warning noise) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

User corrections: - coveredByRule: truthy → nullish check (!=null), empty string counts as "marked covered" — more precise than truthy which would also skip 0 - Empty string coveredByRule test added - Existing test categories updated to standard values Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/cli/commands/internal/fixture-management.ts`:
- Around line 169-175: The code in registerEvidencePrune calls
resolve(parsed.data) multiple times; assign the resolved path to a local
variable (e.g., resolvedDir) after RunDirArgSchema.safeParse succeeds, use that
variable for the existsSync check and for parseDebateResult to avoid duplicate
resolution and match the pattern used in registerEvidenceEnrich (use parsed,
resolve, existsSync, and parseDebateResult to locate the change).

🪄 Autofix (Beta)

✅ Autofix completed

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: e13ee687-bf18-4f83-b159-a8d93df92230

📥 Commits

Reviewing files that changed from the base of the PR and between 6fe1ea5 and 47ff8f2.

📒 Files selected for processing (8)

.claude/commands/calibrate-loop.md
src/agents/evidence-collector.test.ts
src/agents/evidence-collector.ts
src/cli/commands/internal/calibrate-debate.ts
src/cli/commands/internal/fixture-management.ts
src/cli/commands/internal/rule-discovery.test.ts
src/cli/commands/internal/rule-discovery.ts
src/cli/index.ts

src/cli/commands/internal/fixture-management.ts

coderabbitai · 2026-03-29T11:39:55Z

Note

Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it.

Fixes Applied Successfully

Fixed 1 file(s) based on 1 unresolved review comment.

Files modified:

src/cli/commands/internal/fixture-management.ts

Commit: ebb9b8a8fdb0b4f2db86daa8c90783e9a46ca923

The changes have been pushed to the feat/evidence-validation-174-175 branch.

Time taken: 3m 12s

Fixed 1 file(s) based on 1 unresolved review comment. Co-authored-by: CodeRabbit <noreply@coderabbit.ai>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/cli/commands/internal/fixture-management.ts`:
- Line 6: Rename the constant RunDirArgSchema to SCREAMING_SNAKE_CASE
(RUN_DIR_ARG_SCHEMA) and update all local references and any imports/exports
that use RunDirArgSchema; locate the definition of RunDirArgSchema
(z.string().trim().min(1, "runDir is required")) and rename the identifier to
RUN_DIR_ARG_SCHEMA, adjust any named export or import statements that reference
it, and run a quick project-wide search to replace remaining uses to avoid
breaking references.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 8eec8d15-c583-47fd-8dcc-2f27d733eb37

📥 Commits

Reviewing files that changed from the base of the PR and between 47ff8f2 and ebb9b8a.

📒 Files selected for processing (1)

src/cli/commands/internal/fixture-management.ts

src/cli/commands/internal/fixture-management.ts

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (1)

src/cli/commands/internal/rule-discovery.ts (1)

139-149: 🛠️ Refactor suggestion | 🟠 Major

Rename the schema constants to SCREAMING_SNAKE_CASE.

GapSchema and GapsFileSchema are module-level constants, so they should follow the same naming rule as RUN_DIR_ARG_SCHEMA and KEYWORD_ARG_SCHEMA.

♻️ Suggested rename

-const GapSchema = z.object({
+const GAP_SCHEMA = z.object({
   category: z.string(),
   description: z.string(),
   actionable: z.boolean(),
   coveredByRule: z.unknown().optional(),
 }).passthrough();

-const GapsFileSchema = z.object({
+const GAPS_FILE_SCHEMA = z.object({
   fixture: z.string().optional(),
-  gaps: z.array(GapSchema),
+  gaps: z.array(GAP_SCHEMA),
 }).passthrough();
...
-  const parsed = GapsFileSchema.safeParse(raw);
+  const parsed = GAPS_FILE_SCHEMA.safeParse(raw);

As per coding guidelines, "Constants must use SCREAMING_SNAKE_CASE (e.g., MY_CONSTANT, not myConstant)".

Also applies to: 165-165

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/cli/commands/internal/rule-discovery.ts` around lines 139 - 149, The
module-level schema constants GapSchema and GapsFileSchema should be renamed to
SCREAMING_SNAKE_CASE (GAP_SCHEMA and GAPS_FILE_SCHEMA); update the const
declarations for GapSchema and GapsFileSchema to GAP_SCHEMA and GAPS_FILE_SCHEMA
respectively and replace every use/reference of GapSchema and GapsFileSchema in
this file (including the related usage around the other occurrence noted near
line ~165) so the code compiles and follows the constant naming convention.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/cli/commands/internal/calibrate-debate.ts`:
- Around line 109-114: The validation currently only checks existence via
RUN_DIR_ARG_SCHEMA.safeParse and existsSync(dir) which allows existing files to
pass; update both handlers (calibrate-gather-evidence and
calibrate-finalize-debate) to reject non-directories by adding an is-directory
guard after resolve(parsed.data) — e.g., call fs.statSync(dir) or
fs.lstatSync(dir) and check .isDirectory() and if false log the same "Run
directory not found" (or similar) and return; keep the existing parsed/dir
variables (RUN_DIR_ARG_SCHEMA, parsed, dir) so the new check sits immediately
after the existsSync check to prevent later ENOTDIR errors and misleading
messages.

In `@src/cli/commands/internal/rule-discovery.ts`:
- Around line 210-220: The catch block around
collectGapEvidence/appendDiscoveryEvidence currently swallows persistence errors
and exits 0; change it so filesystem failures from appendDiscoveryEvidence
signal command failure: in the catch for the try that calls
appendDiscoveryEvidence(entries) keep the console.log of the error but then
either rethrow the error or call process.exit(1) so the process returns a
non-zero exit code; locate the try/catch that wraps collectGapEvidence and
appendDiscoveryEvidence in rule-discovery.ts and adjust the catch to emit the
same message and then fail the command (process.exit(1) or throw) instead of
returning successfully.
- Around line 143-145: The schema currently allows coveredByRule to be missing
because it uses z.unknown().optional(), so gaps with a missing field are treated
as uncovered; change the property definition(s) for coveredByRule in the Zod
schemas in rule-discovery.ts from z.unknown().optional() to explicitly require
null (e.g., z.null()) so only records with coveredByRule === null pass, and
update both occurrences where coveredByRule is defined (the schema around the
actionable selection and the other schema at the later block).

---

Duplicate comments:
In `@src/cli/commands/internal/rule-discovery.ts`:
- Around line 139-149: The module-level schema constants GapSchema and
GapsFileSchema should be renamed to SCREAMING_SNAKE_CASE (GAP_SCHEMA and
GAPS_FILE_SCHEMA); update the const declarations for GapSchema and
GapsFileSchema to GAP_SCHEMA and GAPS_FILE_SCHEMA respectively and replace every
use/reference of GapSchema and GapsFileSchema in this file (including the
related usage around the other occurrence noted near line ~165) so the code
compiles and follows the constant naming convention.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 893192c1-e0e3-4d1f-aa32-7efd4a99e7de

📥 Commits

Reviewing files that changed from the base of the PR and between ebb9b8a and 3b338a2.

📒 Files selected for processing (3)

src/cli/commands/internal/calibrate-debate.ts
src/cli/commands/internal/fixture-management.ts
src/cli/commands/internal/rule-discovery.ts

src/cli/commands/internal/calibrate-debate.ts

coderabbitai · 2026-03-29T11:58:52Z

src/cli/commands/internal/rule-discovery.ts

+  coveredByRule: z.unknown().optional(),
+}).passthrough();
+


⚠️ Potential issue | 🟠 Major

Require an explicit coveredByRule: null before importing a gap.

The selection contract here is “actionable and coveredByRule === null”. With coveredByRule optional plus the != null check, gaps where the field is simply missing are treated as uncovered and get appended anyway. That turns older or malformed gaps.json into false-positive discovery evidence.

♻️ Suggested contract tightening

const GapSchema = z.object({ category: z.string(), description: z.string(), actionable: z.boolean(), - coveredByRule: z.unknown().optional(), + coveredByRule: z.unknown().nullable(), }).passthrough(); ... - if (gap.coveredByRule != null) continue; + if (gap.coveredByRule !== null) continue;

Also applies to: 171-175

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/cli/commands/internal/rule-discovery.ts` around lines 143 - 145, The schema currently allows coveredByRule to be missing because it uses z.unknown().optional(), so gaps with a missing field are treated as uncovered; change the property definition(s) for coveredByRule in the Zod schemas in rule-discovery.ts from z.unknown().optional() to explicitly require null (e.g., z.null()) so only records with coveredByRule === null pass, and update both occurrences where coveredByRule is defined (the schema around the actionable selection and the other schema at the later block).

coderabbitai · 2026-03-29T11:58:52Z

src/cli/commands/internal/rule-discovery.ts

+      try {
+        const entries = collectGapEvidence(dir, fixture);
+        if (entries.length === 0) {
+          console.log("No uncovered actionable gaps found");
+          return;
+        }
+
+        appendDiscoveryEvidence(entries);
+        console.log(`Collected ${entries.length} gap evidence entries for fixture "${fixture}"`);
+      } catch (err) {
+        console.log(`Failed to collect gap evidence: ${err instanceof Error ? err.message : String(err)}`);


⚠️ Potential issue | 🟠 Major

Mark append failures as command failures.

appendDiscoveryEvidence() can still throw on filesystem errors, but this catch only prints a message and exits successfully. That makes the calibration loop continue without the new evidence even though the command’s only side effect failed. Keep the stdout message if needed, but set a non-zero exit code here.

♻️ Suggested failure signaling

} catch (err) { console.log(`Failed to collect gap evidence: ${err instanceof Error ? err.message : String(err)}`); + process.exitCode = 1; }

Based on learnings, the stdout-and-exit-0 convention in src/cli/commands/internal/ is for expected cases like missing run directories and parseDebateResult returning null, not unexpected persistence failures.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

try {

const entries = collectGapEvidence(dir, fixture);

if (entries.length === 0) {

console.log("No uncovered actionable gaps found");

return;

}

appendDiscoveryEvidence(entries);

console.log(`Collected ${entries.length} gap evidence entries for fixture "${fixture}"`);

} catch (err) {

console.log(`Failed to collect gap evidence: ${err instanceof Error ? err.message : String(err)}`);

try {

const entries = collectGapEvidence(dir, fixture);

if (entries.length === 0) {

console.log("No uncovered actionable gaps found");

return;

}

appendDiscoveryEvidence(entries);

console.log(`Collected ${entries.length} gap evidence entries for fixture "${fixture}"`);

} catch (err) {

console.log(`Failed to collect gap evidence: ${err instanceof Error ? err.message : String(err)}`);

process.exitCode = 1;

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/cli/commands/internal/rule-discovery.ts` around lines 210 - 220, The catch block around collectGapEvidence/appendDiscoveryEvidence currently swallows persistence errors and exits 0; change it so filesystem failures from appendDiscoveryEvidence signal command failure: in the catch for the try that calls appendDiscoveryEvidence(entries) keep the console.log of the error but then either rethrow the error or call process.exit(1) so the process returns a non-zero exit code; locate the try/catch that wraps collectGapEvidence and appendDiscoveryEvidence in rule-discovery.ts and adjust the catch to emit the same message and then fail the command (process.exit(1) or throw) instead of returning successfully.

- calibrate-gather-evidence, calibrate-finalize-debate: add isDirectory() (was: existsSync only, file path would cause ENOTDIR) - GapSchema.coveredByRule: .default(null) instead of .optional() (missing field now defaults to null → treated as uncovered, not silently skipped) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/cli/commands/internal/rule-discovery.ts (1)

45-59: ⚠️ Potential issue | 🟡 Minor

Validate --run-dir before branching on it.

keyword is now trimmed/validated, but options.runDir still bypasses Zod entirely. That means --run-dir "" is treated as “option absent” instead of invalid input, and whitespace-padded paths are never trimmed before the directory check.

♻️ Proposed fix

     .option("--run-dir <path>", "Write filtered evidence to run directory")
     .action((keyword: string, options: { runDir?: string }) => {
       const kParsed = KEYWORD_ARG_SCHEMA.safeParse(keyword);
       if (!kParsed.success) { console.log(`Invalid keyword: ${kParsed.error.issues[0]?.message}`); return; }
       try {
         const filtered = filterDiscoveryEvidence(kParsed.data);
 
-        if (options.runDir) {
-          const dir = resolve(options.runDir);
+        if (options.runDir !== undefined) {
+          const runDirParsed = RUN_DIR_ARG_SCHEMA.safeParse(options.runDir);
+          if (!runDirParsed.success) { console.log(`Invalid runDir: ${runDirParsed.error.issues[0]?.message}`); return; }
+          const dir = resolve(runDirParsed.data);
           if (!existsSync(dir) || !statSync(dir).isDirectory()) {
             console.log(`Run directory not found or is not a directory: ${options.runDir}`);
             return;
           }

As per coding guidelines, src/**/*.ts: Use Zod schemas for external input validation (schemas in contracts/ directory).

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/cli/commands/internal/rule-discovery.ts` around lines 45 - 59, Validate
the --run-dir option with a Zod schema instead of treating falsy/whitespace as
absent: add or import a RUN_DIR_ARG_SCHEMA (in contracts/) and call
RUN_DIR_ARG_SCHEMA.safeParse(options.runDir) inside the .action callback (near
where KEYWORD_ARG_SCHEMA is used); if parsing fails log the validation error and
return, otherwise use the parsed/trimmed value (not the raw options.runDir) for
resolve(), existence and isDirectory checks and for building outPath before
calling writeFileSync and the final console.log; keep filterDiscoveryEvidence
and the rest of the flow unchanged, only replace direct uses of options.runDir
with the validated value.

♻️ Duplicate comments (1)

src/cli/commands/internal/rule-discovery.ts (1)
210-220: ⚠️ Potential issue | 🟠 Major

Surface append failures as command failures.

By the time execution reaches this catch, the expected exit-0 cases have already been handled (gaps.json missing/invalid or no uncovered gaps). If appendDiscoveryEvidence() throws here, the command's only side effect failed, but the process still returns success and the calibration loop can continue without the new evidence.
♻️ Proposed fix
       } catch (err) {
         console.log(`Failed to collect gap evidence: ${err instanceof Error ? err.message : String(err)}`);
+        process.exitCode = 1;
       }
Based on learnings, the stdout-and-exit-0 convention in src/cli/commands/internal/ applies to expected cases like missing run directories and parseDebateResult returning null, not unexpected persistence failures.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cli/commands/internal/rule-discovery.ts` around lines 210 - 220, The
catch block swallowing persistence errors should be changed to surface failures
as command failures: when appendDiscoveryEvidence(entries) throws inside the try
in function handling the rule-discovery command, log the error (use
console.error or processLogger.error) with the full error message and stack and
then terminate with a non-zero exit (e.g., throw the error or call
process.exit(1)) so the process does not return success; update the catch that
currently logs `Failed to collect gap evidence` to rethrow or exit non‑zero to
ensure appendDiscoveryEvidence failures fail the command.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/cli/commands/internal/calibrate-debate.ts`:
- Around line 109-113: The statSync call can throw and currently can abort the
CLI instead of performing the intended console.log + return flow; wrap the
directory check in a try/catch around the statSync (after resolving parsed.data
and verifying existsSync) so any thrown error results in the same
console.log(`Run directory not found or is not a directory: ${runDir}`) and an
early return; update the check around the dir variable used after
RUN_DIR_ARG_SCHEMA.safeParse (and apply the same try/catch pattern to the
similar check later in the file around the lines that validate run directories
at the second block) to ensure exception-safe directory validation.

---

Outside diff comments:
In `@src/cli/commands/internal/rule-discovery.ts`:
- Around line 45-59: Validate the --run-dir option with a Zod schema instead of
treating falsy/whitespace as absent: add or import a RUN_DIR_ARG_SCHEMA (in
contracts/) and call RUN_DIR_ARG_SCHEMA.safeParse(options.runDir) inside the
.action callback (near where KEYWORD_ARG_SCHEMA is used); if parsing fails log
the validation error and return, otherwise use the parsed/trimmed value (not the
raw options.runDir) for resolve(), existence and isDirectory checks and for
building outPath before calling writeFileSync and the final console.log; keep
filterDiscoveryEvidence and the rest of the flow unchanged, only replace direct
uses of options.runDir with the validated value.

---

Duplicate comments:
In `@src/cli/commands/internal/rule-discovery.ts`:
- Around line 210-220: The catch block swallowing persistence errors should be
changed to surface failures as command failures: when
appendDiscoveryEvidence(entries) throws inside the try in function handling the
rule-discovery command, log the error (use console.error or processLogger.error)
with the full error message and stack and then terminate with a non-zero exit
(e.g., throw the error or call process.exit(1)) so the process does not return
success; update the catch that currently logs `Failed to collect gap evidence`
to rethrow or exit non‑zero to ensure appendDiscoveryEvidence failures fail the
command.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 91478502-1e72-404e-9804-19975aa78192

📥 Commits

Reviewing files that changed from the base of the PR and between 3b338a2 and 43d0d47.

📒 Files selected for processing (2)

src/cli/commands/internal/calibrate-debate.ts
src/cli/commands/internal/rule-discovery.ts

coderabbitai · 2026-03-29T12:08:37Z

src/cli/commands/internal/calibrate-debate.ts

+      const parsed = RUN_DIR_ARG_SCHEMA.safeParse(runDir);
+      if (!parsed.success) { console.log(`Invalid runDir: ${parsed.error.issues[0]?.message}`); return; }
+      const dir = resolve(parsed.data);
+      if (!existsSync(dir) || !statSync(dir).isDirectory()) {
+        console.log(`Run directory not found or is not a directory: ${runDir}`);


⚠️ Potential issue | 🟠 Major

Guard statSync exceptions to preserve stdout-and-return contract.

statSync(dir) can throw (e.g., ENOENT race/EACCES), which would abort the command instead of following the expected console.log + early return flow.

♻️ Suggested fix (shared validator + exception-safe dir check)

+function resolveValidatedRunDir(runDir: string): string | null { + const parsed = RUN_DIR_ARG_SCHEMA.safeParse(runDir); + if (!parsed.success) { + console.log(`Invalid runDir: ${parsed.error.issues[0]?.message}`); + return null; + } + + const dir = resolve(parsed.data); + try { + if (!statSync(dir).isDirectory()) { + console.log(`Run directory not found or is not a directory: ${runDir}`); + return null; + } + } catch { + console.log(`Run directory not found or is not a directory: ${runDir}`); + return null; + } + + return dir; +} + export function registerGatherEvidence(cli: CAC): void { cli @@ ) .action((runDir: string) => { - const parsed = RUN_DIR_ARG_SCHEMA.safeParse(runDir); - if (!parsed.success) { console.log(`Invalid runDir: ${parsed.error.issues[0]?.message}`); return; } - const dir = resolve(parsed.data); - if (!existsSync(dir) || !statSync(dir).isDirectory()) { - console.log(`Run directory not found or is not a directory: ${runDir}`); - return; - } + const dir = resolveValidatedRunDir(runDir); + if (!dir) return; @@ export function registerFinalizeDebate(cli: CAC): void { cli @@ ) .action((runDir: string) => { - const parsed = RUN_DIR_ARG_SCHEMA.safeParse(runDir); - if (!parsed.success) { console.log(`Invalid runDir: ${parsed.error.issues[0]?.message}`); return; } - const dir = resolve(parsed.data); - if (!existsSync(dir) || !statSync(dir).isDirectory()) { - console.log(`Run directory not found or is not a directory: ${runDir}`); - return; - } + const dir = resolveValidatedRunDir(runDir); + if (!dir) return;

Based on learnings, internal calibration CLI commands must treat missing/invalid run directories as expected outcomes via console.log + return (exit code 0).

Also applies to: 142-146

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/cli/commands/internal/calibrate-debate.ts` around lines 109 - 113, The statSync call can throw and currently can abort the CLI instead of performing the intended console.log + return flow; wrap the directory check in a try/catch around the statSync (after resolving parsed.data and verifying existsSync) so any thrown error results in the same console.log(`Run directory not found or is not a directory: ${runDir}`) and an early return; update the check around the dir variable used after RUN_DIR_ARG_SCHEMA.safeParse (and apply the same try/catch pattern to the similar check later in the file around the lines that validate run directories at the second block) to ensure exception-safe directory validation.

Shared cli-helpers.ts with resolveRunDir() — validates arg, resolves path, checks isDirectory, catches statSync exceptions. All internal commands now use this single function instead of duplicating the pattern. Removes RunDirArgSchema duplication across 3 files. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (2)

src/cli/commands/internal/rule-discovery.ts (2)

195-206: ⚠️ Potential issue | 🟠 Major

Signal collection failures as command failures.

If appendDiscoveryEvidence() throws here, the command only logs and exits successfully even though its only side effect failed. Set process.exitCode = 1 after the log so the calibration loop can detect the failure.

🛡️ Proposed fix

       } catch (err) {
         console.log(`Failed to collect gap evidence: ${err instanceof Error ? err.message : String(err)}`);
+        process.exitCode = 1;
       }

Based on learnings, internal calibration CLI commands should only "log the relevant message to stdout with `console.log` and complete with exit code `0`" for expected cases like missing run directories or `parseDebateResult` returning `null`.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/cli/commands/internal/rule-discovery.ts` around lines 195 - 206, The
catch block around collectGapEvidence/appendDiscoveryEvidence currently only
logs errors so failures from appendDiscoveryEvidence are treated as success;
modify the catch in the try/catch that wraps collectGapEvidence and
appendDiscoveryEvidence so that after logging the error you set process.exitCode
= 1 to signal command failure (keep console.log unchanged for expected cases
like missing directories or parseDebateResult null); reference the block
handling entries = collectGapEvidence(dir, fixture) and the call to
appendDiscoveryEvidence(entries) and ensure the change is limited to adding
process.exitCode = 1 in that catch branch.

133-165: ⚠️ Potential issue | 🟠 Major

Require an explicit coveredByRule: null before importing a gap.

Line 133 turns a missing coveredByRule into null, so Line 165 imports legacy or malformed gaps as uncovered. That violates the coveredByRule === null contract and can create false-positive discovery evidence.

🐛 Proposed fix

 const GapSchema = z.object({
   category: z.string(),
   description: z.string(),
   actionable: z.boolean(),
-  coveredByRule: z.unknown().default(null),
+  coveredByRule: z.unknown().nullable().optional(),
 }).passthrough();
@@
-    if (gap.coveredByRule != null) continue;
+    if (gap.coveredByRule !== null) continue;

In Zod 4, does `z.object({ coveredByRule: z.unknown().default(null) }).safeParse({})` return `{ coveredByRule: null }`, and what is the recommended way to distinguish a missing field from an explicit `null` value?

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/cli/commands/internal/rule-discovery.ts` around lines 133 - 165, The
schema currently sets coveredByRule to default(null) so a missing field is
treated as explicit null; update the Zod schema (GapSchema/GapsFileSchema used
by collectGapEvidence) to remove the default and make coveredByRule optional
(e.g., z.unknown().optional()) so absence stays absent, and then change the loop
in collectGapEvidence to only import gaps when the property is present and
explicitly null — e.g., use Object.prototype.hasOwnProperty.call(gap,
"coveredByRule") to check presence and require gap.coveredByRule === null before
treating the gap as uncovered.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/cli/commands/internal/rule-discovery.ts`:
- Around line 129-139: Rename the zod schema constants from GapSchema and
GapsFileSchema to SCREAMING_SNAKE_CASE (e.g., GAP_SCHEMA and GAPS_FILE_SCHEMA);
update the declaration identifiers and every reference/usage (including the
other occurrences around the second set of schema definitions) to the new names,
and adjust any exports/imports accordingly so the rest of the module compiles
with the new constant identifiers.

---

Duplicate comments:
In `@src/cli/commands/internal/rule-discovery.ts`:
- Around line 195-206: The catch block around
collectGapEvidence/appendDiscoveryEvidence currently only logs errors so
failures from appendDiscoveryEvidence are treated as success; modify the catch
in the try/catch that wraps collectGapEvidence and appendDiscoveryEvidence so
that after logging the error you set process.exitCode = 1 to signal command
failure (keep console.log unchanged for expected cases like missing directories
or parseDebateResult null); reference the block handling entries =
collectGapEvidence(dir, fixture) and the call to
appendDiscoveryEvidence(entries) and ensure the change is limited to adding
process.exitCode = 1 in that catch branch.
- Around line 133-165: The schema currently sets coveredByRule to default(null)
so a missing field is treated as explicit null; update the Zod schema
(GapSchema/GapsFileSchema used by collectGapEvidence) to remove the default and
make coveredByRule optional (e.g., z.unknown().optional()) so absence stays
absent, and then change the loop in collectGapEvidence to only import gaps when
the property is present and explicitly null — e.g., use
Object.prototype.hasOwnProperty.call(gap, "coveredByRule") to check presence and
require gap.coveredByRule === null before treating the gap as uncovered.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 15e377ad-9a15-4272-9734-be7d8dc71f2a

📥 Commits

Reviewing files that changed from the base of the PR and between 43d0d47 and 43572a6.

📒 Files selected for processing (4)

src/cli/commands/internal/calibrate-debate.ts
src/cli/commands/internal/cli-helpers.ts
src/cli/commands/internal/fixture-management.ts
src/cli/commands/internal/rule-discovery.ts

coderabbitai · 2026-03-29T12:22:39Z

src/cli/commands/internal/rule-discovery.ts

+const GapSchema = z.object({
+  category: z.string(),
+  description: z.string(),
+  actionable: z.boolean(),
+  coveredByRule: z.unknown().default(null),
+}).passthrough();
+
+const GapsFileSchema = z.object({
+  fixture: z.string().optional(),
+  gaps: z.array(GapSchema),
+}).passthrough();


🛠️ Refactor suggestion | 🟠 Major

Rename the schema constants to SCREAMING_SNAKE_CASE.

GapSchema and GapsFileSchema are values, so they should follow the repo's constant naming rule.

♻️ Proposed rename

-const GapSchema = z.object({ +const GAP_SCHEMA = z.object({ category: z.string(), description: z.string(), actionable: z.boolean(), coveredByRule: z.unknown().default(null), }).passthrough(); -const GapsFileSchema = z.object({ +const GAPS_FILE_SCHEMA = z.object({ fixture: z.string().optional(), - gaps: z.array(GapSchema), + gaps: z.array(GAP_SCHEMA), }).passthrough(); @@ - const parsed = GapsFileSchema.safeParse(raw); + const parsed = GAPS_FILE_SCHEMA.safeParse(raw);

As per coding guidelines, "Constants must use SCREAMING_SNAKE_CASE (e.g., `MY_CONSTANT`, not `myConstant`)".
Also applies to: 155-156

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/cli/commands/internal/rule-discovery.ts` around lines 129 - 139, Rename the zod schema constants from GapSchema and GapsFileSchema to SCREAMING_SNAKE_CASE (e.g., GAP_SCHEMA and GAPS_FILE_SCHEMA); update the declaration identifiers and every reference/usage (including the other occurrences around the second set of schema definitions) to the new names, and adjust any exports/imports accordingly so the rest of the module compiles with the new constant identifiers.

let-sunny and others added 2 commits March 29, 2026 20:22

let-sunny and others added 2 commits March 29, 2026 20:28

let-sunny marked this pull request as ready for review March 29, 2026 11:34

coderabbitai bot reviewed Mar 29, 2026

View reviewed changes

src/cli/commands/internal/fixture-management.ts Outdated Show resolved Hide resolved

fix: apply CodeRabbit auto-fixes

ebb9b8a

Fixed 1 file(s) based on 1 unresolved review comment. Co-authored-by: CodeRabbit <noreply@coderabbit.ai>

coderabbitai bot reviewed Mar 29, 2026

View reviewed changes

src/cli/commands/internal/fixture-management.ts Outdated Show resolved Hide resolved

fix: rename arg schemas to SCREAMING_SNAKE_CASE + dedupe resolve call

3b338a2

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai bot reviewed Mar 29, 2026

View reviewed changes

let-sunny merged commit 6cb2b5b into main Mar 29, 2026
3 checks passed

let-sunny deleted the feat/evidence-validation-174-175 branch March 29, 2026 12:24

This was referenced Mar 30, 2026

fix: address calibration/discovery logic risks from #188 #198

Merged

refactor: remove Rule Discovery pipeline #220

Merged

Conversation

let-sunny commented Mar 29, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

#174 — Discovery evidence category validation + gap-analysis collection

#175 — Zod argument validation for internal CLI commands

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Fixes Applied Successfully

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

let-sunny commented Mar 29, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 29, 2026 •

edited

Loading

coderabbitai bot left a comment •

edited

Loading

coderabbitai bot commented Mar 29, 2026 •

edited

Loading