feat: add custom judge type support for external repo integration by peri044 · Pull Request #1265 · NVIDIA-NeMo/Skills

peri044 · 2026-02-20T21:41:07Z

Add generic 'custom' judge type that allows external repos to register their own judge task creators using the locate() pattern. This enables external benchmarks to define judge tasks without modifying nemo-skills source code.

Changes:

Add _create_custom_judge_tasks() function to dynamically import and call external judge creators
Add 'custom' judge type routing in prepare_eval_commands()
Uses existing locate() pattern (same as METRICS_TYPE, eval_type)

Summary by CodeRabbit

New Features
- Support for dynamic judge loading by path, including user-provided custom judge creators.
- Added built-in Comet and NVEmbed judge integrations for translation and embedding-based evaluations.
Refactor
- Consolidated judge creation to the dynamic loader; removed legacy specialized judge creation paths.
- Preserved existing LLM-judge fallback and ensured judge model and pipeline args propagate to dynamic judges.

Add generic 'custom' judge type that allows external repos to register their own judge task creators using the locate() pattern. This enables external benchmarks to define judge tasks without modifying nemo-skills source code. Changes: - Add _create_custom_judge_tasks() function to dynamically import and call external judge creators - Add 'custom' judge type routing in prepare_eval_commands() - Uses existing locate() pattern (same as METRICS_TYPE, eval_type) This change is minimal (~50 lines) and reusable for any external repo.

coderabbitai · 2026-02-20T21:45:35Z

📝 Walkthrough

Walkthrough

Replaces hardcoded per-benchmark judge creators with a dynamic judge_step loader (via locate), removes internal comet/nvembed creators from eval.py, adds a nemo_skills.pipeline.judges package with separate comet and nvembed create_judge_tasks implementations, and propagates judge_model into judge pipeline args.

Changes

Cohort / File(s)	Summary
Eval pipeline `nemo_skills/pipeline/eval.py`	Removed `_create_comet_judge_tasks` and `_create_nvembed_judge_tasks`. Introduces dynamic judge loading using `judge_step`/`judge_creator_path` with `locate()`, calls loaded `create_judge_tasks(...)` with standardized params (benchmarks as list, output_dir, judge_pipeline_args) and falls back to existing LLM judge flow when no dynamic creator is provided. Removed `judge_type` parameter and related logic; `judge_model` is forwarded into `judge_pipeline_args`.
Judges package init `nemo_skills/pipeline/judges/__init__.py`	Adds package initializer (license header and module docstring) to expose `nemo_skills.pipeline.judges`.
Comet judge module `nemo_skills/pipeline/judges/comet_judge.py`	New `create_judge_tasks(...)` implementing Comet-based evaluation task creation: seed selection, remaining-jobs check, build/install/run commands (unbabel-comet), GPU/cluster task creation, dependencies, and returns created task(s).
NVEmbed judge module `nemo_skills/pipeline/judges/nvembed_judge.py`	New `create_judge_tasks(...)` implementing NVEmbed embedding-based judge task creation: input/output/seed handling, remaining-jobs check, build/run command for nvembed_judge, GPU/cluster task creation, dependencies, and returns created task(s).
Dataset metadata update `nemo_skills/dataset/mmau-pro/closed_form/__init__.py`	Replaced `JUDGE_PIPELINE_ARGS` key `judge_type` with `judge_step` pointing to `nemo_skills.pipeline.judges.nvembed_judge::create_judge_tasks` (switches config to dynamic creator path).

Sequence Diagram(s)

sequenceDiagram
    participant Eval as Eval Pipeline
    participant Loc as locate()
    participant Judge as Judge Module\n(create_judge_tasks)
    participant Scheduler as Task Scheduler / Cluster

    Eval->>Loc: resolve `judge_step` / creator path
    Loc-->>Eval: returns create_judge_tasks function
    Eval->>Judge: call create_judge_tasks(exp, expname, benchmarks, judge_pipeline_args, output_dir, ...)
    Judge-->>Eval: returns list of judge tasks
    Eval->>Scheduler: schedule returned tasks (sbatch / executor)
    Scheduler-->>Eval: task execution results / done markers

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

FIX integration tests by escaping aalcr and adding judge args #1062 — Modifies how datasets expose judge pipeline metadata and the switch from judge_type to judge_step (strongly related to dynamic judge loading).

Suggested labels

run GPU tests

Suggested reviewers

Kipok
gwarmstrong

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: introducing support for custom judge types that allow external repositories to integrate their own judge task creators, which is the core objective of the PR.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

nemo_skills/pipeline/eval.py (1)

388-388: ⚠️ Potential issue | 🟡 Minor

judge_type help string doesn't mention the new "custom" value (or the existing "comet").

📝 Proposed fix

-    judge_type: str = typer.Option("llm", help="Type of judge to use: 'llm' (default) or 'nvembed'"),
+    judge_type: str = typer.Option("llm", help="Type of judge to use: 'llm' (default), 'nvembed', 'comet', or 'custom'"),

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@nemo_skills/pipeline/eval.py` at line 388, The help text for the Typer option
judge_type (in the function signature where judge_type: str = typer.Option(...))
is missing the "custom" (and currently "comet") allowed values; update the help
string for judge_type to list all supported values (e.g., "'llm' (default),
'nvembed', 'comet', or 'custom'") or otherwise describe the accepted options so
users know "custom" and "comet" are valid.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@nemo_skills/pipeline/eval.py`:
- Around line 215-235: The _create_custom_judge_tasks function is missing the
rerun_done parameter so external creators cannot implement skip-if-done logic;
add rerun_done to the _create_custom_judge_tasks signature and ensure it is
forwarded into any internal calls that call get_remaining_jobs (matching how
_create_comet_judge_tasks and _create_nvembed_judge_tasks accept rerun_done),
and update the call site that invokes _create_custom_judge_tasks to pass through
the rerun_done argument so completed outputs can be skipped.

---

Outside diff comments:
In `@nemo_skills/pipeline/eval.py`:
- Line 388: The help text for the Typer option judge_type (in the function
signature where judge_type: str = typer.Option(...)) is missing the "custom"
(and currently "comet") allowed values; update the help string for judge_type to
list all supported values (e.g., "'llm' (default), 'nvembed', 'comet', or
'custom'") or otherwise describe the accepted options so users know "custom" and
"comet" are valid.

nemo_skills/pipeline/eval.py

Signed-off-by: Dheeraj Peri <dperi@nvidia.com>

…to peri044/external_judge

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@nemo_skills/pipeline/judges/comet_judge.py`:
- Around line 77-79: Replace silent .get() calls for required
judge_pipeline_args keys with direct dict access so missing required values
raise immediately: change output_dir_path =
judge_pipeline_args.get("output_dir") and comet_model_path =
judge_pipeline_args.get("judge_model") to output_dir_path =
judge_pipeline_args["output_dir"] and comet_model_path =
judge_pipeline_args["judge_model"], and likewise when you reference "input_dir"
in the branch that runs when input_file is None use
judge_pipeline_args["input_dir"] instead of .get(); leave input_file using
.get() since None is a valid sentinel, and ensure the values passed into
get_remaining_jobs(...) and the constructed command string come from these
direct accesses to fail fast with a clear KeyError.

In `@nemo_skills/pipeline/judges/nvembed_judge.py`:
- Around line 77-78: Replace the silent .get() accesses for required keys with
direct dictionary indexing so missing required args raise immediately: change
uses of judge_pipeline_args.get("output_dir") and
judge_pipeline_args.get("input_file") to judge_pipeline_args["output_dir"] and
judge_pipeline_args["input_file"], and likewise inside the branch use
judge_pipeline_args["input_dir"] instead of .get(); this will cause a KeyError
to fail fast (consistent with comet_judge.py) and prevent passing None into the
run command or get_remaining_jobs.

coderabbitai · 2026-02-21T04:59:37Z

nemo_skills/pipeline/judges/comet_judge.py

+    output_dir_path = judge_pipeline_args.get("output_dir")
+    input_file = judge_pipeline_args.get("input_file")
+    comet_model_path = judge_pipeline_args.get("judge_model")


🛠️ Refactor suggestion | 🟠 Major

Use direct dict access for required judge_pipeline_args keys.

output_dir, judge_model, and input_dir (inside the if input_file is None branch) are all expected to be present. Using .get() on them silently yields None, which then propagates into both get_remaining_jobs(output_dir=None, …) and the constructed command string (e.g., --output-dir None --comet-model-path None), producing confusing downstream failures instead of an immediate, clear KeyError. Per coding guidelines, use direct dict access dict[key] for required keys.

input_file correctly uses .get() since None is its legitimate "not-provided" sentinel.

♻️ Proposed fix

- output_dir_path = judge_pipeline_args.get("output_dir") - input_file = judge_pipeline_args.get("input_file") - comet_model_path = judge_pipeline_args.get("judge_model") + output_dir_path = judge_pipeline_args["output_dir"] + input_file = judge_pipeline_args.get("input_file") # intentional sentinel + comet_model_path = judge_pipeline_args["judge_model"]

if input_file is None: - input_dir = judge_pipeline_args.get("input_dir") + input_dir = judge_pipeline_args["input_dir"] script_args.append(f"--input-dir {input_dir}")

As per coding guidelines, "Do not use .get() for accessing dictionary keys if the code expects them to be present; use direct dictionary access dict[key] instead to allow proper error handling and fail fast with clear errors."

Also applies to: 101-101, 104-105

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@nemo_skills/pipeline/judges/comet_judge.py` around lines 77 - 79, Replace silent .get() calls for required judge_pipeline_args keys with direct dict access so missing required values raise immediately: change output_dir_path = judge_pipeline_args.get("output_dir") and comet_model_path = judge_pipeline_args.get("judge_model") to output_dir_path = judge_pipeline_args["output_dir"] and comet_model_path = judge_pipeline_args["judge_model"], and likewise when you reference "input_dir" in the branch that runs when input_file is None use judge_pipeline_args["input_dir"] instead of .get(); leave input_file using .get() since None is a valid sentinel, and ensure the values passed into get_remaining_jobs(...) and the constructed command string come from these direct accesses to fail fast with a clear KeyError.

coderabbitai · 2026-02-21T04:59:37Z

nemo_skills/pipeline/judges/nvembed_judge.py

+    output_dir_path = judge_pipeline_args.get("output_dir")
+    input_file = judge_pipeline_args.get("input_file")


🛠️ Refactor suggestion | 🟠 Major

Same .get() on required keys as in comet_judge.py.

output_dir and input_dir (inside the if input_file is None branch) are required fields. Using .get() silently produces None, which embeds --output-dir None / --input-dir None in the run command and passes None to get_remaining_jobs, rather than failing immediately with a clear KeyError.

♻️ Proposed fix

- output_dir_path = judge_pipeline_args.get("output_dir") - input_file = judge_pipeline_args.get("input_file") + output_dir_path = judge_pipeline_args["output_dir"] + input_file = judge_pipeline_args.get("input_file") # intentional sentinel

if input_file is None: - input_dir = judge_pipeline_args.get("input_dir") + input_dir = judge_pipeline_args["input_dir"] script_args.append(f"--input-dir {input_dir}")

As per coding guidelines, "Do not use .get() for accessing dictionary keys if the code expects them to be present; use direct dictionary access dict[key] instead to allow proper error handling and fail fast with clear errors."

Also applies to: 100-100, 103-104

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@nemo_skills/pipeline/judges/nvembed_judge.py` around lines 77 - 78, Replace the silent .get() accesses for required keys with direct dictionary indexing so missing required args raise immediately: change uses of judge_pipeline_args.get("output_dir") and judge_pipeline_args.get("input_file") to judge_pipeline_args["output_dir"] and judge_pipeline_args["input_file"], and likewise inside the branch use judge_pipeline_args["input_dir"] instead of .get(); this will cause a KeyError to fail fast (consistent with comet_judge.py) and prevent passing None into the run command or get_remaining_jobs.

Signed-off-by: Dheeraj Peri <dperi@nvidia.com>

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

nemo_skills/pipeline/eval.py (1)
490-518: ⚠️ Potential issue | 🟡 Minor

has_tasks not set in the dynamic judge path — experiment may silently not run.

The LLM path (line 521) explicitly sets has_tasks = True before creating judge tasks. The dynamic path omits this. If job_batches is empty (all eval outputs already exist, so the main loop never sets has_tasks = True) and auto_summarize_results=False, pipeline_utils.run_exp at line 653 is never called, and the experiment silently no-ops despite dynamic judge tasks having been created and appended to all_tasks.
🐛 Proposed fix
             if judge_creator_path:
-                # Use locate() to dynamically load judge creator function
-                from nemo_skills.dataset.utils import locate
-
+                has_tasks = True
                 judge_creator_fn = locate(judge_creator_path)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nemo_skills/pipeline/eval.py` around lines 490 - 518, The dynamic
judge-creator branch (when judge_creator_path is used and judge_creator_fn
returns judge_tasks) never sets the has_tasks flag, so downstream
pipeline_utils.run_exp may be skipped; after calling judge_creator_fn and
confirming judge_tasks is non-empty (the judge_tasks variable created in this
block), set has_tasks = True and ensure those tasks are appended to all_tasks so
run_exp (pipeline_utils.run_exp) will execute just like the LLM path does.

🧹 Nitpick comments (1)

nemo_skills/pipeline/eval.py (1)

490-492: Hoist locate import out of the benchmark loop.

from nemo_skills.dataset.utils import locate is re-executed on every benchmark iteration that has a judge. Python's import cache makes it a no-op after the first call, but placing imports inside a loop is non-idiomatic and obscures the module's true dependencies.

♻️ Proposed fix

 from nemo_skills.pipeline.utils.eval import combine_cmds, prepare_eval_commands
+from nemo_skills.dataset.utils import locate
 from nemo_skills.utils import (

Then remove the inline import:

-            if judge_creator_path:
-                # Use locate() to dynamically load judge creator function
-                from nemo_skills.dataset.utils import locate
-
-                judge_creator_fn = locate(judge_creator_path)
+            if judge_creator_path:
+                judge_creator_fn = locate(judge_creator_path)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@nemo_skills/pipeline/eval.py` around lines 490 - 492, The inline import "from
nemo_skills.dataset.utils import locate" inside the block that checks
judge_creator_path should be hoisted to module-level (or at least outside the
benchmark loop) so the import is declared once and dependencies are clear;
remove the inline import from the loop and add a single top-level import for
locate, leaving the existing logic that uses judge_creator_path and locate
unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@nemo_skills/pipeline/eval.py`:
- Around line 490-518: The dynamic judge-creator branch (when judge_creator_path
is used and judge_creator_fn returns judge_tasks) never sets the has_tasks flag,
so downstream pipeline_utils.run_exp may be skipped; after calling
judge_creator_fn and confirming judge_tasks is non-empty (the judge_tasks
variable created in this block), set has_tasks = True and ensure those tasks are
appended to all_tasks so run_exp (pipeline_utils.run_exp) will execute just like
the LLM path does.

---

Nitpick comments:
In `@nemo_skills/pipeline/eval.py`:
- Around line 490-492: The inline import "from nemo_skills.dataset.utils import
locate" inside the block that checks judge_creator_path should be hoisted to
module-level (or at least outside the benchmark loop) so the import is declared
once and dependencies are clear; remove the inline import from the loop and add
a single top-level import for locate, leaving the existing logic that uses
judge_creator_path and locate unchanged.

Signed-off-by: Dheeraj Peri <dperi@nvidia.com>

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

nemo_skills/pipeline/eval.py (1)

511-546: ⚠️ Potential issue | 🟡 Minor

has_tasks is not set for the dynamic judge path.

Line 513 sets has_tasks = True only in the LLM judge branch. The dynamic judge branch (Lines 482–510) skips it. While main eval tasks will have already set has_tasks = True in practice, this is inconsistent and fragile if the flow is ever refactored.

Move the flag into the shared null-check block so both paths are covered:

Proposed fix

-            else:
-                # Use default LLM judge pipeline
-                has_tasks = True
-                judge_tasks = _create_llm_judge_tasks(
+            else:
+                # Use default LLM judge pipeline
+                judge_tasks = _create_llm_judge_tasks(
                     ...
                 )
             # _generate can return None when there are no jobs to run (e.g., outputs already exist)
             # Only record and extend when tasks are present to avoid NoneType errors
             if judge_tasks:
+                has_tasks = True
                 benchmark_to_judge_tasks[benchmark] = judge_tasks
                 all_tasks.extend(judge_tasks)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@nemo_skills/pipeline/eval.py` around lines 511 - 546, The has_tasks flag is
only set inside the LLM judge branch; move setting has_tasks = True into the
shared post-generation null-check so it's set for both the LLM and dynamic judge
paths: after obtaining judge_tasks (from _create_llm_judge_tasks or the dynamic
judge generator) and before you assign benchmark_to_judge_tasks or extend
all_tasks, set has_tasks = True when judge_tasks is truthy; this ensures the
flag reflects any generated judge tasks regardless of which creation function
produced them.

🧹 Nitpick comments (1)

nemo_skills/pipeline/eval.py (1)
482-510: The judge module interface is well-documented but consider adding a discoverable reference at the call site.

Both nvembed_judge.py and comet_judge.py define create_judge_tasks() functions with comprehensive docstrings documenting all parameters and return types, matching exactly the 20 kwargs passed in eval.py lines 489–510. The interface contract is properly specified in the judge modules themselves.

However, external authors looking at eval.py won't see these docstrings without examining the judge modules. Consider adding a brief reference comment near line 481–484 (before the locate() call) pointing to the expected signature in the judge modules, or documenting the standard create_judge_tasks() interface in judges/__init__.py for improved discoverability.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nemo_skills/pipeline/eval.py` around lines 482 - 510, Add a discoverable
reference to the expected judge creator interface by inserting a short comment
before the locate() call in eval.py that names the standardized function and its
parameters (e.g., "expects create_judge_tasks(exp, expname, benchmarks,
judge_pipeline_args, rerun_done, log_dir, output_dir, cluster_config,
judge_server_gpus, judge_server_nodes, partition, run_after, reuse_code_exp,
reuse_code, dependent_tasks, all_tasks, _task_dependencies,
installation_command, skip_hf_home_check, sbatch_kwargs)") and/or add a concise
module-level docstring in judges/__init__.py that documents the
create_judge_tasks signature so external authors can discover the contract
without opening nvembed_judge.py or comet_judge.py; update references to
judge_creator_path, locate, and judge_creator_fn accordingly.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@nemo_skills/pipeline/eval.py`:
- Around line 511-546: The has_tasks flag is only set inside the LLM judge
branch; move setting has_tasks = True into the shared post-generation null-check
so it's set for both the LLM and dynamic judge paths: after obtaining
judge_tasks (from _create_llm_judge_tasks or the dynamic judge generator) and
before you assign benchmark_to_judge_tasks or extend all_tasks, set has_tasks =
True when judge_tasks is truthy; this ensures the flag reflects any generated
judge tasks regardless of which creation function produced them.

---

Nitpick comments:
In `@nemo_skills/pipeline/eval.py`:
- Around line 482-510: Add a discoverable reference to the expected judge
creator interface by inserting a short comment before the locate() call in
eval.py that names the standardized function and its parameters (e.g., "expects
create_judge_tasks(exp, expname, benchmarks, judge_pipeline_args, rerun_done,
log_dir, output_dir, cluster_config, judge_server_gpus, judge_server_nodes,
partition, run_after, reuse_code_exp, reuse_code, dependent_tasks, all_tasks,
_task_dependencies, installation_command, skip_hf_home_check, sbatch_kwargs)")
and/or add a concise module-level docstring in judges/__init__.py that documents
the create_judge_tasks signature so external authors can discover the contract
without opening nvembed_judge.py or comet_judge.py; update references to
judge_creator_path, locate, and judge_creator_fn accordingly.

Kipok

Could you please update a reference to judge_type in the docs in here /home/igitman/workspace/NeMo-Skills/docs/evaluation/multilingual.md?

Also can you please re-create from a branch, so that our tests can run?

Kipok · 2026-02-23T19:09:32Z

nemo_skills/pipeline/eval.py

-            benchmark_judge_type = judge_pipeline_args.pop("judge_type", judge_type)
+            # judge_step is a :: path to the judge creator function (locate() convention).
+            # Benchmarks set this directly in JUDGE_PIPELINE_ARGS; falls back to None for LLM judge.
+            judge_creator_path = judge_pipeline_args.pop("judge_step", None)


can we have this as an explicit pipeline parameter? Similar to how judge_type was? That way it will be simpler to update from command line

Renamed the judge_step to judge_path and added it as a pipeline parameter.

Kipok · 2026-02-24T04:47:45Z

nemo_skills/pipeline/eval.py

                    exp=exp,
                    expname=expname,
-                    benchmark=benchmark,
+                    benchmarks=[benchmark],


what's the reason to have multiple benchmarks here?

No specific reason. Reverted this.

peri044 requested a review from Kipok February 20, 2026 21:42

Merge branch 'main' into peri044/external_judge

9d8bdec

coderabbitai bot reviewed Feb 20, 2026

View reviewed changes

nemo_skills/pipeline/eval.py Outdated Show resolved Hide resolved

Kipok reviewed Feb 20, 2026

View reviewed changes

nemo_skills/pipeline/eval.py Outdated Show resolved Hide resolved

nemo_skills/pipeline/eval.py Outdated Show resolved Hide resolved

peri044 added 2 commits February 20, 2026 20:34

chore: Address review comments

f8971b5

Signed-off-by: Dheeraj Peri <dperi@nvidia.com>

Merge branch 'peri044/external_judge' of github.com:peri044/Skills in…

36ae3b4

…to peri044/external_judge

coderabbitai bot reviewed Feb 21, 2026

View reviewed changes

chore: Fix argument

cb0611e

Signed-off-by: Dheeraj Peri <dperi@nvidia.com>

coderabbitai bot reviewed Feb 21, 2026

View reviewed changes

chore: address review comments

a580e84

Signed-off-by: Dheeraj Peri <dperi@nvidia.com>

coderabbitai bot reviewed Feb 21, 2026

View reviewed changes

peri044 added 3 commits February 22, 2026 02:10

Merge branch 'main' into peri044/external_judge

1dce960

Merge branch 'main' into peri044/external_judge

81e0667

Merge branch 'main' into peri044/external_judge

ebb5230

Kipok reviewed Feb 24, 2026

View reviewed changes

peri044 closed this Feb 24, 2026

peri044 mentioned this pull request Feb 24, 2026

feat: add custom judge type support for external repo integration #1274

Merged

		output_dir_path = judge_pipeline_args.get("output_dir")
		input_file = judge_pipeline_args.get("input_file")

Conversation

peri044 commented Feb 20, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Kipok left a comment

Choose a reason for hiding this comment

Uh oh!

Kipok Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

peri044 Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Kipok Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

peri044 Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

peri044 commented Feb 20, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 20, 2026 •

edited

Loading

peri044 Feb 24, 2026 •

edited

Loading