add swe-rebench to excluded datasets by gwarmstrong · Pull Request #1154 · NVIDIA-NeMo/Skills

gwarmstrong · 2026-01-06T19:20:00Z

Summary by CodeRabbit

Tests
- "swe-rebench" dataset is now excluded from evaluation runs.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Signed-off-by: George Armstrong <georgea@nvidia.com>

coderabbitai · 2026-01-06T19:22:16Z

📝 Walkthrough

Walkthrough

The pull request adds "swe-rebench" to the EXCLUDED_DATASETS list in the GPU test evaluation script. This prevents the dataset from being processed during dataset preparation and evaluation phases.

Changes

Cohort / File(s)	Summary
Test Dataset Configuration `tests/gpu-tests/test_eval.py`	Added "swe-rebench" to EXCLUDED_DATASETS list to skip this dataset during evaluation.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Possibly related PRs

FIX ioi ignore #1131: Modifies the same EXCLUDED_DATASETS set in tests/gpu-tests/test_eval.py, consolidating ioi entries.

Suggested labels

run GPU tests

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The pull request title accurately describes the main change: adding 'swe-rebench' to the excluded datasets list in the test file.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✨ Finishing touches

📝 Generate docstrings

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7c039f5 and 1fc4fb2.

📒 Files selected for processing (1)

tests/gpu-tests/test_eval.py

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: gpu-tests-qwen
GitHub Check: Greptile Review
GitHub Check: pre-commit
GitHub Check: unit-tests

🔇 Additional comments (1)

tests/gpu-tests/test_eval.py (1)

40-40: LGTM!

The addition of "swe-rebench" to the exclusion list is correctly placed and follows the existing pattern. Grouping it with "swe-bench" makes logical sense.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

greptile-apps · 2026-01-06T19:22:35Z

Greptile Summary

Added swe-rebench to the EXCLUDED_DATASETS set in test_eval.py:40. This exclusion follows the same pattern as swe-bench (line 39), which is appropriate because:

SWE-rebench requires explicit parameters like container_formatter, start_date, and end_date in its prepare.py script
The dataset doesn't support the simple max_samples parameter used by the test suite
SWE-rebench was recently added in Evaluation support for SWE-rebench #1102 and shares the same evaluation infrastructure as swe-bench

The change maintains consistency with the existing exclusion policy stated in the comment on line 27: "These don't support max_samples, require explicit parameters, or are very heavy to prepare"

Confidence Score: 5/5

This PR is safe to merge with minimal risk
The change is a single-line addition that correctly excludes swe-rebench from automated testing. The exclusion is justified and consistent with the existing pattern for swe-bench. No logical issues, syntax errors, or security concerns exist.
No files require special attention

Important Files Changed

Filename	Overview
tests/gpu-tests/test_eval.py	Added `swe-rebench` to excluded datasets list - consistent with swe-bench exclusion pattern

Sequence Diagram

sequenceDiagram
    participant Dev as Developer
    participant Test as test_aaa_prepare_and_eval_all_datasets
    participant GDS as get_preparable_datasets()
    participant Datasets as Dataset Directory
    
    Dev->>Test: Run test suite
    Test->>GDS: Get list of preparable datasets
    GDS->>Datasets: Scan dataset directory
    Datasets-->>GDS: Return all datasets with prepare.py
    GDS->>GDS: Filter out EXCLUDED_DATASETS (includes swe-rebench)
    GDS-->>Test: Return filtered dataset list
    Test->>Test: Prepare and evaluate datasets
    Note over Test,GDS: swe-rebench now excluded<br/>like swe-bench (requires<br/>explicit parameters)

Signed-off-by: George Armstrong <georgea@nvidia.com> Signed-off-by: dlord <dlord@nvidia.com>

Signed-off-by: George Armstrong <georgea@nvidia.com> Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com>

Signed-off-by: George Armstrong <georgea@nvidia.com>

Signed-off-by: George Armstrong <georgea@nvidia.com> Signed-off-by: dgitman <dgitman@nvidia.com>

add swe-rebench to excluded datasets

1fc4fb2

Signed-off-by: George Armstrong <georgea@nvidia.com>

gwarmstrong added the run GPU tests label Jan 6, 2026

gwarmstrong merged commit a04f8e0 into main Jan 6, 2026
6 of 7 checks passed

gwarmstrong deleted the georgea/fix-integration-tests-swe-rebench branch January 6, 2026 20:08

blahblahasdf pushed a commit to blahblahasdf/Skills that referenced this pull request Jan 8, 2026

add swe-rebench to excluded datasets (NVIDIA-NeMo#1154)

4b49e26

Signed-off-by: George Armstrong <georgea@nvidia.com> Signed-off-by: dlord <dlord@nvidia.com>

hsiehjackson pushed a commit that referenced this pull request Jan 13, 2026

add swe-rebench to excluded datasets (#1154)

3220591

Signed-off-by: George Armstrong <georgea@nvidia.com> Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com>

coderabbitai bot mentioned this pull request Feb 11, 2026

Add CritPt benchmark #1200

Merged

coderabbitai bot mentioned this pull request Feb 24, 2026

Exclude numb3rs form test_eval.py #1275

Merged

dgtm777 pushed a commit that referenced this pull request Mar 18, 2026

add swe-rebench to excluded datasets (#1154)

3ceddd9

Signed-off-by: George Armstrong <georgea@nvidia.com>

dgtm777 pushed a commit that referenced this pull request Mar 18, 2026

add swe-rebench to excluded datasets (#1154)

e58d0e1

Signed-off-by: George Armstrong <georgea@nvidia.com> Signed-off-by: dgitman <dgitman@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add swe-rebench to excluded datasets#1154

add swe-rebench to excluded datasets#1154
gwarmstrong merged 1 commit intomainfrom
georgea/fix-integration-tests-swe-rebench

gwarmstrong commented Jan 6, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 6, 2026

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Uh oh!

greptile-apps bot commented Jan 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

gwarmstrong commented Jan 6, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 6, 2026

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Pre-merge checks and finishing touches

Uh oh!

greptile-apps bot commented Jan 6, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

gwarmstrong commented Jan 6, 2026 •

edited by coderabbitai bot

Loading