fix: IFBench error handling and build improvements by gwarmstrong · Pull Request #1073 · NVIDIA-NeMo/Skills

gwarmstrong · 2025-12-04T18:15:02Z

Summary

Re-applied IFBench error handling from Add try-except to catch any errors for ifbench #947
Enhanced Docker build reproducibility with fixed IFBench Commit

Signed-off-by: George Armstrong <georgea@nvidia.com>

coderabbitai · 2025-12-04T18:21:26Z

📝 Walkthrough

Walkthrough

The changes introduce parametrized, commit-specific IFBench fetching in the Dockerfile and enhance error handling in the IFBench evaluation library by wrapping instruction checks in try/except blocks with logging, while disabling the en_core_web_sm download.

Changes

Cohort / File(s)	Summary
Docker Build Configuration `dockerfiles/Dockerfile.nemo-skills`	Replaces git clone + per-directory pip install with parametrized commit-specific approach. Introduces ARGs for `IFBENCH_COMMIT`, `IFBENCH_REPO`, and `IFBENCH_DIR`; initializes target directory, sets remote, fetches specified commit, and hard-resets to FETCH_HEAD.
IFBench Patch – Error Handling & Logging `dockerfiles/ifbench.patch`	Modifies `evaluation_lib.py` to wrap `instruction.check_following()` calls in try/except blocks with logging in both strict and loose instruction-following tests; skips response logging on exception. Disables `en_core_web_sm` download in `instructions.py`. Adds minor formatting.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20–30 minutes

Dockerfile ARG introduction and git fetch/reset flow require verification of commit-pinning logic
Try/except placement in evaluation_lib.py and exception handling flow in both test functions need careful review
Confirm that skipping the en_core_web_sm download in instructions.py doesn't break downstream spaCy dependencies

Possibly related PRs

Add try-except to catch any errors for ifbench #947 – Modifies the same ifbench.patch files with try/except guards and logging around instruction checking
Fix ifbench dependency #925 – Updates dockerfiles/Dockerfile.nemo-skills for IFBench setup changes
Fixing ifbench #921 – Patches ifbench.patch to skip en_core_web_sm download and alter error handling in evaluation

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Title check	✅ Passed	The title accurately reflects the main changes: error handling improvements in IFBench code (try/except blocks, logging) and build improvements in the Dockerfile (parametrized checkout approach).

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch georgea/add-back-ifbench-error-escape

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

dockerfiles/Dockerfile.nemo-skills (1)

40-45: Use IFBENCH_DIR consistently or drop it as a configurable ARG

The commit-pinned checkout flow looks fine, but IFBENCH_DIR is only used in these two RUN steps. Later instructions (COPY and RUN cd /opt/benchmarks/IFBench && git apply ifbench.patch) still hard-code the path, so overriding --build-arg IFBENCH_DIR=... would break the build.

Either (a) replace those hard-coded /opt/benchmarks/IFBench usages with ${IFBENCH_DIR}, or (b) remove IFBENCH_DIR as a build-time override and keep the path fixed to avoid misleading configurability.
dockerfiles/ifbench.patch (1)
72-77: Prefer logging over print for the en_core_web_sm download notice

Commenting out the download('en_core_web_sm') call makes sense given the image now pre-downloads the model, but a bare print at import time can be noisy in library contexts.

Consider using the module logger instead, and optionally gating the message behind a debug/info level:
-# assumed to be predownloaded
-print("skipping download of en_core_web_sm")
-# download('en_core_web_sm')
+# assumed to be predownloaded
+logger = logging.getLogger(__name__)
+logger.info("Skipping download of en_core_web_sm (assumed preinstalled)")
+# download('en_core_web_sm')
(Adjust if a logger is already defined earlier in the file.)

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dbfad3d and 9f891bd.

📒 Files selected for processing (2)

dockerfiles/Dockerfile.nemo-skills (1 hunks)
dockerfiles/ifbench.patch (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: unit-tests

🔇 Additional comments (2)

dockerfiles/ifbench.patch (2)

39-57: Exception handling in the loose-following test looks solid

Wrapping instruction.check_following(r) in a try/except, skipping empty responses explicitly, and logging via logging.exception prevents a single bad response or buggy checker from aborting the entire evaluation run while still surfacing the stack trace.

The behavior for non-empty, non-exceptional responses remains the same (first response that passes check_following marks the instruction as followed), so this change is safe.

61-65: Trailing newline in print_report is harmless

Adding a blank line after the report loop doesn’t affect correctness and only tweaks formatting of the printed report. No changes needed.

dockerfiles/ifbench.patch

Signed-off-by: George Armstrong <georgea@nvidia.com> Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>

Signed-off-by: George Armstrong <georgea@nvidia.com> Signed-off-by: wasiahmad <wasiahmad@ucla.edu>

Signed-off-by: George Armstrong <georgea@nvidia.com>

Signed-off-by: George Armstrong <georgea@nvidia.com> Signed-off-by: dgitman <dgitman@nvidia.com>

gwarmstrong added 2 commits December 4, 2025 09:34

MAINT restore ifbench.patch

503612f

Signed-off-by: George Armstrong <georgea@nvidia.com>

MAINT update diffs for ifbench patch

9f891bd

Signed-off-by: George Armstrong <georgea@nvidia.com>

coderabbitai bot changed the title ~~@coderabbitai title~~ build: Improve IFBench fetching and error handling Dec 4, 2025

coderabbitai bot reviewed Dec 4, 2025

View reviewed changes

dockerfiles/ifbench.patch Show resolved Hide resolved

gwarmstrong changed the title ~~build: Improve IFBench fetching and error handling~~ fix: IFBench error handling and build improvements Dec 4, 2025

gwarmstrong merged commit 9115aef into main Dec 4, 2025
5 checks passed

gwarmstrong deleted the georgea/add-back-ifbench-error-escape branch December 4, 2025 18:33

Jorjeous pushed a commit that referenced this pull request Dec 11, 2025

fix: IFBench error handling and build improvements (#1073)

75bf380

Signed-off-by: George Armstrong <georgea@nvidia.com> Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>

wasiahmad pushed a commit that referenced this pull request Dec 12, 2025

fix: IFBench error handling and build improvements (#1073)

6cf5fb9

Signed-off-by: George Armstrong <georgea@nvidia.com> Signed-off-by: wasiahmad <wasiahmad@ucla.edu>

wasiahmad pushed a commit that referenced this pull request Feb 4, 2026

fix: IFBench error handling and build improvements (#1073)

1b1f66e

Signed-off-by: George Armstrong <georgea@nvidia.com>

dgtm777 pushed a commit that referenced this pull request Mar 18, 2026

fix: IFBench error handling and build improvements (#1073)

a4bf4e7

Signed-off-by: George Armstrong <georgea@nvidia.com>

dgtm777 pushed a commit that referenced this pull request Mar 18, 2026

fix: IFBench error handling and build improvements (#1073)

0295d20

Signed-off-by: George Armstrong <georgea@nvidia.com> Signed-off-by: dgitman <dgitman@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: IFBench error handling and build improvements#1073

fix: IFBench error handling and build improvements#1073
gwarmstrong merged 2 commits intomainfrom
georgea/add-back-ifbench-error-escape

gwarmstrong commented Dec 4, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Dec 4, 2025 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

gwarmstrong commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

coderabbitai bot commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

gwarmstrong commented Dec 4, 2025 •

edited

Loading

coderabbitai bot commented Dec 4, 2025 •

edited

Loading