Skip to content

fix: IFBench error handling and build improvements#1073

Merged
gwarmstrong merged 2 commits intomainfrom
georgea/add-back-ifbench-error-escape
Dec 4, 2025
Merged

fix: IFBench error handling and build improvements#1073
gwarmstrong merged 2 commits intomainfrom
georgea/add-back-ifbench-error-escape

Conversation

@gwarmstrong
Copy link
Collaborator

@gwarmstrong gwarmstrong commented Dec 4, 2025

Summary

Signed-off-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com>
@coderabbitai coderabbitai bot changed the title @coderabbitai title build: Improve IFBench fetching and error handling Dec 4, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 4, 2025

📝 Walkthrough

Walkthrough

The changes introduce parametrized, commit-specific IFBench fetching in the Dockerfile and enhance error handling in the IFBench evaluation library by wrapping instruction checks in try/except blocks with logging, while disabling the en_core_web_sm download.

Changes

Cohort / File(s) Summary
Docker Build Configuration
dockerfiles/Dockerfile.nemo-skills
Replaces git clone + per-directory pip install with parametrized commit-specific approach. Introduces ARGs for IFBENCH_COMMIT, IFBENCH_REPO, and IFBENCH_DIR; initializes target directory, sets remote, fetches specified commit, and hard-resets to FETCH_HEAD.
IFBench Patch – Error Handling & Logging
dockerfiles/ifbench.patch
Modifies evaluation_lib.py to wrap instruction.check_following() calls in try/except blocks with logging in both strict and loose instruction-following tests; skips response logging on exception. Disables en_core_web_sm download in instructions.py. Adds minor formatting.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20–30 minutes

  • Dockerfile ARG introduction and git fetch/reset flow require verification of commit-pinning logic
  • Try/except placement in evaluation_lib.py and exception handling flow in both test functions need careful review
  • Confirm that skipping the en_core_web_sm download in instructions.py doesn't break downstream spaCy dependencies

Possibly related PRs

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Title check ✅ Passed The title accurately reflects the main changes: error handling improvements in IFBench code (try/except blocks, logging) and build improvements in the Dockerfile (parametrized checkout approach).
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch georgea/add-back-ifbench-error-escape

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
dockerfiles/Dockerfile.nemo-skills (1)

40-45: Use IFBENCH_DIR consistently or drop it as a configurable ARG

The commit-pinned checkout flow looks fine, but IFBENCH_DIR is only used in these two RUN steps. Later instructions (COPY and RUN cd /opt/benchmarks/IFBench && git apply ifbench.patch) still hard-code the path, so overriding --build-arg IFBENCH_DIR=... would break the build.

Either (a) replace those hard-coded /opt/benchmarks/IFBench usages with ${IFBENCH_DIR}, or (b) remove IFBENCH_DIR as a build-time override and keep the path fixed to avoid misleading configurability.

dockerfiles/ifbench.patch (1)

72-77: Prefer logging over print for the en_core_web_sm download notice

Commenting out the download('en_core_web_sm') call makes sense given the image now pre-downloads the model, but a bare print at import time can be noisy in library contexts.

Consider using the module logger instead, and optionally gating the message behind a debug/info level:

-# assumed to be predownloaded
-print("skipping download of en_core_web_sm")
-# download('en_core_web_sm')
+# assumed to be predownloaded
+logger = logging.getLogger(__name__)
+logger.info("Skipping download of en_core_web_sm (assumed preinstalled)")
+# download('en_core_web_sm')

(Adjust if a logger is already defined earlier in the file.)

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dbfad3d and 9f891bd.

📒 Files selected for processing (2)
  • dockerfiles/Dockerfile.nemo-skills (1 hunks)
  • dockerfiles/ifbench.patch (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: unit-tests
🔇 Additional comments (2)
dockerfiles/ifbench.patch (2)

39-57: Exception handling in the loose-following test looks solid

Wrapping instruction.check_following(r) in a try/except, skipping empty responses explicitly, and logging via logging.exception prevents a single bad response or buggy checker from aborting the entire evaluation run while still surfacing the stack trace.

The behavior for non-empty, non-exceptional responses remains the same (first response that passes check_following marks the instruction as followed), so this change is safe.


61-65: Trailing newline in print_report is harmless

Adding a blank line after the report loop doesn’t affect correctness and only tweaks formatting of the printed report. No changes needed.

@gwarmstrong gwarmstrong changed the title build: Improve IFBench fetching and error handling fix: IFBench error handling and build improvements Dec 4, 2025
@gwarmstrong gwarmstrong merged commit 9115aef into main Dec 4, 2025
5 checks passed
@gwarmstrong gwarmstrong deleted the georgea/add-back-ifbench-error-escape branch December 4, 2025 18:33
Jorjeous pushed a commit that referenced this pull request Dec 11, 2025
Signed-off-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>
wasiahmad pushed a commit that referenced this pull request Dec 12, 2025
Signed-off-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
wasiahmad pushed a commit that referenced this pull request Feb 4, 2026
Signed-off-by: George Armstrong <georgea@nvidia.com>
dgtm777 pushed a commit that referenced this pull request Mar 18, 2026
Signed-off-by: George Armstrong <georgea@nvidia.com>
dgtm777 pushed a commit that referenced this pull request Mar 18, 2026
Signed-off-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: dgitman <dgitman@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant