Stepheng/prover gpt oss fix by stephencge · Pull Request #1114 · NVIDIA-NeMo/Skills

stephencge · 2025-12-16T00:02:24Z

update prover to work with gpt-oss format

Summary by CodeRabbit

New Features
- Enhanced support for parsing and handling reasoning/thinking content from model outputs.
- Improved message structure consistency across generation workflows with optional reasoning fields.
Bug Fixes
- Added fallback parsing behavior to ensure content extraction when standard parsing fails.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Add _make_assistant_message helper to properly handle reasoning_content from models like gpt-oss that output <|channel|> tags. The chat template expects these in a 'thinking' field rather than embedded in 'content'. This fix is backwards compatible - the 'thinking' field is only added when reasoning_content is present in the generation output. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Stephen Ge <stepheng@nvidia.com>

The gpt-oss model outputs <|channel|> tags embedded in the content field, which causes apply_chat_template to fail. The chat template expects analysis/thinking content in a separate 'thinking' field. Added _parse_gpt_oss_output() to extract: - Analysis content (between <|channel|>analysis<|message|> and <|end|>) -> goes to 'thinking' field - Final content (after <|channel|>final<|message|>) -> goes to 'content' field Updated _make_assistant_message() to automatically parse gpt-oss format when reasoning_content is not provided by the server. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Stephen Ge <stepheng@nvidia.com>

coderabbitai · 2025-12-16T00:04:56Z

📝 Walkthrough

Walkthrough

The changes introduce helper methods to parse GPT-OSS formatted outputs and construct standardized assistant messages with optional reasoning content, integrating reasoning handling throughout the data-point generation flow in the prover task.

Changes

Cohort / File(s)	Summary
ProverTask Reasoning Support `nemo_skills/inference/prover.py`	Added `_parse_gpt_oss_output()` method to extract thinking and final content from gpt-oss tagged outputs. Added `_make_assistant_message()` method to construct standardized assistant message dicts with optional thinking field. Integrated reasoning content handling into the data-point generation flow, replacing direct message creation with calls to the new helper to ensure consistent structure and optional thinking inclusion. Updated message finalization to preserve or omit reasoning content based on availability.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20–25 minutes

Pay attention to all integration points where _make_assistant_message() is now called throughout the data-point generation flow to ensure reasoning content is correctly captured and passed
Verify the parsing fallback logic in _parse_gpt_oss_output() handles edge cases where final content cannot be extracted from channel tags
Check that both prompt_turn_list and full_prompt_turn_list are consistently updated with the new message structure including optional thinking fields

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.
Title check	❓ Inconclusive	The title 'Stepheng/prover gpt oss fix' is vague and uses branch naming conventions rather than describing the actual change. It doesn't clearly convey what the fix accomplishes.	Consider renaming to something like 'Add GPT-OSS parsing and reasoning support to prover' to better describe the actual changes being made.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch stepheng/prover-gpt-oss-fix

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

nemo_skills/inference/prover.py (1)
209-209: Remove redundant re import.

The re module is already imported at line 16, so this local import is unnecessary.
-        import re
-
         # Check if the content contains gpt-oss channel tags

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 294b552 and 3c25ee1.

📒 Files selected for processing (1)

nemo_skills/inference/prover.py (3 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: pre-commit
GitHub Check: unit-tests

🔇 Additional comments (3)

nemo_skills/inference/prover.py (3)

236-252: LGTM!

The helper method is well-structured with clear documentation and appropriate fallback behavior for parsing GPT-OSS output when reasoning_content isn't provided.

302-306: LGTM!

The integration correctly captures the optional reasoning_content from the generation and passes it to the new helper method.

316-329: LGTM, with a minor note on shared reference.

The refactoring correctly uses the helper method. The distinction between prompt_turn_list (code-only when remove_cot) and full_prompt_turn_list (full generation with reasoning) is preserved correctly.

Note: Lines 327-329 share the same assistant_msg dict reference in both lists. This is fine given current usage (append-only), but could cause subtle issues if these dicts are mutated elsewhere.

Signed-off-by: wasiahmad <wasiahmad@ucla.edu>

Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com>

Signed-off-by: dgitman <dgitman@nvidia.com>

stephencge and others added 2 commits December 15, 2025 12:59

stephencge requested review from Kipok and gwarmstrong December 16, 2025 00:02

coderabbitai bot reviewed Dec 16, 2025

View reviewed changes

stephencge enabled auto-merge (squash) December 16, 2025 00:39

Kipok approved these changes Dec 16, 2025

View reviewed changes

Merge branch 'main' into stepheng/prover-gpt-oss-fix

0b6a141

stephencge merged commit 9ddefa5 into main Dec 16, 2025
5 checks passed

stephencge deleted the stepheng/prover-gpt-oss-fix branch December 16, 2025 02:32

wasiahmad pushed a commit that referenced this pull request Dec 19, 2025

Stepheng/prover gpt oss fix (#1114)

1faf905

wasiahmad pushed a commit that referenced this pull request Dec 19, 2025

Stepheng/prover gpt oss fix (#1114)

579e765

Signed-off-by: wasiahmad <wasiahmad@ucla.edu>

hsiehjackson pushed a commit that referenced this pull request Jan 13, 2026

Stepheng/prover gpt oss fix (#1114)

30457b8

Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com>

wasiahmad pushed a commit that referenced this pull request Feb 4, 2026

Stepheng/prover gpt oss fix (#1114)

e9ad754

coderabbitai bot mentioned this pull request Mar 2, 2026

fix: clean parse-failure retries in prover #1284

Merged

dgtm777 pushed a commit that referenced this pull request Mar 18, 2026

Stepheng/prover gpt oss fix (#1114)

a14b5ec

dgtm777 pushed a commit that referenced this pull request Mar 18, 2026

Stepheng/prover gpt oss fix (#1114)

4a5d309

Signed-off-by: dgitman <dgitman@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stepheng/prover gpt oss fix#1114

Stepheng/prover gpt oss fix#1114
stephencge merged 3 commits intomainfrom
stepheng/prover-gpt-oss-fix

stephencge commented Dec 16, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Dec 16, 2025

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

stephencge commented Dec 16, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 16, 2025

Walkthrough

Changes

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

stephencge commented Dec 16, 2025 •

edited by coderabbitai bot

Loading