Skip to content

Stepheng/prover gpt oss fix#1114

Merged
stephencge merged 3 commits intomainfrom
stepheng/prover-gpt-oss-fix
Dec 16, 2025
Merged

Stepheng/prover gpt oss fix#1114
stephencge merged 3 commits intomainfrom
stepheng/prover-gpt-oss-fix

Conversation

@stephencge
Copy link
Collaborator

@stephencge stephencge commented Dec 16, 2025

update prover to work with gpt-oss format

Summary by CodeRabbit

  • New Features

    • Enhanced support for parsing and handling reasoning/thinking content from model outputs.
    • Improved message structure consistency across generation workflows with optional reasoning fields.
  • Bug Fixes

    • Added fallback parsing behavior to ensure content extraction when standard parsing fails.

✏️ Tip: You can customize this high-level summary in your review settings.

stephencge and others added 2 commits December 15, 2025 12:59
Add _make_assistant_message helper to properly handle reasoning_content
from models like gpt-oss that output <|channel|> tags. The chat template
expects these in a 'thinking' field rather than embedded in 'content'.

This fix is backwards compatible - the 'thinking' field is only added
when reasoning_content is present in the generation output.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Stephen Ge <stepheng@nvidia.com>
The gpt-oss model outputs <|channel|> tags embedded in the content field,
which causes apply_chat_template to fail. The chat template expects
analysis/thinking content in a separate 'thinking' field.

Added _parse_gpt_oss_output() to extract:
- Analysis content (between <|channel|>analysis<|message|> and <|end|>)
  -> goes to 'thinking' field
- Final content (after <|channel|>final<|message|>)
  -> goes to 'content' field

Updated _make_assistant_message() to automatically parse gpt-oss format
when reasoning_content is not provided by the server.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Stephen Ge <stepheng@nvidia.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 16, 2025

📝 Walkthrough

Walkthrough

The changes introduce helper methods to parse GPT-OSS formatted outputs and construct standardized assistant messages with optional reasoning content, integrating reasoning handling throughout the data-point generation flow in the prover task.

Changes

Cohort / File(s) Summary
ProverTask Reasoning Support
nemo_skills/inference/prover.py
Added _parse_gpt_oss_output() method to extract thinking and final content from gpt-oss tagged outputs. Added _make_assistant_message() method to construct standardized assistant message dicts with optional thinking field. Integrated reasoning content handling into the data-point generation flow, replacing direct message creation with calls to the new helper to ensure consistent structure and optional thinking inclusion. Updated message finalization to preserve or omit reasoning content based on availability.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20–25 minutes

  • Pay attention to all integration points where _make_assistant_message() is now called throughout the data-point generation flow to ensure reasoning content is correctly captured and passed
  • Verify the parsing fallback logic in _parse_gpt_oss_output() handles edge cases where final content cannot be extracted from channel tags
  • Check that both prompt_turn_list and full_prompt_turn_list are consistently updated with the new message structure including optional thinking fields

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Title check ❓ Inconclusive The title 'Stepheng/prover gpt oss fix' is vague and uses branch naming conventions rather than describing the actual change. It doesn't clearly convey what the fix accomplishes. Consider renaming to something like 'Add GPT-OSS parsing and reasoning support to prover' to better describe the actual changes being made.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch stepheng/prover-gpt-oss-fix

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
nemo_skills/inference/prover.py (1)

209-209: Remove redundant re import.

The re module is already imported at line 16, so this local import is unnecessary.

-        import re
-
         # Check if the content contains gpt-oss channel tags
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 294b552 and 3c25ee1.

📒 Files selected for processing (1)
  • nemo_skills/inference/prover.py (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: pre-commit
  • GitHub Check: unit-tests
🔇 Additional comments (3)
nemo_skills/inference/prover.py (3)

236-252: LGTM!

The helper method is well-structured with clear documentation and appropriate fallback behavior for parsing GPT-OSS output when reasoning_content isn't provided.


302-306: LGTM!

The integration correctly captures the optional reasoning_content from the generation and passes it to the new helper method.


316-329: LGTM, with a minor note on shared reference.

The refactoring correctly uses the helper method. The distinction between prompt_turn_list (code-only when remove_cot) and full_prompt_turn_list (full generation with reasoning) is preserved correctly.

Note: Lines 327-329 share the same assistant_msg dict reference in both lists. This is fine given current usage (append-only), but could cause subtle issues if these dicts are mutated elsewhere.

@stephencge stephencge enabled auto-merge (squash) December 16, 2025 00:39
@stephencge stephencge merged commit 9ddefa5 into main Dec 16, 2025
5 checks passed
@stephencge stephencge deleted the stepheng/prover-gpt-oss-fix branch December 16, 2025 02:32
wasiahmad pushed a commit that referenced this pull request Dec 19, 2025
wasiahmad pushed a commit that referenced this pull request Dec 19, 2025
Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
hsiehjackson pushed a commit that referenced this pull request Jan 13, 2026
Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com>
wasiahmad pushed a commit that referenced this pull request Feb 4, 2026
dgtm777 pushed a commit that referenced this pull request Mar 18, 2026
dgtm777 pushed a commit that referenced this pull request Mar 18, 2026
Signed-off-by: dgitman <dgitman@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants