Skip to content

fix: dpo mistral nightly needs more time#1225

Merged
chtruong814 merged 1 commit intomainfrom
tk/dpo-mistral-needs-more-time
Sep 29, 2025
Merged

fix: dpo mistral nightly needs more time#1225
chtruong814 merged 1 commit intomainfrom
tk/dpo-mistral-needs-more-time

Conversation

@terrykong
Copy link
Copy Markdown
Collaborator

@terrykong terrykong commented Sep 29, 2025

There seems to be a jump in the total step time according to our CI

image

https://wandb.ai/nvidia/nemo-rl/panel/z7zystoxj?nw=sorg4p7b5x

But upon inspecting the commit range there weren't any suspicious PRs
image

This perf jump may just be infra related since i don't see it with other runs
image

https://wandb.ai/nvidia/nemo-rl/panel/z7zystoxj?nw=np5jodktgg

For now, just give this test slightly more time so it can run till completion since 100 steps has been taking 45min exactly

Summary by CodeRabbit

  • Documentation
    • Standardized tips/notes/important blocks to GitHub-style admonitions for consistent look and readability.
    • Cleaned up text wrapping and code fences; aligned examples and line breaks.
    • Clarified GPU recommendation tip presentation without changing guidance.
    • Merged environment variable assignment and command into a single profiling example for easier copy-paste.
  • Refactor
    • Improved docs build to convert admonitions during content processing, ensuring consistent formatting across sources.

@terrykong terrykong requested a review from a team as a code owner September 29, 2025 05:52
@terrykong terrykong added the CI:docs Run doctest label Sep 29, 2025
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Sep 29, 2025
@terrykong terrykong enabled auto-merge (squash) September 29, 2025 05:52
@terrykong terrykong closed this Sep 29, 2025
auto-merge was automatically disabled September 29, 2025 05:55

Pull request was closed

@terrykong terrykong force-pushed the tk/dpo-mistral-needs-more-time branch from 1de1fde to 5166d74 Compare September 29, 2025 05:55
Signed-off-by: Terry Kong <terryk@nvidia.com>
@terrykong terrykong reopened this Sep 29, 2025
@terrykong terrykong requested a review from a team as a code owner September 29, 2025 05:56
@github-actions github-actions bot removed the documentation Improvements or additions to documentation label Sep 29, 2025
@terrykong terrykong added documentation Improvements or additions to documentation CI:L0 Run doctests and unit tests and removed CI:docs Run doctest labels Sep 29, 2025
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Sep 29, 2025

📝 Walkthrough

Walkthrough

Documentation markdown admonitions were standardized across multiple docs/*.md files. The Sphinx docs build hook in docs/conf.py was refactored to use an in-place conversion helper and now runs on both include-read and source-read events via lightweight wrapper adapters.

Changes

Cohort / File(s) Summary
Docs admonition formatting
docs/cluster.md, docs/design-docs/logger.md, docs/nsys-profiling.md, docs/testing.md
Replaced MyST-style ::: blocks (tip/note/important) with blockquote admonitions ([!TIP]/[!NOTE]/[!IMPORTANT]); minor text wrapping and code block alignment adjustments; no content changes.
Sphinx admonition conversion refactor
docs/conf.py
Introduced _convert_gh_admonitions_inplace(contents) and wrapper adapters; connected handlers to "include-read" and new "source-read"; logic continues to convert GitHub-style admonitions to MyST in place; GitHub path transformation unchanged.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    actor Dev as Sphinx App
    participant SR as source-read handler
    participant IR as include-read handler
    participant Conv as _convert_gh_admonitions_inplace

    Dev->>SR: source-read(docname, source[])
    SR->>Conv: mutate(source[])
    Conv-->>SR: return

    Dev->>IR: include-read(relative_path, parent_docname, contents[])
    IR->>Conv: mutate(contents[])
    Conv-->>IR: return
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested labels

CI:docs

Suggested reviewers

  • chtruong814

Pre-merge checks and finishing touches

✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title concisely describes the primary change—extending time for the DPO Mistral nightly job—and uses conventional commit styling, aligning with the pull request objective of increasing allotted CI time. It is specific and clearly related to the main modifications without extraneous details.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
Test Results For Major Changes ✅ Passed The PR primarily refactors documentation admonition handling and rewraps existing docs into the new Markdown tip/note syntax, which does not introduce new features or risk product functionality. Because these are minor documentation and tooling adjustments rather than major behavioral changes, the custom check does not require explicit test evidence in the description. Therefore, the condition for failing the check is not met.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch tk/dpo-mistral-needs-more-time

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5166d74 and 1de1fde.

📒 Files selected for processing (5)
  • docs/cluster.md (4 hunks)
  • docs/conf.py (3 hunks)
  • docs/design-docs/logger.md (1 hunks)
  • docs/nsys-profiling.md (1 hunks)
  • docs/testing.md (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
docs/**/*.md

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

When a markdown doc under docs/**/*.md is added or renamed, update docs/index.md to include it in the appropriate section

Files:

  • docs/cluster.md
  • docs/design-docs/logger.md
  • docs/testing.md
  • docs/nsys-profiling.md
**/*.py

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

**/*.py: Follow the Google Python Style Guide for all Python code
Target Python 3.12+ for all Python code in NeMo-RL
Indent Python code with 4 spaces; do not use tabs
Python filenames should be snake_case (e.g., some_file.py)
Class names should be PascalCase
Function and method names should be snake_case
Local variable names should be snake_case; if starting with a number, prefix with k (e.g., k_99th_percentile)
Global variables should be UPPER_SNAKE_CASE and prefixed with G_ (e.g., G_MY_GLOBAL)
Constants should be UPPER_SNAKE_CASE
Avoid shadowing variables declared in an outer scope
Initialize all externally visible members of a class in the constructor
For public interfaces used outside a file, prefer docstrings over comments
Use comments mainly for code within a function or interfaces local to a file
Commented-out code must include a nearby comment explaining usage and why it is commented out; otherwise remove before merging
Use Google-style docstrings for classes and functions (Sphinx-parseable)
Avoid using reflection when functionality can be easily achieved without it
Limit except clauses to the smallest specific set of exceptions possible
For duck-typing via try/except, keep the try body minimal and use else for main logic
Add the NVIDIA copyright header (with current year) at the top of all Python files, excluding tests/ and test-only scripts

Files:

  • docs/conf.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Docs_Tests
  • GitHub Check: Docs_Tests
  • GitHub Check: Lint check
  • GitHub Check: Lint check
  • GitHub Check: Post submodule check comment / Comment on PR
  • GitHub Check: Post automodel integration comment / Comment on PR
🔇 Additional comments (11)
docs/design-docs/logger.md (1)

174-183: Admonition conversion looks good

Using > [!NOTE] aligns with the conf.py converter and MyST. No issues spotted in the converted block.

docs/nsys-profiling.md (1)

66-72: Clearer Megatron profiling instructions

Good switch to > [!IMPORTANT] and folding LD_LIBRARY_PATH into the one-liner. Reads cleanly and should reduce copy/paste errors.

docs/testing.md (2)

132-144: TIP block conversion LGTM

The > [!TIP] block with the jq example formats well and will convert correctly via the Sphinx hook.


148-150: Functional tests note reads well

Concise and consistent with the new admonition style.

docs/cluster.md (4)

28-30: Tip block conversion LGTM

Matches MyST style and improves readability.


42-44: Interactive debugging tip reads clearly

Nice improvement to explain the benefit of staying on the head node.


112-119: Env var pass-through tip is helpful

Good callout; makes behavior explicit for users of ray.sub.


170-173: Note block conversion LGTM

Consistent with the rest of the doc set.

docs/conf.py (3)

161-165: Wrapper for include-read is fine (post-fix)

With the unified in-place converter above, this adapter remains correct.


167-172: source-read adapter LGTM

Delegation is clean; the docstring comment helps future maintenance.


245-248: Connecting both include-read and source-read is the right call

Covers both top-level docs and included fragments.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@chtruong814 chtruong814 added CI:L0 Run doctests and unit tests and removed CI:L0 Run doctests and unit tests labels Sep 29, 2025
@chtruong814 chtruong814 enabled auto-merge (squash) September 29, 2025 12:21
@chtruong814 chtruong814 merged commit 0dca729 into main Sep 29, 2025
80 of 82 checks passed
@chtruong814 chtruong814 deleted the tk/dpo-mistral-needs-more-time branch September 29, 2025 14:30
PrinsYin pushed a commit to PrinsYin/RL that referenced this pull request Nov 30, 2025
Signed-off-by: Terry Kong <terryk@nvidia.com>
yuanhangsu1986 pushed a commit to yuanhangsu1986/RL-Nemontron-Edge-Omni that referenced this pull request Feb 21, 2026
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: yuanhangs <yuanhangs@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI:L0 Run doctests and unit tests documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants