Skip to content

Conversation

@pggPL
Copy link
Collaborator

@pggPL pggPL commented Oct 24, 2025

Description

Our documentation returned a lot of warnings and it seems that some of them were rational. It turned out that half of our PyTorch API was not rendered. This PR fixes all the warnings and forces Github workflow to error out if any docs warning will appear.

The most problematic error was related to cyclic imports - this resulted in part of our PyTorch API not being rendered. Other were mostly related to wrong formatting and some warnings didn't result in anything wrong.

Our docs for 2.9 contains only part of PyTorch API exposed in 2.8, so this will need urgent fix I will update in separate PR.

Type of change

  • Documentation change (change only to the documentation, either a fix or a new content)
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Infra/Build change
  • Code refactoring

Checklist:

  • [ x] I have read and followed the contributing guidelines
  • The functionality is complete
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Signed-off-by: Pawel Gadzinski <[email protected]>
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This PR resolves critical documentation rendering issues where approximately half of the PyTorch API was not being rendered in the generated documentation. The root causes were cyclic import dependencies and reStructuredText formatting violations. The fix involves restructuring imports across JAX and PyTorch modules to break circular dependencies (primarily moving QuantizeLayout and torch_version imports to their source modules), correcting RST section underlines to match header text lengths, converting docstring formatting to proper RST syntax, and enforcing strict documentation builds in CI by adding the -W flag to treat warnings as errors.

Important Files Changed

Filename Score Overview
.github/workflows/docs.yml 5/5 Added -W flag to Sphinx builds to make documentation warnings fatal in CI
transformer_engine/pytorch/utils.py 5/5 Converted torch_version from module-level import to cached function to break cyclic dependencies
transformer_engine/pytorch/cross_entropy.py 5/5 Converted parallel_cross_entropy from bare function reference to documented wrapper function
transformer_engine/jax/quantize/quantizer.py 5/5 Removed QuantizeLayout from __all__ to break circular dependency chain
transformer_engine/jax/quantize/hadamard.py 5/5 Changed QuantizeLayout import from delayed local import to top-level C++ extension import
transformer_engine/jax/cpp_extensions/misc.py 5/5 Moved QuantizeLayout import from relative path to C++ extension to break cycle
transformer_engine/jax/cpp_extensions/activation.py 5/5 Moved QuantizeLayout import from quantize module to C++ extension module
transformer_engine/jax/cpp_extensions/quantization.py 5/5 Moved QuantizeLayout import from quantize module to C++ extension module
transformer_engine/jax/cpp_extensions/gemm.py 5/5 Moved QuantizeLayout import from quantize module to C++ extension module
transformer_engine/jax/cpp_extensions/normalization.py 5/5 Moved QuantizeLayout import from quantize module to C++ extension module
transformer_engine/jax/dense.py 5/5 Moved QuantizeLayout import from local module to top-level package
transformer_engine/pytorch/jit.py 5/5 Changed torch_version import from current module to utils submodule
transformer_engine/pytorch/distributed.py 5/5 Changed torch_version import from package-level to explicit utils module
transformer_engine/pytorch/quantization.py 5/5 Reorganized imports with proper blank line separation for clarity
transformer_engine/pytorch/module/linear.py 5/5 Changed torch_version import to come from utils submodule
transformer_engine/pytorch/module/layernorm_linear.py 5/5 Changed torch_version import to come from utils submodule
transformer_engine/pytorch/module/layernorm_mlp.py 4/5 Changed torch_version import and improved docstring RST formatting
transformer_engine/pytorch/ops/_common.py 5/5 Changed torch_version import to come from utils submodule
transformer_engine/pytorch/ops/basic/l2normalization.py 5/5 Changed torch_version import to come from utils submodule
transformer_engine/pytorch/transformer.py 5/5 Fixed torch_version import and improved docstring RST formatting
transformer_engine/pytorch/module/base.py 5/5 Converted docstring code blocks from markdown to RST syntax
transformer_engine/pytorch/attention/dot_product_attention/dot_product_attention.py 5/5 Improved RST formatting of docstring with proper inline code and bullet lists
transformer_engine/pytorch/attention/multi_head_attention.py 5/5 Improved RST formatting of docstring with proper inline code and bullet lists
transformer_engine/jax/flax/transformer.py 5/5 Converted ASCII table in docstring to RST table directive
docs/conf.py 4/5 Added custom logging filter to suppress unavoidable duplicate namespace warnings
docs/api/pytorch.rst 4/5 Reorganized API documentation with new sections and deprecated functions section
docs/debug.rst 5/5 Added blank line after copyright header to fix RST parsing
docs/debug/api.rst 5/5 Added blank line after copyright header to fix RST parsing
docs/debug/3_api_features.rst 5/5 Extended section underline to match header length
docs/debug/3_api_debug_setup.rst 5/5 Extended section underlines to match header lengths
docs/debug/4_distributed.rst 5/5 Extended section underlines to match header lengths
docs/debug/2_config_file_structure.rst 5/5 Extended section underlines to match header lengths
docs/debug/1_getting_started.rst 0/5 Section underlines still don't match header text lengths exactly - will cause RST warnings
docs/examples/attention/attention.ipynb 5/5 Updated API reference links to lowercase format for Sphinx compatibility
docs/examples/te_gemma/tutorial_generation_gemma_with_te.ipynb 5/5 Fixed broken internal documentation link

Confidence score: 2/5

  • This PR contains one file (docs/debug/1_getting_started.rst) with incorrect RST section underline lengths that will cause Sphinx warnings, directly contradicting the PR's goal of eliminating all warnings
  • All other changes are formatting and import refactoring with minimal risk, but the incomplete fix in 1_getting_started.rst will cause the new CI check (which treats warnings as errors) to fail
  • Pay close attention to docs/debug/1_getting_started.rst - the underlines need to match the exact character length of each section header, not use a fixed-width approach

35 files reviewed, 8 comments

Edit Code Review Agent Settings | Greptile


Getting started
==============
===============================
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syntax: underline mismatch - title has 15 chars ('Getting started') but underline has 31 '=' chars

Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/debug/1_getting_started.rst
Line: 7:7

Comment:
**syntax:** underline mismatch - title has 15 chars ('Getting started') but underline has 31 '=' chars

How can I resolve this? If you propose a fix, please make it concise.


Example training script
----------------------
------------------------------
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syntax: underline mismatch - title has 24 chars ('Example training script') but underline has 30 '-' chars

Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/debug/1_getting_started.rst
Line: 41:41

Comment:
**syntax:** underline mismatch - title has 24 chars ('Example training script') but underline has 30 '-' chars

How can I resolve this? If you propose a fix, please make it concise.


Config file
----------
------------------------------
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syntax: underline mismatch - title has 11 chars ('Config file') but underline has 30 '-' chars

Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/debug/1_getting_started.rst
Line: 84:84

Comment:
**syntax:** underline mismatch - title has 11 chars ('Config file') but underline has 30 '-' chars

How can I resolve this? If you propose a fix, please make it concise.


Adjusting Python file
--------------------
----------------------------
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syntax: underline mismatch - title has 22 chars ('Adjusting Python file') but underline has 28 '-' chars

Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/debug/1_getting_started.rst
Line: 117:117

Comment:
**syntax:** underline mismatch - title has 22 chars ('Adjusting Python file') but underline has 28 '-' chars

How can I resolve this? If you propose a fix, please make it concise.

Inspecting the logs
------------------
----------------------------
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syntax: underline mismatch - title has 20 chars ('Inspecting the logs') but underline has 28 '-' chars

Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/debug/1_getting_started.rst
Line: 149:149

Comment:
**syntax:** underline mismatch - title has 20 chars ('Inspecting the logs') but underline has 28 '-' chars

How can I resolve this? If you propose a fix, please make it concise.

Logging using TensorBoard
------------------------
----------------------------
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syntax: underline mismatch - title has 27 chars ('Logging using TensorBoard') but underline has 28 '-' chars

Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/debug/1_getting_started.rst
Line: 218:218

Comment:
**syntax:** underline mismatch - title has 27 chars ('Logging using TensorBoard') but underline has 28 '-' chars

How can I resolve this? If you propose a fix, please make it concise.

docs/conf.py Outdated
Comment on lines 109 to 110
# There are 2 warnings about the same namespace (transformer_engine) in two different c++ api
# docs pages. This seems to be the only way to suppress these warnings.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syntax: trailing whitespace on line 109

Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/conf.py
Line: 109:110

Comment:
**syntax:** trailing whitespace on line 109

How can I resolve this? If you propose a fix, please make it concise.

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This review covers only the most recent changes that address syntax and formatting issues identified in previous reviews. The developer has made targeted fixes to resolve trailing whitespace, RST heading underline mismatches, and missing newlines across several documentation files. These changes are part of the broader documentation quality enforcement effort.

The recent commits address:

  • Trailing whitespace in docs/conf.py (line 109)
  • RST heading underline length mismatches in docs/debug/1_getting_started.rst (6 sections)
  • Missing newline at end of file in docs/api/pytorch.rst

Important Files Changed

Filename Score Overview
docs/conf.py 5/5 Fixed trailing whitespace on line 109
docs/debug/1_getting_started.rst 3/5 Adjusted RST heading underlines but mismatches remain between title lengths and underline lengths
docs/api/pytorch.rst 5/5 Added missing newline at end of file

Confidence score: 3/5

  • This PR contains syntax fixes but incomplete RST formatting corrections that will still cause Sphinx warnings
  • Score reflects that while trailing whitespace and missing newlines are fixed, the RST heading underlines in docs/debug/1_getting_started.rst still don't match their title character counts (e.g., "Getting started" has 15 chars but 31 '=' underline chars), which will cause Sphinx build failures when the -W flag is active
  • Pay close attention to docs/debug/1_getting_started.rst - all six section heading underlines need to be adjusted to exactly match their title character counts before the CI will pass

Additional Comments (1)

  1. docs/debug/3_api_debug_setup.rst, line 89 (link)

    syntax: missing newline at end of file

34 files reviewed, 12 comments

Edit Code Review Agent Settings | Greptile

)

from .constants import dist_group_type

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: : blank line between imports - typically import groups (stdlib, third-party, local) are separated by blank lines, but mixing local imports like this breaks the pattern

Suggested change
from .constants import dist_group_type
from .constants import dist_group_type
from .utils import get_device_compute_capability

Signed-off-by: Pawel Gadzinski <[email protected]>
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This review covers only the changes made since the last review, not the entire PR. The developer has successfully resolved all remaining ReStructuredText (RST) formatting issues across five debug documentation files. These changes correct section underline mismatches where underline lengths did not match their corresponding title lengths—a strict RST syntax requirement that causes Sphinx documentation build warnings. The fixes affect files in docs/debug/ including 1_getting_started.rst, 3_api_debug_setup.rst, 3_api_features.rst, api.rst, and 4_distributed.rst. Each correction ensures that section titles and their underlines have exactly matching character counts (e.g., "API" with 3 '=' chars, "Getting started" with 15 '=' chars), which eliminates documentation warnings and ensures proper rendering of the PyTorch API documentation. These changes are part of a broader effort to fix all documentation warnings and enforce clean doc builds in CI.

Important Files Changed

Filename Score Overview
docs/debug/3_api_debug_setup.rst 5/5 Fixed RST underline lengths for three function headers (initialize(), set_tensor_reduction_group(), set_weight_tensor_tp_group_reduce())
docs/debug/3_api_features.rst 5/5 Corrected section underline from 27 to 14 '=' chars to match "Debug features" title
docs/debug/api.rst 5/5 Fixed "API" section underline from 12 to 3 '=' chars
docs/debug/4_distributed.rst 5/5 Corrected underlines for main title and four subsections throughout the distributed training documentation
docs/debug/1_getting_started.rst 5/5 Fixed six section underlines to match title lengths across getting started guide

Confidence score: 5/5

  • This PR is safe to merge with minimal risk—all changes are pure RST formatting corrections with no functional impact
  • Score reflects that these are syntax-only corrections (underline length matching) with zero functional changes, no logic modifications, and no code execution paths affected
  • No files require special attention—all changes follow the same trivial pattern of correcting underline character counts to match section titles

5 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Signed-off-by: Pawel Gadzinski <[email protected]>
@pggPL pggPL mentioned this pull request Oct 24, 2025
@ksivaman ksivaman self-requested a review October 27, 2025 23:30
Comment on lines 39 to 40

.. autoapifunction:: transformer_engine.pytorch.fp8_autocast

.. autoapifunction:: transformer_engine.pytorch.fp8_model_init

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do want to have them still in the documentation though, just marked as deprecated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I see that you just moved it.

Comment on lines 53 to 54
Recipe availability
------------------------
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the past I had some trouble with the titles where the underline was not exact same length as the text.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the line is shorter, then warning is generated. In that case there is no warning and it renders properly, so I guess it's ok now.

Comment on lines 19 to 20
"- [transformer_engine.pytorch.DotProductAttention](../../api/pytorch.rst#transformer_engine.pytorch.dotproductattention)\n",
"- [transformer_engine.jax.flax.DotProductAttention](../../api/jax.rst#transformer_engine.jax.flax.dotproductattention)"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure about that? Just looked at the current docs page and the link works properly there.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, some debugging leftover. It does also work, but I will remove this change.

Parameters
----------
_input : torch.Tensor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why _input and not just input?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also noticed this. This function already uses _input as an argument name, so I documented what is in the code rather than changing that - it would be breaking change. This function does not render now due to error btw.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put _input into kwargs and changed it into input in args. I removed _input entirely from the docs, because I doubt anyone uses it. In MCore they use only non-keyword arguments.

Signed-off-by: Pawel Gadzinski <[email protected]>
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This PR fixes critical documentation build warnings that prevented half of the PyTorch API from being rendered in v2.9 documentation. The changes resolve cyclic import issues, improve RST formatting, and enforce warning-free builds in CI.

Key changes:

  • Fixed cyclic import by moving torch_version from a cached function in __init__.py to a module-level variable in utils.py, with updated imports across affected modules
  • Reorganized docs/api/pytorch.rst with logical sections (Recipe availability, MoE functions, Communication-computation overlap, Deprecated functions) for better API discoverability
  • Improved RST docstring formatting with proper code literals (backticks), bullet lists, and table syntax for better Sphinx rendering
  • Added SPHINXOPTS="-W" to GitHub workflow to fail builds on warnings, preventing future documentation regressions
  • Added custom logging filter in conf.py to suppress unavoidable duplicate C++ namespace warnings
  • Cleared Jupyter notebook execution counts to reduce version control noise

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk - it fixes documentation issues without affecting runtime behavior
  • All changes are focused on documentation quality improvements and formatting fixes. The cyclic import fix is well-executed with consistent updates across all dependent modules. The workflow enforcement ensures future documentation quality. Only minor formatting issue found (missing newline)
  • No files require special attention - only a trivial missing newline in docs/api/pytorch.rst

Important Files Changed

File Analysis

Filename Score Overview
.github/workflows/docs.yml 5/5 Added -W flag to Sphinx build to error on warnings, enforcing documentation quality
docs/conf.py 5/5 Added custom logging filter to suppress duplicate C++ namespace warnings, updated exclude patterns
transformer_engine/pytorch/init.py 5/5 Moved torch_version from function to module-level variable, fixing cyclic import issues
docs/api/pytorch.rst 4/5 Reorganized API docs with sections, moved deprecated functions, fixed rendering issues

Sequence Diagram

sequenceDiagram
    participant Dev as Developer
    participant GHA as GitHub Actions
    participant Sphinx as Sphinx Builder
    participant Code as Python Code
    participant Docs as Documentation

    Dev->>Code: Fix cyclic imports (torch_version)
    Dev->>Code: Update import paths
    Dev->>Docs: Reorganize API structure
    Dev->>Docs: Fix RST formatting issues
    Dev->>GHA: Add SPHINXOPTS="-W" flag
    
    Note over Dev,Docs: PR Changes Applied
    
    GHA->>Sphinx: Trigger docs build with -W
    Sphinx->>Code: Import Python modules
    Code-->>Sphinx: Successfully imported (no cycles)
    Sphinx->>Docs: Parse RST files
    Docs-->>Sphinx: Valid RST syntax
    Sphinx->>Sphinx: Apply custom warning filter
    Sphinx-->>GHA: Build succeeds (no warnings)
    
    Note over GHA: Future builds will error<br/>if warnings appear
Loading

9 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile


.. autoapifunction:: transformer_engine.pytorch.fp8_autocast

.. autoapifunction:: transformer_engine.pytorch.fp8_model_init
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syntax: missing final newline

Suggested change
.. autoapifunction:: transformer_engine.pytorch.fp8_model_init
.. autoapifunction:: transformer_engine.pytorch.fp8_model_init

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

Fixed critical documentation build issues that prevented half of the PyTorch API from rendering in version 2.9. Resolved cyclic import in pytorch/__init__.py by moving torch_version() to utils.py, and enforced documentation quality by adding -W flag to Sphinx build.

  • Cyclic import resolution: Moved torch_version() from pytorch/__init__.py to pytorch/utils.py to break import cycle that prevented API documentation generation
  • Sphinx enforcement: Added -W flag to GitHub workflow to treat documentation warnings as build errors
  • RST formatting fixes: Corrected underline mismatches in debug documentation files (15+ instances across multiple files)
  • Docstring improvements: Enhanced formatting with proper RST syntax (backticks, bullet lists) in attention, cross_entropy, and other modules
  • API documentation reorganization: Restructured pytorch.rst with logical sections and moved deprecated functions to the end
  • Import corrections: Fixed several import issues in JAX and PyTorch modules (e.g., QuantizeLayout in JAX, torch_version references)
  • Warning suppression: Added custom logging filter for unavoidable duplicate C++ declaration warnings

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk - it fixes critical documentation issues without changing runtime behavior
  • All changes are documentation-focused (RST formatting, docstrings, import reorganization). The cyclic import fix is a proper refactoring that maintains the same API surface. The workflow change enforces quality going forward. No logic changes to core functionality.
  • No files require special attention

Important Files Changed

File Analysis

Filename Score Overview
.github/workflows/docs.yml 5/5 Added -W flag to make Sphinx treat warnings as errors, enforcing documentation quality
docs/conf.py 5/5 Added custom logging filter to suppress duplicate C++ declaration warnings for transformer_engine namespace
transformer_engine/pytorch/init.py 5/5 Fixed cyclic import by removing torch_version() function from __init__.py (moved to utils.py)
transformer_engine/pytorch/utils.py 5/5 Added torch_version() function here to resolve cyclic import issues
docs/api/pytorch.rst 5/5 Reorganized API documentation with sections, moved deprecated functions to the end, added missing newline
transformer_engine/pytorch/cross_entropy.py 5/5 Added explicit function wrapper with full docstring for parallel_cross_entropy to improve API documentation

Sequence Diagram

sequenceDiagram
    participant Dev as Developer
    participant GH as GitHub Workflow
    participant Sphinx as Sphinx Builder
    participant Init as pytorch/__init__.py
    participant Utils as pytorch/utils.py
    
    Dev->>GH: Push docs changes
    GH->>Sphinx: Run build with -W flag
    
    Note over Sphinx: Before: Warnings ignored
    Note over Sphinx: After: Warnings = Errors
    
    Sphinx->>Sphinx: Check RST formatting
    Sphinx->>Sphinx: Check docstring syntax
    
    Note over Init,Utils: Cyclic Import Fix
    Init->>Utils: torch_version() moved here
    Utils->>Utils: Function callable via import
    
    Sphinx->>Sphinx: Parse API docs successfully
    Sphinx->>GH: Build succeeds (no warnings)
    GH->>Dev: CI passes
Loading

35 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

Fixed critical documentation build warnings that prevented half of the PyTorch API from being rendered in version 2.9 docs. The PR resolves cyclic import issues and enforces documentation quality going forward.

Key Changes:

  • Cyclic Import Fix: Moved torch_version() function from pytorch/__init__.py to pytorch/utils.py to break circular dependency that prevented Sphinx from properly importing and documenting modules
  • Workflow Enforcement: Added -W flag to Sphinx build to error out on any documentation warnings, preventing future documentation regressions
  • RST Formatting: Fixed hundreds of reStructuredText formatting issues including:
    • Section underline length mismatches in debug documentation
    • Code block syntax (converted markdown triple backticks to RST :: syntax)
    • Improved docstring formatting for mathematical expressions and bullet lists
    • Proper use of inline code with double backticks
  • API Documentation Reorganization: Restructured docs/api/pytorch.rst with logical sections (Recipe availability, MoE functions, Communication-computation overlap, Deprecated functions)
  • Function Wrappers: Added proper function wrappers with complete docstrings (e.g., parallel_cross_entropy) to replace bare Function.apply assignments that Sphinx couldn't document
  • Warning Suppression: Added custom logging filter in docs/conf.py to suppress unavoidable duplicate C++ declaration warnings for the transformer_engine namespace

Confidence Score: 5/5

  • This PR is safe to merge - it only fixes documentation and doesn't change runtime behavior
  • All changes are documentation-focused: RST formatting fixes, docstring improvements, reorganization of docs structure, and a critical cyclic import fix that only affects Sphinx's ability to introspect the codebase. The workflow change ensures future documentation quality. No runtime logic changes.
  • No files require special attention - all changes are straightforward documentation fixes

Important Files Changed

File Analysis

Filename Score Overview
.github/workflows/docs.yml 5/5 Added -W flag to error out on documentation warnings
transformer_engine/pytorch/init.py 5/5 Removed torch_version() function to break cyclic import; moved to utils.py
transformer_engine/pytorch/utils.py 5/5 Added torch_version() function moved from init.py to resolve cyclic imports
transformer_engine/pytorch/cross_entropy.py 5/5 Added proper wrapper function with full docstring for parallel_cross_entropy for documentation
docs/conf.py 5/5 Added custom logging filter to suppress duplicate C++ declaration warnings
docs/api/pytorch.rst 5/5 Reorganized API documentation sections and moved deprecated functions to dedicated section

Sequence Diagram

sequenceDiagram
    participant Dev as Developer
    participant Sphinx as Sphinx Build
    participant PyInit as pytorch/__init__.py
    participant Utils as pytorch/utils.py
    participant Docs as Documentation Output

    Note over Dev,Docs: Before: Cyclic Import Issue
    Sphinx->>PyInit: Import module
    PyInit->>PyInit: torch_version() function
    PyInit->>Utils: Import utils
    Utils->>PyInit: Import torch_version (CYCLIC!)
    Note over Sphinx: API not rendered properly

    Note over Dev,Docs: After: Fixed Import Chain
    Sphinx->>PyInit: Import module
    PyInit->>PyInit: torch_version as variable
    PyInit->>Utils: Import utils
    Utils->>Utils: torch_version() function
    Sphinx->>Docs: All APIs rendered correctly
    
    Note over Dev,Docs: Workflow Enforcement
    Dev->>Sphinx: Build docs with -W flag
    alt Warnings Found
        Sphinx-->>Dev: Build FAILS
    else No Warnings
        Sphinx-->>Docs: Build succeeds
    end
Loading

38 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This PR fixes critical documentation issues that prevented half of the PyTorch API from being rendered in Sphinx documentation and ensures future documentation quality through CI enforcement.

Key Changes

  • Cyclic import fix: Removed @functools.lru_cache decorator from torch_version() in transformer_engine/pytorch/__init__.py, converting it to a direct variable assignment to resolve Sphinx import issues
  • Missing API documentation: Converted parallel_cross_entropy from a simple function alias to a proper documented function with NumPy-style docstring, making it visible in generated docs
  • RST formatting fixes: Corrected underline length mismatches across all debug documentation files (15+ instances) and fixed code block syntax from markdown triple backticks to RST double-colon (::) format
  • API reorganization: Restructured docs/api/pytorch.rst with logical sections (Recipe availability, MoE functions, Communication-computation overlap, Deprecated functions) and moved deprecated functions to the bottom
  • CI enforcement: Added SPHINXOPTS="-W" to GitHub workflow to fail builds on any Sphinx warnings, preventing future documentation regressions
  • Warning suppression: Added custom Sphinx setup to filter unavoidable duplicate C++ namespace warnings

Impact

This addresses the urgent issue where v2.9 docs were missing significant portions of the PyTorch API that were present in v2.8. The changes are purely documentation-focused with backward compatibility maintained (e.g., _input parameter deprecation warning in parallel_cross_entropy).

Confidence Score: 5/5

  • This PR is safe to merge - it fixes critical documentation issues without changing runtime behavior
  • All changes are documentation-focused (RST formatting, docstrings, Sphinx configuration) or maintain backward compatibility (cross_entropy parameter renaming with deprecation warning). The cyclic import fix simplifies code without changing functionality. CI enforcement ensures no regressions.
  • No files require special attention - all changes are well-structured documentation fixes

Important Files Changed

File Analysis

Filename Score Overview
.github/workflows/docs.yml 5/5 Added -W flag to make Sphinx error on warnings, enforcing documentation quality
docs/conf.py 4/5 Added warning filter for duplicate C++ declarations, excluded sphinx_rtd_theme from patterns
docs/api/pytorch.rst 5/5 Reorganized API docs with sections, moved deprecated functions to bottom, added missing newline
transformer_engine/pytorch/init.py 5/5 Removed torch_version() function decorator to fix cyclic import issues with documentation
transformer_engine/pytorch/cross_entropy.py 5/5 Renamed _input to input, added proper function wrapper with NumPy-style docstring

Sequence Diagram

sequenceDiagram
    participant Dev as Developer
    participant CI as GitHub CI
    participant Sphinx as Sphinx Builder
    participant Docs as Documentation
    participant PyAPI as PyTorch API
    
    Note over Dev,PyAPI: Documentation Fix PR Flow
    
    Dev->>CI: Push PR with doc fixes
    CI->>Sphinx: Run with SPHINXOPTS="-W"
    
    Note over Sphinx: Critical Fix: Cyclic Import
    Sphinx->>PyAPI: Import PyTorch modules
    PyAPI-->>Sphinx: ✓ No cyclic imports (torch_version simplified)
    
    Note over Sphinx: Fix: RST Formatting
    Sphinx->>Docs: Parse RST files
    Docs-->>Sphinx: ✓ Underlines match titles
    Docs-->>Sphinx: ✓ Code blocks use :: syntax
    
    Note over Sphinx: Fix: API Documentation
    Sphinx->>PyAPI: Generate API docs
    PyAPI-->>Sphinx: ✓ parallel_cross_entropy has docstring
    PyAPI-->>Sphinx: ✓ All functions now rendered
    
    Note over Sphinx: Enhancement: Warning Filter
    Sphinx->>Sphinx: Filter C++ namespace duplicates
    
    Sphinx-->>CI: Build success (no warnings)
    CI-->>Dev: ✓ All checks pass
Loading

38 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@pggPL
Copy link
Collaborator Author

pggPL commented Nov 4, 2025

/te-ci

docs/api/jax.rst Outdated

Pre-defined Variable of Logical Axes
------------------------------------
-------------------------------------
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this needed to fix the rendering? The earlier version seems fine and of the correct length

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's incorrect. Fixed.

* 'off-by-one': ``S[:,:,:,i] = exp(S[:,:,:,i])/(1 + sum(exp(S[:,:,:,:]), dim=-1))``
* 'learnable': ``S[:,j,:,i] = exp(S[:,j,:,i])/(exp(alpha[j]) + sum(exp(S[:,j,:,:]), dim=-1))``
where ``alpha`` is a learnable parameter in shape ``[h]``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
where ``alpha`` is a learnable parameter in shape ``[h]``.
where ``alpha`` is a learnable parameter of shape ``[h]``.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i changed all occurrences in TE (around 10)

Comment on lines +17 to +18
torch_version = PkgVersion(str(torch.__version__)).release
assert torch_version >= (2, 1), f"Minimum torch version 2.1 required. Found {torch_version}."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change the instances at the call site as well. For e.g. linear.py still calls torch_version()

Comment on lines +22 to +25
@functools.lru_cache(maxsize=None)
def torch_version() -> tuple[int, ...]:
"""Get PyTorch version"""
return PkgVersion(str(torch.__version__)).release
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this redefined here because of cyclic import issues when import directly from __init__.py? We should have it defined only in one location.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the init one is causing issues with import, please remove it and change all instance to import from utils instead

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is the case. And I cannot import utils before dynamically loading c++ lib, because it imports some other torch classes. So I guess there are 2 ways of doing that - create some file torch_version.py with only that or do what I did. I do not want to add file for one line function, what do you think?

FP32 only. The returned loss is always in FP32, the input gradients are upcasted
to the datatype of the input.
If ``dist_process_group`` is passed for distributed loss calculation, the input to each
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are multiple instances where we use a single backtick in the docstrings instead of double backticks, this PR seems like a good place to change all those instances as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I did it and it took a lot of time and a lot of files were changed. Also, I discovered more formatting errors, so this PR is not that small now.

def forward(
ctx,
_input,
input,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw @ptrendx's comment above suggesting we should change it, but input is also a python keyword and we should avoid using it. Maybe we can go with inp as we do in our other modules. I'm surprised that the linter didn't complain about this change

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point actually.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to inp

Signed-off-by: Pawel Gadzinski <[email protected]>
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

Fixed documentation rendering issues by correcting RST formatting errors and resolving cyclic import problems. The changes include:

  • Fixed underline lengths in docs/api/jax.rst to match section titles ("Jax", "Checkpointing", "Modules")
  • Renamed _input parameter to input in cross_entropy.py to follow Python naming conventions and enable proper Sphinx autodoc rendering
  • Converted parallel_cross_entropy from a bare function reference to a proper function wrapper with comprehensive NumPy-style docstring
  • Added backward compatibility with deprecated _input parameter via keyword-only argument with FutureWarning
  • Added missing imports (typing.Optional, warnings) to support the new implementation

These changes restore the missing PyTorch API documentation that wasn't being rendered due to cyclic imports caused by the underscore-prefixed parameter name.

Confidence Score: 4/5

  • Safe to merge with minor verification recommended - backward compatibility is preserved through deprecation warnings
  • The changes are well-implemented with proper backward compatibility handling. Score of 4 (not 5) because the deprecation warning approach should be tested to ensure existing code using positional arguments won't break, and the existing test suite appears to use positional calls that might need attention
  • Verify that tests/pytorch/test_parallel_cross_entropy.py still passes - it uses positional arguments which should work, but worth confirming the backward compatibility mechanism functions correctly

Important Files Changed

File Analysis

Filename Score Overview
docs/api/jax.rst 5/5 Fixed RST heading underline lengths to match title lengths for proper documentation rendering
transformer_engine/pytorch/cross_entropy.py 4/5 Renamed _input to input throughout, added proper function wrapper with NumPy-style docstring and backward compatibility deprecation warning

Sequence Diagram

sequenceDiagram
    participant User as User Code
    participant PCE as parallel_cross_entropy()
    participant CEF as CrossEntropyFunction
    participant TCE as triton_cross_entropy
    
    User->>PCE: Call with input/target
    Note over PCE: Check for deprecated _input param
    alt _input is provided
        PCE->>PCE: Issue FutureWarning
        PCE->>PCE: Use _input as input
    end
    PCE->>CEF: .apply(input, target, ...)
    CEF->>TCE: cross_entropy_forward(input, ...)
    TCE-->>CEF: Return (loss, input)
    CEF->>CEF: save_for_backward(input)
    CEF-->>PCE: Return loss
    PCE-->>User: Return loss
    
    Note over User,TCE: Backward Pass
    User->>CEF: backward(grad_output)
    CEF->>CEF: Retrieve saved input
    CEF->>TCE: cross_entropy_backward(input, ...)
    TCE-->>CEF: Return grad_input
    CEF-->>User: Return (grad_input, None, ...)
Loading

2 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This PR fixes critical documentation build warnings by resolving cyclic imports and correcting RST formatting issues throughout the codebase.

Key Changes:

  • Cyclic Import Resolution: Moved torch_version function from pytorch/__init__.py to pytorch/utils.py and converted it from a cached function to a simple variable in __init__.py, breaking the circular dependency that prevented half of the PyTorch API from being rendered
  • Documentation Structure: Reorganized docs/api/pytorch.rst with better sectioning (Recipe availability, MoE functions, Communication-computation overlap, Deprecated functions) and improved API organization
  • RST Formatting: Fixed hundreds of docstring formatting issues across 50+ files:
    • Changed backtick-quoted defaults like `True` to proper RST format True
    • Fixed code references from single quotes to double backticks (e.g., 'forward':meth:`forward`)
    • Added proper math rendering using :math: directives
    • Fixed inline code formatting throughout
  • API Improvements: Enhanced parallel_cross_entropy by converting from bare function assignment to a proper wrapper with full documentation and backward compatibility for the _input parameter
  • CI Enhancement: Added SPHINXOPTS="-W" to GitHub workflow to fail builds on any documentation warnings
  • Warning Suppression: Added custom Sphinx setup to filter unavoidable duplicate C++ declaration warnings

The changes are purely documentation-focused with minimal functional impact, except for the beneficial refactoring of the torch_version utility.

Confidence Score: 5/5

  • This PR is safe to merge - changes are documentation-focused with well-tested import refactoring
  • All changes are either documentation formatting improvements or minor refactoring to fix cyclic imports. The cyclic import fix is clean and preserves backward compatibility. The only remaining issue is a missing newline in docs/api/pytorch.rst
  • docs/api/pytorch.rst needs the final newline added

Important Files Changed

File Analysis

Filename Score Overview
docs/api/pytorch.rst 4/5 Reorganized API documentation with better sectioning; still missing final newline
transformer_engine/pytorch/init.py 5/5 Fixed cyclic import by converting torch_version from function to variable
transformer_engine/pytorch/utils.py 5/5 Moved torch_version function here from init.py to resolve cyclic imports
transformer_engine/pytorch/cross_entropy.py 5/5 Renamed _input to inp and added proper function wrapper with docstrings
transformer_engine/pytorch/module/base.py 5/5 Fixed docstring formatting and removed unused Recipe import that caused cyclic dependency
.github/workflows/docs.yml 5/5 Added SPHINXOPTS=-W to fail builds on documentation warnings

Sequence Diagram

sequenceDiagram
    participant Dev as Developer
    participant GH as GitHub Workflow
    participant Sphinx as Sphinx Builder
    participant Import as Import System
    participant Docs as Generated Docs

    Dev->>GH: Push documentation changes
    GH->>Sphinx: Build docs with SPHINXOPTS="-W"
    
    Note over Sphinx,Import: Before this PR: Cyclic import issue
    Import--xImport: torch_version() in __init__.py<br/>creates circular dependency
    Import--xDocs: Half of PyTorch API not rendered
    
    Note over Sphinx,Import: After this PR: Import fixed
    Import->>Import: torch_version moved to utils.py
    Import->>Import: __init__.py uses simple variable
    Import->>Docs: All PyTorch API successfully imported
    
    Sphinx->>Sphinx: Process RST files
    Sphinx->>Sphinx: Parse docstrings with fixed formatting
    Sphinx->>Sphinx: Render math expressions correctly
    Sphinx->>Sphinx: Apply custom warning filter for C++ duplicates
    
    alt Documentation has warnings
        Sphinx->>GH: Build fails (exit code != 0)
        GH->>Dev: CI reports failure
    else Documentation clean
        Sphinx->>GH: Build succeeds
        GH->>Docs: Upload documentation artifact
        GH->>Dev: CI passes
    end
Loading

31 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile


.. autoapifunction:: transformer_engine.pytorch.fp8_autocast

.. autoapifunction:: transformer_engine.pytorch.fp8_model_init
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syntax: missing final newline

Suggested change
.. autoapifunction:: transformer_engine.pytorch.fp8_model_init
.. autoapifunction:: transformer_engine.pytorch.fp8_model_init

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants