-
Notifications
You must be signed in to change notification settings - Fork 540
Docs fix #2301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Docs fix #2301
Conversation
Signed-off-by: Pawel Gadzinski <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This PR resolves critical documentation rendering issues where approximately half of the PyTorch API was not being rendered in the generated documentation. The root causes were cyclic import dependencies and reStructuredText formatting violations. The fix involves restructuring imports across JAX and PyTorch modules to break circular dependencies (primarily moving QuantizeLayout and torch_version imports to their source modules), correcting RST section underlines to match header text lengths, converting docstring formatting to proper RST syntax, and enforcing strict documentation builds in CI by adding the -W flag to treat warnings as errors.
Important Files Changed
| Filename | Score | Overview |
|---|---|---|
.github/workflows/docs.yml |
5/5 | Added -W flag to Sphinx builds to make documentation warnings fatal in CI |
transformer_engine/pytorch/utils.py |
5/5 | Converted torch_version from module-level import to cached function to break cyclic dependencies |
transformer_engine/pytorch/cross_entropy.py |
5/5 | Converted parallel_cross_entropy from bare function reference to documented wrapper function |
transformer_engine/jax/quantize/quantizer.py |
5/5 | Removed QuantizeLayout from __all__ to break circular dependency chain |
transformer_engine/jax/quantize/hadamard.py |
5/5 | Changed QuantizeLayout import from delayed local import to top-level C++ extension import |
transformer_engine/jax/cpp_extensions/misc.py |
5/5 | Moved QuantizeLayout import from relative path to C++ extension to break cycle |
transformer_engine/jax/cpp_extensions/activation.py |
5/5 | Moved QuantizeLayout import from quantize module to C++ extension module |
transformer_engine/jax/cpp_extensions/quantization.py |
5/5 | Moved QuantizeLayout import from quantize module to C++ extension module |
transformer_engine/jax/cpp_extensions/gemm.py |
5/5 | Moved QuantizeLayout import from quantize module to C++ extension module |
transformer_engine/jax/cpp_extensions/normalization.py |
5/5 | Moved QuantizeLayout import from quantize module to C++ extension module |
transformer_engine/jax/dense.py |
5/5 | Moved QuantizeLayout import from local module to top-level package |
transformer_engine/pytorch/jit.py |
5/5 | Changed torch_version import from current module to utils submodule |
transformer_engine/pytorch/distributed.py |
5/5 | Changed torch_version import from package-level to explicit utils module |
transformer_engine/pytorch/quantization.py |
5/5 | Reorganized imports with proper blank line separation for clarity |
transformer_engine/pytorch/module/linear.py |
5/5 | Changed torch_version import to come from utils submodule |
transformer_engine/pytorch/module/layernorm_linear.py |
5/5 | Changed torch_version import to come from utils submodule |
transformer_engine/pytorch/module/layernorm_mlp.py |
4/5 | Changed torch_version import and improved docstring RST formatting |
transformer_engine/pytorch/ops/_common.py |
5/5 | Changed torch_version import to come from utils submodule |
transformer_engine/pytorch/ops/basic/l2normalization.py |
5/5 | Changed torch_version import to come from utils submodule |
transformer_engine/pytorch/transformer.py |
5/5 | Fixed torch_version import and improved docstring RST formatting |
transformer_engine/pytorch/module/base.py |
5/5 | Converted docstring code blocks from markdown to RST syntax |
transformer_engine/pytorch/attention/dot_product_attention/dot_product_attention.py |
5/5 | Improved RST formatting of docstring with proper inline code and bullet lists |
transformer_engine/pytorch/attention/multi_head_attention.py |
5/5 | Improved RST formatting of docstring with proper inline code and bullet lists |
transformer_engine/jax/flax/transformer.py |
5/5 | Converted ASCII table in docstring to RST table directive |
docs/conf.py |
4/5 | Added custom logging filter to suppress unavoidable duplicate namespace warnings |
docs/api/pytorch.rst |
4/5 | Reorganized API documentation with new sections and deprecated functions section |
docs/debug.rst |
5/5 | Added blank line after copyright header to fix RST parsing |
docs/debug/api.rst |
5/5 | Added blank line after copyright header to fix RST parsing |
docs/debug/3_api_features.rst |
5/5 | Extended section underline to match header length |
docs/debug/3_api_debug_setup.rst |
5/5 | Extended section underlines to match header lengths |
docs/debug/4_distributed.rst |
5/5 | Extended section underlines to match header lengths |
docs/debug/2_config_file_structure.rst |
5/5 | Extended section underlines to match header lengths |
docs/debug/1_getting_started.rst |
0/5 | Section underlines still don't match header text lengths exactly - will cause RST warnings |
docs/examples/attention/attention.ipynb |
5/5 | Updated API reference links to lowercase format for Sphinx compatibility |
docs/examples/te_gemma/tutorial_generation_gemma_with_te.ipynb |
5/5 | Fixed broken internal documentation link |
Confidence score: 2/5
- This PR contains one file (
docs/debug/1_getting_started.rst) with incorrect RST section underline lengths that will cause Sphinx warnings, directly contradicting the PR's goal of eliminating all warnings - All other changes are formatting and import refactoring with minimal risk, but the incomplete fix in
1_getting_started.rstwill cause the new CI check (which treats warnings as errors) to fail - Pay close attention to
docs/debug/1_getting_started.rst- the underlines need to match the exact character length of each section header, not use a fixed-width approach
35 files reviewed, 8 comments
docs/debug/1_getting_started.rst
Outdated
|
|
||
| Getting started | ||
| ============== | ||
| =============================== |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
syntax: underline mismatch - title has 15 chars ('Getting started') but underline has 31 '=' chars
Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/debug/1_getting_started.rst
Line: 7:7
Comment:
**syntax:** underline mismatch - title has 15 chars ('Getting started') but underline has 31 '=' chars
How can I resolve this? If you propose a fix, please make it concise.
docs/debug/1_getting_started.rst
Outdated
|
|
||
| Example training script | ||
| ---------------------- | ||
| ------------------------------ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
syntax: underline mismatch - title has 24 chars ('Example training script') but underline has 30 '-' chars
Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/debug/1_getting_started.rst
Line: 41:41
Comment:
**syntax:** underline mismatch - title has 24 chars ('Example training script') but underline has 30 '-' chars
How can I resolve this? If you propose a fix, please make it concise.
docs/debug/1_getting_started.rst
Outdated
|
|
||
| Config file | ||
| ---------- | ||
| ------------------------------ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
syntax: underline mismatch - title has 11 chars ('Config file') but underline has 30 '-' chars
Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/debug/1_getting_started.rst
Line: 84:84
Comment:
**syntax:** underline mismatch - title has 11 chars ('Config file') but underline has 30 '-' chars
How can I resolve this? If you propose a fix, please make it concise.
docs/debug/1_getting_started.rst
Outdated
|
|
||
| Adjusting Python file | ||
| -------------------- | ||
| ---------------------------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
syntax: underline mismatch - title has 22 chars ('Adjusting Python file') but underline has 28 '-' chars
Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/debug/1_getting_started.rst
Line: 117:117
Comment:
**syntax:** underline mismatch - title has 22 chars ('Adjusting Python file') but underline has 28 '-' chars
How can I resolve this? If you propose a fix, please make it concise.
docs/debug/1_getting_started.rst
Outdated
| Inspecting the logs | ||
| ------------------ | ||
| ---------------------------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
syntax: underline mismatch - title has 20 chars ('Inspecting the logs') but underline has 28 '-' chars
Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/debug/1_getting_started.rst
Line: 149:149
Comment:
**syntax:** underline mismatch - title has 20 chars ('Inspecting the logs') but underline has 28 '-' chars
How can I resolve this? If you propose a fix, please make it concise.
docs/debug/1_getting_started.rst
Outdated
| Logging using TensorBoard | ||
| ------------------------ | ||
| ---------------------------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
syntax: underline mismatch - title has 27 chars ('Logging using TensorBoard') but underline has 28 '-' chars
Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/debug/1_getting_started.rst
Line: 218:218
Comment:
**syntax:** underline mismatch - title has 27 chars ('Logging using TensorBoard') but underline has 28 '-' chars
How can I resolve this? If you propose a fix, please make it concise.
docs/conf.py
Outdated
| # There are 2 warnings about the same namespace (transformer_engine) in two different c++ api | ||
| # docs pages. This seems to be the only way to suppress these warnings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
syntax: trailing whitespace on line 109
Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/conf.py
Line: 109:110
Comment:
**syntax:** trailing whitespace on line 109
How can I resolve this? If you propose a fix, please make it concise.for more information, see https://pre-commit.ci
Signed-off-by: Pawel Gadzinski <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This review covers only the most recent changes that address syntax and formatting issues identified in previous reviews. The developer has made targeted fixes to resolve trailing whitespace, RST heading underline mismatches, and missing newlines across several documentation files. These changes are part of the broader documentation quality enforcement effort.
The recent commits address:
- Trailing whitespace in
docs/conf.py(line 109) - RST heading underline length mismatches in
docs/debug/1_getting_started.rst(6 sections) - Missing newline at end of file in
docs/api/pytorch.rst
Important Files Changed
| Filename | Score | Overview |
|---|---|---|
| docs/conf.py | 5/5 | Fixed trailing whitespace on line 109 |
| docs/debug/1_getting_started.rst | 3/5 | Adjusted RST heading underlines but mismatches remain between title lengths and underline lengths |
| docs/api/pytorch.rst | 5/5 | Added missing newline at end of file |
Confidence score: 3/5
- This PR contains syntax fixes but incomplete RST formatting corrections that will still cause Sphinx warnings
- Score reflects that while trailing whitespace and missing newlines are fixed, the RST heading underlines in
docs/debug/1_getting_started.rststill don't match their title character counts (e.g., "Getting started" has 15 chars but 31 '=' underline chars), which will cause Sphinx build failures when the-Wflag is active - Pay close attention to
docs/debug/1_getting_started.rst- all six section heading underlines need to be adjusted to exactly match their title character counts before the CI will pass
Additional Comments (1)
-
docs/debug/3_api_debug_setup.rst, line 89 (link)syntax: missing newline at end of file
34 files reviewed, 12 comments
| ) | ||
|
|
||
| from .constants import dist_group_type | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: : blank line between imports - typically import groups (stdlib, third-party, local) are separated by blank lines, but mixing local imports like this breaks the pattern
| from .constants import dist_group_type | |
| from .constants import dist_group_type | |
| from .utils import get_device_compute_capability |
Signed-off-by: Pawel Gadzinski <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This review covers only the changes made since the last review, not the entire PR. The developer has successfully resolved all remaining ReStructuredText (RST) formatting issues across five debug documentation files. These changes correct section underline mismatches where underline lengths did not match their corresponding title lengths—a strict RST syntax requirement that causes Sphinx documentation build warnings. The fixes affect files in docs/debug/ including 1_getting_started.rst, 3_api_debug_setup.rst, 3_api_features.rst, api.rst, and 4_distributed.rst. Each correction ensures that section titles and their underlines have exactly matching character counts (e.g., "API" with 3 '=' chars, "Getting started" with 15 '=' chars), which eliminates documentation warnings and ensures proper rendering of the PyTorch API documentation. These changes are part of a broader effort to fix all documentation warnings and enforce clean doc builds in CI.
Important Files Changed
| Filename | Score | Overview |
|---|---|---|
| docs/debug/3_api_debug_setup.rst | 5/5 | Fixed RST underline lengths for three function headers (initialize(), set_tensor_reduction_group(), set_weight_tensor_tp_group_reduce()) |
| docs/debug/3_api_features.rst | 5/5 | Corrected section underline from 27 to 14 '=' chars to match "Debug features" title |
| docs/debug/api.rst | 5/5 | Fixed "API" section underline from 12 to 3 '=' chars |
| docs/debug/4_distributed.rst | 5/5 | Corrected underlines for main title and four subsections throughout the distributed training documentation |
| docs/debug/1_getting_started.rst | 5/5 | Fixed six section underlines to match title lengths across getting started guide |
Confidence score: 5/5
- This PR is safe to merge with minimal risk—all changes are pure RST formatting corrections with no functional impact
- Score reflects that these are syntax-only corrections (underline length matching) with zero functional changes, no logic modifications, and no code execution paths affected
- No files require special attention—all changes follow the same trivial pattern of correcting underline character counts to match section titles
5 files reviewed, no comments
Signed-off-by: Pawel Gadzinski <[email protected]>
|
|
||
| .. autoapifunction:: transformer_engine.pytorch.fp8_autocast | ||
|
|
||
| .. autoapifunction:: transformer_engine.pytorch.fp8_model_init | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do want to have them still in the documentation though, just marked as deprecated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I see that you just moved it.
docs/api/pytorch.rst
Outdated
| Recipe availability | ||
| ------------------------ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the past I had some trouble with the titles where the underline was not exact same length as the text.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the line is shorter, then warning is generated. In that case there is no warning and it renders properly, so I guess it's ok now.
| "- [transformer_engine.pytorch.DotProductAttention](../../api/pytorch.rst#transformer_engine.pytorch.dotproductattention)\n", | ||
| "- [transformer_engine.jax.flax.DotProductAttention](../../api/jax.rst#transformer_engine.jax.flax.dotproductattention)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure about that? Just looked at the current docs page and the link works properly there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, some debugging leftover. It does also work, but I will remove this change.
| Parameters | ||
| ---------- | ||
| _input : torch.Tensor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why _input and not just input?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also noticed this. This function already uses _input as an argument name, so I documented what is in the code rather than changing that - it would be breaking change. This function does not render now due to error btw.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put _input into kwargs and changed it into input in args. I removed _input entirely from the docs, because I doubt anyone uses it. In MCore they use only non-keyword arguments.
Signed-off-by: Pawel Gadzinski <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This PR fixes critical documentation build warnings that prevented half of the PyTorch API from being rendered in v2.9 documentation. The changes resolve cyclic import issues, improve RST formatting, and enforce warning-free builds in CI.
Key changes:
- Fixed cyclic import by moving
torch_versionfrom a cached function in__init__.pyto a module-level variable inutils.py, with updated imports across affected modules - Reorganized
docs/api/pytorch.rstwith logical sections (Recipe availability, MoE functions, Communication-computation overlap, Deprecated functions) for better API discoverability - Improved RST docstring formatting with proper code literals (backticks), bullet lists, and table syntax for better Sphinx rendering
- Added
SPHINXOPTS="-W"to GitHub workflow to fail builds on warnings, preventing future documentation regressions - Added custom logging filter in
conf.pyto suppress unavoidable duplicate C++ namespace warnings - Cleared Jupyter notebook execution counts to reduce version control noise
Confidence Score: 5/5
- This PR is safe to merge with minimal risk - it fixes documentation issues without affecting runtime behavior
- All changes are focused on documentation quality improvements and formatting fixes. The cyclic import fix is well-executed with consistent updates across all dependent modules. The workflow enforcement ensures future documentation quality. Only minor formatting issue found (missing newline)
- No files require special attention - only a trivial missing newline in
docs/api/pytorch.rst
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| .github/workflows/docs.yml | 5/5 | Added -W flag to Sphinx build to error on warnings, enforcing documentation quality |
| docs/conf.py | 5/5 | Added custom logging filter to suppress duplicate C++ namespace warnings, updated exclude patterns |
| transformer_engine/pytorch/init.py | 5/5 | Moved torch_version from function to module-level variable, fixing cyclic import issues |
| docs/api/pytorch.rst | 4/5 | Reorganized API docs with sections, moved deprecated functions, fixed rendering issues |
Sequence Diagram
sequenceDiagram
participant Dev as Developer
participant GHA as GitHub Actions
participant Sphinx as Sphinx Builder
participant Code as Python Code
participant Docs as Documentation
Dev->>Code: Fix cyclic imports (torch_version)
Dev->>Code: Update import paths
Dev->>Docs: Reorganize API structure
Dev->>Docs: Fix RST formatting issues
Dev->>GHA: Add SPHINXOPTS="-W" flag
Note over Dev,Docs: PR Changes Applied
GHA->>Sphinx: Trigger docs build with -W
Sphinx->>Code: Import Python modules
Code-->>Sphinx: Successfully imported (no cycles)
Sphinx->>Docs: Parse RST files
Docs-->>Sphinx: Valid RST syntax
Sphinx->>Sphinx: Apply custom warning filter
Sphinx-->>GHA: Build succeeds (no warnings)
Note over GHA: Future builds will error<br/>if warnings appear
9 files reviewed, 1 comment
|
|
||
| .. autoapifunction:: transformer_engine.pytorch.fp8_autocast | ||
|
|
||
| .. autoapifunction:: transformer_engine.pytorch.fp8_model_init |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
syntax: missing final newline
| .. autoapifunction:: transformer_engine.pytorch.fp8_model_init | |
| .. autoapifunction:: transformer_engine.pytorch.fp8_model_init |
Signed-off-by: Pawel Gadzinski <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
Fixed critical documentation build issues that prevented half of the PyTorch API from rendering in version 2.9. Resolved cyclic import in pytorch/__init__.py by moving torch_version() to utils.py, and enforced documentation quality by adding -W flag to Sphinx build.
- Cyclic import resolution: Moved
torch_version()frompytorch/__init__.pytopytorch/utils.pyto break import cycle that prevented API documentation generation - Sphinx enforcement: Added
-Wflag to GitHub workflow to treat documentation warnings as build errors - RST formatting fixes: Corrected underline mismatches in debug documentation files (15+ instances across multiple files)
- Docstring improvements: Enhanced formatting with proper RST syntax (backticks, bullet lists) in attention, cross_entropy, and other modules
- API documentation reorganization: Restructured
pytorch.rstwith logical sections and moved deprecated functions to the end - Import corrections: Fixed several import issues in JAX and PyTorch modules (e.g.,
QuantizeLayoutin JAX,torch_versionreferences) - Warning suppression: Added custom logging filter for unavoidable duplicate C++ declaration warnings
Confidence Score: 5/5
- This PR is safe to merge with minimal risk - it fixes critical documentation issues without changing runtime behavior
- All changes are documentation-focused (RST formatting, docstrings, import reorganization). The cyclic import fix is a proper refactoring that maintains the same API surface. The workflow change enforces quality going forward. No logic changes to core functionality.
- No files require special attention
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| .github/workflows/docs.yml | 5/5 | Added -W flag to make Sphinx treat warnings as errors, enforcing documentation quality |
| docs/conf.py | 5/5 | Added custom logging filter to suppress duplicate C++ declaration warnings for transformer_engine namespace |
| transformer_engine/pytorch/init.py | 5/5 | Fixed cyclic import by removing torch_version() function from __init__.py (moved to utils.py) |
| transformer_engine/pytorch/utils.py | 5/5 | Added torch_version() function here to resolve cyclic import issues |
| docs/api/pytorch.rst | 5/5 | Reorganized API documentation with sections, moved deprecated functions to the end, added missing newline |
| transformer_engine/pytorch/cross_entropy.py | 5/5 | Added explicit function wrapper with full docstring for parallel_cross_entropy to improve API documentation |
Sequence Diagram
sequenceDiagram
participant Dev as Developer
participant GH as GitHub Workflow
participant Sphinx as Sphinx Builder
participant Init as pytorch/__init__.py
participant Utils as pytorch/utils.py
Dev->>GH: Push docs changes
GH->>Sphinx: Run build with -W flag
Note over Sphinx: Before: Warnings ignored
Note over Sphinx: After: Warnings = Errors
Sphinx->>Sphinx: Check RST formatting
Sphinx->>Sphinx: Check docstring syntax
Note over Init,Utils: Cyclic Import Fix
Init->>Utils: torch_version() moved here
Utils->>Utils: Function callable via import
Sphinx->>Sphinx: Parse API docs successfully
Sphinx->>GH: Build succeeds (no warnings)
GH->>Dev: CI passes
35 files reviewed, no comments
Signed-off-by: Pawel Gadzinski <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
Fixed critical documentation build warnings that prevented half of the PyTorch API from being rendered in version 2.9 docs. The PR resolves cyclic import issues and enforces documentation quality going forward.
Key Changes:
- Cyclic Import Fix: Moved
torch_version()function frompytorch/__init__.pytopytorch/utils.pyto break circular dependency that prevented Sphinx from properly importing and documenting modules - Workflow Enforcement: Added
-Wflag to Sphinx build to error out on any documentation warnings, preventing future documentation regressions - RST Formatting: Fixed hundreds of reStructuredText formatting issues including:
- Section underline length mismatches in debug documentation
- Code block syntax (converted markdown triple backticks to RST
::syntax) - Improved docstring formatting for mathematical expressions and bullet lists
- Proper use of inline code with double backticks
- API Documentation Reorganization: Restructured
docs/api/pytorch.rstwith logical sections (Recipe availability, MoE functions, Communication-computation overlap, Deprecated functions) - Function Wrappers: Added proper function wrappers with complete docstrings (e.g.,
parallel_cross_entropy) to replace bareFunction.applyassignments that Sphinx couldn't document - Warning Suppression: Added custom logging filter in
docs/conf.pyto suppress unavoidable duplicate C++ declaration warnings for thetransformer_enginenamespace
Confidence Score: 5/5
- This PR is safe to merge - it only fixes documentation and doesn't change runtime behavior
- All changes are documentation-focused: RST formatting fixes, docstring improvements, reorganization of docs structure, and a critical cyclic import fix that only affects Sphinx's ability to introspect the codebase. The workflow change ensures future documentation quality. No runtime logic changes.
- No files require special attention - all changes are straightforward documentation fixes
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| .github/workflows/docs.yml | 5/5 | Added -W flag to error out on documentation warnings |
| transformer_engine/pytorch/init.py | 5/5 | Removed torch_version() function to break cyclic import; moved to utils.py |
| transformer_engine/pytorch/utils.py | 5/5 | Added torch_version() function moved from init.py to resolve cyclic imports |
| transformer_engine/pytorch/cross_entropy.py | 5/5 | Added proper wrapper function with full docstring for parallel_cross_entropy for documentation |
| docs/conf.py | 5/5 | Added custom logging filter to suppress duplicate C++ declaration warnings |
| docs/api/pytorch.rst | 5/5 | Reorganized API documentation sections and moved deprecated functions to dedicated section |
Sequence Diagram
sequenceDiagram
participant Dev as Developer
participant Sphinx as Sphinx Build
participant PyInit as pytorch/__init__.py
participant Utils as pytorch/utils.py
participant Docs as Documentation Output
Note over Dev,Docs: Before: Cyclic Import Issue
Sphinx->>PyInit: Import module
PyInit->>PyInit: torch_version() function
PyInit->>Utils: Import utils
Utils->>PyInit: Import torch_version (CYCLIC!)
Note over Sphinx: API not rendered properly
Note over Dev,Docs: After: Fixed Import Chain
Sphinx->>PyInit: Import module
PyInit->>PyInit: torch_version as variable
PyInit->>Utils: Import utils
Utils->>Utils: torch_version() function
Sphinx->>Docs: All APIs rendered correctly
Note over Dev,Docs: Workflow Enforcement
Dev->>Sphinx: Build docs with -W flag
alt Warnings Found
Sphinx-->>Dev: Build FAILS
else No Warnings
Sphinx-->>Docs: Build succeeds
end
38 files reviewed, no comments
Signed-off-by: Pawel Gadzinski <[email protected]>
Signed-off-by: Pawel Gadzinski <[email protected]>
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This PR fixes critical documentation issues that prevented half of the PyTorch API from being rendered in Sphinx documentation and ensures future documentation quality through CI enforcement.
Key Changes
- Cyclic import fix: Removed
@functools.lru_cachedecorator fromtorch_version()intransformer_engine/pytorch/__init__.py, converting it to a direct variable assignment to resolve Sphinx import issues - Missing API documentation: Converted
parallel_cross_entropyfrom a simple function alias to a proper documented function with NumPy-style docstring, making it visible in generated docs - RST formatting fixes: Corrected underline length mismatches across all debug documentation files (15+ instances) and fixed code block syntax from markdown triple backticks to RST double-colon (
::) format - API reorganization: Restructured
docs/api/pytorch.rstwith logical sections (Recipe availability, MoE functions, Communication-computation overlap, Deprecated functions) and moved deprecated functions to the bottom - CI enforcement: Added
SPHINXOPTS="-W"to GitHub workflow to fail builds on any Sphinx warnings, preventing future documentation regressions - Warning suppression: Added custom Sphinx setup to filter unavoidable duplicate C++ namespace warnings
Impact
This addresses the urgent issue where v2.9 docs were missing significant portions of the PyTorch API that were present in v2.8. The changes are purely documentation-focused with backward compatibility maintained (e.g., _input parameter deprecation warning in parallel_cross_entropy).
Confidence Score: 5/5
- This PR is safe to merge - it fixes critical documentation issues without changing runtime behavior
- All changes are documentation-focused (RST formatting, docstrings, Sphinx configuration) or maintain backward compatibility (cross_entropy parameter renaming with deprecation warning). The cyclic import fix simplifies code without changing functionality. CI enforcement ensures no regressions.
- No files require special attention - all changes are well-structured documentation fixes
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| .github/workflows/docs.yml | 5/5 | Added -W flag to make Sphinx error on warnings, enforcing documentation quality |
| docs/conf.py | 4/5 | Added warning filter for duplicate C++ declarations, excluded sphinx_rtd_theme from patterns |
| docs/api/pytorch.rst | 5/5 | Reorganized API docs with sections, moved deprecated functions to bottom, added missing newline |
| transformer_engine/pytorch/init.py | 5/5 | Removed torch_version() function decorator to fix cyclic import issues with documentation |
| transformer_engine/pytorch/cross_entropy.py | 5/5 | Renamed _input to input, added proper function wrapper with NumPy-style docstring |
Sequence Diagram
sequenceDiagram
participant Dev as Developer
participant CI as GitHub CI
participant Sphinx as Sphinx Builder
participant Docs as Documentation
participant PyAPI as PyTorch API
Note over Dev,PyAPI: Documentation Fix PR Flow
Dev->>CI: Push PR with doc fixes
CI->>Sphinx: Run with SPHINXOPTS="-W"
Note over Sphinx: Critical Fix: Cyclic Import
Sphinx->>PyAPI: Import PyTorch modules
PyAPI-->>Sphinx: ✓ No cyclic imports (torch_version simplified)
Note over Sphinx: Fix: RST Formatting
Sphinx->>Docs: Parse RST files
Docs-->>Sphinx: ✓ Underlines match titles
Docs-->>Sphinx: ✓ Code blocks use :: syntax
Note over Sphinx: Fix: API Documentation
Sphinx->>PyAPI: Generate API docs
PyAPI-->>Sphinx: ✓ parallel_cross_entropy has docstring
PyAPI-->>Sphinx: ✓ All functions now rendered
Note over Sphinx: Enhancement: Warning Filter
Sphinx->>Sphinx: Filter C++ namespace duplicates
Sphinx-->>CI: Build success (no warnings)
CI-->>Dev: ✓ All checks pass
38 files reviewed, no comments
|
/te-ci |
docs/api/jax.rst
Outdated
|
|
||
| Pre-defined Variable of Logical Axes | ||
| ------------------------------------ | ||
| ------------------------------------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this needed to fix the rendering? The earlier version seems fine and of the correct length
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's incorrect. Fixed.
| * 'off-by-one': ``S[:,:,:,i] = exp(S[:,:,:,i])/(1 + sum(exp(S[:,:,:,:]), dim=-1))`` | ||
| * 'learnable': ``S[:,j,:,i] = exp(S[:,j,:,i])/(exp(alpha[j]) + sum(exp(S[:,j,:,:]), dim=-1))`` | ||
| where ``alpha`` is a learnable parameter in shape ``[h]``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| where ``alpha`` is a learnable parameter in shape ``[h]``. | |
| where ``alpha`` is a learnable parameter of shape ``[h]``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i changed all occurrences in TE (around 10)
| torch_version = PkgVersion(str(torch.__version__)).release | ||
| assert torch_version >= (2, 1), f"Minimum torch version 2.1 required. Found {torch_version}." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please change the instances at the call site as well. For e.g. linear.py still calls torch_version()
| @functools.lru_cache(maxsize=None) | ||
| def torch_version() -> tuple[int, ...]: | ||
| """Get PyTorch version""" | ||
| return PkgVersion(str(torch.__version__)).release |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this redefined here because of cyclic import issues when import directly from __init__.py? We should have it defined only in one location.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the init one is causing issues with import, please remove it and change all instance to import from utils instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is the case. And I cannot import utils before dynamically loading c++ lib, because it imports some other torch classes. So I guess there are 2 ways of doing that - create some file torch_version.py with only that or do what I did. I do not want to add file for one line function, what do you think?
| FP32 only. The returned loss is always in FP32, the input gradients are upcasted | ||
| to the datatype of the input. | ||
| If ``dist_process_group`` is passed for distributed loss calculation, the input to each |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are multiple instances where we use a single backtick in the docstrings instead of double backticks, this PR seems like a good place to change all those instances as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I did it and it took a lot of time and a lot of files were changed. Also, I discovered more formatting errors, so this PR is not that small now.
| def forward( | ||
| ctx, | ||
| _input, | ||
| input, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point actually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to inp
Signed-off-by: Pawel Gadzinski <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
Fixed documentation rendering issues by correcting RST formatting errors and resolving cyclic import problems. The changes include:
- Fixed underline lengths in
docs/api/jax.rstto match section titles ("Jax", "Checkpointing", "Modules") - Renamed
_inputparameter toinputincross_entropy.pyto follow Python naming conventions and enable proper Sphinx autodoc rendering - Converted
parallel_cross_entropyfrom a bare function reference to a proper function wrapper with comprehensive NumPy-style docstring - Added backward compatibility with deprecated
_inputparameter via keyword-only argument withFutureWarning - Added missing imports (
typing.Optional,warnings) to support the new implementation
These changes restore the missing PyTorch API documentation that wasn't being rendered due to cyclic imports caused by the underscore-prefixed parameter name.
Confidence Score: 4/5
- Safe to merge with minor verification recommended - backward compatibility is preserved through deprecation warnings
- The changes are well-implemented with proper backward compatibility handling. Score of 4 (not 5) because the deprecation warning approach should be tested to ensure existing code using positional arguments won't break, and the existing test suite appears to use positional calls that might need attention
- Verify that
tests/pytorch/test_parallel_cross_entropy.pystill passes - it uses positional arguments which should work, but worth confirming the backward compatibility mechanism functions correctly
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| docs/api/jax.rst | 5/5 | Fixed RST heading underline lengths to match title lengths for proper documentation rendering |
| transformer_engine/pytorch/cross_entropy.py | 4/5 | Renamed _input to input throughout, added proper function wrapper with NumPy-style docstring and backward compatibility deprecation warning |
Sequence Diagram
sequenceDiagram
participant User as User Code
participant PCE as parallel_cross_entropy()
participant CEF as CrossEntropyFunction
participant TCE as triton_cross_entropy
User->>PCE: Call with input/target
Note over PCE: Check for deprecated _input param
alt _input is provided
PCE->>PCE: Issue FutureWarning
PCE->>PCE: Use _input as input
end
PCE->>CEF: .apply(input, target, ...)
CEF->>TCE: cross_entropy_forward(input, ...)
TCE-->>CEF: Return (loss, input)
CEF->>CEF: save_for_backward(input)
CEF-->>PCE: Return loss
PCE-->>User: Return loss
Note over User,TCE: Backward Pass
User->>CEF: backward(grad_output)
CEF->>CEF: Retrieve saved input
CEF->>TCE: cross_entropy_backward(input, ...)
TCE-->>CEF: Return grad_input
CEF-->>User: Return (grad_input, None, ...)
2 files reviewed, no comments
Signed-off-by: Pawel Gadzinski <[email protected]>
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This PR fixes critical documentation build warnings by resolving cyclic imports and correcting RST formatting issues throughout the codebase.
Key Changes:
- Cyclic Import Resolution: Moved
torch_versionfunction frompytorch/__init__.pytopytorch/utils.pyand converted it from a cached function to a simple variable in__init__.py, breaking the circular dependency that prevented half of the PyTorch API from being rendered - Documentation Structure: Reorganized
docs/api/pytorch.rstwith better sectioning (Recipe availability, MoE functions, Communication-computation overlap, Deprecated functions) and improved API organization - RST Formatting: Fixed hundreds of docstring formatting issues across 50+ files:
- Changed backtick-quoted defaults like
`True`to proper RST formatTrue - Fixed code references from single quotes to double backticks (e.g.,
'forward'→:meth:`forward`) - Added proper math rendering using
:math:directives - Fixed inline code formatting throughout
- Changed backtick-quoted defaults like
- API Improvements: Enhanced
parallel_cross_entropyby converting from bare function assignment to a proper wrapper with full documentation and backward compatibility for the_inputparameter - CI Enhancement: Added
SPHINXOPTS="-W"to GitHub workflow to fail builds on any documentation warnings - Warning Suppression: Added custom Sphinx setup to filter unavoidable duplicate C++ declaration warnings
The changes are purely documentation-focused with minimal functional impact, except for the beneficial refactoring of the torch_version utility.
Confidence Score: 5/5
- This PR is safe to merge - changes are documentation-focused with well-tested import refactoring
- All changes are either documentation formatting improvements or minor refactoring to fix cyclic imports. The cyclic import fix is clean and preserves backward compatibility. The only remaining issue is a missing newline in docs/api/pytorch.rst
- docs/api/pytorch.rst needs the final newline added
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| docs/api/pytorch.rst | 4/5 | Reorganized API documentation with better sectioning; still missing final newline |
| transformer_engine/pytorch/init.py | 5/5 | Fixed cyclic import by converting torch_version from function to variable |
| transformer_engine/pytorch/utils.py | 5/5 | Moved torch_version function here from init.py to resolve cyclic imports |
| transformer_engine/pytorch/cross_entropy.py | 5/5 | Renamed _input to inp and added proper function wrapper with docstrings |
| transformer_engine/pytorch/module/base.py | 5/5 | Fixed docstring formatting and removed unused Recipe import that caused cyclic dependency |
| .github/workflows/docs.yml | 5/5 | Added SPHINXOPTS=-W to fail builds on documentation warnings |
Sequence Diagram
sequenceDiagram
participant Dev as Developer
participant GH as GitHub Workflow
participant Sphinx as Sphinx Builder
participant Import as Import System
participant Docs as Generated Docs
Dev->>GH: Push documentation changes
GH->>Sphinx: Build docs with SPHINXOPTS="-W"
Note over Sphinx,Import: Before this PR: Cyclic import issue
Import--xImport: torch_version() in __init__.py<br/>creates circular dependency
Import--xDocs: Half of PyTorch API not rendered
Note over Sphinx,Import: After this PR: Import fixed
Import->>Import: torch_version moved to utils.py
Import->>Import: __init__.py uses simple variable
Import->>Docs: All PyTorch API successfully imported
Sphinx->>Sphinx: Process RST files
Sphinx->>Sphinx: Parse docstrings with fixed formatting
Sphinx->>Sphinx: Render math expressions correctly
Sphinx->>Sphinx: Apply custom warning filter for C++ duplicates
alt Documentation has warnings
Sphinx->>GH: Build fails (exit code != 0)
GH->>Dev: CI reports failure
else Documentation clean
Sphinx->>GH: Build succeeds
GH->>Docs: Upload documentation artifact
GH->>Dev: CI passes
end
31 files reviewed, 1 comment
|
|
||
| .. autoapifunction:: transformer_engine.pytorch.fp8_autocast | ||
|
|
||
| .. autoapifunction:: transformer_engine.pytorch.fp8_model_init |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
syntax: missing final newline
| .. autoapifunction:: transformer_engine.pytorch.fp8_model_init | |
| .. autoapifunction:: transformer_engine.pytorch.fp8_model_init |
Description
Our documentation returned a lot of warnings and it seems that some of them were rational. It turned out that half of our PyTorch API was not rendered. This PR fixes all the warnings and forces Github workflow to error out if any docs warning will appear.
The most problematic error was related to cyclic imports - this resulted in part of our PyTorch API not being rendered. Other were mostly related to wrong formatting and some warnings didn't result in anything wrong.
Our docs for 2.9 contains only part of PyTorch API exposed in 2.8, so this will need urgent fix I will update in separate PR.
Type of change
Checklist: