Skip to content

Support non orthogonal parallel axes and explicit replication annotation in dump comparator#19679

Merged
fzyzcjy merged 539 commits intosgl-project:mainfrom
fzyzcjy:ac8420/32
Mar 2, 2026
Merged

Support non orthogonal parallel axes and explicit replication annotation in dump comparator#19679
fzyzcjy merged 539 commits intosgl-project:mainfrom
fzyzcjy:ac8420/32

Conversation

@fzyzcjy
Copy link
Copy Markdown
Collaborator

@fzyzcjy fzyzcjy commented Mar 2, 2026

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Review Process

  1. Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or contact authorized users to do so.
    • /tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci
  4. After green CI and required approvals, ask Merge Oncalls to merge.

…h dims+side field, move e2e tests to test_entrypoint.py
Move existing token_aligner logic into smart/ subdirectory.
Add concat.py for simple BS=1 concat+truncate mode.
AlignerPlan gains token_aligner_mode field for mode dispatch.
Default mode is concat (no aux tensors needed).
TokenAlignerMode is an implementation detail of the token_aligner
package. Outer modules (bundle_comparator, planner, entrypoint)
now use Optional[str] instead of the typed alias.
Tests in TestEntrypointAlignment rely on aux tensor filtering which
only happens in smart mode. With the new concat default, these tests
need to explicitly request smart mode.
THD zigzag tests require thd_global_seq_lens from aux tensor processing,
which only happens in smart mode.
11 new tests in TestEntrypointConcatMode covering:
- multi-step concat with different data + truncation
- concat combined with TP/CP unshard
- unequal step counts between baseline and target
- token dim resolution (t / s / fallback dim 0)
- step ordering preservation
- aux tensors not filtered in concat mode
- aligner_plan field verification
- failure path validation

Also adds two helpers: _create_multi_step_rank_dump and
_create_multi_step_tp_sharded_dumps for tests needing
different tensors per step.
layer_id=[012] is not valid Python (leading zero in integer literal).
Change to layer_id in [0, 1, 2] which eval() can handle correctly.
Add ParallelAxis.DP, ParallelState enum (SHARDED/CONCATED),
DimSpec.parallel_state field, and axis:state colon parsing
(e.g. "t(dp:concated)") to support DP Attention scenarios.
Concated axes (parallel_state == CONCATED) are excluded from both
sharded and replicated sets, so they don't participate in unshard,
validation, or coord tracking. This makes DP transparent when
tensors are already dp_gathered (concated).
- test_dims: axis:state parsing, DP axis, error cases, dp_attn_mlp_dims
- test_planner: TestConcatedAxes (skipped/empty/not-replicated/not-in-info)
- test_parallel_info: test_dp_info
- entrypoint test_planner: dp_concated_with_tp_sharded, _make_meta dp support
TestEntrypointDpConcated: test_dp2_tp2_concated and test_dp2_tp4_concated
verify that DP concated axis is skipped by unsharder while TP is
correctly concatenated, producing unified_shape matching full tensor.
fzyzcjy added 23 commits March 1, 2026 10:21
# Conflicts:
#	test/registered/debug_utils/comparator/aligner/unsharder/test_executor.py
# Conflicts:
#	python/sglang/srt/debug_utils/comparator/bundle_comparator.py
#	python/sglang/srt/debug_utils/comparator/dims.py
#	python/sglang/srt/debug_utils/comparator/entrypoint.py
#	test/registered/debug_utils/comparator/aligner/reorderer/test_planner.py
#	test/registered/debug_utils/comparator/aligner/unsharder/test_executor.py
#	test/registered/debug_utils/comparator/test_dims.py
#	test/registered/debug_utils/comparator/test_entrypoint.py
#	test/registered/debug_utils/comparator/test_model_validation.py
#	test/registered/debug_utils/source_patcher/test_code_patcher.py
#	test/registered/debug_utils/source_patcher/test_dumper_integration.py
#	test/registered/debug_utils/source_patcher/test_source_editor.py
# Conflicts:
#	test/registered/debug_utils/comparator/test_entrypoint.py
# Conflicts:
#	test/registered/debug_utils/comparator/test_dims.py
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly upgrades the dump comparator's ability to handle complex tensor layouts and parallelization strategies. By introducing a new dimension specification system, it enables the tool to accurately align and compare tensors with fused dimensions and explicitly declared replicated axes. The refactored logging and CLI presets further enhance the usability and diagnostic capabilities of the comparator, making it more adaptable to diverse debugging needs.

Highlights

  • Enhanced Dimension Specification: Introduced a new dims_spec package to replace the simpler dims module. This new package provides a more robust and flexible way to parse and interpret dimension strings, supporting complex annotations like fused dimensions (e.g., (num_heads*head_dim)) and explicit replication (e.g., # tp:replicated).
  • Non-Orthogonal Parallel Axes Support: The axis aligner now correctly handles non-orthogonal parallel axes, allowing for more sophisticated tensor comparison across different parallelization schemes. This includes new logic for semantic name matching, canonical order building, and side-specific pattern generation for einops rearrangements.
  • Explicit Replication Annotation: Added support for explicitly marking parallel axes as replicated using a new # axis:replicated syntax in dimension strings. This allows the comparator to correctly identify and verify replicated tensors, preventing false positives in comparisons.
  • Refactored Logging System: Replaced the warning_sink with a more structured log_sink that differentiates between ErrorLog and InfoLog. This improves clarity and allows for better categorization of messages during the comparison process.
  • Improved CLI Argument Handling with Presets: Implemented a preset system for command-line arguments, allowing users to quickly configure the comparator for common scenarios (e.g., sglang_dev, sglang_megatron). This simplifies usage and reduces boilerplate.
  • Extended Parallel Info Collection: The dumper now collects additional parallelization information, specifically moe_dp_rank, moe_dp_size, attn_cp_rank, and attn_cp_size, providing more comprehensive metadata for debugging.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • python/sglang/srt/debug_utils/comparator/init.py
    • Renamed ComparisonRecord to TensorComparisonRecord for clarity.
    • Updated model_rebuild() call to reflect the new class name.
  • python/sglang/srt/debug_utils/comparator/aligner/axis_aligner.py
    • Updated imports to use the new dims_spec package and log_sink.
    • Refactored compute_axis_aligner_plan to use DimSpec objects and new helper functions for canonical order and pattern building.
    • Added _semantic_names_match to verify semantic dimension name consistency.
    • Introduced _expand_and_skip_squeeze for processing dimension specifications.
    • Implemented _build_canonical_order to determine a unified dimension order, supporting fused dimensions.
    • Created _build_side_pattern to generate einops rearrangement patterns for each tensor side.
    • Added validation for the side argument in execute_axis_aligner_plan.
  • python/sglang/srt/debug_utils/comparator/aligner/axis_swapper.py
    • Adjusted parse_dims calls to access the .dims attribute from the new DimsSpec object.
  • python/sglang/srt/debug_utils/comparator/aligner/entrypoint/planner.py
    • Updated imports to reflect the new dims_spec package and its types.
    • Modified compute_per_step_sub_plans to extract and pass explicit_replicated_axes to the unsharder planner.
  • python/sglang/srt/debug_utils/comparator/aligner/reorderer/executor.py
    • Updated imports to use the new dims_spec package.
  • python/sglang/srt/debug_utils/comparator/aligner/reorderer/planner.py
    • Updated imports to use the new dims_spec package.
  • python/sglang/srt/debug_utils/comparator/aligner/token_aligner/concat_steps/executor.py
    • Updated imports to use the new dims_spec package.
  • python/sglang/srt/debug_utils/comparator/aligner/token_aligner/entrypoint.py
    • Replaced GeneralWarning with InfoLog and warning_sink with log_sink.
    • Refactored token aligner mode selection logic.
  • python/sglang/srt/debug_utils/comparator/aligner/token_aligner/smart/aux_loader.py
    • Updated imports to use dims_spec, ErrorLog, InfoLog, and log_sink.
    • Replaced GeneralWarning instances with ErrorLog or InfoLog.
  • python/sglang/srt/debug_utils/comparator/aligner/token_aligner/smart/aux_plugins.py
    • Updated imports to use dims_spec, InfoLog, and log_sink.
    • Modified infer_cp_sharded_dims to use [] for modifiers instead of ().
    • Replaced GeneralWarning instances with InfoLog.
  • python/sglang/srt/debug_utils/comparator/aligner/token_aligner/smart/executor.py
    • Updated imports to use the new dims_spec package.
  • python/sglang/srt/debug_utils/comparator/aligner/token_aligner/smart/types.py
    • Updated imports to use the new dims_spec package.
  • python/sglang/srt/debug_utils/comparator/aligner/unsharder/executor.py
    • Updated imports to use the new dims_spec package.
    • Refactored _verify_replicated_group into _check_replicated_pair to handle optional diff information.
  • python/sglang/srt/debug_utils/comparator/aligner/unsharder/parallel_info.py
    • Updated imports to use the new dims_spec package.
  • python/sglang/srt/debug_utils/comparator/aligner/unsharder/planner.py
    • Updated imports to use the new dims_spec package.
    • Modified compute_unsharder_plan to accept explicit_replicated_axes and use sanitized_name for modifiers.
    • Added logic to automatically mark RECOMPUTE_PSEUDO as replicated.
    • Introduced _validate_explicit_replicated for robust validation of replicated axis declarations.
  • python/sglang/srt/debug_utils/comparator/aligner/unsharder/types.py
    • Updated imports to use the new dims_spec package.
  • python/sglang/srt/debug_utils/comparator/bundle_comparator.py
    • Updated imports to use dims_spec, ErrorLog, NonTensorComparisonRecord, SkipComparisonRecord, TensorComparisonRecord, and log_sink.
    • Adjusted return types for comparison functions to reflect new record types.
    • Integrated log_sink for collecting errors and infos.
    • Changed the order of DP filter and meta override application.
    • Added _extract_dp_alias_from_items to support DP group aliases.
  • python/sglang/srt/debug_utils/comparator/dims.py
    • Removed the file, as its functionality has been refactored into the dims_spec package.
  • python/sglang/srt/debug_utils/comparator/dims_spec/init.py
    • Added a new package for structured dimension specification, re-exporting all new modules.
  • python/sglang/srt/debug_utils/comparator/dims_spec/comment_parser.py
    • Added a new module to parse comment suffixes in dimension strings for DP aliases and replicated axes.
  • python/sglang/srt/debug_utils/comparator/dims_spec/dim_parser.py
    • Added a new module to parse individual dimension tokens, including support for fused dimensions like (a*b).
  • python/sglang/srt/debug_utils/comparator/dims_spec/dims_parser.py
    • Added a new module to parse full dimension strings, handling _SingletonDimUtil, fused dimensions, and extracting DP group aliases and replicated axes.
  • python/sglang/srt/debug_utils/comparator/dims_spec/modifier_parser.py
    • Added a new module to parse dimension modifiers (e.g., [tp:partial]).
  • python/sglang/srt/debug_utils/comparator/dims_spec/tensor_naming.py
    • Added a new module with utilities for finding, resolving, applying, and stripping dimension names from tensors.
  • python/sglang/srt/debug_utils/comparator/dims_spec/types.py
    • Added a new module defining core types for dimension specification, including DimSpec, DimsSpec, ParallelAxis, Ordering, Reduction, ParallelModifier, and TokenLayout.
  • python/sglang/srt/debug_utils/comparator/dp_utils.py
    • Modified filter_to_non_empty_dp_rank and _extract_dp_info to accept an optional dp_group_alias for flexible data parallel group filtering.
  • python/sglang/srt/debug_utils/comparator/entrypoint.py
    • Removed unused re import.
    • Updated imports to use new output record types and log_sink.
    • Refactored _compute_exit_code into utils.py.
    • Simplified _compute_skip_keys and removed has_token_aligner argument.
    • Updated _consume_comparison_records to handle new record types and return failed_names.
    • Introduced parse_args to handle CLI argument parsing and preset expansion.
    • Added allow_failed_pattern CLI argument for controlling exit code behavior.
    • Refactored _maybe_load_tokenizer for improved logic.
  • python/sglang/srt/debug_utils/comparator/log_sink.py
    • Added a new module implementing a structured logging sink, replacing the previous warning_sink.
  • python/sglang/srt/debug_utils/comparator/output_types.py
    • Refactored GeneralWarning into BaseLog, ErrorLog, and InfoLog.
    • Added _split_logs helper for partitioning log types.
    • Renamed SkipRecord to SkipComparisonRecord, ComparisonRecord to TensorComparisonRecord, and NonTensorRecord to NonTensorComparisonRecord.
    • Introduced RecordLocation to track the step of a comparison.
    • Updated _OutputRecord to manage errors and infos lists.
    • Renamed WarningRecord to LogRecord.
  • python/sglang/srt/debug_utils/comparator/per_token_visualizer.py
    • Updated ComparisonRecord to TensorComparisonRecord in function signatures.
  • python/sglang/srt/debug_utils/comparator/preset.py
    • Added a new module defining CLI argument presets for common comparator configurations.
  • python/sglang/srt/debug_utils/comparator/tensor_comparator/formatter.py
    • Modified format_replicated_checks to gracefully handle cases where diff might be None.
  • python/sglang/srt/debug_utils/comparator/utils.py
    • Added compute_exit_code and _is_all_match_pattern for robust exit code determination based on comparison results and allowance patterns.
  • python/sglang/srt/debug_utils/comparator/warning_sink.py
    • Removed the file, as its functionality has been replaced by log_sink.py.
  • python/sglang/srt/debug_utils/dumper.py
    • Extended parallel info collection to include moe_dp_rank, moe_dp_size, attn_cp_rank, and attn_cp_size.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant and well-executed refactoring of the dump comparator utility. The core of the changes is the new dims_spec package, which provides a more structured and robust way to parse and handle tensor dimension specifications, including support for fused dimensions and explicit replication annotations. The logging system has also been improved by replacing the warning_sink with a more expressive log_sink that distinguishes between errors and informational messages. Additionally, the command-line interface is now more user-friendly with the introduction of presets. Overall, these changes greatly enhance the functionality and maintainability of the comparator. I've found one minor issue with a duplicated function.

Comment on lines +157 to +175
def _maybe_load_tokenizer(args: argparse.Namespace) -> Any:
tokenizer_path: Optional[str] = getattr(args, "tokenizer", None)

if tokenizer_path is None:
for directory in [Path(args.baseline_path), Path(args.target_path)]:
tokenizer_path = read_tokenizer_path(directory)
if tokenizer_path is not None:
break

if tokenizer_path is None:
return None

try:
from transformers import AutoTokenizer

return AutoTokenizer.from_pretrained(tokenizer_path)
except Exception:
return None

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The function _maybe_load_tokenizer is defined twice in this file. This second definition appears to be a copy of the one defined on lines 137-155. This duplication should be removed.

# Conflicts:
#	python/sglang/srt/debug_utils/comparator/dims.py
#	test/registered/debug_utils/comparator/test_dims.py
@fzyzcjy fzyzcjy merged commit 6980416 into sgl-project:main Mar 2, 2026
26 of 34 checks passed
Kangyan-Zhou pushed a commit to Kangyan-Zhou/sglang that referenced this pull request Mar 4, 2026
magicYang1573 pushed a commit to magicYang1573/sglang that referenced this pull request Mar 9, 2026
Wangzheee pushed a commit to Wangzheee/sglang that referenced this pull request Mar 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant