Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
485 commits
Select commit Hold shift + click to select a range
ebb3d92
Revert "_kv_pairs_to_dict: support quoted values for filter expressio…
fzyzcjy Feb 27, 2026
9bab665
Merge branch 'ac8420/19' into ac8420/20
fzyzcjy Feb 27, 2026
e38aed2
Revert "_kv_pairs_to_dict: support quoted values for filter expressio…
fzyzcjy Feb 27, 2026
8e1f604
Merge branch 'ac8420/20' into ac8420/21
fzyzcjy Feb 27, 2026
efd701a
fix: put extra_imports before user preamble in source patcher
fzyzcjy Feb 27, 2026
d1b828e
comparator: add DP support to aux_loader and token aligner types
fzyzcjy Feb 27, 2026
fe43d07
comparator: add DP grouping to bundle_comparator
fzyzcjy Feb 27, 2026
562994b
comparator: add tests for DP support
fzyzcjy Feb 27, 2026
86317b8
fix formatting: isort, black, ruff auto-fixes for DP support
fzyzcjy Feb 27, 2026
02d55d9
fix isort: merge duplicate imports in test_dp_support
fzyzcjy Feb 27, 2026
5fe0ae4
comparator: extract DP logic into separate files
fzyzcjy Feb 27, 2026
7cb0c8f
Revert "comparator: extract DP logic into separate files"
fzyzcjy Feb 27, 2026
c190525
Revert "fix isort: merge duplicate imports in test_dp_support"
fzyzcjy Feb 27, 2026
2342364
Revert "fix formatting: isort, black, ruff auto-fixes for DP support"
fzyzcjy Feb 27, 2026
77addbc
Revert "comparator: add tests for DP support"
fzyzcjy Feb 27, 2026
b653142
Revert "comparator: add DP grouping to bundle_comparator"
fzyzcjy Feb 27, 2026
946cdba
Revert "comparator: add DP support to aux_loader and token aligner ty…
fzyzcjy Feb 27, 2026
e66c328
add dp_utils.py: filter_to_non_empty_dp_rank for DP support
fzyzcjy Feb 27, 2026
3efc11a
bundle_comparator: apply DP filter after loading values
fzyzcjy Feb 27, 2026
b5812c5
aux_loader: apply DP filter after loading aux tensors
fzyzcjy Feb 27, 2026
0c62a31
add test_dp_filter.py: unit + integration tests for DP filtering
fzyzcjy Feb 27, 2026
4aac27e
fix dp_utils: skip filtering for non-tensor items
fzyzcjy Feb 27, 2026
13c2bb2
fix formatting: isort + black auto-fixes
fzyzcjy Feb 27, 2026
e8834c9
reorganize DP tests: one test file per source file
fzyzcjy Feb 27, 2026
d33469f
fix: replicated_mismatch warnings should not fail category
fzyzcjy Feb 27, 2026
f4c8fd3
fix: replicated_mismatch warnings should not fail category
fzyzcjy Feb 27, 2026
873c81a
update test: replicated_mismatch warnings are informational
fzyzcjy Feb 27, 2026
ac51a9b
fix DP E2E tests: use grouping=logical for multi-rank scenarios
fzyzcjy Feb 27, 2026
db4a8c4
update unit tests for replicated_mismatch category change
fzyzcjy Feb 27, 2026
f3751d1
fix: revert accidental grouping=raw -> logical in existing tests
fzyzcjy Feb 27, 2026
460e42f
fix formatting: black auto-fix in test_entrypoint.py
fzyzcjy Feb 27, 2026
b922d59
Revert "fix: replicated_mismatch warnings should not fail category"
fzyzcjy Feb 27, 2026
f645bc6
fix stale replicated-axis tests after category logic change
fzyzcjy Feb 27, 2026
47f4b3b
Merge branch 'ac8420/21' into ac8420/22
fzyzcjy Feb 27, 2026
1733577
dp_utils: only handle dp_rank/dp_size, drop attn_dp support
fzyzcjy Feb 27, 2026
b885d67
revert: replicated_mismatch warnings must fail the record
fzyzcjy Feb 27, 2026
a1ae47f
add E2E test: DP=2 both ranks non-empty raises AssertionError
fzyzcjy Feb 27, 2026
11fc3c6
add ReplicatedCheckResult as first-class field in ComparisonRecord
fzyzcjy Feb 27, 2026
d53be34
fix AnyWarning: use plain type alias instead of Union+Discriminator w…
fzyzcjy Feb 27, 2026
1f29e83
fix aligner entrypoint tests for new tuple return types
fzyzcjy Feb 27, 2026
1d1e01c
fix reorderer tests for UnsharderResult return type
fzyzcjy Feb 27, 2026
0d1757e
fix reorderer planner test for UnsharderResult return type
fzyzcjy Feb 27, 2026
35d0202
fix aux_loader to unpack execute_sub_plans tuple return
fzyzcjy Feb 27, 2026
16f6560
fix formatting: isort, ruff, black auto-fixes
fzyzcjy Feb 27, 2026
548379c
add comparator/patch_config.py: dims override rules for post-hoc patc…
fzyzcjy Feb 27, 2026
84af4d9
wire dims override into comparator entrypoint
fzyzcjy Feb 27, 2026
813d3cc
inject dims override into bundle_comparator pipeline
fzyzcjy Feb 27, 2026
d2cac0f
add test_patch_config.py: unit + integration tests for dims override
fzyzcjy Feb 27, 2026
9d59d06
fix formatting: isort, black, ruff auto-fixes
fzyzcjy Feb 27, 2026
d509ef2
replace getattr with direct attr access for dims override args
fzyzcjy Feb 27, 2026
59df3bb
add missing dims override tests: multi-tensor, multi-CLI, per-side pa…
fzyzcjy Feb 27, 2026
f0b54da
rename patch_config→override_config, DimsOverrideRule→MetaOverrideRul…
fzyzcjy Feb 27, 2026
aaaf7e0
fix black formatting in test_override_config.py
fzyzcjy Feb 27, 2026
73c7d47
fix test_non_tensor_unaffected_by_override: include tensor alongside …
fzyzcjy Feb 27, 2026
9d70bcb
rename override_config→meta_overrider, DimsOverrider→MetaOverrider, O…
fzyzcjy Feb 27, 2026
656e6b3
refactor MetaOverrideRule: replace dims/baseline_dims/target_dims wit…
fzyzcjy Feb 28, 2026
ea28c33
fix apply_to_metas: first-match-wins per side, not globally
fzyzcjy Feb 28, 2026
2f9805c
add apply_to_value_with_meta_pair to MetaOverrider, simplify bundle_c…
fzyzcjy Feb 28, 2026
eade867
reduce TestEntrypointMetaOverride duplication: parametrize per-side t…
fzyzcjy Feb 28, 2026
16aedc6
rename YAML top-level key from dims to overrides, update docstrings f…
fzyzcjy Feb 28, 2026
5fde75a
remove manual re.compile cache — re module caches internally
fzyzcjy Feb 28, 2026
d4dc5ee
deduplicate from_args_and_config CLI rule construction
fzyzcjy Feb 28, 2026
f014faa
simplify MetaOverrider to single-value API: apply_to_meta / apply_to_…
fzyzcjy Feb 28, 2026
fb97c00
remove apply_to_value, inline _parse_cli_override_args, parametrize s…
fzyzcjy Feb 28, 2026
a949183
fix isort and black formatting
fzyzcjy Feb 28, 2026
e88a354
token_aligner: dual-mode --token-aligner {smart,concat}
fzyzcjy Feb 28, 2026
aaf6d9a
black formatting fixes
fzyzcjy Feb 28, 2026
5ce43f6
fix mock.patch target for moved aux_loader module
fzyzcjy Feb 28, 2026
afc67bf
fix execute_concat: resolve token dim from named dims, add tests
fzyzcjy Feb 28, 2026
b2dded0
make TokenAlignerMode internal to token_aligner package
fzyzcjy Feb 28, 2026
b44ef8f
black formatting fix for concat.py
fzyzcjy Feb 28, 2026
ee367e8
fix alignment tests: use token_aligner="smart" for aux-dependent tests
fzyzcjy Feb 28, 2026
bd78f7b
fix THD zigzag tests: use token_aligner="smart" for aux-dependent tests
fzyzcjy Feb 28, 2026
95329f3
more
fzyzcjy Feb 28, 2026
c3af3ec
more
fzyzcjy Feb 28, 2026
457f56e
add comprehensive E2E tests for concat token-aligner mode
fzyzcjy Feb 28, 2026
f9a2876
more
fzyzcjy Feb 28, 2026
c9b832f
refactor execute_token_aligner_concat to use Pair.map
fzyzcjy Feb 28, 2026
179759b
refactor TestEntrypointConcatMode: extract _make_dirs, _create_both_s…
fzyzcjy Feb 28, 2026
cec56ea
fix e2e test: use valid Python syntax for DUMPER_FILTER expression
fzyzcjy Feb 28, 2026
6d852fe
dp-attn v2: add DP enum, ParallelState, colon syntax to dims.py
fzyzcjy Feb 28, 2026
b9a47af
dp-attn v2: concated axes excluded from unsharder plan
fzyzcjy Feb 28, 2026
c37f51d
dp-attn v2: unit tests for DP enum, colon syntax, concated axes
fzyzcjy Feb 28, 2026
115473d
dp-attn v2: E2E tests for dp concated + tp sharded scenario
fzyzcjy Feb 28, 2026
e4ca1a4
dp-attn v2: black formatting fixes
fzyzcjy Feb 28, 2026
3d4be19
dp-attn v2: neutralize dp_size after dp_utils filtering
fzyzcjy Feb 28, 2026
f3f89b8
dp-attn v2: DP is non-shardable, not neutralized in dp_utils
fzyzcjy Feb 28, 2026
4b97052
more
fzyzcjy Feb 28, 2026
b33549c
dp-attn v2: rename all_axes to shardable_axes
fzyzcjy Feb 28, 2026
37c6352
more
fzyzcjy Feb 28, 2026
6f40c6a
dp-attn v2: always return parallel_state in colon modifier parsing
fzyzcjy Feb 28, 2026
fd17dee
revert dp attn
fzyzcjy Feb 28, 2026
22192a9
emit warning when tensor file load fails in _load_all_values
fzyzcjy Feb 28, 2026
ef822d5
add tests for _load_all_values load failure warnings
fzyzcjy Feb 28, 2026
d65c8f5
comparator: exit non-zero on failed comparisons
fzyzcjy Feb 28, 2026
0409d7e
comparator: add TestExitCode tests, adapt existing tests for new retu…
fzyzcjy Feb 28, 2026
d43caad
more
fzyzcjy Feb 28, 2026
8752c48
comparator: add subprocess e2e tests for exit code behavior
fzyzcjy Feb 28, 2026
545d567
refactor: replace DimSpec single-axis fields with ParallelModifier tuple
fzyzcjy Feb 28, 2026
c9a81b1
refactor: update unsharder planner for multi-modifier DimSpec
fzyzcjy Feb 28, 2026
7bb059a
refactor: update reorderer planner for multi-modifier DimSpec
fzyzcjy Feb 28, 2026
f97b263
migrate all code and tests to colon syntax for dim spec modifiers
fzyzcjy Feb 28, 2026
85ad1c9
style: auto-fix formatting from ruff and black
fzyzcjy Feb 28, 2026
09b7499
cleanup: remove unused _ORDERING_LOOKUP and _REDUCTION_LOOKUP
fzyzcjy Feb 28, 2026
164c857
refactor: use pydantic _FrozenBase for DimSpec types, simplify parser
fzyzcjy Feb 28, 2026
0ec2847
style: auto-fix black formatting in unsharder planner
fzyzcjy Feb 28, 2026
b6f6cdc
refactor: use .get() lookup in _parse_modifier_token
fzyzcjy Feb 28, 2026
315ffb7
test: add multi-axis same-dim tests for t(cp:zigzag,sp)
fzyzcjy Feb 28, 2026
c75e046
test: add entrypoint E2E test for t(cp:zigzag,sp) same-dim multi-axis
fzyzcjy Feb 28, 2026
ca124fe
fix: add cu_seqlens_q + per-seq zigzag to cp_zigzag_sp test helper
fzyzcjy Feb 28, 2026
3e7374d
fix: remove stale token_dim kwarg from test call
fzyzcjy Feb 28, 2026
70fd0c0
fix: use smart token aligner for cp_zigzag_sp entrypoint test
fzyzcjy Feb 28, 2026
ec9181c
fix: use s(cp:zigzag,sp) seq dim instead of t dim for entrypoint test
fzyzcjy Feb 28, 2026
bff7fa5
add moe_dp and attn_cp to SGLang parallel info collection
fzyzcjy Feb 28, 2026
f01d963
feat: add dp group alias support via `// dp:=<group>` in dims
fzyzcjy Feb 28, 2026
7db8c77
test: add tests for dp group alias feature
fzyzcjy Feb 28, 2026
5d569aa
test: verify moe_dp and attn_cp fields in E2E parallel_info dump
fzyzcjy Feb 28, 2026
b68dcca
test: verify all sglang_parallel_info fields in E2E dump
fzyzcjy Feb 28, 2026
617cd11
fix: correct E2E tests for dp group alias
fzyzcjy Feb 28, 2026
079cf62
fix: use realistic single-rank and moe_dp scenarios in E2E tests
fzyzcjy Feb 28, 2026
80804f4
refactor: use Pair.map for dp filter step in bundle_comparator
fzyzcjy Feb 28, 2026
4d4687a
refactor: consolidate parse_dims + extract_dp_group_alias into DimsSpec
fzyzcjy Feb 28, 2026
bfca658
more
fzyzcjy Feb 28, 2026
6830bfd
refactor: use # instead of // as dims declaration separator
fzyzcjy Feb 28, 2026
86842a9
replace --forbid-skip with --allow-skip-pattern for fine-grained skip…
fzyzcjy Feb 28, 2026
9d67f89
style: black formatting for test_entrypoint.py
fzyzcjy Feb 28, 2026
00a6245
feat(comparator): add ReportSink for unified JSONL report output
fzyzcjy Feb 28, 2026
5fe2d8f
style: black formatting for entrypoint.py
fzyzcjy Feb 28, 2026
56e400b
add ReportPathRecord, remove --no-report in favor of --report-path ''
fzyzcjy Feb 28, 2026
6a6c808
more
fzyzcjy Feb 28, 2026
8a7ef20
simplify e2e test: use --allow-skip-pattern, remove JSONL parsing wor…
fzyzcjy Feb 28, 2026
eb7d99c
more
fzyzcjy Feb 28, 2026
93aceb2
fix: restore no_report check in _resolve_report_path and _make_args d…
fzyzcjy Feb 28, 2026
a4ef8e0
fix: restore --no-report CLI argument removed by linter
fzyzcjy Feb 28, 2026
69208d3
refactor: replace --no-report with --report-path '' to disable report
fzyzcjy Feb 28, 2026
42976e6
fix: remove linter-injected ReportPathRecord, use stderr for report path
fzyzcjy Feb 28, 2026
13fe398
style: remove extra blank line
fzyzcjy Feb 28, 2026
abbffc1
Merge branch 'ac8420/26' into ac8420/27
fzyzcjy Feb 28, 2026
4b3fe94
Merge branch 'ac8420/27' into ac8420/28
fzyzcjy Feb 28, 2026
2bb0162
temp
fzyzcjy Feb 28, 2026
b98439a
fix: add explicit parentheses for set operator precedence in load_thd…
fzyzcjy Feb 28, 2026
33d0e54
test: add unit tests for load_thd_seq_lens_only
fzyzcjy Feb 28, 2026
80e9070
test: add concat + THD CP zigzag integration test
fzyzcjy Feb 28, 2026
cccd2eb
refactor: rename "concat" to "concat_steps", move load_thd_seq_lens_o…
fzyzcjy Feb 28, 2026
91267fa
fix: break circular import by importing thd_seq_lens_loader directly
fzyzcjy Feb 28, 2026
b4b55ce
fix: correct double-suffixed function name in test_concat_steps.py
fzyzcjy Feb 28, 2026
1c19b4f
fix: update mock target path after moving load_thd_seq_lens_only
fzyzcjy Feb 28, 2026
6fd811f
fix formatting (black)
fzyzcjy Feb 28, 2026
13166ea
merge ac8420/12 (formatting fix)
fzyzcjy Feb 28, 2026
257f013
merge ac8420/13
fzyzcjy Feb 28, 2026
e223327
merge ac8420/14
fzyzcjy Feb 28, 2026
125197c
merge ac8420/15a
fzyzcjy Feb 28, 2026
2020aa2
merge ac8420/15b
fzyzcjy Feb 28, 2026
0af87c3
merge ac8420/16
fzyzcjy Feb 28, 2026
fc984c4
merge ac8420/17
fzyzcjy Feb 28, 2026
f88eefa
more
fzyzcjy Feb 28, 2026
a45cfa7
more
fzyzcjy Feb 28, 2026
df4f747
refactor: move TestLoadThdSeqLensOnly to test_thd_seq_lens_loader.py
fzyzcjy Feb 28, 2026
d1c8c30
Merge remote-tracking branch 'upstream/main' into ac8420/18
fzyzcjy Feb 28, 2026
3f7d331
Merge branch 'ac8420/25' into ac8420/26
fzyzcjy Feb 28, 2026
03f8a4c
Merge branch 'ac8420/26' into ac8420/27
fzyzcjy Feb 28, 2026
86c2d5d
Merge branch 'ac8420/27' into ac8420/28
fzyzcjy Feb 28, 2026
91c1644
add preset.py for comparator CLI preset system
fzyzcjy Feb 28, 2026
11f4501
token_aligner: remove grouping dependency, use token_aligner=None
fzyzcjy Feb 28, 2026
f3877bf
output_types: add step field to SkipRecord, ComparisonRecord, NonTens…
fzyzcjy Feb 28, 2026
d11f2ad
entrypoint: replace --grouping with --preset and --grouping-skip-keys
fzyzcjy Feb 28, 2026
1ea4228
add CI registry to source_patcher test files
fzyzcjy Feb 28, 2026
74aac85
tests: adapt test_entrypoint for preset/grouping-skip-keys refactor
fzyzcjy Feb 28, 2026
4ea50b2
style: black formatting fixes
fzyzcjy Feb 28, 2026
8c9969a
style: black formatting for unrelated files
fzyzcjy Feb 28, 2026
6bf63c5
fix: add explicit concat args to test_concat_multi_step_cp_unshard
fzyzcjy Feb 28, 2026
adc4dea
fix: add step to skip_keys for test_concat_thd_cp_zigzag
fzyzcjy Feb 28, 2026
7926f8d
fix: unpack _run_and_parse tuple in test_concat_thd_cp_zigzag
fzyzcjy Feb 28, 2026
aec82f0
Merge remote-tracking branch 'upstream/main' into ac8420/18
fzyzcjy Feb 28, 2026
85f579d
fix: skip dim naming when tensor ndim mismatches metadata dims count
fzyzcjy Feb 28, 2026
3bc7d52
Revert "fix: skip dim naming when tensor ndim mismatches metadata dim…
fzyzcjy Feb 28, 2026
2d9d8cf
refactor: move TestExpandPreset to test_preset.py
fzyzcjy Feb 28, 2026
76c9df4
simplify _compute_skip_keys to single expression
fzyzcjy Feb 28, 2026
5eee8d9
refactor: split _parse_args into public parse_args(argv) + private _p…
fzyzcjy Feb 28, 2026
9476263
refactor: introduce RecordLocation, _BundleComparisonRecord base class
fzyzcjy Feb 28, 2026
554f317
refactor: switch test_entrypoint from Namespace to argv-driven via pa…
fzyzcjy Feb 28, 2026
fc9a6ab
style: auto-format from pre-commit (isort, black)
fzyzcjy Feb 28, 2026
4907b6a
more
fzyzcjy Feb 28, 2026
b65159f
more
fzyzcjy Feb 28, 2026
bd20054
remove recompute_status from preset skip keys
fzyzcjy Feb 28, 2026
14f8efa
fix: recompute tests need explicit grouping_skip_keys with recompute_…
fzyzcjy Feb 28, 2026
dfea8f6
refactor: extract _expand_flag helper from expand_preset
fzyzcjy Feb 28, 2026
16e4657
more
fzyzcjy Feb 28, 2026
6add391
more
fzyzcjy Feb 28, 2026
daae289
fix: update error message match in test_preset for _expand_flag
fzyzcjy Feb 28, 2026
76afc1c
fix: _expand_flag return type should be list[str] | None
fzyzcjy Feb 28, 2026
356e034
fix: skip dim naming when tensor ndim mismatches metadata dims count
fzyzcjy Feb 28, 2026
047d9d1
fix: squeeze singleton dims when tensor ndim exceeds dims metadata count
fzyzcjy Feb 28, 2026
c508abe
fix: handle shape mismatch in replicated group verification
fzyzcjy Feb 28, 2026
f4529d6
feat: add --exclude flag to comparator for filename exclusion
fzyzcjy Feb 28, 2026
b077ab2
fix: apply --exclude filter to both baseline and target dataframes
fzyzcjy Feb 28, 2026
7d937f9
fix: treat dim_name_squeeze as informational warning, squeeze singlet…
fzyzcjy Feb 28, 2026
d407dec
fix: treat axis_aligner_dim_mismatch as informational warning
fzyzcjy Feb 28, 2026
b1a849c
revert: remove axis_aligner_dim_mismatch workaround and bidirectional…
fzyzcjy Feb 28, 2026
82a1167
remove _shape_mismatch_diff, use diff=None for shape-mismatch replica…
fzyzcjy Feb 28, 2026
5941aa1
replace --exclude with --allow-fail-pattern in comparator
fzyzcjy Feb 28, 2026
04e68c9
fix tests: pass allow_fail_pattern/failed_names to _compute_exit_code
fzyzcjy Feb 28, 2026
7f5c986
fix: clear error message when dims metadata ndim mismatches tensor
fzyzcjy Feb 28, 2026
ea97fdb
rename allow-skip/fail-pattern to allow-skipped/failed-pattern
fzyzcjy Feb 28, 2026
c726c30
more
fzyzcjy Feb 28, 2026
b6b7acc
refactor: migrate all producers and tests from GeneralWarning/Warning…
fzyzcjy Feb 28, 2026
4d4a849
fmt
fzyzcjy Feb 28, 2026
1ed815a
refactor: extract _is_all_match_pattern from _compute_exit_code
fzyzcjy Feb 28, 2026
1f9c8a7
refactor: extract _check_replicated_pair, deduplicate ReplicatedCheck…
fzyzcjy Feb 28, 2026
8bb60b6
fix: change replicated check detail from "shape mismatch (no diff)" t…
fzyzcjy Feb 28, 2026
ca59ef4
test: add tests for ac8420/30 new features
fzyzcjy Feb 28, 2026
3efa45a
refactor: move compute_exit_code to utils.py, add passed==0 guard
fzyzcjy Feb 28, 2026
9c6849c
style: fix black formatting in comparator tests
fzyzcjy Mar 1, 2026
de727e7
Merge branch 'ac8420/20' into ac8420/21
fzyzcjy Mar 1, 2026
4a6f451
Merge branch 'ac8420/21' into ac8420/22
fzyzcjy Mar 1, 2026
355725e
style: fix black formatting in test_entrypoint.py
fzyzcjy Mar 1, 2026
4ddb61d
Merge branch 'ac8420/22' into ac8420/23
fzyzcjy Mar 1, 2026
1cb579c
Merge branch 'ac8420/23' into ac8420/24
fzyzcjy Mar 1, 2026
de3c50e
Merge branch 'ac8420/24' into ac8420/25
fzyzcjy Mar 1, 2026
186762c
style: fix black formatting in token aligner concat steps
fzyzcjy Mar 1, 2026
594118c
Merge branch 'ac8420/25' into ac8420/26
fzyzcjy Mar 1, 2026
4cac82b
Merge branch 'ac8420/26' into ac8420/27
fzyzcjy Mar 1, 2026
7c99955
style: fix black formatting in test_planner.py
fzyzcjy Mar 1, 2026
7c7ef7b
Merge branch 'ac8420/27' into ac8420/28
fzyzcjy Mar 1, 2026
567e3ea
style: fix black formatting in test_dims.py
fzyzcjy Mar 1, 2026
c4fa9af
fix: unpack _run_and_parse tuple in test_concat_thd_cp_zigzag
fzyzcjy Mar 1, 2026
58a9b48
fix: unpack _run_and_parse tuple in test_concat_thd_cp_zigzag
fzyzcjy Mar 1, 2026
ced943a
Merge branch 'ac8420/26' into ac8420/27
fzyzcjy Mar 1, 2026
fc16725
Merge branch 'ac8420/27' into ac8420/28
fzyzcjy Mar 1, 2026
9586bc8
Merge remote-tracking branch 'upstream/main' into ac8420/28
fzyzcjy Mar 1, 2026
9ec901c
fix: CI lint and test collection errors after upstream merge
fzyzcjy Mar 1, 2026
568b84d
Merge branch 'ac8420/18' into ac8420/19
fzyzcjy Mar 1, 2026
ce9b1b1
Merge remote-tracking branch 'upstream/main' into ac8420/19
fzyzcjy Mar 1, 2026
8cc0e4b
Merge branch 'ac8420/19' into ac8420/20
fzyzcjy Mar 1, 2026
7828eaa
Merge branch 'ac8420/20' into ac8420/21
fzyzcjy Mar 1, 2026
66e4ce7
Merge branch 'ac8420/21' into ac8420/22
fzyzcjy Mar 1, 2026
f098936
Merge branch 'ac8420/22' into ac8420/23
fzyzcjy Mar 1, 2026
930f44d
Merge branch 'ac8420/23' into ac8420/24
fzyzcjy Mar 1, 2026
8e6f790
Merge branch 'ac8420/24' into ac8420/25
fzyzcjy Mar 1, 2026
cb74bae
Merge branch 'ac8420/25' into ac8420/26
fzyzcjy Mar 1, 2026
b783fa0
Merge branch 'ac8420/26' into ac8420/27
fzyzcjy Mar 1, 2026
16c945e
Merge branch 'ac8420/27' into ac8420/28
fzyzcjy Mar 1, 2026
619050b
Merge remote-tracking branch 'upstream/main' into ac8420/28
fzyzcjy Mar 1, 2026
10b8b70
fmt
fzyzcjy Mar 1, 2026
b03ec80
Merge branch 'ac8420/28' into ac8420/29
fzyzcjy Mar 1, 2026
67f2ea7
Merge branch 'ac8420/29' into ac8420/30
fzyzcjy Mar 1, 2026
a01d2d9
style: reformat preset.py (black)
fzyzcjy Mar 2, 2026
9c4bba6
Merge branch 'ac8420/29' into ac8420/30
fzyzcjy Mar 2, 2026
1c86491
style: reformat with black
fzyzcjy Mar 2, 2026
80f12f7
Merge remote-tracking branch 'upstream/main' into ac8420/30
fzyzcjy Mar 2, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@
_SingletonDimUtil,
parse_dims,
)
from sglang.srt.debug_utils.comparator.log_sink import log_sink
from sglang.srt.debug_utils.comparator.utils import Pair, _FrozenBase
from sglang.srt.debug_utils.comparator.warning_sink import warning_sink

# --- types ---

Expand Down Expand Up @@ -70,10 +70,10 @@ def _resolve_target_order(
if set(x_names) != set(y_names):
# Local import to avoid circular dependency:
# output_types -> aligner/entrypoint/types -> axis_aligner -> output_types
from sglang.srt.debug_utils.comparator.output_types import GeneralWarning
from sglang.srt.debug_utils.comparator.output_types import ErrorLog

warning_sink.add(
GeneralWarning(
log_sink.add(
ErrorLog(
category="axis_aligner_dim_mismatch",
message=(
f"AxisAligner: dim name sets differ (x={x_names}, y={y_names}), "
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,9 @@
TokenAlignerPlan,
TokenAlignerSeqsInfo,
)
from sglang.srt.debug_utils.comparator.output_types import GeneralWarning
from sglang.srt.debug_utils.comparator.log_sink import log_sink
from sglang.srt.debug_utils.comparator.output_types import InfoLog
from sglang.srt.debug_utils.comparator.utils import Pair
from sglang.srt.debug_utils.comparator.warning_sink import warning_sink

_NONE_THD: Pair[Optional[dict[int, list[int]]]] = Pair(x=None, y=None)

Expand Down Expand Up @@ -66,8 +66,8 @@ def compute_maybe_token_aligner_result(
)
elif token_aligner_mode == "smart":
if not (has_aux_tensors(dfs.x) and has_aux_tensors(dfs.y)):
warning_sink.add(
GeneralWarning(
log_sink.add(
InfoLog(
category="aux_tensors_missing",
message="Aux tensors missing, skipping token alignment",
)
Expand Down Expand Up @@ -102,8 +102,8 @@ def _build_smart_result(
)

if baseline_aux is None or target_aux is None:
warning_sink.add(
GeneralWarning(
log_sink.add(
InfoLog(
category="framework_detection_failed",
message="Framework detection failed, skipping token alignment",
)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,8 @@
resolve_dim_names,
)
from sglang.srt.debug_utils.comparator.dp_utils import filter_to_non_empty_dp_rank
from sglang.srt.debug_utils.comparator.output_types import GeneralWarning
from sglang.srt.debug_utils.comparator.warning_sink import warning_sink
from sglang.srt.debug_utils.comparator.log_sink import log_sink
from sglang.srt.debug_utils.comparator.output_types import ErrorLog, InfoLog
from sglang.srt.debug_utils.dump_loader import ValueWithMeta, filter_rows

# re-export for existing callers
Expand Down Expand Up @@ -181,8 +181,8 @@ def _load_non_tensor_aux(
first_value = loaded[0].value
for i, item in enumerate(loaded[1:], start=1):
if item.value != first_value:
warning_sink.add(
GeneralWarning(
log_sink.add(
ErrorLog(
category=f"{name}_mismatch",
message=(
f"{name} mismatch across ranks: rank 0 has {first_value}, "
Expand Down Expand Up @@ -244,8 +244,8 @@ def _load_and_align_aux_tensor(
assert result is not None
return result.rename(None) # strip named dims before returning to plugin

warning_sink.add(
GeneralWarning(
log_sink.add(
InfoLog(
category="aux_no_dims",
message=(
f"aux tensor '{name}' has {len(tensors)} ranks "
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@
TokenAlignerStepAux,
)
from sglang.srt.debug_utils.comparator.dims import TokenLayout
from sglang.srt.debug_utils.comparator.output_types import GeneralWarning
from sglang.srt.debug_utils.comparator.warning_sink import warning_sink
from sglang.srt.debug_utils.comparator.log_sink import log_sink
from sglang.srt.debug_utils.comparator.output_types import InfoLog

# ── plugin ABC ─────────────────────────────────────────────────────

Expand Down Expand Up @@ -227,8 +227,8 @@ def detect_layout(self, raw: dict[int, dict[str, object]]) -> TokenLayout:
if isinstance(input_ids, torch.Tensor) and input_ids.ndim == 2:
return TokenLayout.BS

warning_sink.add(
GeneralWarning(
log_sink.add(
InfoLog(
category="layout_detection_fallback",
message=(
"Megatron layout detection: no qkv_format or 2D input_ids found, "
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -96,29 +96,49 @@ def _verify_replicated_group(
group_index: int,
) -> list[ReplicatedCheckResult]:
baseline: torch.Tensor = ordered_tensors[0].rename(None).float()
checks: list[ReplicatedCheckResult] = []

for i in range(1, len(ordered_tensors)):
other: torch.Tensor = ordered_tensors[i].rename(None).float()
return [
_check_replicated_pair(
baseline=baseline,
other=ordered_tensors[i],
axis=axis,
group_index=group_index,
compared_index=i,
)
for i in range(1, len(ordered_tensors))
]


def _check_replicated_pair(
*,
baseline: torch.Tensor,
other: torch.Tensor,
axis: ParallelAxis,
group_index: int,
compared_index: int,
) -> ReplicatedCheckResult:
other_float: torch.Tensor = other.rename(None).float()

if baseline.shape != other_float.shape:
passed = False
diff_info = None
else:
diff_info = compute_diff(
x_baseline=baseline,
x_target=other,
x_target=other_float,
diff_threshold=_REPLICATED_ATOL,
)
passed: bool = diff_info.max_abs_diff <= _REPLICATED_ATOL
checks.append(
ReplicatedCheckResult(
axis=axis.value,
group_index=group_index,
compared_index=i,
baseline_index=0,
passed=passed,
atol=_REPLICATED_ATOL,
diff=diff_info,
)
)

return checks
passed = diff_info.max_abs_diff <= _REPLICATED_ATOL

return ReplicatedCheckResult(
axis=axis.value,
group_index=group_index,
compared_index=compared_index,
baseline_index=0,
passed=passed,
atol=_REPLICATED_ATOL,
diff=diff_info,
)


def _thd_concat(
Expand Down
18 changes: 10 additions & 8 deletions python/sglang/srt/debug_utils/comparator/bundle_comparator.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,18 +26,19 @@
resolve_dim_names,
)
from sglang.srt.debug_utils.comparator.dp_utils import filter_to_non_empty_dp_rank
from sglang.srt.debug_utils.comparator.log_sink import log_sink
from sglang.srt.debug_utils.comparator.meta_overrider import MetaOverrider
from sglang.srt.debug_utils.comparator.output_types import (
GeneralWarning,
ErrorLog,
NonTensorComparisonRecord,
SkipComparisonRecord,
TensorComparisonRecord,
_split_logs,
)
from sglang.srt.debug_utils.comparator.tensor_comparator.comparator import (
compare_tensor_pair,
)
from sglang.srt.debug_utils.comparator.utils import Pair
from sglang.srt.debug_utils.comparator.warning_sink import warning_sink
from sglang.srt.debug_utils.dump_loader import LOAD_FAILED, ValueWithMeta

_FAILED_SIDE_MAP: dict[str, str] = {"x": "baseline", "y": "target"}
Expand All @@ -59,7 +60,7 @@ def compare_bundle_pair(
compute_per_token: bool = False,
meta_overrider: Optional[MetaOverrider] = None,
) -> Union[TensorComparisonRecord, SkipComparisonRecord, NonTensorComparisonRecord]:
with warning_sink.context() as collected_warnings:
with log_sink.context() as collected_logs:
result = _compare_bundle_pair_inner(
name=name,
filenames_pair=filenames_pair,
Expand All @@ -74,7 +75,8 @@ def compare_bundle_pair(
meta_overrider=meta_overrider,
)

return result.model_copy(update={"warnings": collected_warnings})
errors, infos = _split_logs(collected_logs)
return result.model_copy(update={"errors": errors, "infos": infos})


def _compare_bundle_pair_inner(
Expand Down Expand Up @@ -267,8 +269,8 @@ def _try_generate_viz(
output_path=output_path,
)
except Exception as exc:
warning_sink.add(
GeneralWarning(
log_sink.add(
ErrorLog(
category="visualizer",
message=f"Visualization failed for {name}: {exc}",
)
Expand Down Expand Up @@ -332,8 +334,8 @@ def _load_all_values(filenames: list[str], base_path: Path) -> list[ValueWithMet
for f in filenames:
item: ValueWithMeta = ValueWithMeta.load(base_path / f)
if item.value is LOAD_FAILED:
warning_sink.add(
GeneralWarning(
log_sink.add(
ErrorLog(
category="load_failed",
message=f"Failed to load tensor file: {f}",
)
Expand Down
6 changes: 6 additions & 0 deletions python/sglang/srt/debug_utils/comparator/dims.py
Original file line number Diff line number Diff line change
Expand Up @@ -233,6 +233,12 @@ def resolve_dim_by_name(tensor: torch.Tensor, name: str) -> int:


def apply_dim_names(tensor: torch.Tensor, dim_names: list[str]) -> torch.Tensor:
if tensor.ndim != len(dim_names):
raise ValueError(
f"dims metadata mismatch: tensor has {tensor.ndim} dims (shape {list(tensor.shape)}) "
f"but dims string specifies {len(dim_names)} names {dim_names}. "
f"Please fix the dims string in the dumper.dump() call to match the actual tensor shape."
)
return tensor.refine_names(*dim_names)


Expand Down
46 changes: 20 additions & 26 deletions python/sglang/srt/debug_utils/comparator/entrypoint.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
from __future__ import annotations

import argparse
import re
import sys
from pathlib import Path
from typing import Any, Iterator, Optional, Union
Expand Down Expand Up @@ -38,7 +37,7 @@
generate_per_token_heatmap,
)
from sglang.srt.debug_utils.comparator.preset import PRESETS, expand_preset
from sglang.srt.debug_utils.comparator.utils import Pair
from sglang.srt.debug_utils.comparator.utils import Pair, compute_exit_code
from sglang.srt.debug_utils.dump_loader import read_meta, read_tokenizer_path

_DEFAULT_SKIP_KEYS: set[str] = {"dump_index", "filename"}
Expand Down Expand Up @@ -112,38 +111,23 @@ def run(args: argparse.Namespace) -> int:
compute_per_token=visualize_per_token is not None,
meta_overrider=meta_overrider,
)
summary, skipped_names = _consume_comparison_records(
summary, skipped_names, failed_names = _consume_comparison_records(
comparison_records=comparison_records,
visualize_per_token=visualize_per_token,
)
return _compute_exit_code(
return compute_exit_code(
summary,
allow_skip_pattern=args.allow_skip_pattern,
allow_skipped_pattern=args.allow_skipped_pattern,
skipped_names=skipped_names,
allow_failed_pattern=args.allow_failed_pattern,
failed_names=failed_names,
)
finally:
report_sink.close()
if report_path is not None:
print(f"Report: {report_path}", file=sys.stderr)


def _compute_exit_code(
summary: SummaryRecord,
*,
allow_skip_pattern: str,
skipped_names: list[str],
) -> int:
if summary.failed > 0:
return 1

pattern: re.Pattern[str] = re.compile(allow_skip_pattern)
forbidden: list[str] = [n for n in skipped_names if not pattern.fullmatch(n)]
if forbidden:
return 1

return 0


def _resolve_report_path(args: argparse.Namespace) -> Optional[Path]:
if args.report_path is not None:
return Path(args.report_path) if args.report_path else None
Expand Down Expand Up @@ -261,16 +245,19 @@ def _consume_comparison_records(
Union[TensorComparisonRecord, SkipComparisonRecord, NonTensorComparisonRecord]
],
visualize_per_token: Optional[Path] = None,
) -> tuple[SummaryRecord, list[str]]:
) -> tuple[SummaryRecord, list[str], list[str]]:
counts: dict[str, int] = {"passed": 0, "failed": 0, "skipped": 0}
collected_comparisons: list[TensorComparisonRecord] = []
skipped_names: list[str] = []
failed_names: list[str] = []

for record in comparison_records:
counts[record.category] += 1
report_sink.add(record)
if isinstance(record, SkipComparisonRecord) and record.category == "skipped":
skipped_names.append(record.name)
if record.category == "failed":
failed_names.append(record.name)
if visualize_per_token is not None and isinstance(
record, TensorComparisonRecord
):
Expand All @@ -285,7 +272,7 @@ def _consume_comparison_records(
output_path=visualize_per_token,
)

return summary, skipped_names
return summary, skipped_names, failed_names


def parse_args(argv: list[str]) -> argparse.Namespace:
Expand All @@ -299,7 +286,7 @@ def parse_args(argv: list[str]) -> argparse.Namespace:
parser.add_argument("--end-step", type=int, default=1000000)
parser.add_argument("--diff-threshold", type=float, default=1e-3)
parser.add_argument(
"--filter", type=str, default=None, help="Regex to filter filenames"
"--filter", type=str, default=None, help="Regex to filter filenames (include)"
)
parser.add_argument(
"--output-format",
Expand Down Expand Up @@ -383,12 +370,19 @@ def parse_args(argv: list[str]) -> argparse.Namespace:
help="Path to YAML override config file (dims overrides, etc.)",
)
parser.add_argument(
"--allow-skip-pattern",
"--allow-skipped-pattern",
type=str,
default=".*",
help="Regex pattern for tensor names allowed to be skipped. "
"Default '.*' allows all skips. Use '^$' to forbid all skips.",
)
parser.add_argument(
"--allow-failed-pattern",
type=str,
default=None,
help="Regex pattern for tensor names allowed to fail without affecting exit code. "
"Default None (all failures affect exit code).",
)

# Report output
parser.add_argument(
Expand Down
Loading
Loading