[draft][compile][graph_partition]Add tensor size handling#32747
[draft][compile][graph_partition]Add tensor size handling#32747fxdawnn wants to merge 29 commits intovllm-project:mainfrom
Conversation
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run You ask your reviewers to trigger select CI tests on top of Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add If you have any questions, please reach out to us on Slack at https://slack.vllm.ai. 🚀 |
There was a problem hiding this comment.
Code Review
The pull request introduces functionality to handle tensor size operations (sym_size) across graph split boundaries in PyTorch's FX graph. This is crucial for preventing issues where torch.Size is not fully supported as a submodule output when sym_size is in one subgraph and its consumer is in another. The changes include adding helper functions _is_sym_size_op and _move_sym_size_nodes_for_split in vllm/compilation/backends.py, and integrating the latter into the split_graph function. New tests in tests/compile/test_graph_partition.py validate this behavior. The overall approach is sound and addresses a known limitation in PyTorch 2.
|
Hi @fxdawnn, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Signed-off-by: Xiao Fu <xiaofu@meta.com>
ProExpertProg
left a comment
There was a problem hiding this comment.
Thanks for this fix!
Signed-off-by: Xiao Fu <xiaofu@meta.com>
ProExpertProg
left a comment
There was a problem hiding this comment.
Looks overall good to me but will defer to torch.compile folks
This reverts commit a409cf4.
Signed-off-by: Xiao Fu <xiaofu@meta.com>
This reverts commit a409cf4.
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Signed-off-by: Xiao Fu <xiaofu@meta.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Signed-off-by: Xiao <31429901+fxdawnn@users.noreply.github.com>
|
Hi @fxdawnn, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
Signed-off-by: Xiao Fu <xiaofu@meta.com>
Signed-off-by: Xiao Fu <xiaofu@meta.com>
|
This pull request has merge conflicts that must be resolved before it can be |
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Signed-off-by: Xiao Fu <xiaofu@meta.com>
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Signed-off-by: Xiao Fu <xiaofu@meta.com>
Signed-off-by: Xiao Fu <xiaofu@meta.com>
Signed-off-by: Xiao Fu <xiaofu@meta.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Signed-off-by: Xiao <31429901+fxdawnn@users.noreply.github.com>
Signed-off-by: Xiao Fu <xiaofu@meta.com>
Signed-off-by: Xiao Fu <xiaofu@meta.com>
…nsor_size Signed-off-by: Xiao Fu <xiaofu@meta.com>
|
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: Xiao Fu <xiaofu@meta.com>
Purpose
Fix #31043
Summary
split_graphpartitions an FX graph into subgraphs at splitting ops (e.g., attention, sigmoid). Onmain,sym_size.intnodes (fromtensor.shape[dim]) can end up in consumer subgraphs after a split boundary. This causes the original tensor to be passed as an input to the consumer just for.size()calls, keeping it alive unnecessarily.This PR adds a pre-pass that moves
sym_size.intnodes to right after their tensor operand before subgraph assignment. The normal sequential assignment then places them in the producer subgraph.split_modulethreads the SymInt result to consumers automatically.main (current):
This PR:
Why this works
split_modulealready threads SymInt values across subgraph boundaries correctly. Oncesym_size.intcomputes the result in the producer,split_modulepasses that SymInt to any consumer that needs it — no custom SymInt handling code required.This follows the same pattern as the existing
getitemhoisting (PR #28533), which movesgetitemnodes to the producer to avoid passing tuples across boundaries.Change
One pre-pass added to
split_graph(~10 lines):Test plan
test_sym_size_in_producer_subgraph— verifiessym_sizenodes are in the producer subgraph, not the consumer, and functional correctness is preservedtest_symint_crosses_split_boundary— verifies SymInt fromtorch.compile+mark_dynamiccrosses split boundaries without errorstest_getitem_moved_to_producer_subgraph,test_no_tuple_inputs_with_multiple_consumers,test_consecutive_ops_in_split) continue to passEssential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.