Skip to content

[draft][compile][graph_partition]Add tensor size handling#32747

Open
fxdawnn wants to merge 29 commits intovllm-project:mainfrom
fxdawnn:tensor_size
Open

[draft][compile][graph_partition]Add tensor size handling#32747
fxdawnn wants to merge 29 commits intovllm-project:mainfrom
fxdawnn:tensor_size

Conversation

@fxdawnn
Copy link
Contributor

@fxdawnn fxdawnn commented Jan 21, 2026

Purpose

Fix #31043

Summary

split_graph partitions an FX graph into subgraphs at splitting ops (e.g., attention, sigmoid). On main, sym_size.int nodes (from tensor.shape[dim]) can end up in consumer subgraphs after a split boundary. This causes the original tensor to be passed as an input to the consumer just for .size() calls, keeping it alive unnecessarily.

This PR adds a pre-pass that moves sym_size.int nodes to right after their tensor operand before subgraph assignment. The normal sequential assignment then places them in the producer subgraph. split_module threads the SymInt result to consumers automatically.

main (current):

submod_0: (empty or other ops)
submod_1: sigmoid(x)                                              # split point
submod_2: sym_size.int(x, 0), sym_size.int(x, 1), view(y, ...)   # x passed for .size()

This PR:

submod_0: sym_size.int(x, 0), sym_size.int(x, 1)   # shape computed in producer
submod_1: sigmoid(x)                                 # split point
submod_2: view(y, s0, s1)                            # only s0, s1 (SymInt) cross boundary

Why this works

split_module already threads SymInt values across subgraph boundaries correctly. Once sym_size.int computes the result in the producer, split_module passes that SymInt to any consumer that needs it — no custom SymInt handling code required.

This follows the same pattern as the existing getitem hoisting (PR #28533), which moves getitem nodes to the producer to avoid passing tuples across boundaries.

Change

One pre-pass added to split_graph (~10 lines):

for node in list(graph.graph.nodes):
    if (node.op == "call_function"
            and node.target == torch.ops.aten.sym_size.int):
        tensor_node = node.args[0]
        with graph.graph.inserting_after(tensor_node):
            new_node = graph.graph.call_function(
                torch.ops.aten.sym_size.int, args=node.args)
            new_node.meta = node.meta.copy()
        node.replace_all_uses_with(new_node)
        graph.graph.erase_node(node)

Test plan

  • test_sym_size_in_producer_subgraph — verifies sym_size nodes are in the producer subgraph, not the consumer, and functional correctness is preserved
  • test_symint_crosses_split_boundary — verifies SymInt from torch.compile + mark_dynamic crosses split boundaries without errors
  • Existing tests (test_getitem_moved_to_producer_subgraph, test_no_tuple_inputs_with_multiple_consumers, test_consecutive_ops_in_split) continue to pass

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@github-actions
Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces functionality to handle tensor size operations (sym_size) across graph split boundaries in PyTorch's FX graph. This is crucial for preventing issues where torch.Size is not fully supported as a submodule output when sym_size is in one subgraph and its consumer is in another. The changes include adding helper functions _is_sym_size_op and _move_sym_size_nodes_for_split in vllm/compilation/backends.py, and integrating the latter into the split_graph function. New tests in tests/compile/test_graph_partition.py validate this behavior. The overall approach is sound and addresses a known limitation in PyTorch 2.

@mergify
Copy link

mergify bot commented Jan 21, 2026

Hi @fxdawnn, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?
mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

Signed-off-by: Xiao Fu <xiaofu@meta.com>
Copy link
Collaborator

@ProExpertProg ProExpertProg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this fix!

@vllm-project vllm-project deleted a comment Jan 22, 2026
Signed-off-by: Xiao Fu <xiaofu@meta.com>
Copy link
Collaborator

@ProExpertProg ProExpertProg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks overall good to me but will defer to torch.compile folks

fxdawnn and others added 6 commits January 23, 2026 12:10
Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

Signed-off-by: Xiao Fu <xiaofu@meta.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Signed-off-by: Xiao <31429901+fxdawnn@users.noreply.github.com>
@mergify
Copy link

mergify bot commented Jan 27, 2026

Hi @fxdawnn, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?
mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

Signed-off-by: Xiao Fu <xiaofu@meta.com>
Signed-off-by: Xiao Fu <xiaofu@meta.com>
@mergify
Copy link

mergify bot commented Jan 31, 2026

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @fxdawnn.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Jan 31, 2026
fxdawnn and others added 11 commits February 2, 2026 10:35
Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

Signed-off-by: Xiao Fu <xiaofu@meta.com>
Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

Signed-off-by: Xiao Fu <xiaofu@meta.com>
Signed-off-by: Xiao Fu <xiaofu@meta.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Signed-off-by: Xiao <31429901+fxdawnn@users.noreply.github.com>
Signed-off-by: Xiao Fu <xiaofu@meta.com>
Signed-off-by: Xiao Fu <xiaofu@meta.com>
@mergify mergify bot removed the needs-rebase label Feb 2, 2026
Signed-off-by: Xiao Fu <xiaofu@meta.com>
@mergify
Copy link

mergify bot commented Feb 6, 2026

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @fxdawnn.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Feb 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Ready

Development

Successfully merging this pull request may close these issues.

[BugFix]: move torch.Size across graphs in split_graph

4 participants