[draft][compile][graph_partition]Add tensor size handling by fxdawnn · Pull Request #32747 · vllm-project/vllm

fxdawnn · 2026-01-21T02:16:00Z

Purpose

Summary

split_graph partitions an FX graph into subgraphs at splitting ops (e.g., attention, sigmoid). On main, sym_size.int nodes (from tensor.shape[dim]) can end up in consumer subgraphs after a split boundary. This causes the original tensor to be passed as an input to the consumer just for .size() calls, keeping it alive unnecessarily.

This PR adds a pre-pass that moves sym_size.int nodes to right after their tensor operand before subgraph assignment. The normal sequential assignment then places them in the producer subgraph. split_module threads the SymInt result to consumers automatically.

main (current):

submod_0: (empty or other ops)
submod_1: sigmoid(x)                                              # split point
submod_2: sym_size.int(x, 0), sym_size.int(x, 1), view(y, ...)   # x passed for .size()

This PR:

submod_0: sym_size.int(x, 0), sym_size.int(x, 1)   # shape computed in producer
submod_1: sigmoid(x)                                 # split point
submod_2: view(y, s0, s1)                            # only s0, s1 (SymInt) cross boundary

Why this works

split_module already threads SymInt values across subgraph boundaries correctly. Once sym_size.int computes the result in the producer, split_module passes that SymInt to any consumer that needs it — no custom SymInt handling code required.

This follows the same pattern as the existing getitem hoisting (PR #28533), which moves getitem nodes to the producer to avoid passing tuples across boundaries.

Change

One pre-pass added to split_graph (~10 lines):

for node in list(graph.graph.nodes):
    if (node.op == "call_function"
            and node.target == torch.ops.aten.sym_size.int):
        tensor_node = node.args[0]
        with graph.graph.inserting_after(tensor_node):
            new_node = graph.graph.call_function(
                torch.ops.aten.sym_size.int, args=node.args)
            new_node.meta = node.meta.copy()
        node.replace_all_uses_with(new_node)
        graph.graph.erase_node(node)

Test plan

test_sym_size_in_producer_subgraph — verifies sym_size nodes are in the producer subgraph, not the consumer, and functional correctness is preserved
test_symint_crosses_split_boundary — verifies SymInt from torch.compile + mark_dynamic crosses split boundaries without errors
Existing tests (test_getitem_moved_to_producer_subgraph, test_no_tuple_inputs_with_multiple_consumers, test_consecutive_ops_in_split) continue to pass

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

github-actions · 2026-01-21T02:16:10Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

gemini-code-assist

Code Review

The pull request introduces functionality to handle tensor size operations (sym_size) across graph split boundaries in PyTorch's FX graph. This is crucial for preventing issues where torch.Size is not fully supported as a submodule output when sym_size is in one subgraph and its consumer is in another. The changes include adding helper functions _is_sym_size_op and _move_sym_size_nodes_for_split in vllm/compilation/backends.py, and integrating the latter into the split_graph function. New tests in tests/compile/test_graph_partition.py validate this behavior. The overall approach is sound and addresses a known limitation in PyTorch 2.

tests/compile/test_graph_partition.py

mergify · 2026-01-21T02:20:24Z

Hi @fxdawnn, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?

mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Signed-off-by: Xiao Fu <xiaofu@meta.com>

ProExpertProg

Thanks for this fix!

tests/compile/test_graph_partition.py

vllm/compilation/backends.py

Signed-off-by: Xiao Fu <xiaofu@meta.com>

ProExpertProg

Looks overall good to me but will defer to torch.compile folks

vllm/compilation/backends.py

This reverts commit a409cf4.

Signed-off-by: Xiao Fu <xiaofu@meta.com>

This reverts commit a409cf4.

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Signed-off-by: Xiao Fu <xiaofu@meta.com>

Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Signed-off-by: Xiao <31429901+fxdawnn@users.noreply.github.com>

mergify · 2026-01-27T19:25:29Z

Hi @fxdawnn, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?

mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

Signed-off-by: Xiao Fu <xiaofu@meta.com>

vllm/compilation/cuda_graph.py

mergify · 2026-01-31T14:59:04Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @fxdawnn.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Signed-off-by: Xiao Fu <xiaofu@meta.com>

Signed-off-by: Xiao Fu <xiaofu@meta.com>

Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Signed-off-by: Xiao <31429901+fxdawnn@users.noreply.github.com>

Signed-off-by: Xiao Fu <xiaofu@meta.com>

…nsor_size Signed-off-by: Xiao Fu <xiaofu@meta.com>

Signed-off-by: Xiao Fu <xiaofu@meta.com>

mergify · 2026-02-06T14:09:26Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @fxdawnn.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

…sing

Signed-off-by: Xiao Fu <xiaofu@meta.com>

[compile][graph_partition]Add tensor size handling

440341a

fxdawnn requested review from ProExpertProg, youkaichao and zou3519 as code owners January 21, 2026 02:16

gemini-code-assist bot reviewed Jan 21, 2026

View reviewed changes

tests/compile/test_graph_partition.py Outdated Show resolved Hide resolved

Add more test

074f5bb

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Signed-off-by: Xiao Fu <xiaofu@meta.com>

ProExpertProg approved these changes Jan 21, 2026

View reviewed changes

tests/compile/test_graph_partition.py Outdated Show resolved Hide resolved

tests/compile/test_graph_partition.py Outdated Show resolved Hide resolved

tests/compile/test_graph_partition.py Outdated Show resolved Hide resolved

zou3519 reviewed Jan 21, 2026

View reviewed changes

vllm/compilation/backends.py Outdated Show resolved Hide resolved

zou3519 reviewed Jan 21, 2026

View reviewed changes

vllm/compilation/backends.py Outdated Show resolved Hide resolved

vllm-project deleted a comment Jan 22, 2026

Add replication to all consumer

63a55a4

Signed-off-by: Xiao Fu <xiaofu@meta.com>

fxdawnn force-pushed the tensor_size branch from dd204ee to 63a55a4 Compare January 22, 2026 06:01

ProExpertProg reviewed Jan 23, 2026

View reviewed changes

vllm/compilation/backends.py Outdated Show resolved Hide resolved

fxdawnn added 2 commits January 23, 2026 09:08

Revert "remove cuda graph copy"

592307a

This reverts commit a409cf4.

Add repro-level bug on cuda_graph address assignment to ensure the fix

ee67880

Signed-off-by: Xiao Fu <xiaofu@meta.com>

mergify bot added the nvidia label Jan 23, 2026

github-project-automation bot added this to NVIDIA Jan 23, 2026

github-project-automation bot moved this to Ready in NVIDIA Jan 23, 2026

fxdawnn and others added 6 commits January 23, 2026 12:10

Revert "remove cuda graph copy"

a239cd3

This reverts commit a409cf4.

Merge branch 'main' of https://github.com/vllm-project/vllm

832b8b1

Merge branch 'main' of https://github.com/vllm-project/vllm

26f7680

[compile][graph_partition]Add tensor size handling

009f916

Add more test

313eef1

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Signed-off-by: Xiao Fu <xiaofu@meta.com>

Update vllm/compilation/backends.py

77ecf1b

Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Signed-off-by: Xiao <31429901+fxdawnn@users.noreply.github.com>

fxdawnn added 2 commits January 27, 2026 11:34

Merge conflict

3a709cd

Signed-off-by: Xiao Fu <xiaofu@meta.com>

Modify the test for scenario with torch.tensor()

e726e43

Signed-off-by: Xiao Fu <xiaofu@meta.com>

zou3519 reviewed Jan 30, 2026

View reviewed changes

vllm/compilation/cuda_graph.py Outdated Show resolved Hide resolved

mergify bot added the needs-rebase label Jan 31, 2026

fxdawnn and others added 11 commits February 2, 2026 10:35

Merge branch 'main' of https://github.com/vllm-project/vllm

ef98db7

[compile][graph_partition]Add tensor size handling

6bd90a0

Add more test

2a55ef3

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Signed-off-by: Xiao Fu <xiaofu@meta.com>

[compile][graph_partition]Add tensor size handling

cbf3c10

Add more test

0e12498

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Signed-off-by: Xiao Fu <xiaofu@meta.com>

Add replication to all consumer

10c9793

Signed-off-by: Xiao Fu <xiaofu@meta.com>

Add repro-level bug on cuda_graph address assignment to ensure the fix

8bd2fc9

Signed-off-by: Xiao Fu <xiaofu@meta.com>

Update vllm/compilation/backends.py

b204b4d

Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Signed-off-by: Xiao <31429901+fxdawnn@users.noreply.github.com>

Modify the test for scenario with torch.tensor()

24c0ea6

Signed-off-by: Xiao Fu <xiaofu@meta.com>

Fix rebase and arrangement

018fe84

Signed-off-by: Xiao Fu <xiaofu@meta.com>

Merge branch 'tensor_size' of https://github.com/fxdawnn/vllm into te…

16ad25d

…nsor_size Signed-off-by: Xiao Fu <xiaofu@meta.com>

mergify bot removed the needs-rebase label Feb 2, 2026

Fix error

d936732

Signed-off-by: Xiao Fu <xiaofu@meta.com>

mergify bot added the needs-rebase label Feb 6, 2026

fxdawnn added 4 commits February 13, 2026 15:39

Reduce logic overhead

b0666e0

Fix pre-commit issue

c31b32d

Move sym_size.int to producer subgraph to reduce tensor boundary cros…

4de21b8

…sing

Fix and the unclear part of boundary crossing allowlist

1ccdb56

Signed-off-by: Xiao Fu <xiaofu@meta.com>

mergify bot mentioned this pull request Feb 25, 2026

[compile][graph_partition] Remove unused subgraph inputs after split_module #35251

Closed

5 tasks

bsherifi mentioned this pull request Mar 1, 2026

[torch.compile] Move torch.Size producers to consumer subgraph in split_graph #35635

Open

5 tasks

fxdawnn mentioned this pull request Mar 4, 2026

[compile][graph_partition]Add tensor size handling #36038

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[draft][compile][graph_partition]Add tensor size handling#32747

[draft][compile][graph_partition]Add tensor size handling#32747
fxdawnn wants to merge 29 commits intovllm-project:mainfrom
fxdawnn:tensor_size

fxdawnn commented Jan 21, 2026 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Jan 21, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

mergify bot commented Jan 21, 2026

Uh oh!

ProExpertProg left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ProExpertProg left a comment

Uh oh!

Uh oh!

mergify bot commented Jan 27, 2026

Uh oh!

Uh oh!

mergify bot commented Jan 31, 2026

Uh oh!

mergify bot commented Feb 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

fxdawnn commented Jan 21, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Summary

Why this works

Change

Test plan

Uh oh!

github-actions bot commented Jan 21, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

mergify bot commented Jan 21, 2026

Uh oh!

ProExpertProg left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ProExpertProg left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mergify bot commented Jan 27, 2026

Uh oh!

Uh oh!

mergify bot commented Jan 31, 2026

Uh oh!

mergify bot commented Feb 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fxdawnn commented Jan 21, 2026 •

edited by github-actions bot

Loading