[compile][cuda_graph]Add sym_size handling by folding them to constant by fxdawnn · Pull Request #32960 · vllm-project/vllm

fxdawnn · 2026-01-23T19:23:17Z

Purpose

Fix #31043

Test Plan

Comparing between the pro of the issues after reverting the previous temporary fix
Adding local test

Test Result

Pass

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

This reverts commit a409cf4.

…t to allow graph transfering Signed-off-by: Xiao Fu <xiaofu@meta.com>

mergify · 2026-01-23T19:23:56Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @fxdawnn.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.}

Comment @cursor review or bugbot run to trigger another review on this PR

cursor · 2026-01-23T19:32:38Z

vllm/compilation/piecewise_backend.py

+                    else:
+                        break
+
+                fold_sym_size_to_constants(self.graph, concrete_inputs)


Graph mutation causes wrong constants for subsequent compilations

High Severity

The fold_sym_size_to_constants function mutates the shared self.graph by calling node.replace_all_uses_with(const_value), which replaces all uses of sym_size nodes with constants. After the first single-size compilation, these nodes have zero users (as the test explicitly verifies). Subsequent calls for different sizes find the same sym_size nodes but replace_all_uses_with has no effect since there are no users left to replace. This causes all single-size compilations after the first to use incorrect constant values from the initial compilation.

Additional Locations (1)

vllm/compilation/backends.py#L377-L382

gemini-code-assist

Code Review

The pull request introduces functionality to fold symbolic sizes to constants within FX graphs, which is crucial for CUDA graph capture. It includes a new test case to validate this folding and integrates the functionality into the piecewise compilation backend. The changes improve the robustness of CUDA graph capture by ensuring sym_size values are constants, preventing potential address mismatch issues during replay. The PR also optimizes debugging by making input address tracking conditional on the debugging mode.

gemini-code-assist · 2026-01-23T19:40:51Z

tests/compile/test_graph_partition.py

+    concrete_inputs: dict[str, torch.Tensor] = {}
+    for node in gm.graph.nodes:
+        if node.op == "placeholder":
+            concrete_inputs[node.name] = x
+            break


The construction of concrete_inputs only adds the first placeholder's concrete value and then breaks. If the model_fn were to accept multiple input tensors (e.g., def model_fn(x, y): ...), this logic would incorrectly only provide the concrete input for x, potentially leading to fold_sym_size_to_constants failing or behaving unexpectedly for y's symbolic size operations. To ensure robustness for multi-input models, all placeholder nodes should be iterated over and their corresponding concrete inputs added.

Suggested change

concrete_inputs: dict[str, torch.Tensor] = {}

for node in gm.graph.nodes:

if node.op == "placeholder":

concrete_inputs[node.name] = x

break

concrete_inputs: dict[str, torch.Tensor] = {}

for node in gm.graph.nodes:

if node.op == "placeholder":

# Assuming all placeholders should receive the same concrete tensor 'x' for this test.

# If different inputs are needed, this logic would require adjustment.

concrete_inputs[node.name] = x

assert concrete_inputs, "No placeholder found in the graph."

fxdawnn added 2 commits January 22, 2026 15:13

Revert "remove cuda graph copy"

2894bb4

This reverts commit a409cf4.

[compile][cuda_graph]Add sym_size handling by folding them to constan…

6b24b50

…t to allow graph transfering Signed-off-by: Xiao Fu <xiaofu@meta.com>

fxdawnn requested review from ProExpertProg, youkaichao and zou3519 as code owners January 23, 2026 19:23

mergify bot added the nvidia label Jan 23, 2026

mergify bot added the needs-rebase label Jan 23, 2026

github-project-automation bot added this to NVIDIA Jan 23, 2026

cursor bot reviewed Jan 23, 2026

View reviewed changes

fxdawnn marked this pull request as draft January 23, 2026 19:35

gemini-code-assist bot reviewed Jan 23, 2026

View reviewed changes

bsherifi mentioned this pull request Mar 1, 2026

[torch.compile] Move torch.Size producers to consumer subgraph in split_graph #35635

Open

5 tasks

fxdawnn closed this Mar 2, 2026

github-project-automation bot moved this to Done in NVIDIA Mar 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[compile][cuda_graph]Add sym_size handling by folding them to constant#32960

[compile][cuda_graph]Add sym_size handling by folding them to constant#32960
fxdawnn wants to merge 2 commits intovllm-project:mainfrom
fxdawnn:fold_sym_size_const

fxdawnn commented Jan 23, 2026 •

edited by github-actions bot

Loading

Uh oh!

mergify bot commented Jan 23, 2026

Uh oh!

cursor bot left a comment

Uh oh!

cursor bot Jan 23, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

fxdawnn commented Jan 23, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Pass

Uh oh!

mergify bot commented Jan 23, 2026

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Jan 23, 2026

Choose a reason for hiding this comment

Graph mutation causes wrong constants for subsequent compilations

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fxdawnn commented Jan 23, 2026 •

edited by github-actions bot

Loading