Skip to content

[Bugfix] Eliminate tuple inputs to submodules in graph partitioning#28533

Merged
zou3519 merged 7 commits intovllm-project:mainfrom
gmagogsfm:cache_key
Nov 13, 2025
Merged

[Bugfix] Eliminate tuple inputs to submodules in graph partitioning#28533
zou3519 merged 7 commits intovllm-project:mainfrom
gmagogsfm:cache_key

Conversation

@gmagogsfm
Copy link
Copy Markdown
Contributor

@gmagogsfm gmagogsfm commented Nov 12, 2025

Move getitem operations to the same subgraph as their input to prevent submodules from receiving tuple arguments.

This is to fix issues like #24915 where split_graph partitioned graph at a node that produces a tuple, which is then passed to subsequent submodules as input. This unfortunately does not conform to the calling convention of AoT Autograd modules, which requires flattened arguments.

Signed-off-by: Yanan Cao gmagogsfm@gmail.com

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses the issue of tuple inputs to submodules in graph partitioning by moving getitem operations to the producer's subgraph. The changes in vllm/compilation/backends.py correctly implement this logic, and the new test file tests/compile/test_graph_partition.py provides good coverage for both single and multiple consumer scenarios, ensuring the fix works as intended. The addition of the test to the .buildkite/test-pipeline.yaml ensures continuous validation. The overall change improves the robustness of the graph partitioning logic for AoT Autograd modules. No critical or high-severity issues were found.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@mergify
Copy link
Copy Markdown

mergify bot commented Nov 12, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @gmagogsfm.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Move getitem operations to the same subgraph as their input to prevent
submodules from receiving tuple arguments. This is done by reassigning
getitem nodes during partition assignment before calling split_module.

PyTorch AoT compile expects graphs that conform to AoTAutograd spec,
which prohibits tuple-type inputs to (sub)modules.

Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
Comment on lines +323 to +325
if input_node in node_to_subgraph_id:
node_to_subgraph_id[node] = node_to_subgraph_id[input_node]
continue
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should always be true, right?

Suggested change
if input_node in node_to_subgraph_id:
node_to_subgraph_id[node] = node_to_subgraph_id[input_node]
continue
assert input_node in node_to_subgraph_id
node_to_subgraph_id[node] = node_to_subgraph_id[input_node]
continue

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think placeholder nodes would not be in node_to_subgraph_id map.

Copy link
Copy Markdown
Collaborator

@zou3519 zou3519 Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm I think the claim is that we should never be getitem on a placeholder node or an output node so we can just assert that the input_node is in node_to_subgraph id?

Because Dynamo produces a graph where all placeholders should be Tensors or symints or scriptobjects

So I agree with Luka's suggestion

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, changed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zou3519 @ProExpertProg I had to relax the assert to only when producer node is not a placeholder, because getitem is also used to fetch item/slice from a tensor as well. In that case, there can be legitimate getitem calls on placeholders. This is seen in qwen2.

PTAL.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh no, this is dynamo IR we're talking about... let me think about this

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a good way of distinguishing between getitem calls on Tensors and getitem calls on Tuples?

Copy link
Copy Markdown
Contributor Author

@gmagogsfm gmagogsfm Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking type of node.meta['val'] is the way. However I don't think there is a strong guarantee about its presence.

I think the current assert is strict and correct because it is by design that all placeholder nodes are not in node_to_subgraph_id, and this check asserts exactly that.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I tested with node.meta['val']. It is not always available, so we can't rely on it to tell us if something is a tuple or tensor.

The next best thing we can do is what is currently implemented, i.e. checking input_node.op == "placeholder" and relying on Dynamo doing the right thing, which is currently the case according to CI results.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with loosening the assert. I ran into this a while ago with something in standard torch.compile

Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
Copy link
Copy Markdown
Collaborator

@zou3519 zou3519 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code reasonable, I think Luka's comment makes sense

Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
@zou3519 zou3519 added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 13, 2025
@zou3519 zou3519 enabled auto-merge (squash) November 13, 2025 00:07
Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
auto-merge was automatically disabled November 13, 2025 05:40

Head branch was pushed to by a user without write access

@zou3519 zou3519 merged commit 262d263 into vllm-project:main Nov 13, 2025
48 checks passed
geodavic pushed a commit to geodavic/vllm that referenced this pull request Nov 16, 2025
devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025
kitaekatt pushed a commit to kitaekatt/vllm that referenced this pull request Dec 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants