[Bugfix] Eliminate tuple inputs to submodules in graph partitioning#28533
[Bugfix] Eliminate tuple inputs to submodules in graph partitioning#28533zou3519 merged 7 commits intovllm-project:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request effectively addresses the issue of tuple inputs to submodules in graph partitioning by moving getitem operations to the producer's subgraph. The changes in vllm/compilation/backends.py correctly implement this logic, and the new test file tests/compile/test_graph_partition.py provides good coverage for both single and multiple consumer scenarios, ensuring the fix works as intended. The addition of the test to the .buildkite/test-pipeline.yaml ensures continuous validation. The overall change improves the robustness of the graph partitioning logic for AoT Autograd modules. No critical or high-severity issues were found.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
This pull request has merge conflicts that must be resolved before it can be |
Move getitem operations to the same subgraph as their input to prevent submodules from receiving tuple arguments. This is done by reassigning getitem nodes during partition assignment before calling split_module. PyTorch AoT compile expects graphs that conform to AoTAutograd spec, which prohibits tuple-type inputs to (sub)modules. Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
vllm/compilation/backends.py
Outdated
| if input_node in node_to_subgraph_id: | ||
| node_to_subgraph_id[node] = node_to_subgraph_id[input_node] | ||
| continue |
There was a problem hiding this comment.
This should always be true, right?
| if input_node in node_to_subgraph_id: | |
| node_to_subgraph_id[node] = node_to_subgraph_id[input_node] | |
| continue | |
| assert input_node in node_to_subgraph_id | |
| node_to_subgraph_id[node] = node_to_subgraph_id[input_node] | |
| continue |
There was a problem hiding this comment.
I think placeholder nodes would not be in node_to_subgraph_id map.
There was a problem hiding this comment.
hmm I think the claim is that we should never be getitem on a placeholder node or an output node so we can just assert that the input_node is in node_to_subgraph id?
Because Dynamo produces a graph where all placeholders should be Tensors or symints or scriptobjects
So I agree with Luka's suggestion
There was a problem hiding this comment.
@zou3519 @ProExpertProg I had to relax the assert to only when producer node is not a placeholder, because getitem is also used to fetch item/slice from a tensor as well. In that case, there can be legitimate getitem calls on placeholders. This is seen in qwen2.
PTAL.
There was a problem hiding this comment.
oh no, this is dynamo IR we're talking about... let me think about this
There was a problem hiding this comment.
Do we have a good way of distinguishing between getitem calls on Tensors and getitem calls on Tuples?
There was a problem hiding this comment.
Checking type of node.meta['val'] is the way. However I don't think there is a strong guarantee about its presence.
I think the current assert is strict and correct because it is by design that all placeholder nodes are not in node_to_subgraph_id, and this check asserts exactly that.
There was a problem hiding this comment.
OK, I tested with node.meta['val']. It is not always available, so we can't rely on it to tell us if something is a tuple or tensor.
The next best thing we can do is what is currently implemented, i.e. checking input_node.op == "placeholder" and relying on Dynamo doing the right thing, which is currently the case according to CI results.
There was a problem hiding this comment.
I am fine with loosening the assert. I ran into this a while ago with something in standard torch.compile
Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
zou3519
left a comment
There was a problem hiding this comment.
code reasonable, I think Luka's comment makes sense
Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
Head branch was pushed to by a user without write access
…llm-project#28533) Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
…llm-project#28533) Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
…llm-project#28533) Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
Move
getitemoperations to the same subgraph as their input to prevent submodules from receiving tuple arguments.This is to fix issues like #24915 where
split_graphpartitioned graph at a node that produces a tuple, which is then passed to subsequent submodules as input. This unfortunately does not conform to the calling convention of AoT Autograd modules, which requires flattened arguments.Signed-off-by: Yanan Cao gmagogsfm@gmail.com