Add option to use unbacked, and backed size obl dynamic shapes for more sounds compilation.#26199
Add option to use unbacked, and backed size obl dynamic shapes for more sounds compilation.#26199zou3519 merged 1 commit intovllm-project:mainfrom
Conversation
|
This pull request has merge conflicts that must be resolved before it can be |
There was a problem hiding this comment.
Code Review
This pull request introduces a significant and valuable refactoring of the torch.compile integration, particularly around handling dynamic shapes. The new TorchCompileGuardsStripWrapper and the introduction of shape invariants provide a much cleaner and more robust approach to compilation. The changes are well-structured and the new tests are a great addition.
I've found a few issues that need to be addressed: a critical bug in the new DynamicShapesConfig that would cause a runtime error, a minor bug in hash computation, and an opportunity to strengthen the new dynamic shapes test.
hmellor
left a comment
There was a problem hiding this comment.
The conflicts are caused by our migration to ruff. Please see https://vllm-dev.slack.com/archives/C07R5Q1Q2BB/p1759663228844749 which contains detailed instructions to make updating your branch as painless as possible.
|
@ProExpertProg |
|
Sorry I thought you were still pushing, but also the CI ran already anyway |
|
just addressed richard comment on the debug section. @zou3519 looking good? |
898d8b0 to
0e849f0
Compare
| @@ -151,6 +151,76 @@ To avoid this, please either: | |||
| 2. wrap the branching logic into a custom operator. TorchDynamo does not | |||
| trace into custom operators. | |||
|
|
|||
| ## Debugging constraint violations and dynamic shapes guards issues | |||
There was a problem hiding this comment.
this is a bit of a repeat of the previous section. Let's resolve it in a future PR, figuring out what to write is a bit annoying
There was a problem hiding this comment.
ah i see
## Debugging Dynamic Shape full graph capture
we probably should merge the two i agree
|
@ProExpertProg the failing job is a timeout in building docs is there a way to retry? |
Signed-off-by: Laith Sakka <lsakka@meta.com>
|
dummy update to force running all tests/ |
| - BACKED: Default PyTorch behavior with potential guards ignored. | ||
| - UNBACKED: No guards guaranteed (most sound) but may throw | ||
| data dependent errors. | ||
| - BACKED_SIZE_OBLIVIOUS: Experimental safer alternative to | ||
| backed/unbacked. |
There was a problem hiding this comment.
Tiny nit to get the help text to render nicely. Should bhe done in a follow up to save CI
| - BACKED: Default PyTorch behavior with potential guards ignored. | |
| - UNBACKED: No guards guaranteed (most sound) but may throw | |
| data dependent errors. | |
| - BACKED_SIZE_OBLIVIOUS: Experimental safer alternative to | |
| backed/unbacked. | |
| - BACKED: Default PyTorch behavior with potential guards ignored.\n | |
| - UNBACKED: No guards guaranteed (most sound) but may throw | |
| data dependent errors.\n | |
| - BACKED_SIZE_OBLIVIOUS: Experimental safer alternative to | |
| backed/unbacked. |
|
Docs failure/timeout was likely related to the Python docs (which our docs references) being down. It appears to have resolved itself |
…re sounds compilation. (vllm-project#26199) Signed-off-by: Laith Sakka <lsakka@meta.com>
…re sounds compilation. (vllm-project#26199) Signed-off-by: Laith Sakka <lsakka@meta.com> Signed-off-by: Runkai Tao <rt572@physics.rutgers.edu>
…re sounds compilation. (vllm-project#26199) Signed-off-by: Laith Sakka <lsakka@meta.com>
…re sounds compilation. (vllm-project#26199) Signed-off-by: Laith Sakka <lsakka@meta.com>
Purpose
Dynamic shapes guards are dropped unsoundly in vLLM, those can be added during both dynamo and inductor compilations. The proper way to compile a sounds graph is to use unbacked dynamic shapes or ""backed size oblivious
maybe"" ! (see the note in the diff content). This PR adds a config that allow choosing between backed, unbacked and backed_size_oblivious as explained in the comment in the content of the PR.
When using unbacked, users might want to to provide invariants about the shapes of the Model, a lambda argument
is added to support_torch_compile decorators where users can provide a lambda that assert on invariants
on the model inputs shapes. Those are needed to avoid DDE and to be able to trace the model with unbacked.
example:
see the qwen example in the code also.
Test Plan
Added a unit test .
only works for torch. 2.10+
Perf testing
command
Qwen/Qwen2-1.5B-Instruct
will look into some of those perf issues in the future, but the long term vision with pre_compile is that this will be
a fallback mode. results anyway still better than eager.
backed
backed size oblivious
unbacked
eager