-
-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[2/N][torch.compile] make compilation cfg part of vllm cfg #10383
Conversation
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
capture_sizes: List[int] = PrivateAttr | ||
|
||
def model_post_init(self, __context: Any) -> None: | ||
self.level = envs.VLLM_TORCH_COMPILE_LEVEL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently it is still read from env var so that api server can set it. later we will move it to the cli args.
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A dumb question: How does it work with TP? It seems we relying on more global variables that are used by the model executor. Does it add any new assumption or restriction to the TP implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It works with TP naturally, because every TP worker (process) has its own model executor and plugins
module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Please fix the CI error before merge.
the ci error is huggingface timeout. now it passes. |
…ect#10383) Signed-off-by: youkaichao <[email protected]> Signed-off-by: Linkun Chen <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this was merged but I had a few questions
…ect#10383) Signed-off-by: youkaichao <[email protected]>
…ect#10383) Signed-off-by: youkaichao <[email protected]>
…ect#10383) Signed-off-by: youkaichao <[email protected]> Signed-off-by: Maxime Fournioux <[email protected]>
…ect#10383) Signed-off-by: youkaichao <[email protected]> Signed-off-by: rickyx <[email protected]>
…ect#10383) Signed-off-by: youkaichao <[email protected]> Signed-off-by: Tyler Michael Smith <[email protected]>
…ect#10383) Signed-off-by: youkaichao <[email protected]>
…ect#10383) Signed-off-by: youkaichao <[email protected]>
continue of #10237
move
vllm.compilation.config
intovllm.config
it is still controlled by the env var, but in the core code, no one should read the env var.
TODO: