[Frontend] speed up import time of vllm.config#18036
[Frontend] speed up import time of vllm.config#18036aarnphm merged 1 commit intovllm-project:mainfrom
Conversation
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
c1b18c2 to
6921702
Compare
|
This pull request has merge conflicts that must be resolved before it can be |
bb4c8d5 to
054f562
Compare
|
@aarnphm this is ready for review, thanks! cc @simon-mo @Chen-0210 |
|
This pull request has merge conflicts that must be resolved before it can be |
aarnphm
left a comment
There was a problem hiding this comment.
I'm a bit hesitant to optimize this file lazily, given that this touches a lot of components within vLLM.
Also let's try to reduce some hint change to minimum.
This PR will requires running the whole suite to make sure it won't introduce any regression.
|
This pull request has merge conflicts that must be resolved before it can be |
|
This pull request has merge conflicts that must be resolved before it can be |
1ec7328 to
6a65c3d
Compare
|
This pull request has merge conflicts that must be resolved before it can be |
Head branch was pushed to by a user without write access
by changing submodules to lazily import expensive modules like `vllm.model_executor.layers.quantization` or only importing them for type checkers when not used during runtime. contributes to vllm-project#14924 Signed-off-by: David Xia <david@davidxia.com>
|
@aarnphm thanks for reviewing again. I rebased away the conflict and fix the pre-commit Python formatting check. All checks pass now and ready for another review. 🙏 |
|
On my M1 Mac with 64GB memory, py312, editable install of vllm following these docs before with master commit 3443aafafter with master commit 7108934~2.15% speed up in the average import times ((4.332-4.239)÷4.332) |
by changing some modules in
vllm/multimodalto lazily import expensive modules liketransformersor only importing them for type checkers when not used during runtime.contributes to #14924
I ran on main branch
python -X importtime -c 'import vllm' 2> import.log && tuna import.log. The visualized call tree showsvllm.configaccounts for the majority of the total import time at 55.5%.On this branch,
vllm.config's share decreased to 52.5%.python -c 'import vllm'on a Google Compute Engine
a2-highgpu-1g(12 vCPUs, 85 GB Memory) instance with 1 A100 GPU~3% decrease in mean time
before (main branch commit 94d8ec8)
after (my PR commit 054f562)