-
Notifications
You must be signed in to change notification settings - Fork 1k
[Rebase] Rebase to vllm v0.18.0 #2037
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
dd0c893
3630bdf
e150a1b
7706132
cfbdb57
8ac369c
cf0a7a5
9dc5d15
5c845ce
4014e76
dd0ec55
b0cf788
cee2b4b
1c0a71e
a9862f1
20c102d
ad12921
91135f1
e47c5dd
08a2673
11b6c16
72cb4f6
e17fb0e
d911475
62c8695
d3db476
97899c6
f137a1e
3a508d7
aae101f
9c22e3a
a26e568
de6bcc7
57ac7eb
8e81e60
22a2bd3
83b05ce
b0e36f9
0eccb77
ac937b8
722ad6b
b7a36d1
83effe3
35b3dd6
cc4ade4
004c60a
8a31e78
7a68bd3
dc6f2fb
342a209
4f75ccc
034d579
22b6703
f83df01
37e055e
367ce63
3c52432
0d23f68
b8f736c
6ff31eb
66c31a3
521f178
9538e3f
869a593
826b3b3
e805487
c27f1c4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -36,7 +36,7 @@ steps: | |
| - label: "Diffusion Model Test" | ||
| depends_on: upload-ready-pipeline | ||
| commands: | ||
| - timeout 20m pytest -s -v tests/e2e/offline_inference/test_t2i_model.py -m "core_model and diffusion" --run-level "core_model" | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The same question
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same timeout problem. |
||
| - timeout 30m pytest -s -v tests/e2e/offline_inference/test_t2i_model.py -m "core_model and diffusion" --run-level "core_model" | ||
| agents: | ||
| queue: "gpu_1_queue" | ||
| plugins: | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -11,10 +11,29 @@ RUN apt-get update && \ | |
| apt-get clean && \ | ||
| rm -rf /var/lib/apt/lists/* | ||
|
|
||
| # Install vllm-omni into the same uv-managed Python environment used by the base image. | ||
| # Use bash -c so that $(python3 -c ...) is expanded inside the container. | ||
| RUN uv pip install --system --no-cache-dir ".[dev]" | ||
| RUN uv pip uninstall --system -y vllm || true | ||
|
|
||
| # Install vLLM from precompiled wheel at the selected commit. | ||
| # Must use direct URL because the wheel has a PEP 440 local version identifier | ||
| # (e.g. +g0a0a1a198) which pip/uv refuse to install from a PEP 503 package index. | ||
| ENV VLLM_PRECOMPILED_WHEEL_COMMIT=89138b21cc246ae944c741d5c399c148e2b770ab | ||
| RUN VLLM_WHEEL_URL=$(python3 -c "import urllib.request,re; \ | ||
| html=urllib.request.urlopen('https://wheels.vllm.ai/${VLLM_PRECOMPILED_WHEEL_COMMIT}/vllm/').read().decode(); \ | ||
| m=re.search(r'>(\S+x86_64\.whl)<',html); \ | ||
| print('https://wheels.vllm.ai/${VLLM_PRECOMPILED_WHEEL_COMMIT}/'+m.group(1).replace('+','%2B'))") && \ | ||
| echo "Installing vLLM from: ${VLLM_WHEEL_URL}" && \ | ||
| uv pip install --system --force-reinstall "${VLLM_WHEEL_URL}" | ||
|
|
||
| RUN uv pip install --system ".[dev]" | ||
|
|
||
| RUN uv pip install --system --upgrade \ | ||
| "flashinfer-cubin==0.6.6" \ | ||
| "nvidia-cublas-cu12==12.9.1.4" \ | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we still need the cublas version upgrade? |
||
| "numpy==2.2.6" | ||
|
|
||
| RUN uv pip install --system --upgrade \ | ||
| "flashinfer-jit-cache==0.6.6" \ | ||
| --index-url https://flashinfer.ai/whl/cu129 | ||
| RUN ln -sf /usr/bin/python3 /usr/bin/python | ||
|
|
||
| ENTRYPOINT [] | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we modify these settings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The running time is a little bit longer than 20 mins, I guess this is caused by much larger docker image than the main branch.