Remove use of VLLM_USE_V1. by QiliangCui · Pull Request #1083 · vllm-project/tpu-inference

QiliangCui · 2025-11-12T15:58:20Z

Description

Remove VLLM_USE_V1 because it is removed from vllm.main
vllm-project/vllm#28204 removed "VLLM_USE_V1".

FIXES: b/460101498

Tests

Manual launch

# 1 not set VLLM_USE_V1
TPU_BACKEND_TYPE=jax vllm serve Qwen/Qwen2.5-1.5B-Instruct \
 --seed 42 \
 --disable-log-requests \
 --max-num-seqs 512 \
 --max-num-batched-tokens 4096 \
 --tensor-parallel-size 1 \
 --no-enable-prefix-caching \
 --download_dir /mnt/disks/persist \
 --max-model-len 2048

vllm bench serve \
    --backend vllm \
    --model "Qwen/Qwen2.5-1.5B-Instruct" \
    --dataset-name "sonnet" \
    --dataset-path "benchmarks/sonnet_4x.txt" \
    --sonnet-input-len 1200 \
    --sonnet-output-len 1 \
    --num-prompts 1 \
    --request-rate inf \
    --percentile-metrics "ttft,tpot,itl,e2el" \
    --ignore-eos

# 2: still set VLLM_USE_V1=1 making sure it runs as backward compatible. 
VLLM_USE_V1=1 TPU_BACKEND_TYPE=jax vllm serve Qwen/Qwen2.5-1.5B-Instruct \
 --seed 42 \
 --disable-log-requests \
 --max-num-seqs 512 \
 --max-num-batched-tokens 4096 \
 --tensor-parallel-size 1 \
 --no-enable-prefix-caching \
 --download_dir /mnt/disks/persist \
 --max-model-len 2048

vllm bench serve \
    --backend vllm \
    --model "Qwen/Qwen2.5-1.5B-Instruct" \
    --dataset-name "sonnet" \
    --dataset-path "benchmarks/sonnet_4x.txt" \
    --sonnet-input-len 1200 \
    --sonnet-output-len 1 \
    --num-prompts 1 \
    --request-rate inf \
    --percentile-metrics "ttft,tpot,itl,e2el" \
    --ignore-eos

CIT

https://buildkite.com/tpu-commons/tpu-inference-ci/builds/5249

The two broken tests are tracked by other bug as known issue
b/460112990 and b/459771818

github-actions · 2025-11-12T15:58:38Z

Description

Start with a short description of what the PR does and how this is a change from
the past.

The rest of the description includes relevant details and context, examples:

why is this change being made,
the problem being solved and any relevant context,
why this is a good solution,
some information about the specific implementation,
shortcomings of the solution and possible future improvements.

If the change fixes a bug or a Github issue, please include a link, e.g.,:
FIXES: b/123456
FIXES: #123456

Tests

Please describe how you tested this change, and include any instructions and/or
commands to reproduce.

Checklist

Before submitting this PR, please make sure:

I have performed a self-review of my code.
I have necessary comments in my code, particularly in hard-to-understand areas.
I have made or will make corresponding changes to any relevant documentation.

wdhongtw

LGTM. Just a small indent issue and need some noqa comment (for ruff tool) if we want to minimize the line of changes.

tpu_inference/executors/ray_distributed_executor.py

wdhongtw

LGTM

Signed-off-by: Qiliang Cui <derrhein@gmail.com>

jrplatin approved these changes Nov 12, 2025

View reviewed changes

wdhongtw requested changes Nov 12, 2025

View reviewed changes

tpu_inference/executors/ray_distributed_executor.py Outdated Show resolved Hide resolved

QiliangCui force-pushed the fix-use-vllm branch 2 times, most recently from 00441ac to 7f03966 Compare November 12, 2025 16:54

wdhongtw approved these changes Nov 12, 2025

View reviewed changes

QiliangCui force-pushed the fix-use-vllm branch from 7f03966 to 73a003a Compare November 12, 2025 16:57

Remove use of VLLM_USE_V1.

41a84ce

Signed-off-by: Qiliang Cui <derrhein@gmail.com>

QiliangCui force-pushed the fix-use-vllm branch from 73a003a to 41a84ce Compare November 12, 2025 17:00

QiliangCui merged commit cb8734c into main Nov 12, 2025
3 checks passed

jcyang43 mentioned this pull request Nov 13, 2025

[Bug]: VLLM env doesn't have VLLM_USE_V1 flag #1081

Closed

wdhongtw deleted the fix-use-vllm branch April 7, 2026 09:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove use of VLLM_USE_V1.#1083

Remove use of VLLM_USE_V1.#1083
QiliangCui merged 1 commit intomainfrom
fix-use-vllm

QiliangCui commented Nov 12, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 12, 2025

Uh oh!

wdhongtw left a comment •

edited

Loading

Uh oh!

Uh oh!

wdhongtw left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

QiliangCui commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Tests

Manual launch

CIT

Uh oh!

github-actions bot commented Nov 12, 2025

Description

Tests

Checklist

Uh oh!

wdhongtw left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wdhongtw left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

QiliangCui commented Nov 12, 2025 •

edited

Loading

wdhongtw left a comment •

edited

Loading