[WIP] Multimodal model support for V1 TPU by mgoin · Pull Request #12133 · vllm-project/vllm

mgoin · 2025-01-16T22:26:34Z

Based on and requires #11936

Currently only focused on usability and correctness, not performance.

This does not deal with pre-compiling the encoder forward pass, so in the event that the model is passed in image/video/audio that is a new shape, it will force compilation during runtime.

Tested Examples

Image:

VLLM_USE_V1=1 python llava_tpu.py
...
Prompt 1: What do you see in this image?
Response:  The image features a tall tower with a spire, surrounded by a beautiful cherry blossom tree. The tree is filled with pink flowers, creating a stunning contrast against the tower. The blossoms are scattered throughout the tree, with some closer to the top and others near the bottom. The scene

Prompt 2: What colors are most prominent in this image?
Response:  The most prominent colors in this image are pink and white, as they are associated with the cherry blossoms and the sky.

Audio:

VLLM_USE_V1=1 python examples/offline_inference/offline_inference_audio_language.py --model-type qwen2_audio
Processed prompts: 100%|████████████████████████████████████| 1/1 [00:33<00:00, 33.90s/it, est. speed input: 12.80 toks/s, output: 1.42 toks/s]
The recited content in the audio is: 'First words I spoke in the original coronavirus a little feat of practical poetry Mary had a little lamb its fleece was white as snow and everywhere that Mary went the lamb was sure to go.'

Signed-off-by: mgoin <mgoin@redhat.com>

github-actions · 2025-01-16T22:26:46Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

robertgshaw2-redhat · 2025-01-16T23:05:06Z

cc @bvrockwell - FYI

mergify · 2025-01-22T22:39:14Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @mgoin.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

bvrockwell · 2025-01-31T02:24:26Z

cc @yaochengji could you please take a look?

alexm-redhat and others added 27 commits January 9, 2025 17:00

TPU rebase from Rob's PR - in process

c63fc49

finished tpu model runner

ae3c487

add tpu worker

35d139d

add files

2656fb2

add executor

6a7633a

store tmp

56621b4

finished rebase

a9fc408

remove tmp files

fda64cb

fix refs

774a112

add files

d534ecf

add test

e9057a7

tmp not working yet

d40ef18

made progress

422aecc

more progress

6065fac

runs, no correctness yet

f1da4b0

fixes

6ea94b0

tmp wip

cefce4a

works!

fca7765

enforce DYNAMO_ONCE compilation level for TPU

9064c84

adjust scheduler params

6a14317

remove uniproc_tpu_executor

7d43d7b

cleanups

75ba559

refactor to use worker_base for both cuda and tpu workers

a6074b9

refactor to avoid code duplications

b65ed98

Add TP support

d25ec0e

Multimodal model support for V1 TPU

f658a50

Signed-off-by: mgoin <mgoin@redhat.com>

Fix

c3c3145

Signed-off-by: mgoin <mgoin@redhat.com>

alexm-redhat force-pushed the tpu_v1 branch from d25ec0e to b65ed98 Compare January 20, 2025 14:13

alexm-redhat force-pushed the tpu_v1 branch 5 times, most recently from dea6afd to c6f526c Compare January 22, 2025 22:38

mergify bot added needs-rebase ci/build labels Jan 22, 2025

alexm-redhat force-pushed the tpu_v1 branch 4 times, most recently from 1392a46 to 39c4a4c Compare January 28, 2025 23:09

mgoin closed this Feb 18, 2025

mergify bot added the v1 label Feb 18, 2025

mgoin mentioned this pull request Feb 18, 2025

[V1][TPU] TPU multimodal model support #13496

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] Multimodal model support for V1 TPU#12133

[WIP] Multimodal model support for V1 TPU#12133
mgoin wants to merge 27 commits intovllm-project:tpu_v1from
neuralmagic:tpu_v1_vlm

mgoin commented Jan 16, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Jan 16, 2025

Uh oh!

robertgshaw2-redhat commented Jan 16, 2025

Uh oh!

mergify bot commented Jan 22, 2025

Uh oh!

bvrockwell commented Jan 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

mgoin commented Jan 16, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Tested Examples

Uh oh!

github-actions bot commented Jan 16, 2025

Uh oh!

robertgshaw2-redhat commented Jan 16, 2025

Uh oh!

mergify bot commented Jan 22, 2025

Uh oh!

bvrockwell commented Jan 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mgoin commented Jan 16, 2025 •

edited by github-actions bot

Loading