[Model] Move `multimodal_cpu_fields` definition to field config by DarkLight1337 · Pull Request #30181 · vllm-project/vllm

DarkLight1337 · 2025-12-06T08:02:45Z

Purpose

Redesign of #28168, we now define the CPU fields in the field config where they really belong.

To avoid mixins, I had to make MultiModalFieldConfig kw_only=True and update the serialization accordingly to use dicts instead of tuples - this leads to ~10 more serialized bytes per item which is very small in comparison to the tensor data.

Since GLM4V uses the field config from Qwen2-VL model, I also updated it to support CPU fields.

cc @lgeiger

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

gemini-code-assist

Code Review

This pull request refactors the mechanism for specifying CPU-only multimodal fields by introducing a keep_on_cpu flag in MultiModalFieldConfig and deprecating the old multimodal_cpu_fields attribute. The changes are well-implemented and consistently applied across model definitions and the core multimodal input processing logic. This improves the API by making the configuration more explicit and localized. I have one suggestion to improve a developer-facing error message.

vllm/model_executor/models/interfaces.py

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

vllm/model_executor/models/glm4_1v.py

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

lgeiger · 2025-12-08T12:55:36Z

vllm/multimodal/inputs.py

+        if device is not None and self.keep_on_cpu:
+            device = "cpu"


Should we instead not call _nested_tensors_h2d at all if the tensor should stay on CPU? Just wondering since we set non_blocking=True in the copy which can lead to problems when transferring to CPU. Might be not an issue in practice since the tensors always are already on CPU, but might be nicer to be safe here.

That is true, we haven't handled the case where the original tensors are on GPU and we want to potentially move them to CPU (though I don't expect that to be needed any time soon). Feel free to open a PR if you have time!

lgeiger · 2025-12-08T12:57:09Z

Nice! Thanks for adding it to MultiModalFieldConfig I thought about doing that in the first place as well, thanks for updating. I think this is a better API

### What this PR does / why we need it? ### Does this PR introduce _any_ user-facing change? 1. fix vllm-project/vllm#27938 2. fix vllm-project/vllm#27145 pooling models now supports chunked prefill and prefix caching, 3. fix vllm-project/vllm#30181 define the CPU fields in the field config where they really belong. 4. fix vllm-project/vllm#28168 define the CPU fields in the field config where they really belong. 5. fix vllm-project/vllm#30201 some moudle rename 6. fix vllm-project/vllm#29067 fusedmoe moudle refactor 7. fix vllm-project/vllm#29066 fusedmoe moudle refactor 8. fix vllm-project/vllm#29624 ### How was this patch tested? - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e --------- Signed-off-by: wangli <wangli858794774@gmail.com>

…-project#30181) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>

### What this PR does / why we need it? ### Does this PR introduce _any_ user-facing change? 1. fix vllm-project/vllm#27938 2. fix vllm-project/vllm#27145 pooling models now supports chunked prefill and prefix caching, 3. fix vllm-project/vllm#30181 define the CPU fields in the field config where they really belong. 4. fix vllm-project/vllm#28168 define the CPU fields in the field config where they really belong. 5. fix vllm-project/vllm#30201 some moudle rename 6. fix vllm-project/vllm#29067 fusedmoe moudle refactor 7. fix vllm-project/vllm#29066 fusedmoe moudle refactor 8. fix vllm-project/vllm#29624 ### How was this patch tested? - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e --------- Signed-off-by: wangli <wangli858794774@gmail.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

[Model] Move multimodal_cpu_fields definition to field config

b9516fd

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

DarkLight1337 requested a review from Isotr0py December 6, 2025 08:02

DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 6, 2025

DarkLight1337 added this to Multi-modality Core Dec 6, 2025

DarkLight1337 added the multi-modality Related to multi-modality (#4194) label Dec 6, 2025

DarkLight1337 requested review from NickLucche, sighingnow, tjtanaa and ywang96 as code owners December 6, 2025 08:02

DarkLight1337 moved this to In Progress in Multi-modality Core Dec 6, 2025

mergify bot added qwen Related to Qwen models v1 tpu Related to Google TPUs labels Dec 6, 2025

gemini-code-assist bot reviewed Dec 6, 2025

View reviewed changes

vllm/model_executor/models/interfaces.py Outdated Show resolved Hide resolved

Update PR number

b41998d

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

chatgpt-codex-connector bot reviewed Dec 6, 2025

View reviewed changes

vllm/model_executor/models/glm4_1v.py Show resolved Hide resolved

Isotr0py approved these changes Dec 6, 2025

View reviewed changes

Update serialization

7b3a4b9

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

DarkLight1337 enabled auto-merge (squash) December 6, 2025 09:03

DarkLight1337 merged commit 671427e into vllm-project:main Dec 6, 2025
62 checks passed

DarkLight1337 deleted the move-cpu-fields branch December 6, 2025 13:40

github-project-automation bot moved this from In Progress to Done in Multi-modality Core Dec 6, 2025

Potabk mentioned this pull request Dec 8, 2025

[Misc] Upgrade vllm commit to 12_08 vllm-project/vllm-ascend#4781

Closed

lgeiger reviewed Dec 8, 2025

View reviewed changes

This was referenced Dec 11, 2025

[Misc] Upgrade vllm hash to 1210 vllm-project/vllm-ascend#4906

Closed

[Misc] Upgrade vllm hash to 12_14 vllm-project/vllm-ascend#5000

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Model] Move `multimodal_cpu_fields` definition to field config#30181

[Model] Move `multimodal_cpu_fields` definition to field config#30181
DarkLight1337 merged 3 commits intovllm-project:mainfrom
DarkLight1337:move-cpu-fields

DarkLight1337 commented Dec 6, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

Uh oh!

lgeiger Dec 8, 2025

Uh oh!

DarkLight1337 Dec 8, 2025

Uh oh!

lgeiger commented Dec 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

DarkLight1337 commented Dec 6, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

lgeiger Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

DarkLight1337 Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

lgeiger commented Dec 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

DarkLight1337 commented Dec 6, 2025 •

edited by github-actions bot

Loading