Skip to content

[Misc] Various cleanups for MM input processing#29970

Merged
DarkLight1337 merged 7 commits intovllm-project:mainfrom
DarkLight1337:cleanup-misc
Dec 4, 2025
Merged

[Misc] Various cleanups for MM input processing#29970
DarkLight1337 merged 7 commits intovllm-project:mainfrom
DarkLight1337:cleanup-misc

Conversation

@DarkLight1337
Copy link
Copy Markdown
Member

@DarkLight1337 DarkLight1337 commented Dec 3, 2025

Purpose

  • Add vllm.utils.serial_utils.tensor2base64 convenience function
  • Fix some type annotations for MM embedding input parsing
  • Fix Terratorch test not being run because of being incorrectly marked async and fix the test failure by allowing tokenizer=None in _preprocess_chat
  • Remove unused tokenizer argument for MultiModalItemTracker classes, and by extension parse_chat_messages functions

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
@DarkLight1337 DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 3, 2025
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
@mergify
Copy link
Copy Markdown

mergify bot commented Dec 3, 2025

Documentation preview: https://vllm--29970.org.readthedocs.build/en/29970/

@mergify mergify bot added documentation Improvements or additions to documentation frontend multi-modality Related to multi-modality (#4194) v1 labels Dec 3, 2025
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces several valuable cleanups for multimodal inputs. The addition of the tensor2base64 utility function is a great improvement, simplifying code that serializes tensors to base64 and reducing duplication across documentation, examples, and tests. The removal of the unused tokenizer argument from parse_chat_messages and parse_chat_messages_futures cleans up the API nicely. I also appreciate the fix for the Terratorch test and the addition of missing type annotations.

I have one minor suggestion to improve the new utility function further. Overall, this is a solid set of changes that improves code quality and maintainability.

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
@DarkLight1337
Copy link
Copy Markdown
Member Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces several cleanups and fixes related to multi-modal inputs. A new convenience function tensor2base64 is added to simplify tensor serialization, and it's consistently used across documentation, examples, and tests. A significant refactoring removes the unused tokenizer argument from MultiModalItemTracker and related functions, which cleans up the API. The PR also includes fixes for incorrect type annotations in multi-modal data parsers and corrects a test that was improperly marked as asynchronous. Overall, these changes improve code quality, consistency, and correctness. The implementation looks solid, and I have no further suggestions.

@DarkLight1337 DarkLight1337 changed the title [Misc] Various cleanups for MM inputs [Misc] Various cleanups for MM input processing Dec 3, 2025
@DarkLight1337 DarkLight1337 enabled auto-merge (squash) December 3, 2025 15:56
buf.seek(0)
binary_data = buf.read()

return base64.b64encode(binary_data).decode("utf-8")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also use pybase64 here as introduced in #19984, or do you think it's not worth for this case.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do that in a separate PR

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
@DarkLight1337 DarkLight1337 merged commit 9ae2f60 into vllm-project:main Dec 4, 2025
57 checks passed
@DarkLight1337 DarkLight1337 deleted the cleanup-misc branch December 4, 2025 06:22
dsuhinin pushed a commit to dsuhinin/vllm that referenced this pull request Jan 21, 2026
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation frontend multi-modality Related to multi-modality (#4194) ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants