[Renderer] Introduce Renderer for processing chat messages (using `RendererConfig`) by DarkLight1337 · Pull Request #30198 · vllm-project/vllm

DarkLight1337 · 2025-12-07T07:17:59Z

Purpose

Prototype an interface, vllm.renderers.RendererLike, to process chat messages into engine inputs.
Introduce RendererRegistry which lazily registers renderers to avoid circular import problem..
Move implementation-specific chat utils to the corresponding renderer in vllm.renderers.
TODO: Migrate tokenizer_mode to renderer_mode, and use a specific tokenizer implementation for each renderer, deprecating TokenizerRegistry in the process.

Towards #22880 and #23873

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

gemini-code-assist

Code Review

This pull request introduces a Renderer abstraction to encapsulate chat template processing and tokenization logic. This is a significant and positive architectural refactoring, moving model-specific rendering logic out of the core engine and into dedicated renderer classes. The changes are extensive, touching many parts of the codebase to replace direct tokenizer usage with the new renderer interface. The implementation appears mostly correct and consistent. However, I found a bug in the MistralToolParser where an instance variable is not initialized, which will lead to an AttributeError.

gemini-code-assist · 2025-12-07T07:20:38Z

vllm/entrypoints/openai/tool_parsers/mistral_tool_parser.py

@@ -112,6 +114,8 @@ def __init__(self, tokenizer: TokenizerLike):
                "the tokenizer!"
            )

+        self.prev_tool_call_arr: list[dict[str, Any]]


The instance variable prev_tool_call_arr is declared as a type hint but not initialized in the __init__ method. This will cause an AttributeError when it's accessed before being set. Please initialize it in __init__, for example: self.prev_tool_call_arr = [].

DarkLight1337 · 2025-12-07T07:25:25Z

vllm/entrypoints/pooling/classify/serving.py

@@ -106,7 +104,7 @@ async def _preprocess(
                    ctx.engine_prompts = []
                    return None

-                renderer = self._get_renderer(ctx.tokenizer)
+                renderer = self._get_renderer(self.renderer.tokenizer)


Note that vllm.renderers.Renderer (self.renderer) is currently for chat only, and is not to be confused with the Renderer inside vllm.entrypoints.renderer.CompletionRenderer (the result of self._get_renderer). The two implementations will be merged in a later PR.

DarkLight1337 · 2025-12-07T07:27:02Z

vllm/entrypoints/openai/serving_engine.py

@@ -281,16 +267,13 @@ def __init__(

        self.request_logger = request_logger
        self.return_tokens_as_token_ids = return_tokens_as_token_ids
-        self._tokenizer_executor = ThreadPoolExecutor(max_workers=1)


This has been moved to the Mistral renderer

DarkLight1337 · 2025-12-07T07:27:27Z

vllm/entrypoints/openai/serving_engine.py

    request: RequestT
    raw_request: Request | None = None
    model_name: str
    request_id: str
    created_time: int = field(default_factory=lambda: int(time.time()))
    lora_request: LoRARequest | None = None

-    # Shared across most requests


Prefer using self.renderer to simplify the code.

mergify · 2025-12-07T08:05:27Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @DarkLight1337.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

DarkLight1337 · 2025-12-07T08:46:33Z

Superseded by #30200

DarkLight1337 changed the title ~~[Renderer] Introduce Renderer~~ [Renderer] Introduce Renderer for processing chat messages Dec 7, 2025

mergify bot added frontend multi-modality Related to multi-modality (#4194) performance Performance-related issues gpt-oss Related to GPT-OSS models structured-output labels Dec 7, 2025

github-project-automation bot added this to gpt-oss Issues & Enhancements Dec 7, 2025

mergify bot added v1 tool-calling labels Dec 7, 2025

github-project-automation bot moved this to To Triage in gpt-oss Issues & Enhancements Dec 7, 2025

github-project-automation bot added this to Structured Output and Tool Calling Dec 7, 2025

gemini-code-assist bot reviewed Dec 7, 2025

View reviewed changes

DarkLight1337 commented Dec 7, 2025

View reviewed changes

mergify bot added the needs-rebase label Dec 7, 2025

[Renderer] Introduce Renderer

47c1f05

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

DarkLight1337 force-pushed the init-renderer branch from 0db05b1 to 47c1f05 Compare December 7, 2025 08:15

DarkLight1337 changed the title ~~[Renderer] Introduce Renderer for processing chat messages~~ [Renderer] Introduce Renderer for processing chat messages (using RendererConfig) Dec 7, 2025

DarkLight1337 changed the title ~~[Renderer] Introduce Renderer for processing chat messages (using RendererConfig)~~ [Renderer] Introduce Renderer for processing chat messages (using ModelConfig) Dec 7, 2025

DarkLight1337 changed the title ~~[Renderer] Introduce Renderer for processing chat messages (using ModelConfig)~~ [Renderer] Introduce Renderer for processing chat messages (using RendererConfig) Dec 7, 2025

DarkLight1337 closed this Dec 7, 2025

github-project-automation bot moved this to Done in Structured Output Dec 7, 2025

github-project-automation bot moved this to Done in Tool Calling Dec 7, 2025

github-project-automation bot moved this from To Triage to Done in gpt-oss Issues & Enhancements Dec 7, 2025

DarkLight1337 deleted the init-renderer branch February 2, 2026 09:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Renderer] Introduce Renderer for processing chat messages (using `RendererConfig`)#30198

[Renderer] Introduce Renderer for processing chat messages (using `RendererConfig`)#30198
DarkLight1337 wants to merge 1 commit intovllm-project:mainfrom
DarkLight1337:init-renderer

DarkLight1337 commented Dec 7, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Dec 7, 2025

Uh oh!

DarkLight1337 Dec 7, 2025 •

edited

Loading

Uh oh!

DarkLight1337 Dec 7, 2025

Uh oh!

DarkLight1337 Dec 7, 2025

Uh oh!

mergify bot commented Dec 7, 2025

Uh oh!

DarkLight1337 commented Dec 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

DarkLight1337 commented Dec 7, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 7, 2025

Choose a reason for hiding this comment

Uh oh!

DarkLight1337 Dec 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DarkLight1337 Dec 7, 2025

Choose a reason for hiding this comment

Uh oh!

DarkLight1337 Dec 7, 2025

Choose a reason for hiding this comment

Uh oh!

mergify bot commented Dec 7, 2025

Uh oh!

DarkLight1337 commented Dec 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

DarkLight1337 commented Dec 7, 2025 •

edited by github-actions bot

Loading

DarkLight1337 Dec 7, 2025 •

edited

Loading