feat(frontend): add --default-chat-template-kwargs CLI argument by effortprogrammer · Pull Request #31343 · vllm-project/vllm

effortprogrammer · 2025-12-25T10:10:45Z

Purpose

Add server-level default chat_template_kwargs to control reasoning model behavior at deployment time. Request-level kwargs override these defaults.

Test Plan

This PR allows explicit control of reasoning/non-reasoning mode at the vllm serve command level using --default-chat-template-kwargs.

For reasoning models like Qwen3, you can now disable thinking mode server-wide by setting {"enable_thinking": false} as a default, eliminating the need to specify it in every request. Request-level chat_template_kwargs will override these server defaults when provided.

Manual test command:

vllm serve Qwen/Qwen3-8B --tensor-parallel-size 2 --served-model-name xionic-test --host 0.0.0.0 --port 8000

Minimal python code for test:

from openai import OpenAI
BASE_URL = "http://localhost:8000/v1"  # Change to your server
MODEL = "xionic-test"
client = OpenAI(api_key="EMPTY", base_url=BASE_URL)
# Same request to both servers
messages = [{"role": "user", "content": "What is 2+2?"}]
print("=" * 60)
print("WITHOUT --default-chat-template-kwargs (thinking enabled)")
print("=" * 60)
resp1 = client.chat.completions.create(model=MODEL, messages=messages, max_tokens=200)
print(resp1.choices[0].message.content)
print("\n" + "=" * 60)
print("WITH --default-chat-template-kwargs (thinking disabled)")
print("Or using client-side override (current workaround):")
print("=" * 60)
resp2 = client.chat.completions.create(
    model=MODEL,
    messages=messages,
    max_tokens=200,
    extra_body={"chat_template_kwargs": {"enable_thinking": False}},
)
print(resp2.choices[0].message.content)
print("\n" + "=" * 60)
print("SUMMARY")
print("=" * 60)
print(f"Response 1 length: {len(resp1.choices[0].message.content)} chars")
print(f"Response 2 length: {len(resp2.choices[0].message.content)} chars")
print(f"Has <think> tag in resp1: {'<think>' in resp1.choices[0].message.content}")
print(f"Has <think> tag in resp2: {'<think>' in resp2.choices[0].message.content}")

Test Result

WITHOUT --default-chat-template-kwargs (thinking enabled):

Result:
Okay, the user is asking "What is 2+2?" That seems straightforward, but maybe they want a detailed explanation. Let me think. First, I should confirm the basic arithmetic. 2 plus 2 is 4. But maybe they're testing if I know the answer or if there's a trick. Sometimes people ask simple questions to see if the AI is reliable.

Wait, could there be a different interpretation? Like in some contexts, 2+2 might not be 4? For example, in modular arithmetic, if we're working modulo 3, 2+2 would be 1. But the question doesn't specify any context, so the default is standard arithmetic.

Also, maybe they want to know the steps involved. Let me break it down. Starting with two units and adding another two units. So 2 + 2 equals 4. But perhaps they want a more detailed explanation, like using number lines or visual aids.

WITH --default-chat-template-kwargs (thinking disabled):

Result: 2 + 2 equals 4.

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

chatgpt-codex-connector · 2025-12-25T10:10:52Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

gemini-code-assist

Code Review

This pull request introduces a new CLI argument --default-chat-template-kwargs to set server-level default keyword arguments for the chat template renderer. The implementation correctly adds the argument and passes it through to the serving layer. However, there is a logic issue in how these default arguments are merged with request-level arguments, which could lead to server defaults incorrectly overriding request parameters. I've provided a suggestion to fix the merge order.

github-actions · 2025-12-25T10:21:20Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

chatgpt-codex-connector · 2025-12-25T11:27:41Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

mergify · 2025-12-25T11:31:36Z

Hi @effortprogrammer, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?

mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

vllm/entrypoints/openai/api_server.py

vllm/entrypoints/openai/serving_engine.py

effortprogrammer · 2025-12-29T03:30:00Z

@DarkLight1337 @chaunceyjiang

I made essential changes based on review. Please check if there's any more issues!

tests/entrypoints/openai/test_cli_args.py

chaunceyjiang · 2025-12-29T11:27:27Z

@effortprogrammer You need to DCO.

mergify · 2025-12-29T13:50:14Z

Documentation preview: https://vllm--31343.org.readthedocs.build/en/31343/

Add server-level default chat_template_kwargs to control reasoning model behavior at deployment time. Request-level kwargs override these defaults. Fixes vllm-project#28070 Signed-off-by: effortprogrammer <yhjhoward7@gmail.com>

Signed-off-by: effortprogrammer <yhjhoward7@gmail.com>

…te args Signed-off-by: effortprogrammer <yhjhoward7@gmail.com>

Signed-off-by: effortprogrammer <yhjhoward7@gmail.com>

chaunceyjiang

Thanks~ @effortprogrammer

effortprogrammer · 2025-12-29T17:53:20Z

@chaunceyjiang @DarkLight1337 It seems like current CI/CD failed does not relate with my current changes. Is there anything I should change for?

…-project#31343) Signed-off-by: effortprogrammer <yhjhoward7@gmail.com>

…-project#31343) Signed-off-by: effortprogrammer <yhjhoward7@gmail.com> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>

…-project#31343) Signed-off-by: effortprogrammer <yhjhoward7@gmail.com>

effortprogrammer requested review from DarkLight1337, NickLucche, aarnphm, chaunceyjiang and robertgshaw2-redhat as code owners December 25, 2025 10:10

effortprogrammer marked this pull request as draft December 25, 2025 10:10

mergify bot added the frontend label Dec 25, 2025

gemini-code-assist bot reviewed Dec 25, 2025

View reviewed changes

effortprogrammer marked this pull request as ready for review December 25, 2025 11:27

effortprogrammer force-pushed the feat/default-chat-template-kwargs branch from 8731e9a to cce2494 Compare December 25, 2025 11:39

DarkLight1337 reviewed Dec 26, 2025

View reviewed changes

vllm/entrypoints/openai/api_server.py Outdated Show resolved Hide resolved

chaunceyjiang reviewed Dec 26, 2025

View reviewed changes

vllm/entrypoints/openai/serving_engine.py Show resolved Hide resolved

effortprogrammer requested review from DarkLight1337 and chaunceyjiang December 26, 2025 13:39

DarkLight1337 approved these changes Dec 29, 2025

View reviewed changes

chaunceyjiang reviewed Dec 29, 2025

View reviewed changes

tests/entrypoints/openai/test_cli_args.py Show resolved Hide resolved

effortprogrammer force-pushed the feat/default-chat-template-kwargs branch from 7386263 to 413cd1b Compare December 29, 2025 13:36

mergify bot added the documentation Improvements or additions to documentation label Dec 29, 2025

effortprogrammer added 4 commits December 29, 2025 22:51

chore: check pre-commit

483f916

Signed-off-by: effortprogrammer <yhjhoward7@gmail.com>

refactor: move default_chat_template_kwargs to group with chat templa…

f260529

…te args Signed-off-by: effortprogrammer <yhjhoward7@gmail.com>

add: use case for reasoning_outputs.md

dda72c8

Signed-off-by: effortprogrammer <yhjhoward7@gmail.com>

effortprogrammer force-pushed the feat/default-chat-template-kwargs branch from 61f6f35 to dda72c8 Compare December 29, 2025 13:52

effortprogrammer requested a review from chaunceyjiang December 29, 2025 14:28

chaunceyjiang approved these changes Dec 29, 2025

View reviewed changes

chaunceyjiang enabled auto-merge (squash) December 29, 2025 14:32

chaunceyjiang added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 29, 2025

chaunceyjiang merged commit dc837bc into vllm-project:main Dec 30, 2025
49 checks passed

yiliu30 pushed a commit to yiliu30/vllm-fork that referenced this pull request Dec 30, 2025

feat(frontend): add --default-chat-template-kwargs CLI argument (vllm…

70b5248

…-project#31343) Signed-off-by: effortprogrammer <yhjhoward7@gmail.com>

cjackal mentioned this pull request Dec 31, 2025

[Frontend] [Bugfix] respect server-level default chat template kwargs in reasoning parser #31581

Merged

5 tasks

yugong333 pushed a commit to yugong333/vllm that referenced this pull request Jan 9, 2026

feat(frontend): add --default-chat-template-kwargs CLI argument (vllm…

841ddce

…-project#31343) Signed-off-by: effortprogrammer <yhjhoward7@gmail.com>

akh64bit pushed a commit to akh64bit/vllm that referenced this pull request Jan 16, 2026

feat(frontend): add --default-chat-template-kwargs CLI argument (vllm…

9c149cd

…-project#31343) Signed-off-by: effortprogrammer <yhjhoward7@gmail.com>

ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026

feat(frontend): add --default-chat-template-kwargs CLI argument (vllm…

ee2c612

…-project#31343) Signed-off-by: effortprogrammer <yhjhoward7@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(frontend): add --default-chat-template-kwargs CLI argument#31343

feat(frontend): add --default-chat-template-kwargs CLI argument#31343
chaunceyjiang merged 4 commits intovllm-project:mainfrom
effortprogrammer:feat/default-chat-template-kwargs

effortprogrammer commented Dec 25, 2025 •

edited by github-actions bot

Loading

Uh oh!

chatgpt-codex-connector bot commented Dec 25, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

github-actions bot commented Dec 25, 2025

Uh oh!

chatgpt-codex-connector bot commented Dec 25, 2025

Uh oh!

mergify bot commented Dec 25, 2025

Uh oh!

Uh oh!

Uh oh!

effortprogrammer commented Dec 29, 2025

Uh oh!

Uh oh!

chaunceyjiang commented Dec 29, 2025

Uh oh!

mergify bot commented Dec 29, 2025

Uh oh!

chaunceyjiang left a comment

Uh oh!

effortprogrammer commented Dec 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

effortprogrammer commented Dec 25, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

WITHOUT --default-chat-template-kwargs (thinking enabled):

WITH --default-chat-template-kwargs (thinking disabled):

Uh oh!

chatgpt-codex-connector bot commented Dec 25, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

github-actions bot commented Dec 25, 2025

Uh oh!

chatgpt-codex-connector bot commented Dec 25, 2025

Uh oh!

mergify bot commented Dec 25, 2025

Uh oh!

Uh oh!

Uh oh!

effortprogrammer commented Dec 29, 2025

Uh oh!

Uh oh!

chaunceyjiang commented Dec 29, 2025

Uh oh!

mergify bot commented Dec 29, 2025

Uh oh!

chaunceyjiang left a comment

Choose a reason for hiding this comment

Uh oh!

effortprogrammer commented Dec 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

effortprogrammer commented Dec 25, 2025 •

edited by github-actions bot

Loading