[Bugfix] llm.chat bos token duplicate by nike00811 · Pull Request #15695 · vllm-project/vllm

nike00811 · 2025-03-28T10:16:49Z

When using llm.chat, the conversation is first converted into a string and then passed to self.generate for completion generation. This causes apply_chat_template to prepend a bos token at the beginning of prompt_data. Subsequently, converting prompt_data into prompt_token_ids adds another bos token, resulting in two consecutive bos tokens at the start of the final prompt.
I changed the default value of tokenize and added a new parameter to llm.chat, allowing users to decide whether to enable tokenization.

>>> from vllm import LLM
>>> llm = LLM(model="meta-llama/Llama-3.1-8B-Instruct")
>>> tokenizer = llm.get_tokenizer()
>>> conversation = [
>>>     {"role": "system", "content": "You are a helpful assistant"},
>>>     {"role": "user", "content": "Hello"},
>>>     {"role": "assistant", "content": "Hello! How can I assist you today?"},
>>>     {"role": "user", "content": "Write an essay about the importance of higher education."},
>>> ]
>>> outputs = llm.chat(conversation)

Before fix: (tokenize = False)

>>> print(output.prompt_token_ids[:10])
>>> print('{!r}'.format(tokenizer.decode(output.prompt_token_ids[:10])))
[128000, 128000, 128006, 9125, 128007, 271, 38766, 1303, 33025, 2696]
'<|begin_of_text|><|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nCutting Knowledge Date'

After fix: (tokenize = True)

>>> print(output.prompt_token_ids[:10])
>>> print('{!r}'.format(tokenizer.decode(output.prompt_token_ids[:10])))
[128000, 128006, 9125, 128007, 271, 38766, 1303, 33025, 2696, 25]
'<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nCutting Knowledge Date:'

Signed-off-by: nike00811 <nike00811@gmail.com>

github-actions · 2025-03-28T10:16:57Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

DarkLight1337 · 2025-03-28T10:21:23Z

This is a somewhat complicated issue, see #9519 and #11432 for discussion. Can you add tests to avoid future regressions?

njhill · 2025-04-25T21:29:44Z

Thanks @nike00811, this will be addressed by #16081.

nike00811 added 2 commits March 28, 2025 15:47

llm.chat add tokenize parameter

876b7b5

Signed-off-by: nike00811 <nike00811@gmail.com>

Merge branch 'vllm-project:main' into main

0501575

mergify bot added the frontend label Mar 28, 2025

nike00811 added 2 commits March 29, 2025 00:21

Merge branch 'vllm-project:main' into main

d25db33

Merge branch 'vllm-project:main' into main

9bca4cc

njhill mentioned this pull request Apr 5, 2025

[BugFix][Frontend] Fix LLM.chat() tokenization #16081

Merged

DarkLight1337 mentioned this pull request Apr 18, 2025

[Bug]: Two BOS when using chat #16853

Closed

1 task

njhill closed this Apr 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugfix] llm.chat bos token duplicate#15695

[Bugfix] llm.chat bos token duplicate#15695
nike00811 wants to merge 4 commits intovllm-project:mainfrom
nike00811:main

nike00811 commented Mar 28, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Mar 28, 2025

Uh oh!

DarkLight1337 commented Mar 28, 2025 •

edited

Loading

Uh oh!

njhill commented Apr 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

nike00811 commented Mar 28, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 28, 2025

Uh oh!

DarkLight1337 commented Mar 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

njhill commented Apr 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nike00811 commented Mar 28, 2025 •

edited by github-actions bot

Loading

DarkLight1337 commented Mar 28, 2025 •

edited

Loading