Skip to content

Add option in OpenAI API to disable special tokens when tokenizing#2263

Closed
liuyanyi wants to merge 1 commit intovllm-project:mainfrom
liuyanyi:main
Closed

Add option in OpenAI API to disable special tokens when tokenizing#2263
liuyanyi wants to merge 1 commit intovllm-project:mainfrom
liuyanyi:main

Conversation

@liuyanyi
Copy link
Contributor

@liuyanyi liuyanyi commented Dec 26, 2023

Follow the instruction in huggingface doc, i use a llama style chat template for my own model

template = (
    "{% if messages[0]['role'] == 'system' %}"
    "{% set loop_messages = messages[1:] %}"  # Extract system message if it's present
    "{% set system_message = messages[0]['content'] %}"
    "{% elif USE_DEFAULT_PROMPT == true and not '<<SYS>>' in messages[0]['content'] %}"
    "{% set loop_messages = messages %}"  # Or use the default system message if the flag is set
    "{% set system_message = 'DEFAULT_SYSTEM_MESSAGE' %}"
    "{% else %}"
    "{% set loop_messages = messages %}"
    "{% set system_message = false %}"
    "{% endif %}"
    "{% for message in loop_messages %}"  # Loop over all non-system messages
    "{% if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}"
    "{{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}"
    "{% endif %}"
    "{% if loop.index0 == 0 and system_message != false %}"  # Embed system message in first message
    "{% set content = '<<SYS>>\\n' + system_message + '\\n<</SYS>>\\n\\n' + message['content'] %}"
    "{% else %}"
    "{% set content = message['content'] %}"
    "{% endif %}"
    "{% if message['role'] == 'user' %}"  # After all of that, handle messages/roles in a fairly normal way
    "{{ bos_token + '[INST] ' + content.strip() + ' [/INST]' }}"
    "{% elif message['role'] == 'system' %}"
    "{{ '<<SYS>>\\n' + content.strip() + '\\n<</SYS>>\\n\\n' }}"
    "{% elif message['role'] == 'assistant' %}"
    "{{ ' '  + content.strip() + ' ' + eos_token }}"
    "{% endif %}"
    "{% endfor %}"
)

There is already a bos_token in template, after tokenize in OpenAI API server, there are two bos at the start.

So i add an option to control the behavior of the tokenizer.

This may fix #2012

@DarkLight1337
Copy link
Member

Closed as superseded by #4688

WeNeedMoreCode pushed a commit to WeNeedMoreCode/vllm that referenced this pull request Dec 15, 2025
### What this PR does / why we need it?
 Update user guide for suported models 

- vLLM version: v0.10.0
- vLLM main:
vllm-project@4be02a3

---------

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

"/v1/chat/completions" tokenization issue

2 participants