[Tokenizer] Add tokenizer mode by WoosukKwon · Pull Request #298 · vllm-project/vllm

WoosukKwon · 2023-06-28T18:48:02Z

Closes #281

This PR adds tokenizer_mode argument which can be either auto or slow. When it is slow, vLLM uses the slow tokenizer even if the fast tokenizer is available. This is required for some popular models, e.g., open llama.

zhuohan123

LGTM! Thanks

@andy-neuma

Upstream sync 2024 06 11 (neuralmagic#288) SUMMARY: * Merge commits from vllm-project@1197e02 to vllm-project@114332b * Our GCP test instances do not have gcc or clang installed. All of the triton kernels rely on the gcc and clang to generate JITs. These are still disabled (cc @andy-neuma). All are marked with: ```python @pytest.mark.skip("C compiler not installed in NM automation. " "This codepath follows a triton pathway, which " "JITs using clang or gcc. Since neither are installed " "in our test instances, we need to skip this for now.") ``` Note that vllm-project@1197e02 is NOT included in this merge. COMPARE vs UPSTREAM: https://github.com/neuralmagic/nm-vllm/compare/upstream-sync-2024-06-11..vllm-project:vllm:v0.5.0 --------- Signed-off-by: Ye Cao <caoye.cao@alibaba-inc.com> Signed-off-by: kevin <kevin@anyscale.com> Co-authored-by: Daniele <d.trifiro@me.com> Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com> Co-authored-by: Varun Sundar Rabindranath <varunsundar08@gmail.com> Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com> Co-authored-by: Ye Cao <952129620@qq.com> Co-authored-by: Nadav Shmayovits <45605409+NadavShmayo@users.noreply.github.com> Co-authored-by: chenqianfzh <51831990+chenqianfzh@users.noreply.github.com> Co-authored-by: Zhuohan Li <zhuohan123@gmail.com> Co-authored-by: Daniil Arapov <59310708+Delviet@users.noreply.github.com> Co-authored-by: mgoin <michael@neuralmagic.com> Co-authored-by: Simon Mo <simon.mo@hey.com> Co-authored-by: Avinash Raj <avistylein3105@gmail.com> Co-authored-by: Divakar Verma <137818590+divakar-amd@users.noreply.github.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk> Co-authored-by: Antoni Baum <antoni.baum@protonmail.com> Co-authored-by: Yuan <yuan.zhou@intel.com> Co-authored-by: Kaiyang Chen <48289729+Kaiyang-Chen@users.noreply.github.com> Co-authored-by: Kevin H. Luu <kevin@anyscale.com> Co-authored-by: Breno Faria <breno@veltefaria.de> Co-authored-by: Toshiki Kataoka <tos.lunar@gmail.com> Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu> Co-authored-by: afeldman-nm <156691304+afeldman-nm@users.noreply.github.com> Co-authored-by: zifeitong <zifei.tong@parasail.io> Co-authored-by: Jie Fu (傅杰) <fujie_email@sina.com> Co-authored-by: Li, Jiang <jiang1.li@intel.com> Co-authored-by: youkaichao <youkaichao@gmail.com> Co-authored-by: tomeras91 <57313761+tomeras91@users.noreply.github.com> Co-authored-by: Cody Yu <hao.yu.cody@gmail.com> Co-authored-by: DriverSong <31926998+DriverSong@users.noreply.github.com> Co-authored-by: qiujiawei9 <qiujiawei9@jd.com> Co-authored-by: Philipp Moritz <pcmoritz@gmail.com> Co-authored-by: Nick Hill <nickhill@us.ibm.com> Co-authored-by: Alex Wu <alexanderwu@berkeley.edu> Co-authored-by: Breno Faria <breno.faria@intrafind.com> Co-authored-by: liuyhwangyh <liuyhwangyh@163.com> Co-authored-by: mulin.lyh <mulin.lyh@taobao.com> Co-authored-by: Matthew Goldey <matthew.goldey@gmail.com> Co-authored-by: Jie Fu (傅杰) <jiefu@tencent.com> Co-authored-by: Itay Etelis <92247226+Etelis@users.noreply.github.com> Co-authored-by: limingshu <61349199+JamesLim-sy@users.noreply.github.com> Co-authored-by: Dipika Sikka <dipikasikka1@gmail.com> Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com> Co-authored-by: Calvinn Ng <39899397+Calvinnncy97@users.noreply.github.com> Co-authored-by: team <calvinn.ng@ahrefs.com> Co-authored-by: Cheng Li <pistasable@gmail.com> Co-authored-by: Benjamin Kitor <bkitor@gmail.com> Co-authored-by: Hongxia Yang <62075498+hongxiayang@users.noreply.github.com> Co-authored-by: bnellnm <49004751+bnellnm@users.noreply.github.com> Co-authored-by: Bla_ckB <50193121+BlackBird-Coding@users.noreply.github.com> Co-authored-by: Roger Wang <ywang@roblox.com>

) ### What this PR does / why we need it? The triton doesn't work with ascend. We should make sure it's uninstalled in dockerfile Backport: vllm-project/vllm-ascend#298 Closes: vllm-project/vllm-ascend#291 ### Does this PR introduce _any_ user-facing change? NO ### How was this patch tested? CI passed Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: Yikun Jiang <yikunkero@gmail.com> Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com>

triton doesn't work with ascend. We should make sure it's uninstalled in dockerfile Related: vllm-project/vllm-ascend#291 --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: Yikun Jiang <yikunkero@gmail.com> Co-authored-by: Yikun Jiang <yikunkero@gmail.com>

…#298) Signed-off-by: Gaohan123 <hgaoaf@connect.ust.hk>

…es (vllm-project#298) Signed-off-by: Salar <skhorasgani@tenstorrent.com>

Add tokenizer_mode

d35faf2

WoosukKwon requested a review from zhuohan123 June 28, 2023 18:48

zhuohan123 approved these changes Jun 28, 2023

View reviewed changes

WoosukKwon merged commit 998d9d1 into main Jun 28, 2023

WoosukKwon deleted the tokenizer-mode branch June 28, 2023 21:19

michaelfeil pushed a commit to michaelfeil/vllm that referenced this pull request Jul 1, 2023

[Tokenizer] Add tokenizer mode (vllm-project#298)

5673e14

hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024

[Tokenizer] Add tokenizer mode (vllm-project#298)

6338cc0

cursor Bot pushed a commit to Shirley125/vllm_epd that referenced this pull request Jan 22, 2026

[Feature] Support output modalities control per request (vllm-project…

8722c63

…#298) Signed-off-by: Gaohan123 <hgaoaf@connect.ust.hk>

mickg10 pushed a commit to mickg10/vllm that referenced this pull request Feb 11, 2026

Modify TT-vLLM install script to use uv pip after tt-metal venv chang…

a186bf4

…es (vllm-project#298) Signed-off-by: Salar <skhorasgani@tenstorrent.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Tokenizer] Add tokenizer mode#298

[Tokenizer] Add tokenizer mode#298
WoosukKwon merged 1 commit intomainfrom
tokenizer-mode

WoosukKwon commented Jun 28, 2023

Uh oh!

zhuohan123 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

WoosukKwon commented Jun 28, 2023

Uh oh!

zhuohan123 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants