Skip to content

fix: use custom_tokenizer to workaround the trtllm + glm5 tokenizer loading issue#20

Merged
ishandhanani merged 1 commit intoNVIDIA:mainfrom
richardhuo-nv:rihuo/fix_glm5_tokenizer_2
Apr 9, 2026
Merged

fix: use custom_tokenizer to workaround the trtllm + glm5 tokenizer loading issue#20
ishandhanani merged 1 commit intoNVIDIA:mainfrom
richardhuo-nv:rihuo/fix_glm5_tokenizer_2

Conversation

@richardhuo-nv
Copy link
Copy Markdown
Collaborator

TRT-LLM is still on Transformers v4, while the GLM-5 model was built with Transformers v5. As a result, the GLM-5 tokenizer cannot be loaded directly with AutoTokenizer in Transformers v4.

Our current workaround is adapted from TensorRT-LLM’s glm_moe_dsa tokenizer implementation:
https://github.com/NVIDIA/TensorRT-LLM/blob/main/tensorrt_llm/tokenizer/glm_moe_dsa/tokenizer.py

This workaround uses the Rust tokenizer library to load tokenizer.json, and then initializes a Transformers v4 AutoTokenizer with appropriately translated settings from tokenizer_config.json.

At the moment, this workaround does not support chat_template, so we need to disable chat templating for now.

benchmark:
  custom_tokenizer: "glm_moe_dsa"
  use_chat_template: false

@codecov-commenter
Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (main@e93856b). Learn more about missing BASE report.

Additional details and impacted files
@@           Coverage Diff           @@
##             main      #20   +/-   ##
=======================================
  Coverage        ?   60.13%           
=======================================
  Files           ?       48           
  Lines           ?     4079           
  Branches        ?        0           
=======================================
  Hits            ?     2453           
  Misses          ?     1626           
  Partials        ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ishandhanani ishandhanani merged commit 129c6fc into NVIDIA:main Apr 9, 2026
5 checks passed
richardhuo-nv added a commit that referenced this pull request Apr 20, 2026
…#47)

* fix tokenizer for glm5 (#20)

fix

* add nvidia pre-release url (#22)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants