Skip to content

Remove all_special_tokens_extended from tokenizer code#29686

Merged
DarkLight1337 merged 1 commit intovllm-project:mainfrom
hmellor:remove-tokenizer-field
Nov 28, 2025
Merged

Remove all_special_tokens_extended from tokenizer code#29686
DarkLight1337 merged 1 commit intovllm-project:mainfrom
hmellor:remove-tokenizer-field

Conversation

@hmellor
Copy link
Member

@hmellor hmellor commented Nov 28, 2025

This attribute is:

Should resolve the failures in https://buildkite.com/vllm/ci/builds/41020/steps/canvas?sid=019ac943-dca1-43d6-ad64-ca0187f645bc

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
@hmellor hmellor force-pushed the remove-tokenizer-field branch from 4449139 to 56f2482 Compare November 28, 2025 15:47
@hmellor hmellor added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 28, 2025
@chatgpt-codex-connector
Copy link

💡 Codex Review

https://github.com/vllm-project/vllm/blob/44491395c266dc41b892fd8a98f25fa50b83d7ee/vllm/transformers_utils/config.py#L460-L461
P1 Badge Handle legacy rope_theta on transformers>=5

In patch_rope_parameters the transformers>=5 branch now no-ops, so legacy configs that still expose config.rope_theta (common for custom transformers v4 models) are no longer converted into rope_parameters or validated. When running with transformers v5, such configs keep rope_parameters unset, so later consumers (e.g., prepare_hf_config and max length scaling) drop the provided RoPE scaling/θ information and may initialize RoPE with incorrect defaults. Please keep the standardization/validation here for v5 the same way the v4 branch still does.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Copy link
Member

@DarkLight1337 DarkLight1337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was planning on removing it as well since it's unused in vLLM, thanks for doing it!

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) November 28, 2025 15:59
@hmellor
Copy link
Member Author

hmellor commented Nov 28, 2025

I've enabled the Transformers nightly tests to verify that the errors relating to this attr are gone https://buildkite.com/vllm/ci/builds/41093/steps/canvas?sid=019acb27-ad2f-4514-ab1e-68141500585c

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly removes the all_special_tokens_extended attribute, which simplifies the tokenizer-related code and improves forward compatibility with Transformers v5. The changes are applied consistently across tests and implementations. However, the pull request also includes undocumented changes to RoPE parameter handling in vllm/transformers_utils/config.py and vllm/transformers_utils/configs/qwen3_next.py. Specifically, a block of code for Transformers v5 compatibility has been commented out. While this might be intentional, leaving commented-out code reduces maintainability. I've added a suggestion to remove it. It would be beneficial to update the pull request description to cover all changes for better clarity.

@DarkLight1337 DarkLight1337 merged commit fecae12 into vllm-project:main Nov 28, 2025
52 checks passed
@hmellor hmellor deleted the remove-tokenizer-field branch November 28, 2025 20:31
kitaekatt pushed a commit to kitaekatt/vllm that referenced this pull request Dec 1, 2025
…t#29686)

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
amd-hhashemi pushed a commit to amd-hhashemi/vllm that referenced this pull request Dec 2, 2025
…t#29686)

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Hashem Hashemi <hashem.hashemi@amd.com>
dsuhinin pushed a commit to dsuhinin/vllm that referenced this pull request Jan 21, 2026
…t#29686)

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants