[Bugfix] Pass hf_token through config loading paths for gated model support by javierdejesusda · Pull Request #37920 · vllm-project/vllm

javierdejesusda · 2026-03-23T19:46:19Z

Purpose

The hf_token argument provided to the LLM instance was not forwarded to HuggingFace API calls during config loading, causing gated models to fail authentication when hf_token is passed explicitly (rather than via environment variable).

Three gaps existed:

maybe_override_with_speculators() called PretrainedConfig.get_config_dict() without the token, so speculators auto-detection failed for gated models.
ModelConfig.__post_init__() called get_config() without forwarding self.hf_token, so the main config loading path (PretrainedConfig.get_config_dict, AutoConfig.from_pretrained, config_class.from_pretrained) also lacked authentication.
try_get_generation_config() called GenerationConfig.from_pretrained() without the token, so generation config loading failed for gated models.

This fix threads hf_token through all three paths, using token= at the HuggingFace API boundary (matching the convention at config.py L1031).

Note: PR #31974 attempted a similar fix but was closed with 0 commits.

Test Plan

# Non-gated model (no regression)
python -c "from vllm.transformers_utils.config import maybe_override_with_speculators; \
  print(maybe_override_with_speculators('gpt2', None, False))"

# Gated model with explicit token (speculators path)
python -c "from vllm.transformers_utils.config import maybe_override_with_speculators; \
  print(maybe_override_with_speculators('meta-llama/Llama-2-7b-hf', None, False, hf_token='hf_...'))"

Test Result

Non-gated models work as before. With hf_token=None (default), behavior is unchanged — HuggingFace Hub falls back to environment-based token lookup internally.

The purpose of the PR
The test plan
The test results

gemini-code-assist

Code Review

The pull request introduces support for passing a HuggingFace token (hf_token) through the vLLM engine configuration. This token is now accepted as an argument in create_engine_config and maybe_override_with_speculators, and is subsequently used when calling PretrainedConfig.get_config_dict to enable authenticated access for fetching model configurations from HuggingFace.

…upport Fixes vllm-project#31894 The hf_token argument provided to the LLM instance was not forwarded to HuggingFace API calls during config loading, causing gated models to fail authentication. - Add hf_token parameter to maybe_override_with_speculators() and pass it as token= to PretrainedConfig.get_config_dict() - Pass token=self.hf_token from ModelConfig to get_config() so it flows through to all HF config loading calls via **kwargs - Add hf_token parameter to try_get_generation_config() and pass it to GenerationConfig.from_pretrained() and the get_config() fallback Signed-off-by: javierdejesusda <javier.dejesusj9@gmail.com>

yewentao256

Please merge from main to fix the pre-commit issue

yewentao256

LGTM, thanks for the work!

…upport (vllm-project#37920) Signed-off-by: javierdejesusda <javier.dejesusj9@gmail.com>

…upport (vllm-project#37920) Signed-off-by: javierdejesusda <javier.dejesusj9@gmail.com> Signed-off-by: Michel Belleau <michel.belleau@malaiwah.com>

…upport (vllm-project#37920) Signed-off-by: javierdejesusda <javier.dejesusj9@gmail.com>

…upport (vllm-project#37920) Signed-off-by: javierdejesusda <javier.dejesusj9@gmail.com> Signed-off-by: Monishver Chandrasekaran <monishverchandrasekaran@gmail.com>

…upport (vllm-project#37920) Signed-off-by: javierdejesusda <javier.dejesusj9@gmail.com> Signed-off-by: Nithin Chalapathi <nithin.ch10@gmail.com>

…upport (vllm-project#37920) Signed-off-by: javierdejesusda <javier.dejesusj9@gmail.com>

…upport (vllm-project#37920) Signed-off-by: javierdejesusda <javier.dejesusj9@gmail.com> Signed-off-by: Vinay Damodaran <vrdn@hey.com>

mergify bot added the bug Something isn't working label Mar 23, 2026

gemini-code-assist bot reviewed Mar 23, 2026

View reviewed changes

javierdejesusda force-pushed the fix/hf-token-speculators branch 2 times, most recently from abe3183 to 2206515 Compare March 23, 2026 19:57

javierdejesusda requested review from ProExpertProg, WoosukKwon, hmellor, houseroad, mgoin, robertgshaw2-redhat, tlrmchlsmth, yewentao256 and youkaichao as code owners March 23, 2026 19:57

javierdejesusda changed the title ~~[Bugfix] Pass hf_token to maybe_override_with_speculators for gated model support~~ [Bugfix] Pass hf_token through config loading paths for gated model support Mar 23, 2026

javierdejesusda force-pushed the fix/hf-token-speculators branch from 2206515 to 139af70 Compare March 23, 2026 20:41

yewentao256 reviewed Mar 24, 2026

View reviewed changes

Merge remote-tracking branch 'origin/main' into fix/hf-token-speculators

ae7c0bc

yewentao256 added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 24, 2026

javierdejesusda requested a review from yewentao256 March 24, 2026 17:13

yewentao256 approved these changes Mar 24, 2026

View reviewed changes

yewentao256 merged commit 54b0578 into vllm-project:main Mar 24, 2026
58 of 59 checks passed

RhizoNymph pushed a commit to RhizoNymph/vllm that referenced this pull request Mar 26, 2026

[Bugfix] Pass hf_token through config loading paths for gated model s…

3ea0881

…upport (vllm-project#37920) Signed-off-by: javierdejesusda <javier.dejesusj9@gmail.com>

HenryTangDev pushed a commit to HenryTangMain/vllm that referenced this pull request Mar 27, 2026

[Bugfix] Pass hf_token through config loading paths for gated model s…

4abfd6e

…upport (vllm-project#37920) Signed-off-by: javierdejesusda <javier.dejesusj9@gmail.com>

khairulkabir1661 pushed a commit to khairulkabir1661/vllm that referenced this pull request Mar 27, 2026

[Bugfix] Pass hf_token through config loading paths for gated model s…

8fbbb35

…upport (vllm-project#37920) Signed-off-by: javierdejesusda <javier.dejesusj9@gmail.com>

JiantaoXu pushed a commit to JiantaoXu/vllm that referenced this pull request Mar 28, 2026

[Bugfix] Pass hf_token through config loading paths for gated model s…

a246679

…upport (vllm-project#37920) Signed-off-by: javierdejesusda <javier.dejesusj9@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugfix] Pass hf_token through config loading paths for gated model support#37920

[Bugfix] Pass hf_token through config loading paths for gated model support#37920
yewentao256 merged 2 commits intovllm-project:mainfrom
javierdejesusda:fix/hf-token-speculators

javierdejesusda commented Mar 23, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

yewentao256 left a comment

Uh oh!

yewentao256 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

javierdejesusda commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

yewentao256 left a comment

Choose a reason for hiding this comment

Uh oh!

yewentao256 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

javierdejesusda commented Mar 23, 2026 •

edited

Loading