Skip to content

[Bugfix] Pass hf_token through config loading paths for gated model support#37920

Merged
yewentao256 merged 2 commits intovllm-project:mainfrom
javierdejesusda:fix/hf-token-speculators
Mar 24, 2026
Merged

[Bugfix] Pass hf_token through config loading paths for gated model support#37920
yewentao256 merged 2 commits intovllm-project:mainfrom
javierdejesusda:fix/hf-token-speculators

Conversation

@javierdejesusda
Copy link
Copy Markdown
Contributor

@javierdejesusda javierdejesusda commented Mar 23, 2026

Purpose

Fixes #31894

The hf_token argument provided to the LLM instance was not forwarded to HuggingFace API calls during config loading, causing gated models to fail authentication when hf_token is passed explicitly (rather than via environment variable).

Three gaps existed:

  1. maybe_override_with_speculators() called PretrainedConfig.get_config_dict() without the token, so speculators auto-detection failed for gated models.
  2. ModelConfig.__post_init__() called get_config() without forwarding self.hf_token, so the main config loading path (PretrainedConfig.get_config_dict, AutoConfig.from_pretrained, config_class.from_pretrained) also lacked authentication.
  3. try_get_generation_config() called GenerationConfig.from_pretrained() without the token, so generation config loading failed for gated models.

This fix threads hf_token through all three paths, using token= at the HuggingFace API boundary (matching the convention at config.py L1031).

Note: PR #31974 attempted a similar fix but was closed with 0 commits.

Test Plan

# Non-gated model (no regression)
python -c "from vllm.transformers_utils.config import maybe_override_with_speculators; \
  print(maybe_override_with_speculators('gpt2', None, False))"

# Gated model with explicit token (speculators path)
python -c "from vllm.transformers_utils.config import maybe_override_with_speculators; \
  print(maybe_override_with_speculators('meta-llama/Llama-2-7b-hf', None, False, hf_token='hf_...'))"

Test Result

Non-gated models work as before. With hf_token=None (default), behavior is unchanged — HuggingFace Hub falls back to environment-based token lookup internally.


  • The purpose of the PR
  • The test plan
  • The test results

@mergify mergify bot added the bug Something isn't working label Mar 23, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces support for passing a HuggingFace token (hf_token) through the vLLM engine configuration. This token is now accepted as an argument in create_engine_config and maybe_override_with_speculators, and is subsequently used when calling PretrainedConfig.get_config_dict to enable authenticated access for fetching model configurations from HuggingFace.

@javierdejesusda javierdejesusda force-pushed the fix/hf-token-speculators branch 2 times, most recently from abe3183 to 2206515 Compare March 23, 2026 19:57
@javierdejesusda javierdejesusda changed the title [Bugfix] Pass hf_token to maybe_override_with_speculators for gated model support [Bugfix] Pass hf_token through config loading paths for gated model support Mar 23, 2026
…upport

Fixes vllm-project#31894

The hf_token argument provided to the LLM instance was not forwarded
to HuggingFace API calls during config loading, causing gated models
to fail authentication.

- Add hf_token parameter to maybe_override_with_speculators() and pass
  it as token= to PretrainedConfig.get_config_dict()
- Pass token=self.hf_token from ModelConfig to get_config() so it flows
  through to all HF config loading calls via **kwargs
- Add hf_token parameter to try_get_generation_config() and pass it to
  GenerationConfig.from_pretrained() and the get_config() fallback

Signed-off-by: javierdejesusda <javier.dejesusj9@gmail.com>
@javierdejesusda javierdejesusda force-pushed the fix/hf-token-speculators branch from 2206515 to 139af70 Compare March 23, 2026 20:41
Copy link
Copy Markdown
Member

@yewentao256 yewentao256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please merge from main to fix the pre-commit issue

@yewentao256 yewentao256 added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 24, 2026
Copy link
Copy Markdown
Member

@yewentao256 yewentao256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the work!

@yewentao256 yewentao256 merged commit 54b0578 into vllm-project:main Mar 24, 2026
58 of 59 checks passed
RhizoNymph pushed a commit to RhizoNymph/vllm that referenced this pull request Mar 26, 2026
…upport (vllm-project#37920)

Signed-off-by: javierdejesusda <javier.dejesusj9@gmail.com>
HenryTangDev pushed a commit to HenryTangMain/vllm that referenced this pull request Mar 27, 2026
…upport (vllm-project#37920)

Signed-off-by: javierdejesusda <javier.dejesusj9@gmail.com>
malaiwah pushed a commit to malaiwah/vllm that referenced this pull request Mar 27, 2026
…upport (vllm-project#37920)

Signed-off-by: javierdejesusda <javier.dejesusj9@gmail.com>
Signed-off-by: Michel Belleau <michel.belleau@malaiwah.com>
khairulkabir1661 pushed a commit to khairulkabir1661/vllm that referenced this pull request Mar 27, 2026
…upport (vllm-project#37920)

Signed-off-by: javierdejesusda <javier.dejesusj9@gmail.com>
Monishver11 pushed a commit to Monishver11/vllm that referenced this pull request Mar 27, 2026
…upport (vllm-project#37920)

Signed-off-by: javierdejesusda <javier.dejesusj9@gmail.com>
Signed-off-by: Monishver Chandrasekaran <monishverchandrasekaran@gmail.com>
nithinvc pushed a commit to nithinvc/vllm that referenced this pull request Mar 27, 2026
…upport (vllm-project#37920)

Signed-off-by: javierdejesusda <javier.dejesusj9@gmail.com>

Signed-off-by: Nithin Chalapathi <nithin.ch10@gmail.com>
JiantaoXu pushed a commit to JiantaoXu/vllm that referenced this pull request Mar 28, 2026
…upport (vllm-project#37920)

Signed-off-by: javierdejesusda <javier.dejesusj9@gmail.com>
vrdn-23 pushed a commit to vrdn-23/vllm that referenced this pull request Mar 30, 2026
…upport (vllm-project#37920)

Signed-off-by: javierdejesusda <javier.dejesusj9@gmail.com>
Signed-off-by: Vinay Damodaran <vrdn@hey.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] hf_token argument to LLM in Python SDK ignored in vllm.transformer_utils.config

2 participants