[Bugfix] Pass hf_token through config loading paths for gated model support#37920
Merged
yewentao256 merged 2 commits intovllm-project:mainfrom Mar 24, 2026
Merged
Conversation
Contributor
There was a problem hiding this comment.
Code Review
The pull request introduces support for passing a HuggingFace token (hf_token) through the vLLM engine configuration. This token is now accepted as an argument in create_engine_config and maybe_override_with_speculators, and is subsequently used when calling PretrainedConfig.get_config_dict to enable authenticated access for fetching model configurations from HuggingFace.
abe3183 to
2206515
Compare
…upport Fixes vllm-project#31894 The hf_token argument provided to the LLM instance was not forwarded to HuggingFace API calls during config loading, causing gated models to fail authentication. - Add hf_token parameter to maybe_override_with_speculators() and pass it as token= to PretrainedConfig.get_config_dict() - Pass token=self.hf_token from ModelConfig to get_config() so it flows through to all HF config loading calls via **kwargs - Add hf_token parameter to try_get_generation_config() and pass it to GenerationConfig.from_pretrained() and the get_config() fallback Signed-off-by: javierdejesusda <javier.dejesusj9@gmail.com>
2206515 to
139af70
Compare
yewentao256
reviewed
Mar 24, 2026
Member
yewentao256
left a comment
There was a problem hiding this comment.
Please merge from main to fix the pre-commit issue
yewentao256
approved these changes
Mar 24, 2026
Member
yewentao256
left a comment
There was a problem hiding this comment.
LGTM, thanks for the work!
RhizoNymph
pushed a commit
to RhizoNymph/vllm
that referenced
this pull request
Mar 26, 2026
…upport (vllm-project#37920) Signed-off-by: javierdejesusda <javier.dejesusj9@gmail.com>
HenryTangDev
pushed a commit
to HenryTangMain/vllm
that referenced
this pull request
Mar 27, 2026
…upport (vllm-project#37920) Signed-off-by: javierdejesusda <javier.dejesusj9@gmail.com>
malaiwah
pushed a commit
to malaiwah/vllm
that referenced
this pull request
Mar 27, 2026
…upport (vllm-project#37920) Signed-off-by: javierdejesusda <javier.dejesusj9@gmail.com> Signed-off-by: Michel Belleau <michel.belleau@malaiwah.com>
khairulkabir1661
pushed a commit
to khairulkabir1661/vllm
that referenced
this pull request
Mar 27, 2026
…upport (vllm-project#37920) Signed-off-by: javierdejesusda <javier.dejesusj9@gmail.com>
Monishver11
pushed a commit
to Monishver11/vllm
that referenced
this pull request
Mar 27, 2026
…upport (vllm-project#37920) Signed-off-by: javierdejesusda <javier.dejesusj9@gmail.com> Signed-off-by: Monishver Chandrasekaran <monishverchandrasekaran@gmail.com>
nithinvc
pushed a commit
to nithinvc/vllm
that referenced
this pull request
Mar 27, 2026
…upport (vllm-project#37920) Signed-off-by: javierdejesusda <javier.dejesusj9@gmail.com> Signed-off-by: Nithin Chalapathi <nithin.ch10@gmail.com>
JiantaoXu
pushed a commit
to JiantaoXu/vllm
that referenced
this pull request
Mar 28, 2026
…upport (vllm-project#37920) Signed-off-by: javierdejesusda <javier.dejesusj9@gmail.com>
vrdn-23
pushed a commit
to vrdn-23/vllm
that referenced
this pull request
Mar 30, 2026
…upport (vllm-project#37920) Signed-off-by: javierdejesusda <javier.dejesusj9@gmail.com> Signed-off-by: Vinay Damodaran <vrdn@hey.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
Fixes #31894
The
hf_tokenargument provided to theLLMinstance was not forwarded to HuggingFace API calls during config loading, causing gated models to fail authentication whenhf_tokenis passed explicitly (rather than via environment variable).Three gaps existed:
maybe_override_with_speculators()calledPretrainedConfig.get_config_dict()without the token, so speculators auto-detection failed for gated models.ModelConfig.__post_init__()calledget_config()without forwardingself.hf_token, so the main config loading path (PretrainedConfig.get_config_dict,AutoConfig.from_pretrained,config_class.from_pretrained) also lacked authentication.try_get_generation_config()calledGenerationConfig.from_pretrained()without the token, so generation config loading failed for gated models.This fix threads
hf_tokenthrough all three paths, usingtoken=at the HuggingFace API boundary (matching the convention at config.py L1031).Note: PR #31974 attempted a similar fix but was closed with 0 commits.
Test Plan
Test Result
Non-gated models work as before. With
hf_token=None(default), behavior is unchanged — HuggingFace Hub falls back to environment-based token lookup internally.