[CI] Enable all hf transformers baselines in test_hybrid#23936
Merged
heheda12345 merged 6 commits intovllm-project:mainfrom Sep 2, 2025
Merged
[CI] Enable all hf transformers baselines in test_hybrid#23936heheda12345 merged 6 commits intovllm-project:mainfrom
heheda12345 merged 6 commits intovllm-project:mainfrom
Conversation
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
Contributor
There was a problem hiding this comment.
Code Review
This pull request enables Hugging Face Transformers baselines for all hybrid models in the test suite. This is made possible by a recent fix in transformers v4.55.3 that resolves issues with Mamba-related models. The changes involve removing the HF_UNSUPPORTED_MODELS list and updating the conditions in tests to always run the baseline comparison. Additionally, the minimum required transformers version for BambaForCausalLM and JambaForCausalLM has been correctly updated to 4.55.3. The changes are straightforward, correct, and improve test coverage.
heheda12345
reviewed
Aug 29, 2025
Collaborator
heheda12345
left a comment
There was a problem hiding this comment.
Can you also remove the if hf_outputs is not None checks?
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
Member
Author
|
@heheda12345 done |
845473182
pushed a commit
to 845473182/vllm
that referenced
this pull request
Sep 3, 2025
* 'main' of https://github.com/845473182/vllm: (457 commits) [BugFix] Fix routed_scaling_factor double mul for dots1 and glm4 MoE models (vllm-project#24132) [Misc] Add check for dual_chunk_attention (vllm-project#24070) [Doc]: fix typos in Python comments (vllm-project#24115) [Doc]: fix typos in Python comments (vllm-project#24093) [Compile] Fix Compile Warning for `w4a8_mm_entry.cu` (vllm-project#23660) fix some typos (vllm-project#24071) [V1] Wrapper which plumbs request-level logits processors into vLLM batch-level logits processing (vllm-project#23656) Upgrade xgrammar to 0.1.23 (vllm-project#22988) Update release pipeline post PyTorch 2.8.0 update (vllm-project#24073) [XPU] Fix the bug of LoRA logits on the XPU platform (vllm-project#24081) [CI/Build] Disable SiluMul NVFP4 quant fusion tests (vllm-project#24121) [Bug] R1 Accuracy: Fix `routed_scaling_factor` Double Mul Issue (vllm-project#24119) [AMD][Kernel][Bugfix] Cast offsets tensor bn to tl.int64 to avoid GPU segfault (vllm-project#23692) [CI] Enable all hf transformers baselines in test_hybrid (vllm-project#23936) [Log] Only Print Profiler Results on Rank 0 (vllm-project#23370) Fix weights loading for Apertus (vllm-project#24100) [Metrics] Deprecate TPOT in favor of ITL (vllm-project#24110) [Bugfix] Fix packed_factor missing attribute error (vllm-project#23902) Run ruff format on a few files. (vllm-project#24075) [Bugfix] Fix transform_config parsing in Compressed Tensors (vllm-project#23945) ...
eicherseiji
pushed a commit
to eicherseiji/vllm
that referenced
this pull request
Sep 9, 2025
…t#23936) Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
FeiDaLI
pushed a commit
to FeiDaLI/vllm
that referenced
this pull request
Sep 25, 2025
…t#23936) Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
HF transformers recently released v4.55.3 that contains a fix for the mamba-related issues that prevented us from comparing to transformers as a baseline in the hybrid model tests. I also checked that the two models we listed in
HF_UNSUPPORTED_MODELSnow seem to work fine.This is a useful step towards removing V0 code, since at that point we will no longer be able to use V0 output as a baseline for the V1 output, so we need to be able to rely on transformers for that.
cc @heheda12345
Test Plan
I will trigger Hybrid test in CI.
Test Result
Passing.
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.