[Bugfix] Fix Sparse24 Compressed Tensors models by kylesayrs · Pull Request #33446 · vllm-project/vllm

kylesayrs · 2026-01-30T21:10:46Z

Purpose

Fix sparse24 compressed tensors models

Background

#30141 introduced a check against config_groups in the compressed tensors config. However, the config_groups field is not populated in the case of sparse24 models.

This regression should have been caught by tests/quantization/test_compressed_tensors.py::test_compressed_tensors_2of4_sparse, however this test is skipped by the CI because the CI runs with CC<90. A broader discussion should be had as to whether quantization tests should be run with higher CC, likely accompanied with a pruning of existing quantization tests.

Changes

Check for existence of config_groups field before filtering out Attention groups
Change sparse24 CC logic to check for exactly hopper (sparse24 kernels are not forwards compatible)

Testing

test_compressed_tensors_2of4_sparse and test_compressed_tensors_2of4_sparse_compressed previously failed but now passes on hopper machine
Tested sanity with nm-testing/TinyLlama-1.1B-Chat-v1.0-sparse2of4_only-e2e on hopper machine

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

gemini-code-assist

Code Review

This pull request addresses a regression that caused a KeyError when loading sparse24 compressed tensor models. The fix involves adding a check to ensure the config_groups key exists in the configuration dictionary before attempting to process it. This is a correct and necessary change that prevents the crash. The implementation is clear and directly solves the issue. I approve these changes.

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

kylesayrs · 2026-01-30T22:52:05Z

@dsikka @tlrmchlsmth

mgoin · 2026-01-30T23:29:14Z

Can we just remove 24 sparsity instead?

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> Co-authored-by: Michael Goin <mgoin64@gmail.com>

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> Co-authored-by: Michael Goin <mgoin64@gmail.com> Signed-off-by: Eldar Kurtic <research@neuralmagic.com>

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> Co-authored-by: Michael Goin <mgoin64@gmail.com>

check config_groups

0f55bc8

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

mergify bot added the bug Something isn't working label Jan 30, 2026

gemini-code-assist bot reviewed Jan 30, 2026

View reviewed changes

kylesayrs added 3 commits January 30, 2026 16:18

populate config_groups so that sparse24 model compressor can work

b007afc

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

do not pipe if sparse24

c3b4979

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

update error

c7ad3ac

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

mergify bot added the nvidia label Jan 30, 2026

github-project-automation bot added this to NVIDIA Jan 30, 2026

change to be equal

474c4da

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

kylesayrs marked this pull request as ready for review January 30, 2026 22:42

kylesayrs requested review from 22quinn, mgoin, pavanimajety, robertgshaw2-redhat, tlrmchlsmth and yewentao256 as code owners January 30, 2026 22:42

mgoin approved these changes Feb 9, 2026

View reviewed changes

github-project-automation bot moved this to Ready in NVIDIA Feb 9, 2026

mgoin added the ready ONLY add when PR is ready to merge/full CI is needed label Feb 9, 2026

mgoin and others added 2 commits February 9, 2026 10:27

Merge branch 'main' into kylesayrs/check-sparse-only-models

017638d

Merge branch 'main' into kylesayrs/check-sparse-only-models

d4a7027

vllm-bot merged commit e9cd691 into vllm-project:main Feb 12, 2026
104 of 109 checks passed

github-project-automation bot moved this from Ready to Done in NVIDIA Feb 12, 2026

warichet pushed a commit to warichet/vllm that referenced this pull request Feb 12, 2026

[Bugfix] Fix Sparse24 Compressed Tensors models (vllm-project#33446)

12a24dd

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> Co-authored-by: Michael Goin <mgoin64@gmail.com>

kylesayrs deleted the kylesayrs/check-sparse-only-models branch February 16, 2026 15:43

llsj14 pushed a commit to llsj14/vllm that referenced this pull request Mar 1, 2026

[Bugfix] Fix Sparse24 Compressed Tensors models (vllm-project#33446)

6b98631

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> Co-authored-by: Michael Goin <mgoin64@gmail.com>

tunglinwood pushed a commit to tunglinwood/vllm that referenced this pull request Mar 4, 2026

[Bugfix] Fix Sparse24 Compressed Tensors models (vllm-project#33446)

dbcd2db

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> Co-authored-by: Michael Goin <mgoin64@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugfix] Fix Sparse24 Compressed Tensors models#33446

[Bugfix] Fix Sparse24 Compressed Tensors models#33446
vllm-bot merged 7 commits intovllm-project:mainfrom
neuralmagic:kylesayrs/check-sparse-only-models

kylesayrs commented Jan 30, 2026 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

kylesayrs commented Jan 30, 2026

Uh oh!

mgoin commented Jan 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

kylesayrs commented Jan 30, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Background

Changes

Testing

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

kylesayrs commented Jan 30, 2026

Uh oh!

mgoin commented Jan 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kylesayrs commented Jan 30, 2026 •

edited by github-actions bot

Loading