Skip to content

[Compressed Tensors] Allow configs with non-explicit ignores#41965

Merged
vllm-bot merged 3 commits intovllm-project:mainfrom
neuralmagic:kylesayrs/ct-non-explicit-ignore
May 7, 2026
Merged

[Compressed Tensors] Allow configs with non-explicit ignores#41965
vllm-bot merged 3 commits intovllm-project:mainfrom
neuralmagic:kylesayrs/ct-non-explicit-ignore

Conversation

@kylesayrs
Copy link
Copy Markdown
Contributor

@kylesayrs kylesayrs commented May 7, 2026

Purpose

  • Allow for greater config flexibility by not requiring all layers to be listed in the ignore list
    • Any layers which do not match schemes are assumed to be ignored

Previously:

"quantization_config": {
  "config_groups": {
    "group_0": {
      "targets": [
        "re:.*attn.*_proj$",
      ],
    }
  },
  "ignore": [
    "layers.2.attn.indexer.weights_proj",
    "layers.4.attn.indexer.weights_proj",
    "layers.6.attn.indexer.weights_proj",
    "layers.8.attn.indexer.weights_proj",
    "layers.10.attn.indexer.weights_proj",
  ],
},

Now:

"quantization_config": {
  "config_groups": {
    "group_0": {
      "targets": [
        "re:.*attn.*_proj$",
      ],
    }
  },
  "ignore": [],
},

Changes

  • Allow find_matched_target to return None, and treat a non-match against schemes as an ignored layer

Testing

  • Added test_find_matched_target_returns_none_on_no_match
  • Added test_get_scheme_dict_returns_none_on_no_match

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors find_matched_target to return None instead of raising a ValueError when a target is not found, updating the calling logic in compressed_tensors.py to handle these cases explicitly. Feedback was provided regarding a potential AttributeError in get_scheme_dict if the retrieved scheme dictionary is None, along with a suggestion to avoid in-place modifications of the configuration dictionary.

Copy link
Copy Markdown
Member

@mgoin mgoin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Kyle, thanks for cleaning up the ValueError. Can you add a unit test to solidify this expected behavior?

Copy link
Copy Markdown
Contributor

@dsikka dsikka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but we should probably add a basic smoke test for this.

kylesayrs added 2 commits May 7, 2026 11:57
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
@mgoin mgoin added ready ONLY add when PR is ready to merge/full CI is needed quantization labels May 7, 2026
@vllm-bot vllm-bot merged commit c1819ca into vllm-project:main May 7, 2026
66 of 71 checks passed
@kylesayrs kylesayrs deleted the kylesayrs/ct-non-explicit-ignore branch May 7, 2026 21:53
libinta pushed a commit to libinta/vllm that referenced this pull request May 8, 2026
…oject#41965)

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Libin Tang <libin.tang@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

quantization ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants