Skip to content

[Bugfix][CI/Build] Fix failing pooling models test due to Triton kernel accuracy diff#31776

Merged
noooop merged 3 commits intovllm-project:mainfrom
Isotr0py:fix-pooling-test
Jan 6, 2026
Merged

[Bugfix][CI/Build] Fix failing pooling models test due to Triton kernel accuracy diff#31776
noooop merged 3 commits intovllm-project:mainfrom
Isotr0py:fix-pooling-test

Conversation

@Isotr0py
Copy link
Copy Markdown
Member

@Isotr0py Isotr0py commented Jan 6, 2026

Purpose

Test Plan

pytest -s -v tests/models/language/pooling/test_token_classification.py::test_modernbert_models[float-disham993/electrical-ner-ModernBERT-base]

Test Result

Can't reproduce locally, hope this make CI green. 🙏


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
@Isotr0py Isotr0py requested a review from noooop as a code owner January 6, 2026 05:56
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a failing test for pooling models by increasing the numerical tolerance in an assertion. While this is a pragmatic fix for the immediate CI failure, I have raised a high-severity concern. The relaxed tolerance, especially when combined with the existing absolute tolerance, could potentially mask future regressions in the model's output. My review comment suggests adding explanatory comments to the code and exploring whether the tolerances can be defined more precisely to ensure the test remains as strict as possible while accounting for expected numerical differences. This is crucial for maintaining the integrity and reliability of the test suite.

hf_output = hf_output.detach().clone().cpu().float()
vllm_output = vllm_output.detach().clone().cpu().float()
assert torch.allclose(hf_output, vllm_output, atol=1e-2)
assert torch.allclose(hf_output, vllm_output, atol=1e-2, rtol=1e-3)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

While this change fixes the immediate CI failure, relaxing test tolerances, especially on top of an existing atol=1e-2, increases the risk of masking future regressions. To ensure the test remains as strict as possible while accounting for the numerical noise from the Triton kernel, please consider the following:

  1. Add a brief code comment explaining why this specific test for ModernBERT requires this rtol, unlike the other tests in this file. This provides vital context for future maintenance.
  2. If the discrepancy is primarily relative, could the absolute tolerance atol be tightened? A more precise test might use something like atol=1e-5, rtol=1e-3, which would be stricter for small-magnitude outputs.

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
@Isotr0py Isotr0py added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 6, 2026
@Isotr0py Isotr0py changed the title [Bugfix][CI/Build] Fix failing pooling models test due to Trion kernel accuracy diff [Bugfix][CI/Build] Fix failing pooling models test due to Triton kernel accuracy diff Jan 6, 2026
@noooop noooop enabled auto-merge (squash) January 6, 2026 06:17
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
@noooop noooop merged commit ee2e69d into vllm-project:main Jan 6, 2026
19 checks passed
@Isotr0py Isotr0py deleted the fix-pooling-test branch January 6, 2026 11:53
LucasWilkinson pushed a commit to neuralmagic/vllm that referenced this pull request Jan 6, 2026
…el accuracy diff (vllm-project#31776)

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
yugong333 pushed a commit to yugong333/vllm that referenced this pull request Jan 9, 2026
…el accuracy diff (vllm-project#31776)

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
MitchLewis930 added a commit to Signal65/vllm-code-review that referenced this pull request Jan 14, 2026
akh64bit pushed a commit to akh64bit/vllm that referenced this pull request Jan 16, 2026
…el accuracy diff (vllm-project#31776)

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
dsuhinin pushed a commit to dsuhinin/vllm that referenced this pull request Jan 21, 2026
…el accuracy diff (vllm-project#31776)

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026
…el accuracy diff (vllm-project#31776)

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants