Skip to content

test: use .float() in in F.cosine_similarity() in bmm_fp8 test#2266

Merged
bkryu merged 1 commit intoflashinfer-ai:mainfrom
yongwww:fix_bmm_test
Dec 25, 2025
Merged

test: use .float() in in F.cosine_similarity() in bmm_fp8 test#2266
bkryu merged 1 commit intoflashinfer-ai:mainfrom
yongwww:fix_bmm_test

Conversation

@yongwww
Copy link
Copy Markdown
Member

@yongwww yongwww commented Dec 24, 2025

📌 Description

saw some test failures on Blackwell boards after #2261, all the failed assertions are related to the large dim 10304.

Use .float() to help reduce precision loss during cosine_similarity (dot(x, y) / (||x|| * ||y||)) check.

FAILED tests/gemm/test_bmm_fp8.py::test_bmm_fp8[True-cutlass-res_dtype1-mat2_dtype0-input_dtype0-256-10304-128-16] - AssertionError: assert tensor(0., device='cuda:0') > 0.99
2025-12-24T07:00:08.299846Z 01O FAILED tests/gemm/test_bmm_fp8.py::test_bmm_fp8[False-cudnn-res_dtype1-mat2_dtype0-input_dtype1-256-10304-128-16] - AssertionError: assert tensor(0., device='cuda:0') > 0.99
... # the failure occurs for all backend (cutlass, cudnn, etc)

cc: @zihaoye @bkryu

🔍 Related Issues

🚀 Pull Request Checklist

Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.

✅ Pre-commit Checks

  • I have installed pre-commit by running pip install pre-commit (or used your preferred method).
  • I have installed the hooks with pre-commit install.
  • I have run the hooks manually with pre-commit run --all-files and fixed any reported issues.

If you are unsure about how to set up pre-commit, see the pre-commit documentation.

🧪 Tests

  • Tests have been added or updated as needed.
  • All tests are passing (unittest, etc.).

Reviewer Notes

Summary by CodeRabbit

  • Tests
    • Improved test accuracy by ensuring tensor comparisons use floating-point precision for cosine similarity calculations.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Dec 24, 2025

📝 Walkthrough

Walkthrough

A test file was modified to explicitly cast both reference and result tensors to float before computing cosine similarity. This ensures the comparison metric is calculated in floating-point precision rather than the tensors' original data types.

Changes

Cohort / File(s) Summary
Test Precision Fix
tests/gemm/test_bmm_fp8.py
Cast reference and result tensors to float before computing cosine similarity to ensure floating-point precision in test comparisons

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Poem

🐰 A rabbit hops through tests with glee,
Float casting brings precision free!
No more surprises, just clean math,
Similarity flows down the true path.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: casting tensors to float in F.cosine_similarity() call within the bmm_fp8 test to address precision-related test failures.
Description check ✅ Passed PR description includes problem statement, solution explanation, test failures evidence, and completed checklist items, meeting the template requirements.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @yongwww, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves intermittent test failures observed on Blackwell boards in the bmm_fp8 tests. The issue stemmed from precision loss during the cosine similarity calculation, leading to incorrect assertions. By explicitly casting the tensors to float before the similarity check, the PR ensures more robust and accurate test outcomes, preventing future false negatives related to floating-point precision.

Highlights

  • Test Fix: Addresses AssertionError in bmm_fp8 tests on Blackwell boards by casting tensors to float() before calculating cosine similarity to mitigate precision loss.
  • Precision Improvement: Explicitly converts reference and res tensors to float type when computing F.cosine_similarity to ensure higher precision in the comparison.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@yongwww
Copy link
Copy Markdown
Member Author

yongwww commented Dec 24, 2025

/bot run

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses test failures on Blackwell boards by casting the reference and res tensors to float32 before calculating cosine similarity in test_bmm_fp8.py. This is a robust solution to prevent precision loss when working with low-precision dtypes like bfloat16 and float16, especially when dealing with large tensors. The change is correct, well-justified, and improves the stability of the test suite. The code is clean and the change is minimal and well-targeted. I approve this change.

@flashinfer-bot
Copy link
Copy Markdown
Collaborator

GitLab MR !217 has been created, and the CI pipeline #40763568 is currently running. I'll report back once the pipeline job completes.

@flashinfer-bot
Copy link
Copy Markdown
Collaborator

[FAILED] Pipeline #40763568: 9/20 passed

@bkryu bkryu merged commit 43ec6c7 into flashinfer-ai:main Dec 25, 2025
4 checks passed
@yongwww yongwww deleted the fix_bmm_test branch December 25, 2025 00:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants