bump version to 0.6.6 by aleozlx · Pull Request #2724 · flashinfer-ai/flashinfer

aleozlx · 2026-03-09T04:48:35Z

📌 Description

🔍 "Gated-by" PR list

https://github.com/flashinfer-ai/flashinfer/pulls?q=is%3Aopen+is%3Apr+label%3Av0.6.6

#2730 Resolving gated_delta_rule_mtp breaking change before release

🚀 Pull Request Checklist

Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.

✅ Pre-commit Checks

I have installed pre-commit by running pip install pre-commit (or used your preferred method).
I have installed the hooks with pre-commit install.
I have run the hooks manually with pre-commit run --all-files and fixed any reported issues.

If you are unsure about how to set up pre-commit, see the pre-commit documentation.

🧪 Tests

Tests have been added or updated as needed.
All tests are passing (unittest, etc.).

Reviewer Notes

API changes review

$ git diff v0.6.5 | grep -A20 @flashinfer_api                                                                                                                                                                         (version_bump✱)
 @flashinfer_api
 def gated_delta_rule_decode_pretranspose(
     q: torch.Tensor,
     k: torch.Tensor,
     v: torch.Tensor,
-    state: torch.Tensor,
+    state: Optional[torch.Tensor],
     A_log: torch.Tensor,
     a: torch.Tensor,
     dt_bias: torch.Tensor,
@@ -951,6 +113,8 @@ def gated_delta_rule_decode_pretranspose(
     scale: Optional[float] = None,
     output: Optional[torch.Tensor] = None,
     use_qk_l2norm: bool = True,
+    initial_state: Optional[torch.Tensor] = None,
+    initial_state_indices: Optional[torch.Tensor] = None,
 ) -> Tuple[torch.Tensor, torch.Tensor]:
     r"""Gated Delta Rule Decode kernel for single-token generation.

@@ -964,10 +128,11 @@ def gated_delta_rule_decode_pretranspose(
             Current key of shape ``[B, 1, H, K]``. Must be float16/bfloat16.
--
 @flashinfer_api
 def gated_delta_rule_mtp(
     q: torch.Tensor,
@@ -2398,7 +487,7 @@ def gated_delta_rule_mtp(
     scale: Optional[float] = None,
     output: Optional[torch.Tensor] = None,
     intermediate_states_buffer: Optional[torch.Tensor] = None,
-    disable_state_update: bool = True,
+    disable_state_update: bool = False,
     use_qk_l2norm: bool = True,
 ) -> Tuple[torch.Tensor, torch.Tensor]:
     """

gated_delta_rule_mtp will be resolved in PR Deprecation for gated_delta_rule_mtp's intermediate_states_buffer=True #2730

Summary by CodeRabbit

Chores
- Version bump to 0.6.6

…on_bump

…sion_bump

…on_bump

coderabbitai · 2026-03-09T04:48:59Z

📝 Walkthrough

Walkthrough

This pull request bumps the version number in version.txt from 0.6.5 to 0.6.6. The change is a single-line version metadata update with no modifications to code logic, structure, or functionality.

Changes

Cohort / File(s)	Summary
Version Bump `version.txt`	Version number incremented from 0.6.5 to 0.6.6

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~1 minute

Possibly related PRs

Version bump to 0.6.5 #2668: Direct predecessor version bump updating version.txt from 0.6.4 to 0.6.5
release: bump version to 0.6.2 #2411: Related version bump to version.txt with same change pattern
bump version to 0.6.3 #2497: Similar version.txt-only release version bump (0.6.2→0.6.3)

Suggested labels

run-ci

Suggested reviewers

yzh119
yongwww

Poem

🐰 A tiny hop in digits bright,
From five-point-six, point-five takes flight,
To six it bounces, small but true,
Version whispers what code will do! 🎉

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description lacks the required brief summary of changes in the 'Description' section, leaving it as a template comment placeholder.	Fill in the 'Description' section with a brief explanation of why the version is being bumped to 0.6.6 and summarize the key changes being released.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'bump version to 0.6.6' directly and clearly describes the main change in the changeset, which is a version bump from 0.6.5 to 0.6.6.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-03-09T04:49:48Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request primarily focuses on updating the version number to 0.6.6 and incorporates API modifications related to the gated delta rule functions. It is intended to bundle several related changes marked with the v0.6.6 label.

Highlights

Version Bump: This PR increments the version number from 0.6.5 to 0.6.6 in the version.txt file.
Gated-by PRs: This PR is gated by a list of other PRs, all labeled with v0.6.6, ensuring that related changes are merged together.
API Changes: The PR includes API changes, specifically related to the gated_delta_rule_decode_pretranspose and gated_delta_rule_mtp functions, as highlighted in the reviewer notes.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

flashinfer_api
- Modified gated_delta_rule_decode_pretranspose to accept optional state, initial_state and initial_state_indices
- Modified gated_delta_rule_mtp to default disable_state_update to False
version.txt
- Updated version number to 0.6.6

Activity

The PR includes a version bump from 0.6.5 to 0.6.6.
API changes are introduced, affecting gated_delta_rule_decode_pretranspose and gated_delta_rule_mtp.
The PR is gated by other PRs with the label v0.6.6.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request updates the version from 0.6.5 to 0.6.6. Based on the API changes noted in the pull request description, particularly a change in a default parameter value which could be a breaking change, I've suggested considering a minor version bump to 0.7.0 instead, in adherence with semantic versioning principles.

_{Note: Security Review has been skipped due to the limited scope of the PR.}

gemini-code-assist · 2026-03-09T04:50:30Z

version.txt

@@ -1 +1 @@
-0.6.5
+0.6.6


Given the API changes mentioned in the pull request description, specifically the change of the default value for disable_state_update in gated_delta_rule_mtp from True to False, this could be a breaking change for users who were relying on the previous default behavior. According to semantic versioning practices for projects in version 0.x, introducing breaking changes should result in a minor version bump. A patch version is typically for backward-compatible bug fixes. Therefore, a version bump to 0.7.0 might be more appropriate for this release.

0.7.0

aleozlx · 2026-03-09T18:02:37Z

/bot run

flashinfer-bot · 2026-03-09T18:04:16Z

GitLab MR !392 has been created, and the CI pipeline #45728223 is currently running. I'll report back once the pipeline job completes.

flashinfer-bot · 2026-03-09T23:16:56Z

[SUCCESS] Pipeline #45728223: 10/20 passed

## 📌 Description  ## 🔍 "Gated-by" PR list https://github.com/flashinfer-ai/flashinfer/pulls?q=is%3Aopen+is%3Apr+label%3Av0.6.6 flashinfer-ai#2730 Resolving gated_delta_rule_mtp breaking change before release ## 🚀 Pull Request Checklist Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete. ### ✅ Pre-commit Checks - [ ] I have installed `pre-commit` by running `pip install pre-commit` (or used your preferred method). - [ ] I have installed the hooks with `pre-commit install`. - [ ] I have run the hooks manually with `pre-commit run --all-files` and fixed any reported issues. > If you are unsure about how to set up `pre-commit`, see [the pre-commit documentation](https://pre-commit.com/). ## 🧪 Tests - [ ] Tests have been added or updated as needed. - [ ] All tests are passing (`unittest`, etc.). ## Reviewer Notes **API changes review** ```diff $ git diff v0.6.5 | grep -A20 @flashinfer_api (version_bump✱) @flashinfer_api def gated_delta_rule_decode_pretranspose( q: torch.Tensor, k: torch.Tensor, v: torch.Tensor, - state: torch.Tensor, + state: Optional[torch.Tensor], A_log: torch.Tensor, a: torch.Tensor, dt_bias: torch.Tensor, @@ -951,6 +113,8 @@ def gated_delta_rule_decode_pretranspose( scale: Optional[float] = None, output: Optional[torch.Tensor] = None, use_qk_l2norm: bool = True, + initial_state: Optional[torch.Tensor] = None, + initial_state_indices: Optional[torch.Tensor] = None, ) -> Tuple[torch.Tensor, torch.Tensor]: r"""Gated Delta Rule Decode kernel for single-token generation. @@ -964,10 +128,11 @@ def gated_delta_rule_decode_pretranspose( Current key of shape ``[B, 1, H, K]``. Must be float16/bfloat16. -- @flashinfer_api def gated_delta_rule_mtp( q: torch.Tensor, @@ -2398,7 +487,7 @@ def gated_delta_rule_mtp( scale: Optional[float] = None, output: Optional[torch.Tensor] = None, intermediate_states_buffer: Optional[torch.Tensor] = None, - disable_state_update: bool = True, + disable_state_update: bool = False, use_qk_l2norm: bool = True, ) -> Tuple[torch.Tensor, torch.Tensor]: """ ``` * gated_delta_rule_mtp will be resolved in PR flashinfer-ai#2730  ## Summary by CodeRabbit * **Chores** * Version bump to 0.6.6

## 📌 Description  ## 🔍 "Gated-by" PR list https://github.com/flashinfer-ai/flashinfer/pulls?q=is%3Aopen+is%3Apr+label%3Av0.6.6 flashinfer-ai#2730 Resolving gated_delta_rule_mtp breaking change before release ## 🚀 Pull Request Checklist Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete. ### ✅ Pre-commit Checks - [ ] I have installed `pre-commit` by running `pip install pre-commit` (or used your preferred method). - [ ] I have installed the hooks with `pre-commit install`. - [ ] I have run the hooks manually with `pre-commit run --all-files` and fixed any reported issues. > If you are unsure about how to set up `pre-commit`, see [the pre-commit documentation](https://pre-commit.com/). ## 🧪 Tests - [ ] Tests have been added or updated as needed. - [ ] All tests are passing (`unittest`, etc.). ## Reviewer Notes **API changes review** ```diff $ git diff v0.6.5 | grep -A20 @flashinfer_api (version_bump✱) @flashinfer_api def gated_delta_rule_decode_pretranspose( q: torch.Tensor, k: torch.Tensor, v: torch.Tensor, - state: torch.Tensor, + state: Optional[torch.Tensor], A_log: torch.Tensor, a: torch.Tensor, dt_bias: torch.Tensor, @@ -951,6 +113,8 @@ def gated_delta_rule_decode_pretranspose( scale: Optional[float] = None, output: Optional[torch.Tensor] = None, use_qk_l2norm: bool = True, + initial_state: Optional[torch.Tensor] = None, + initial_state_indices: Optional[torch.Tensor] = None, ) -> Tuple[torch.Tensor, torch.Tensor]: r"""Gated Delta Rule Decode kernel for single-token generation. @@ -964,10 +128,11 @@ def gated_delta_rule_decode_pretranspose( Current key of shape ``[B, 1, H, K]``. Must be float16/bfloat16. -- @flashinfer_api def gated_delta_rule_mtp( q: torch.Tensor, @@ -2398,7 +487,7 @@ def gated_delta_rule_mtp( scale: Optional[float] = None, output: Optional[torch.Tensor] = None, intermediate_states_buffer: Optional[torch.Tensor] = None, - disable_state_update: bool = True, + disable_state_update: bool = False, use_qk_l2norm: bool = True, ) -> Tuple[torch.Tensor, torch.Tensor]: """ ``` * gated_delta_rule_mtp will be resolved in PR flashinfer-ai#2730  ## Summary by CodeRabbit * **Chores** * Version bump to 0.6.6  Signed-off-by: Amey Naik <212485788+ameynaik-hub@users.noreply.github.com>

aleozlx and others added 10 commits February 4, 2026 15:12

bump version to 0.6.3

5420355

Merge branch 'main' into version_bump

0bbb174

bump version to 0.6.4

bf2e46b

Merge branch 'main' of github.com:flashinfer-ai/flashinfer into versi…

8eddbd7

…on_bump

Merge branch 'version_bump' of github.com:aleozlx/flashinfer into ver…

3202074

…sion_bump

- stale CHANGELOG.md

bfe9f00

Merge branch 'main' into version_bump

6db5225

bump version to 0.6.5

d0d616d

Merge branch 'main' of github.com:flashinfer-ai/flashinfer into versi…

ad1548f

…on_bump

bump version to 0.6.6

7fc9363

gemini-code-assist bot reviewed Mar 9, 2026

View reviewed changes

aleozlx mentioned this pull request Mar 9, 2026

0.6.5 auto-tuning crash #2701

Closed

aleozlx marked this pull request as ready for review March 9, 2026 18:00

aleozlx added the run-ci label Mar 9, 2026

yongwww approved these changes Mar 9, 2026

View reviewed changes

yzh119 merged commit a15c357 into flashinfer-ai:main Mar 9, 2026
33 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bump version to 0.6.6#2724

bump version to 0.6.6#2724
yzh119 merged 10 commits intoflashinfer-ai:mainfrom
aleozlx:version_bump

aleozlx commented Mar 9, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Mar 9, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist bot commented Mar 9, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 9, 2026

Uh oh!

aleozlx commented Mar 9, 2026

Uh oh!

flashinfer-bot commented Mar 9, 2026

Uh oh!

flashinfer-bot commented Mar 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

aleozlx commented Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📌 Description

🔍 "Gated-by" PR list

🚀 Pull Request Checklist

✅ Pre-commit Checks

🧪 Tests

Reviewer Notes

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist bot commented Mar 9, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

aleozlx commented Mar 9, 2026

Uh oh!

flashinfer-bot commented Mar 9, 2026

Uh oh!

flashinfer-bot commented Mar 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

aleozlx commented Mar 9, 2026 •

edited

Loading

coderabbitai bot commented Mar 9, 2026 •

edited

Loading