Skip to content

bump version to 0.6.6#2724

Merged
yzh119 merged 10 commits intoflashinfer-ai:mainfrom
aleozlx:version_bump
Mar 9, 2026
Merged

bump version to 0.6.6#2724
yzh119 merged 10 commits intoflashinfer-ai:mainfrom
aleozlx:version_bump

Conversation

@aleozlx
Copy link
Collaborator

@aleozlx aleozlx commented Mar 9, 2026

📌 Description

🔍 "Gated-by" PR list

https://github.com/flashinfer-ai/flashinfer/pulls?q=is%3Aopen+is%3Apr+label%3Av0.6.6

#2730 Resolving gated_delta_rule_mtp breaking change before release

🚀 Pull Request Checklist

Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.

✅ Pre-commit Checks

  • I have installed pre-commit by running pip install pre-commit (or used your preferred method).
  • I have installed the hooks with pre-commit install.
  • I have run the hooks manually with pre-commit run --all-files and fixed any reported issues.

If you are unsure about how to set up pre-commit, see the pre-commit documentation.

🧪 Tests

  • Tests have been added or updated as needed.
  • All tests are passing (unittest, etc.).

Reviewer Notes

API changes review

$ git diff v0.6.5 | grep -A20 @flashinfer_api                                                                                                                                                                         (version_bump✱)
 @flashinfer_api
 def gated_delta_rule_decode_pretranspose(
     q: torch.Tensor,
     k: torch.Tensor,
     v: torch.Tensor,
-    state: torch.Tensor,
+    state: Optional[torch.Tensor],
     A_log: torch.Tensor,
     a: torch.Tensor,
     dt_bias: torch.Tensor,
@@ -951,6 +113,8 @@ def gated_delta_rule_decode_pretranspose(
     scale: Optional[float] = None,
     output: Optional[torch.Tensor] = None,
     use_qk_l2norm: bool = True,
+    initial_state: Optional[torch.Tensor] = None,
+    initial_state_indices: Optional[torch.Tensor] = None,
 ) -> Tuple[torch.Tensor, torch.Tensor]:
     r"""Gated Delta Rule Decode kernel for single-token generation.

@@ -964,10 +128,11 @@ def gated_delta_rule_decode_pretranspose(
             Current key of shape ``[B, 1, H, K]``. Must be float16/bfloat16.
--
 @flashinfer_api
 def gated_delta_rule_mtp(
     q: torch.Tensor,
@@ -2398,7 +487,7 @@ def gated_delta_rule_mtp(
     scale: Optional[float] = None,
     output: Optional[torch.Tensor] = None,
     intermediate_states_buffer: Optional[torch.Tensor] = None,
-    disable_state_update: bool = True,
+    disable_state_update: bool = False,
     use_qk_l2norm: bool = True,
 ) -> Tuple[torch.Tensor, torch.Tensor]:
     """

Summary by CodeRabbit

  • Chores
    • Version bump to 0.6.6

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 9, 2026

📝 Walkthrough

Walkthrough

This pull request bumps the version number in version.txt from 0.6.5 to 0.6.6. The change is a single-line version metadata update with no modifications to code logic, structure, or functionality.

Changes

Cohort / File(s) Summary
Version Bump
version.txt
Version number incremented from 0.6.5 to 0.6.6

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~1 minute

Possibly related PRs

Suggested labels

run-ci

Suggested reviewers

  • yzh119
  • yongwww

Poem

🐰 A tiny hop in digits bright,
From five-point-six, point-five takes flight,
To six it bounces, small but true,
Version whispers what code will do! 🎉

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description lacks the required brief summary of changes in the 'Description' section, leaving it as a template comment placeholder. Fill in the 'Description' section with a brief explanation of why the version is being bumped to 0.6.6 and summarize the key changes being released.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'bump version to 0.6.6' directly and clearly describes the main change in the changeset, which is a version bump from 0.6.5 to 0.6.6.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request primarily focuses on updating the version number to 0.6.6 and incorporates API modifications related to the gated delta rule functions. It is intended to bundle several related changes marked with the v0.6.6 label.

Highlights

  • Version Bump: This PR increments the version number from 0.6.5 to 0.6.6 in the version.txt file.
  • Gated-by PRs: This PR is gated by a list of other PRs, all labeled with v0.6.6, ensuring that related changes are merged together.
  • API Changes: The PR includes API changes, specifically related to the gated_delta_rule_decode_pretranspose and gated_delta_rule_mtp functions, as highlighted in the reviewer notes.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • flashinfer_api
    • Modified gated_delta_rule_decode_pretranspose to accept optional state, initial_state and initial_state_indices
    • Modified gated_delta_rule_mtp to default disable_state_update to False
  • version.txt
    • Updated version number to 0.6.6
Activity
  • The PR includes a version bump from 0.6.5 to 0.6.6.
  • API changes are introduced, affecting gated_delta_rule_decode_pretranspose and gated_delta_rule_mtp.
  • The PR is gated by other PRs with the label v0.6.6.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the version from 0.6.5 to 0.6.6. Based on the API changes noted in the pull request description, particularly a change in a default parameter value which could be a breaking change, I've suggested considering a minor version bump to 0.7.0 instead, in adherence with semantic versioning principles.

Note: Security Review has been skipped due to the limited scope of the PR.

@@ -1 +1 @@
0.6.5
0.6.6
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Given the API changes mentioned in the pull request description, specifically the change of the default value for disable_state_update in gated_delta_rule_mtp from True to False, this could be a breaking change for users who were relying on the previous default behavior. According to semantic versioning practices for projects in version 0.x, introducing breaking changes should result in a minor version bump. A patch version is typically for backward-compatible bug fixes. Therefore, a version bump to 0.7.0 might be more appropriate for this release.

0.7.0

@aleozlx aleozlx mentioned this pull request Mar 9, 2026
@aleozlx aleozlx marked this pull request as ready for review March 9, 2026 18:00
@aleozlx
Copy link
Collaborator Author

aleozlx commented Mar 9, 2026

/bot run

@aleozlx aleozlx added the run-ci label Mar 9, 2026
@flashinfer-bot
Copy link
Collaborator

GitLab MR !392 has been created, and the CI pipeline #45728223 is currently running. I'll report back once the pipeline job completes.

@flashinfer-bot
Copy link
Collaborator

[SUCCESS] Pipeline #45728223: 10/20 passed

@yzh119 yzh119 merged commit a15c357 into flashinfer-ai:main Mar 9, 2026
33 checks passed
frankwang28 pushed a commit to frankwang28/flashinfer that referenced this pull request Mar 18, 2026
<!-- .github/pull_request_template.md -->

## 📌 Description

<!-- What does this PR do? Briefly describe the changes and why they’re
needed. -->

## 🔍 "Gated-by" PR list


https://github.com/flashinfer-ai/flashinfer/pulls?q=is%3Aopen+is%3Apr+label%3Av0.6.6

flashinfer-ai#2730 Resolving
gated_delta_rule_mtp breaking change before release

## 🚀 Pull Request Checklist

Thank you for contributing to FlashInfer! Before we review your pull
request, please make sure the following items are complete.

### ✅ Pre-commit Checks

- [ ] I have installed `pre-commit` by running `pip install pre-commit`
(or used your preferred method).
- [ ] I have installed the hooks with `pre-commit install`.
- [ ] I have run the hooks manually with `pre-commit run --all-files`
and fixed any reported issues.

> If you are unsure about how to set up `pre-commit`, see [the
pre-commit documentation](https://pre-commit.com/).

## 🧪 Tests

- [ ] Tests have been added or updated as needed.
- [ ] All tests are passing (`unittest`, etc.).

## Reviewer Notes

**API changes review**

```diff
$ git diff v0.6.5 | grep -A20 @flashinfer_api                                                                                                                                                                         (version_bump✱)
 @flashinfer_api
 def gated_delta_rule_decode_pretranspose(
     q: torch.Tensor,
     k: torch.Tensor,
     v: torch.Tensor,
-    state: torch.Tensor,
+    state: Optional[torch.Tensor],
     A_log: torch.Tensor,
     a: torch.Tensor,
     dt_bias: torch.Tensor,
@@ -951,6 +113,8 @@ def gated_delta_rule_decode_pretranspose(
     scale: Optional[float] = None,
     output: Optional[torch.Tensor] = None,
     use_qk_l2norm: bool = True,
+    initial_state: Optional[torch.Tensor] = None,
+    initial_state_indices: Optional[torch.Tensor] = None,
 ) -> Tuple[torch.Tensor, torch.Tensor]:
     r"""Gated Delta Rule Decode kernel for single-token generation.

@@ -964,10 +128,11 @@ def gated_delta_rule_decode_pretranspose(
             Current key of shape ``[B, 1, H, K]``. Must be float16/bfloat16.
--
 @flashinfer_api
 def gated_delta_rule_mtp(
     q: torch.Tensor,
@@ -2398,7 +487,7 @@ def gated_delta_rule_mtp(
     scale: Optional[float] = None,
     output: Optional[torch.Tensor] = None,
     intermediate_states_buffer: Optional[torch.Tensor] = None,
-    disable_state_update: bool = True,
+    disable_state_update: bool = False,
     use_qk_l2norm: bool = True,
 ) -> Tuple[torch.Tensor, torch.Tensor]:
     """
```

* gated_delta_rule_mtp will be resolved in PR
flashinfer-ai#2730

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Chores**
  * Version bump to 0.6.6

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
ameynaik-hub pushed a commit to ameynaik-hub/flashinfer that referenced this pull request Mar 18, 2026
<!-- .github/pull_request_template.md -->

## 📌 Description

<!-- What does this PR do? Briefly describe the changes and why they’re
needed. -->

## 🔍 "Gated-by" PR list

https://github.com/flashinfer-ai/flashinfer/pulls?q=is%3Aopen+is%3Apr+label%3Av0.6.6

flashinfer-ai#2730 Resolving
gated_delta_rule_mtp breaking change before release

## 🚀 Pull Request Checklist

Thank you for contributing to FlashInfer! Before we review your pull
request, please make sure the following items are complete.

### ✅ Pre-commit Checks

- [ ] I have installed `pre-commit` by running `pip install pre-commit`
(or used your preferred method).
- [ ] I have installed the hooks with `pre-commit install`.
- [ ] I have run the hooks manually with `pre-commit run --all-files`
and fixed any reported issues.

> If you are unsure about how to set up `pre-commit`, see [the
pre-commit documentation](https://pre-commit.com/).

## 🧪 Tests

- [ ] Tests have been added or updated as needed.
- [ ] All tests are passing (`unittest`, etc.).

## Reviewer Notes

**API changes review**

```diff
$ git diff v0.6.5 | grep -A20 @flashinfer_api                                                                                                                                                                         (version_bump✱)
 @flashinfer_api
 def gated_delta_rule_decode_pretranspose(
     q: torch.Tensor,
     k: torch.Tensor,
     v: torch.Tensor,
-    state: torch.Tensor,
+    state: Optional[torch.Tensor],
     A_log: torch.Tensor,
     a: torch.Tensor,
     dt_bias: torch.Tensor,
@@ -951,6 +113,8 @@ def gated_delta_rule_decode_pretranspose(
     scale: Optional[float] = None,
     output: Optional[torch.Tensor] = None,
     use_qk_l2norm: bool = True,
+    initial_state: Optional[torch.Tensor] = None,
+    initial_state_indices: Optional[torch.Tensor] = None,
 ) -> Tuple[torch.Tensor, torch.Tensor]:
     r"""Gated Delta Rule Decode kernel for single-token generation.

@@ -964,10 +128,11 @@ def gated_delta_rule_decode_pretranspose(
             Current key of shape ``[B, 1, H, K]``. Must be float16/bfloat16.
--
 @flashinfer_api
 def gated_delta_rule_mtp(
     q: torch.Tensor,
@@ -2398,7 +487,7 @@ def gated_delta_rule_mtp(
     scale: Optional[float] = None,
     output: Optional[torch.Tensor] = None,
     intermediate_states_buffer: Optional[torch.Tensor] = None,
-    disable_state_update: bool = True,
+    disable_state_update: bool = False,
     use_qk_l2norm: bool = True,
 ) -> Tuple[torch.Tensor, torch.Tensor]:
     """
```

* gated_delta_rule_mtp will be resolved in PR
flashinfer-ai#2730

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Chores**
  * Version bump to 0.6.6

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Amey Naik <212485788+ameynaik-hub@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants