Skip to content

Fix wrong kernel selection for int32/int64 indices#16912

Merged
Fridge003 merged 1 commit intomainfrom
revert-16273-jit_set_kv
Jan 12, 2026
Merged

Fix wrong kernel selection for int32/int64 indices#16912
Fridge003 merged 1 commit intomainfrom
revert-16273-jit_set_kv

Conversation

@hnyls2002
Copy link
Copy Markdown
Collaborator

@hnyls2002 hnyls2002 commented Jan 11, 2026

Bug introduced in #16273

CI breaks.

@hnyls2002
Copy link
Copy Markdown
Collaborator Author

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @hnyls2002, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request fully reverts the previously introduced feature 'Support JIT set kv cache' (#16273). The primary motivation for this revert is to address continuous integration (CI) failures that were occurring after the feature's introduction. The changes systematically remove all components related to the JIT KV-cache, including its custom CUDA kernel, Python bindings, benchmarking, testing, and its integration within the memory management system, aiming to restore stability to the CI pipeline.

Highlights

  • JIT KV-Cache Feature Removal: The core Just-In-Time (JIT) KV-cache kernel (kvcache.cuh) and its Python wrapper (kvcache.py) have been completely removed, effectively undoing the 'Support JIT set kv cache' feature.
  • Memory Pool Integration Reverted: The memory_pool.py file has been updated to remove all integration points for the JIT KV-cache, including the _set_kv_buffer_impl function, related imports, and direct assignment logic for K/V cache updates.
  • Associated Files Deleted: Dedicated benchmark (bench_store_cache.py), test files (test_store_cache.py), and a utility file (utils.py) specifically created for the JIT KV-cache feature have been deleted.
  • Norm Module Cleanup: The norm.py module and its corresponding C++ kernel (qknorm.cuh) have been adjusted. The qknorm.cuh file was removed, and norm.py now points to a more general norm.cuh and uses a renamed internal function _jit_norm_module.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@hnyls2002
Copy link
Copy Markdown
Collaborator Author

/rerun-stage stage-c-test-large-4-gpu

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request reverts the feature 'Support JIT set kv cache' due to CI breakages. The changes correctly remove the new JIT kernel for setting KV cache, including its implementation, benchmarks, and tests. The code in memory_pool.py that utilized this feature has been reverted to its previous implementation. The revert appears to be mostly complete. However, I've found a minor issue in a benchmark file that seems to be an incomplete part of the revert, which I've commented on.

@github-actions
Copy link
Copy Markdown
Contributor

✅ Triggered stage-c-test-large-4-gpu to run independently (skipping dependencies).

@github-actions
Copy link
Copy Markdown
Contributor

🔗 View workflow run

@DarkSharpness DarkSharpness force-pushed the revert-16273-jit_set_kv branch from 314e4e8 to 3cd69a1 Compare January 11, 2026 18:24
@DarkSharpness
Copy link
Copy Markdown
Collaborator

/rerun-stage stage-c-test-large-4-gpu

@github-actions
Copy link
Copy Markdown
Contributor

✅ Triggered stage-c-test-large-4-gpu to run independently (skipping dependencies).

@github-actions
Copy link
Copy Markdown
Contributor

🔗 View workflow run

@DarkSharpness DarkSharpness force-pushed the revert-16273-jit_set_kv branch from 3cd69a1 to 1b30c3b Compare January 11, 2026 21:12
@DarkSharpness
Copy link
Copy Markdown
Collaborator

/tag-and-rerun-ci

@DarkSharpness DarkSharpness changed the title Revert "[Feature] Support JIT set kv cache" Fix "[Feature] Support JIT set kv cache" Jan 11, 2026
@hnyls2002 hnyls2002 changed the title Fix "[Feature] Support JIT set kv cache" Fix wrong kernel selection for int32/int64 indices Jan 12, 2026
@DarkSharpness
Copy link
Copy Markdown
Collaborator

/rerun-failed-ci

@Fridge003 Fridge003 merged commit 2b3791e into main Jan 12, 2026
446 of 467 checks passed
@Fridge003 Fridge003 deleted the revert-16273-jit_set_kv branch January 12, 2026 09:26
whybeyoung pushed a commit to whybeyoung/sglang that referenced this pull request Jan 14, 2026
Co-authored-by: DarkSharpness <2040703891@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants