Skip to content

Conversation

@skyzh
Copy link
Contributor

@skyzh skyzh commented Sep 19, 2025

Motivation

There seems to be two bugs in the radix cache:

  • In the case of page_size != 1 we might get key == 0 after truncating the key; in that case, we should return an empty match in advance
  • self.evictable_size_ when adding/removing the leaf should both use value instead of key

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @skyzh, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses two critical bugs within the radix cache implementation. It refines the prefix matching logic to correctly handle edge cases where keys become empty after page-alignment, preventing potential errors. Additionally, it rectifies the memory accounting mechanism by ensuring that the evictable_size_ is updated based on the actual key length, rather than the value length, for more accurate cache management.

Highlights

  • Radix Cache Matching Logic: Corrected the match_prefix method to handle edge cases where the key becomes empty after page-alignment truncation, ensuring an empty match result is returned to prevent incorrect behavior.
  • Memory Accounting Accuracy: Fixed an issue in the _insert_helper method where evictable_size_ was incorrectly updated using the length of the value instead of the key, leading to more accurate memory accounting for the radix cache.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly addresses two bugs in the radix cache. First, it adds a check to handle cases where a key becomes empty after page-size alignment, improving robustness. The associated refactoring to create an empty_match_result helper function also enhances code clarity. Second, it fixes a memory accounting inconsistency by using len(key) instead of len(value) for evictable_size_ updates during insertion, aligning it with the deletion logic. The changes are well-implemented and improve the overall quality of the code.

@zhaochenyang20
Copy link
Collaborator

fix the lint plz @skyzh

@skyzh
Copy link
Contributor Author

skyzh commented Sep 19, 2025

thanks :) lint should work now

@skyzh
Copy link
Contributor Author

skyzh commented Sep 19, 2025

@Edenzzzz thanks for the reviews :) there's also:

            if node.lock_ref == 0:
                self.evictable_size_ -= len(node.value)
                self.protected_size_ += len(node.value)

in the same file, which sounds like we are moving something from evictable to protected, should I also use node.key here?

@Edenzzzz
Copy link
Contributor

cc @xiezhq-hermann I feel these two can be used interchangeably?

@skyzh skyzh mentioned this pull request Sep 23, 2025
4 tasks
if self.page_size != 1:
page_aligned_len = len(key) // self.page_size * self.page_size
key = key[:page_aligned_len]

if len(key) == 0:
return empty_match_result()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's good we do a guardrail here, but is this causing a bug? I believe there are protection within _match_prefix_helper as well.

Copy link
Contributor Author

@skyzh skyzh Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kind of.. this comes from my experiment to enable deterministic radix cache by simply setting the page_size_ in radix cache to the split size of prefill - it yields several issues. So this might just work fine with the current code.

My other argument is that we are going to return empty anyways, so we can skip a bunch of code here and do a shortcircuit path to directly return empty?

@skyzh skyzh force-pushed the skyzh/radix-cache-fix branch from c68487b to 3306486 Compare October 1, 2025 03:56
Signed-off-by: Alex Chi Z <[email protected]>

fix lint

Signed-off-by: Alex Chi Z <[email protected]>

actually we should use len(value)?

Signed-off-by: Alex Chi Z <[email protected]>
@skyzh skyzh force-pushed the skyzh/radix-cache-fix branch from 3306486 to f41d2ad Compare October 1, 2025 03:58
@skyzh skyzh requested a review from xiezhq-hermann October 1, 2025 03:58
skyzh added 2 commits October 1, 2025 03:59
Signed-off-by: Alex Chi Z <[email protected]>
@skyzh
Copy link
Contributor Author

skyzh commented Oct 1, 2025

ready for review again :) thanks! cc @xiezhq-hermann

@xiezhq-hermann xiezhq-hermann self-assigned this Oct 1, 2025
@Fridge003 Fridge003 merged commit 1a31229 into sgl-project:main Oct 3, 2025
161 of 183 checks passed
0xtoward pushed a commit to 0xtoward/sglang that referenced this pull request Oct 5, 2025
ch-tiger1 pushed a commit to ch-tiger1/sglang that referenced this pull request Oct 9, 2025
lpc0220 pushed a commit to lpc0220/sglang that referenced this pull request Oct 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants