Add dynamic tokens-per-page TRTLLM-GEN GQA kernels by PerkzZheng · Pull Request #3259 · flashinfer-ai/flashinfer

PerkzZheng · 2026-05-07T14:20:40Z

Summary

update TRTLLM-GEN FMHA artifact path and checksum to the newly published cubins
select dynamic tokens-per-page kernels for paged GQA decode and prefill when page size is at least 128
keep MLA decode kernel selection on dense-mask keys and add Blackwell-only dynamic page-size coverage
relax fp16 Blackwell attention-sink context tolerance to match observed precision noise

Tests

pre-commit run on changed files
python3 -m pytest -q -n 8 --tb=short --maxfail=5 -k trtllm-gen-mla-filter tests/attention/test_trtllm_gen_mla.py: 7686 passed
python3 -m pytest -q -n 8 --tb=short tests/attention/test_attention_sink_blackwell.py: 144 passed

Summary by CodeRabbit

Chores
- Updated an artifact path and its checksum manifest.
New Features
- Dynamic page-size selection for generation attention kernels on supported GPUs.
- Per-token variable sparse-MLA top‑K lengths for finer-grained sparse attention.
- Expanded MLA-generation kernel support for additional head/dimension combos.
Tests
- Added GPU-only tests for dynamic page-size prefill/decode scenarios.
- Relaxed float16 numerical tolerance in an attention test.

coderabbitai · 2026-05-07T14:21:02Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a1b37111-fcd5-4d5e-9270-caac8390b54f

📥 Commits

Reviewing files that changed from the base of the PR and between dffd4b83dfd4da812afcbe277cee082dfeb1d45b and eb503ea.

📒 Files selected for processing (6)

flashinfer/artifacts.py
include/flashinfer/trtllm/fmha/fmhaKernels.cuh
include/flashinfer/trtllm/fmha/fmhaRunnerParams.h
include/flashinfer/trtllm/fmha/kernelParams.h
tests/attention/test_attention_sink_blackwell.py
tests/attention/test_trtllm_gen_attention.py

🚧 Files skipped from review as they are similar to previous changes (5)

flashinfer/artifacts.py
tests/attention/test_attention_sink_blackwell.py
tests/attention/test_trtllm_gen_attention.py
include/flashinfer/trtllm/fmha/fmhaRunnerParams.h
include/flashinfer/trtllm/fmha/fmhaKernels.cuh

📝 Walkthrough

Walkthrough

Adds dynamic numTokensPerPage selection for TRTLLM FMHA kernels, extends parameter structs for sparse-MLA, updates kernel selection and hash encoding (including MLA Dense-mask handling), bumps the TRTLLM_GEN_FMHA artifact path and checksum, and adds Blackwell-gated tests plus a float16 tolerance tweak.

Changes

Dynamic numTokensPerPage Selection for TRTLLM FMHA

Layer / File(s)	Summary
Parameter Contracts `include/flashinfer/trtllm/fmha/fmhaRunnerParams.h`, `include/flashinfer/trtllm/fmha/kernelParams.h`	`TllmGenSelectKernelParams` adds `mDynamicNumTokensPerPage` (initialized false). `KernelParams` adds `tmaKSlidingWindowKvPool_` and `ptrSparseMlaTopKLens`; `setKernelParams` initializes `ptrSparseMlaTopKLens` to `nullptr`.
Kernel Selection Logic `include/flashinfer/trtllm/fmha/fmhaKernels.cuh`	Adds constants and `selectNumTokensPerPage` with eligibility rules; allows `numTokensPerPage==0` in hash validation and encodes log2 as 0 when 0; centralizes page selection before heuristic probing; sets dynamic flag for kernel-keyed generated cubins; forces Dense mask for MLA generation; skips sliding-window/chunked causal mask override for MLA generation; debug/hash string includes `dynamicNumTokensPerPage`.
Artifact Versioning `flashinfer/artifacts.py`	Updates `ArtifactPath.TRTLLM_GEN_FMHA` and `CheckSumHash.TRTLLM_GEN_FMHA` to new directory hash and checksum manifest SHA256.
Tests / Tolerance `tests/attention/test_trtllm_gen_attention.py`, `tests/attention/test_attention_sink_blackwell.py`	Adds `_skip_if_not_blackwell()` and Blackwell-gated parametrized tests `test_trtllm_batch_prefill_dynamic_page_size_gqa` and `test_trtllm_batch_decode_dynamic_page_size_gqa`; increases float16 `atol` from `1e-3` to `2e-3` in the sink test; removes an earlier head-dim-512 prefill block.

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant Selector
  participant Hasher
  participant KernelPicker
  Caller->>Selector: selectNumTokensPerPage(params)
  Selector-->>Caller: set mNumTokensPerPage, mDynamicNumTokensPerPage
  Caller->>Hasher: build/validate kernel hash (numTokensPerPageLog2)
  Hasher-->>KernelPicker: candidate hashes
  KernelPicker-->>Caller: selected cubin (Dense mask for MLA gen)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

flashinfer-ai/flashinfer#1912: Modifies the same TRTLLM_GEN_FMHA artifact path/checksum entries.
flashinfer-ai/flashinfer#1967: Related edits to ArtifactPath and CheckSumHash definitions and TRTLLM_GEN_FMHA entries.
flashinfer-ai/flashinfer#2062: Overlapping FMHA kernel selection/hash and KernelParams changes.

Suggested labels

run-ci

Suggested reviewers

sricketts
aleozlx
yzh119
cyx-6
bkryu
samuellees
yongwww
nv-yunzheq
jimmyzho
kahyunnam
yyihuang

Poem

🐰 I hopped through hashes, pages, and kernels bright,
Dynamic tokens per page now take flight.
Dense masks for MLA, tests gated with care,
Artifacts bumped — new cubins are there!
🥕✨

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description lacks the required template sections (Description, Related Issues, Checklist items) and doesn't follow the specified format, though it does contain relevant technical content.	Reformat the description to follow the repository's template, including the Description section, Related Issues section, and completion of the Pre-commit Checks and Tests checklists.
Docstring Coverage	⚠️ Warning	Docstring coverage is 30.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title accurately describes the main change: adding dynamic tokens-per-page TRTLLM-GEN GQA kernels, which aligns with the primary objective and code modifications.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-05-07T14:25:17Z

Warning

Gemini is experiencing higher than usual traffic and was unable to create the review. Please try again in a few hours by commenting /gemini review.

PerkzZheng · 2026-05-07T14:33:39Z

/bot run

flashinfer-bot · 2026-05-07T14:34:32Z

GitLab MR !644 has been created, and the CI pipeline #50558261 is currently running. I'll report back once the pipeline job completes.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@include/flashinfer/trtllm/fmha/kernelParams.h`:
- Around line 47-48: The field CUtensorMap tmaKSlidingWindowKvPool_ is declared
but never initialized or used; either remove this dead member or initialize it
in setKernelParams by calling buildNdTmaDescriptor in the same sparse-MLA
conditional where tmaK_ and tmaV_ are built (mirror their construction/guards),
ensuring the descriptor parameters match the DSv4 sparse MLA sliding-window KV
pool layout if you intend to support that path; update any related cleanup/usage
accordingly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: efb8b457-2332-4fab-9813-3e51975b0e44

📥 Commits

Reviewing files that changed from the base of the PR and between 3ddae5562724a157a2f9e06ba150a05f72a32cca and 8b17dc664bacc1e5685345ba66e92b4411465d84.

📒 Files selected for processing (6)

flashinfer/artifacts.py
include/flashinfer/trtllm/fmha/fmhaKernels.cuh
include/flashinfer/trtllm/fmha/fmhaRunnerParams.h
include/flashinfer/trtllm/fmha/kernelParams.h
tests/attention/test_attention_sink_blackwell.py
tests/attention/test_trtllm_gen_attention.py

🚧 Files skipped from review as they are similar to previous changes (4)

include/flashinfer/trtllm/fmha/fmhaRunnerParams.h
flashinfer/artifacts.py
include/flashinfer/trtllm/fmha/fmhaKernels.cuh
tests/attention/test_trtllm_gen_attention.py

coderabbitai · 2026-05-07T14:37:02Z

+  // TMA descriptor for DSv4 sparse MLA sliding-window KV pool. Same format as tmaK_.
+  CUtensorMap tmaKSlidingWindowKvPool_;


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Find all sites that set tmaKSlidingWindowKvPool_ to verify caller-side initialization. rg -n "tmaKSlidingWindowKvPool_" --type cpp --type h -C 4

Repository: flashinfer-ai/flashinfer

Length of output: 800

🏁 Script executed:

# Find the setKernelParams function and buildNdTmaDescriptor calls rg -n "setKernelParams|buildNdTmaDescriptor" --type cpp --type h -A 3 -B 1

Repository: flashinfer-ai/flashinfer

Length of output: 9188

🏁 Script executed:

# Check the memset call around line 639 in kernelParams.h head -660 include/flashinfer/trtllm/fmha/kernelParams.h | tail -50

Repository: flashinfer-ai/flashinfer

Length of output: 2785

🏁 Script executed:

# Search for where tmaKSlidingWindowKvPool_ is actually used/read rg -n "tmaKSlidingWindowKvPool_" --type cpp --type cu

Repository: flashinfer-ai/flashinfer

Length of output: 95

🏁 Script executed:

# Read the full setKernelParams function (roughly lines 634-850) sed -n '634,850p' include/flashinfer/trtllm/fmha/kernelParams.h

Repository: flashinfer-ai/flashinfer

Length of output: 11415

🏁 Script executed:

# Search for any usage/reads of tmaKSlidingWindowKvPool_ without file type restriction rg "tmaKSlidingWindowKvPool_" -C 4

Repository: flashinfer-ai/flashinfer

Length of output: 773

🏁 Script executed:

# Look for sparse MLA related code to understand if this field is relevant rg "sparse.*MLA|SlidingWindow" -C 2

Repository: flashinfer-ai/flashinfer

Length of output: 6475

🏁 Script executed:

# Search for tmaKSlidingWindowKvPool_ in all files, focusing on kernel code find . -type f \( -name "*.cuh" -o -name "*.cu" \) -exec grep -l "tmaKSlidingWindowKvPool_" {} \;

Repository: flashinfer-ai/flashinfer

Length of output: 50

🏁 Script executed:

# Also search in any generated or header files for kernel access rg "tmaKSlidingWindowKvPool_" . --type-list

Repository: flashinfer-ai/flashinfer

Length of output: 458

🏁 Script executed:

# Look for the sparse MLA kernel to see if it uses this field rg -n "sparse.*kernel|slidingWindow" --type-list 2>/dev/null | head -20

Repository: flashinfer-ai/flashinfer

Length of output: 50

🏁 Script executed:

# Search for kernel code more broadly - look for .cuh and .cu files find . -type f \( -name "*.cuh" -o -name "*.cu" \) | head -20

Repository: flashinfer-ai/flashinfer

Length of output: 668

🏁 Script executed:

# Search all files for any reference to this field with broader patterns rg "SlidingWindowKvPool" .

Repository: flashinfer-ai/flashinfer

Length of output: 156

🏁 Script executed:

# Check git history to see if this field was recently added git log --oneline --all -- "include/flashinfer/trtllm/fmha/kernelParams.h" | head -10

Repository: flashinfer-ai/flashinfer

Length of output: 842

🏁 Script executed:

# See the actual commit that added this field git log -p --all -S "tmaKSlidingWindowKvPool_" -- "include/flashinfer/trtllm/fmha/kernelParams.h" | head -100

Repository: flashinfer-ai/flashinfer

Length of output: 50

🏁 Script executed:

# Look at recent commits to kernelParams.h to find when this field was added git log -p --all -- "include/flashinfer/trtllm/fmha/kernelParams.h" | grep -B 10 -A 10 "tmaKSlidingWindowKvPool_" | head -50

Repository: flashinfer-ai/flashinfer

Length of output: 50

🏁 Script executed:

# Also check if there are any TODOs or comments about this field being unused rg -B 5 -A 5 "tmaKSlidingWindowKvPool_"

Repository: flashinfer-ai/flashinfer

Length of output: 893

🏁 Script executed:

# Search for any test or reference that might use sparse MLA with sliding window rg "sparse.*sliding|slidingWindow.*sparse" -C 3

Repository: flashinfer-ai/flashinfer

Length of output: 633

🏁 Script executed:

# Check the fmhaKernels.cuh file that handles sparse MLA selection cat csrc/trtllm_fmha_kernel_launcher.cu | grep -A 20 -B 5 "sparse_mla"

Repository: flashinfer-ai/flashinfer

Length of output: 8495

Remove tmaKSlidingWindowKvPool_ or implement its initialization for sparse MLA sliding-window support.

The field is declared but never populated via buildNdTmaDescriptor and is unused throughout the codebase. Either remove it as dead code, or if intended for future sparse MLA sliding-window support, add the corresponding buildNdTmaDescriptor call inside setKernelParams (guarded by the sparse-MLA path, similar to how tmaK_ and tmaV_ are constructed).

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@include/flashinfer/trtllm/fmha/kernelParams.h` around lines 47 - 48, The field CUtensorMap tmaKSlidingWindowKvPool_ is declared but never initialized or used; either remove this dead member or initialize it in setKernelParams by calling buildNdTmaDescriptor in the same sparse-MLA conditional where tmaK_ and tmaV_ are built (mirror their construction/guards), ensuring the descriptor parameters match the DSv4 sparse MLA sliding-window KV pool layout if you intend to support that path; update any related cleanup/usage accordingly.

Not sure if I'm missing something, but what's the reason for this never actually being initialized?

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Search for any TODO/FIXME near this field or sparse MLA sliding window rg -n "TODO|FIXME|tmaKSlidingWindowKvPool_|SlidingWindowKvPool|sparse.*sliding|sliding.*sparse" -C 3 # Also check if the cubin kernels might reference struct offsets that imply this field must exist rg -n "sparseMla|sparse_mla|mSparseMla" --type cpp --type h -C 3 # Check the full sparse MLA block in setKernelParams grep -n "sparseMla\|SlidingWindow" include/flashinfer/trtllm/fmha/kernelParams.h

Length of output: 210871

@saltyminty Great question! The root cause is binary ABI compatibility with the precompiled cubins.

KernelParams is the struct passed directly to the TRTLLM-GEN GPU kernels, so its memory layout (field offsets) must exactly match what the precompiled cubins expect. This PR updates the cubin artifacts to a new version (as noted in the PR description), and the new cubins appear to include tmaKSlidingWindowKvPool_ in their expected KernelParams layout. If the field were absent from the host-side struct, every field declared after it would be at the wrong byte offset, causing silent data corruption in the kernel.

The memset zero-initialization acts as a safe "disabled" sentinel — the kernel can check whether the descriptor is valid before using it, and a zeroed CUtensorMap signals that this feature path is inactive.

You can see the same pattern with ptrSparseMlaTopKLens (line 122): it's declared, zero-initialized by memset, and explicitly pinned to nullptr at line 867 — all without a full host-side implementation. Both fields are effectively forward-ported struct layout stubs that maintain cubin ABI compatibility while the actual initialization logic is deferred to a follow-up.

So to summarize: the field must exist to keep the struct binary-compatible with the new cubins, and the zero-initialization from memset is the correct default for the "feature not active" case.

this is actually used for a new kind of kernels, see #3269.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tests/attention/test_attention_sink_blackwell.py (1)
215-215: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Remove debug print statement.

Line 215 (print(q.shape, k.shape, v.shape)) is a debug artifact that produces noise in CI output. It should be removed before merging.
🐛 Proposed fix
-    print(q.shape, k.shape, v.shape)
-
     o_ref = sink_attention_unified(
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/attention/test_attention_sink_blackwell.py` at line 215, Remove the
debug print that prints tensor shapes — delete the statement printing q.shape,
k.shape, v.shape (the `print(q.shape, k.shape, v.shape)` line) from the test so
it no longer emits noisy output during CI; leave the surrounding test logic and
any assertions intact.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@tests/attention/test_attention_sink_blackwell.py`:
- Line 215: Remove the debug print that prints tensor shapes — delete the
statement printing q.shape, k.shape, v.shape (the `print(q.shape, k.shape,
v.shape)` line) from the test so it no longer emits noisy output during CI;
leave the surrounding test logic and any assertions intact.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f785676b-3ed8-4591-90db-4b98cb5c0c67

📥 Commits

Reviewing files that changed from the base of the PR and between 8b17dc664bacc1e5685345ba66e92b4411465d84 and 92c6075b657b93a1c865835647f534560e9137c1.

📒 Files selected for processing (6)

flashinfer/artifacts.py
include/flashinfer/trtllm/fmha/fmhaKernels.cuh
include/flashinfer/trtllm/fmha/fmhaRunnerParams.h
include/flashinfer/trtllm/fmha/kernelParams.h
tests/attention/test_attention_sink_blackwell.py
tests/attention/test_trtllm_gen_attention.py

✅ Files skipped from review due to trivial changes (2)

flashinfer/artifacts.py
include/flashinfer/trtllm/fmha/fmhaKernels.cuh

🚧 Files skipped from review as they are similar to previous changes (2)

include/flashinfer/trtllm/fmha/fmhaRunnerParams.h
tests/attention/test_trtllm_gen_attention.py

qsang-nv

LGTM

saltyminty

Approved but see comment about tmaKSlidingWindowKvPool_

saltyminty · 2026-05-08T17:26:24Z

+  // TMA descriptor for DSv4 sparse MLA sliding-window KV pool. Same format as tmaK_.
+  CUtensorMap tmaKSlidingWindowKvPool_;


Not sure if I'm missing something, but what's the reason for this never actually being initialized?

PerkzZheng · 2026-05-09T02:02:18Z

/bot run

flashinfer-bot · 2026-05-09T02:02:50Z

GitLab MR !644 has been updated with latest changes, and the CI pipeline #50739882 is currently running. I'll report back once the pipeline job completes.

PerkzZheng · 2026-05-11T00:40:47Z

@qsang-nv @saltyminty the CI failures are more related to machine-allocation issues. Feel free to merge if it looks good to you. Thanks!

B300 tests are passed so it should be good.

qsang-nv · 2026-05-11T02:43:33Z

/bot run

flashinfer-bot · 2026-05-11T02:43:39Z

GitLab MR !644 has been created, and the CI pipeline #50869866 is currently running. I'll report back once the pipeline job completes.

PerkzZheng requested review from aleozlx, bkryu, cyx-6, jimmyzho, kahyunnam, nv-yunzheq, qsang-nv, saltyminty, samuellees, sricketts, yongwww, yyihuang and yzh119 as code owners May 7, 2026 14:20

flashinfer-bot added the op: attention label May 7, 2026

PerkzZheng force-pushed the port-dynamic-tokens-per-page-gqa branch from 3ddae55 to 8b17dc6 Compare May 7, 2026 14:32

coderabbitai Bot reviewed May 7, 2026

View reviewed changes

PerkzZheng mentioned this pull request May 8, 2026

Add DeepSeek V4 sparse MLA TRTLLM-GEN kernels #3269

Merged

qsang-nv reviewed May 8, 2026

View reviewed changes

Comment thread include/flashinfer/trtllm/fmha/fmhaKernels.cuh

PerkzZheng force-pushed the port-dynamic-tokens-per-page-gqa branch 2 times, most recently from 92c6075 to dffd4b8 Compare May 8, 2026 07:45

coderabbitai Bot reviewed May 8, 2026

View reviewed changes

qsang-nv approved these changes May 8, 2026

View reviewed changes

saltyminty approved these changes May 8, 2026

View reviewed changes

Add dynamic tokens per page FMHA kernels

eb503ea

PerkzZheng force-pushed the port-dynamic-tokens-per-page-gqa branch from dffd4b8 to eb503ea Compare May 9, 2026 02:02

qsang-nv added the run-ci label May 11, 2026

qsang-nv merged commit 4f072c4 into flashinfer-ai:main May 11, 2026
42 of 43 checks passed

coderabbitai Bot mentioned this pull request May 14, 2026

Update trtllm FMHA cubins #3317

Merged

		// TMA descriptor for DSv4 sparse MLA sliding-window KV pool. Same format as tmaK_.
		CUtensorMap tmaKSlidingWindowKvPool_;

Conversation

PerkzZheng commented May 7, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Tests

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (2 warnings)

Uh oh!

gemini-code-assist Bot commented May 7, 2026

Uh oh!

PerkzZheng commented May 7, 2026

Uh oh!

flashinfer-bot commented May 7, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

saltyminty May 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

PerkzZheng May 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

qsang-nv left a comment

Choose a reason for hiding this comment

Uh oh!

saltyminty left a comment

Choose a reason for hiding this comment

Uh oh!

saltyminty May 8, 2026

Choose a reason for hiding this comment

Uh oh!

PerkzZheng commented May 9, 2026

Uh oh!

flashinfer-bot commented May 9, 2026

Uh oh!

PerkzZheng commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

qsang-nv commented May 11, 2026

Uh oh!

flashinfer-bot commented May 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

PerkzZheng commented May 7, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 7, 2026 •

edited

Loading

coderabbitai Bot May 7, 2026 •

edited

Loading

PerkzZheng commented May 11, 2026 •

edited

Loading