Skip to content

Fix batch prefill example script for ragged kv cache#73

Merged
demandal25 merged 2 commits intoROCm:amd-integrationfrom
demandal25:fix-batch-prefill-script
Nov 21, 2025
Merged

Fix batch prefill example script for ragged kv cache#73
demandal25 merged 2 commits intoROCm:amd-integrationfrom
demandal25:fix-batch-prefill-script

Conversation

@demandal25
Copy link
Collaborator

This PR fixes the batch prefill example script for ragged kv cache. The examples for batch prefill with ragged kv cache in the script are now passing. This reinforces the basic correctness of the PR #50

More exhaustive tests are in the pytest script for batch prefill and should pass to call it a full victory.

Copilot AI review requested due to automatic review settings November 21, 2025 03:11
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes the batch prefill example script to properly support ragged KV cache functionality. The changes remove naive attention reference comparisons, add support for non-contiguous KV cache testing, improve LSE (log-sum-exp) output verification, and update test configurations.

Key changes:

  • Added contiguous_kv parameter to test non-contiguous memory layouts for KV cache
  • Enhanced verification to include LSE output comparison when return_lse=True
  • Removed naive attention comparisons and unused backend parameter
  • Updated test cases with new parameter values

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
tests/attention_reference.py Added clarifying comments about NHD layout for K and V tensor parameters
examples/batch_prefill_example.py Refactored batch prefill functions to support non-contiguous KV cache, added LSE verification, removed naive attention comparisons, and updated test configurations

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@demandal25 demandal25 requested a review from rtmadduri November 21, 2025 03:12
@demandal25 demandal25 marked this pull request as draft November 21, 2025 03:17
@demandal25 demandal25 marked this pull request as ready for review November 21, 2025 03:59
Copilot AI review requested due to automatic review settings November 21, 2025 03:59
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

# Basic ragged KV cache test with causal masking
batch_prefill_with_ragged_kv_cache_example(
12, 54, 37, 8, 8, 128, True, "NONE", 0.0, False
12, 54, 37, 8, 8, 64, False, "NONE", 0.0, False
Copy link

Copilot AI Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment on line 427 states 'Basic ragged KV cache test with causal masking' but the function call passes causal=False. Either update the comment to match the actual parameter or change the parameter to True.

Suggested change
12, 54, 37, 8, 8, 64, False, "NONE", 0.0, False
12, 54, 37, 8, 8, 64, True, "NONE", 0.0, False

Copilot uses AI. Check for mistakes.
@demandal25 demandal25 merged commit 1476ab8 into ROCm:amd-integration Nov 21, 2025
1 check passed
diptorupd pushed a commit that referenced this pull request Dec 5, 2025
Upgrade both the base and CI dockers in `docker/Dockerfile.rocm_ci` to
the new version:

- rocm6.4
- ubuntu24.04
- py3.12
- torch2.7.1
diptorupd pushed a commit that referenced this pull request Dec 5, 2025
This PR fixes the batch prefill example script for ragged kv cache. The
examples for batch prefill with ragged kv cache in the script are now
passing. This reinforces the basic correctness of the PR #50

More exhaustive tests are in the pytest script for batch prefill and
should pass to call it a full victory.
@demandal25 demandal25 deleted the fix-batch-prefill-script branch January 5, 2026 23:51
zhenhantech pushed a commit to zhenhantech/flashinfer that referenced this pull request Jan 9, 2026
Upgrade both the base and CI dockers in `docker/Dockerfile.rocm_ci` to
the new version:

- rocm6.4
- ubuntu24.04
- py3.12
- torch2.7.1
zhenhantech pushed a commit to zhenhantech/flashinfer that referenced this pull request Jan 9, 2026
This PR fixes the batch prefill example script for ragged kv cache. The
examples for batch prefill with ragged kv cache in the script are now
passing. This reinforces the basic correctness of the PR ROCm#50

More exhaustive tests are in the pytest script for batch prefill and
should pass to call it a full victory.
diptorupd pushed a commit to diptorupd/flashinfer that referenced this pull request Jan 28, 2026
Upgrade both the base and CI dockers in `docker/Dockerfile.rocm_ci` to
the new version:

- rocm6.4
- ubuntu24.04
- py3.12
- torch2.7.1
diptorupd pushed a commit to diptorupd/flashinfer that referenced this pull request Jan 28, 2026
This PR fixes the batch prefill example script for ragged kv cache. The
examples for batch prefill with ragged kv cache in the script are now
passing. This reinforces the basic correctness of the PR ROCm#50

More exhaustive tests are in the pytest script for batch prefill and
should pass to call it a full victory.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants