Skip to content

Conversation

@qjia7
Copy link
Contributor

@qjia7 qjia7 commented Sep 4, 2025

Description

This PR unifies the present_sequence_length in flash attention and removes the dependency on total_sequence_length. This is preparation to support graph capture. #25868

@qjia7 qjia7 requested review from fs-eire and guschmue September 4, 2025 09:18
@guschmue guschmue added the ep:WebGPU ort-web webgpu provider label Sep 4, 2025
@guschmue guschmue merged commit 2132530 into main Sep 15, 2025
90 of 92 checks passed
@guschmue guschmue deleted the present_sequence_length branch September 15, 2025 16:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ep:WebGPU ort-web webgpu provider

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants