Skip to content

Conversation

@qjia7
Copy link
Contributor

@qjia7 qjia7 commented Apr 23, 2025

Fixed the bug in #24228 which causes the incorrect result for phi models when flash attention is disabled.

@guschmue guschmue added the ep:WebGPU ort-web webgpu provider label Apr 23, 2025
@qjia7 qjia7 marked this pull request as ready for review April 24, 2025 01:29
@qjia7 qjia7 requested review from fs-eire and guschmue April 24, 2025 01:34
@guschmue
Copy link
Contributor

lgtm.

@qjia7 qjia7 merged commit 5c014e2 into main Apr 25, 2025
87 of 89 checks passed
@qjia7 qjia7 deleted the fix_bug_in_1d_dispatch branch April 25, 2025 05:22
vraspar pushed a commit that referenced this pull request Apr 28, 2025
Fixed the bug in #24228 which causes the incorrect result for phi models
when flash attention is disabled.
jywu-msft pushed a commit that referenced this pull request Apr 30, 2025
### Description

Cherry pick the following into
[rel-1.22.0](https://github.com/microsoft/onnxruntime/tree/rel-1.22.0)


- (#24487)
- (#24466)
- (#24493)
- (#24484)
- (#24494)
- (#24489)
- (#24504)
- (#24510)
- (#24456)
- (#24537)
- (#24501)
- (#24519)
- (#24513)
- (#24539)
- (#24514)
- (#24542)
- (#24585)

Not added:

Planning to cherry pick Cuda Matmulnbits PRs once the fix for failing
cuda pipeline is ready
- (#24491)
- (#24509)
- (#24564)

---------

Co-authored-by: Adrian Lizarraga <[email protected]>
Co-authored-by: minfhong-quic <[email protected]>
Co-authored-by: minfhong-quic <[email protected]>
Co-authored-by: Justin Chu <[email protected]>
Co-authored-by: Prathik Rao <[email protected]>
Co-authored-by: Edward Chen <[email protected]>
Co-authored-by: Ankan Banerjee <[email protected]>
Co-authored-by: Maximilian Müller <[email protected]>
Co-authored-by: Gaurav Garg <[email protected]>
Co-authored-by: iraut <[email protected]>
Co-authored-by: Hrishikesh Manohar <[email protected]>
Co-authored-by: Maximilian Müller <[email protected]>
Co-authored-by: Scott McKay <[email protected]>
Co-authored-by: Jiajia Qin <[email protected]>
Co-authored-by: kunal-vaishnavi <[email protected]>
Co-authored-by: xhcao <[email protected]>
jatinwadhwa921 pushed a commit to intel/onnxruntime that referenced this pull request Apr 30, 2025
### Description

Cherry pick the following into
[rel-1.22.0](https://github.com/microsoft/onnxruntime/tree/rel-1.22.0)


- (microsoft#24487)
- (microsoft#24466)
- (microsoft#24493)
- (microsoft#24484)
- (microsoft#24494)
- (microsoft#24489)
- (microsoft#24504)
- (microsoft#24510)
- (microsoft#24456)
- (microsoft#24537)
- (microsoft#24501)
- (microsoft#24519)
- (microsoft#24513)
- (microsoft#24539)
- (microsoft#24514)
- (microsoft#24542)
- (microsoft#24585)

Not added:

Planning to cherry pick Cuda Matmulnbits PRs once the fix for failing
cuda pipeline is ready
- (microsoft#24491)
- (microsoft#24509)
- (microsoft#24564)

---------

Co-authored-by: vraspar <[email protected]>
Co-authored-by: Adrian Lizarraga <[email protected]>
Co-authored-by: minfhong-quic <[email protected]>
Co-authored-by: minfhong-quic <[email protected]>
Co-authored-by: Justin Chu <[email protected]>
Co-authored-by: Prathik Rao <[email protected]>
Co-authored-by: Edward Chen <[email protected]>
Co-authored-by: Ankan Banerjee <[email protected]>
Co-authored-by: Maximilian Müller <[email protected]>
Co-authored-by: Gaurav Garg <[email protected]>
Co-authored-by: iraut <[email protected]>
Co-authored-by: Hrishikesh Manohar <[email protected]>
Co-authored-by: Maximilian Müller <[email protected]>
Co-authored-by: Scott McKay <[email protected]>
Co-authored-by: Jiajia Qin <[email protected]>
Co-authored-by: kunal-vaishnavi <[email protected]>
Co-authored-by: xhcao <[email protected]>
ankitm3k pushed a commit to intel/onnxruntime that referenced this pull request May 12, 2025
Fixed the bug in microsoft#24228 which causes the incorrect result for phi models
when flash attention is disabled.
@snnn
Copy link
Contributor

snnn commented Sep 5, 2025

This PR has been included in the rel-1.22.0 branch. Removing the release:1.22.0 label.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ep:WebGPU ort-web webgpu provider

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants