[CUDA] Run FlashAttention regression test only when FlashAttention is available by hariharans29 · Pull Request #27206 · microsoft/onnxruntime

hariharans29 · 2026-01-29T18:46:06Z

Description

As title.

Checking if FlashAttention exists check includes if torch has CUDA support, the system has the right device to run FlashAttention, etc.

Motivation and Context

Fix Windows CUDA CI failures

github-actions

You can commit the suggested changes from lintrunner.

onnxruntime/test/python/transformers/test_gqa.py

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Copilot

Pull request overview

This PR adds a conditional skip decorator to the TestGQARegressions test class to prevent FlashAttention regression tests from running when FlashAttention is not available on the system. This helps unblock CI build pipelines that may not have the necessary CUDA support or hardware requirements.

Changes:

Added @unittest.skipIf decorator to TestGQARegressions class to skip tests when FlashAttention is unavailable

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

… available (#27206) ### Description As title. Checking if FlashAttention exists check includes if torch has CUDA support, the system has the right device to run FlashAttention, etc. ### Motivation and Context Fix Windows CUDA CI failures --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

#27021: [Disable matmul 1d tests on DML](1afc8bc) #27206: [[CUDA] Run FlashAttention regression test only when FlashAttention is…](4d95d97) #27120: [POWER : Fix build failure due to unsupported cpuinfo on ppc64le](2843ec0) --------- Co-authored-by: Ti-Tai Wang <titaiwang@microsoft.com> Co-authored-by: Hariharan Seshadri <shariharan91@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: BODAPATIMAHESH <148746454+BODAPATIMAHESH@users.noreply.github.com>

Run FLashAttention regression test only when FlashAttention is available

4e9cc07

github-actions bot reviewed Jan 29, 2026

View reviewed changes

onnxruntime/test/python/transformers/test_gqa.py Show resolved Hide resolved

Update onnxruntime/test/python/transformers/test_gqa.py

970418d

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

hariharans29 requested review from Copilot and tianleiwu January 29, 2026 18:50

Copilot started reviewing on behalf of hariharans29 January 29, 2026 18:52 View session

hariharans29 mentioned this pull request Jan 29, 2026

POWER : Fix build failure due to unsupported cpuinfo on ppc64le #27120

Merged

Copilot AI reviewed Jan 29, 2026

View reviewed changes

tianleiwu approved these changes Jan 29, 2026

View reviewed changes

hariharans29 enabled auto-merge (squash) January 29, 2026 19:15

hariharans29 disabled auto-merge January 29, 2026 19:22

titaiwangms approved these changes Jan 30, 2026

View reviewed changes

Merge remote-tracking branch 'origin' into hari/fix_GQA_build_errors

65c0706

hariharans29 enabled auto-merge (squash) January 30, 2026 23:41

titaiwangms mentioned this pull request Feb 3, 2026

Specify attention-23 kernel and relax assertion in prepare qkv #27217

Merged

hariharans29 merged commit 260a48c into main Feb 4, 2026
92 of 101 checks passed

hariharans29 deleted the hari/fix_GQA_build_errors branch February 4, 2026 01:10

tianleiwu mentioned this pull request Feb 4, 2026

ORT 1.24.1 release cherry pick round 2 #27233

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CUDA] Run FlashAttention regression test only when FlashAttention is available#27206

[CUDA] Run FlashAttention regression test only when FlashAttention is available#27206
hariharans29 merged 3 commits intomainfrom
hari/fix_GQA_build_errors

hariharans29 commented Jan 29, 2026 •

edited

Loading

Uh oh!

github-actions bot left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

hariharans29 commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

hariharans29 commented Jan 29, 2026 •

edited

Loading