[rocBLAS] Users/torrezuk/swdev 568158 syrk ex tolerance fix#2851
Merged
TorreZuk merged 2 commits intoNov 24, 2025
Conversation
Contributor
TorreZuk
commented
Nov 21, 2025
- fix syrk_ex tolerance due to reference conversions and add double reference function
- add gfx11 tolerance template for f32 with f64 compute
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #2851 +/- ##
============================================
+ Coverage 54.54% 67.07% +12.53%
============================================
Files 14 362 +348
Lines 3768 51073 +47305
Branches 578 5837 +5259
============================================
+ Hits 2055 34255 +32200
- Misses 1468 13157 +11689
- Partials 245 3661 +3416
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
amcamd
approved these changes
Nov 24, 2025
assistant-librarian Bot
pushed a commit
to ROCm/rocBLAS
that referenced
this pull request
Nov 24, 2025
[rocBLAS] Users/torrezuk/swdev 568158 syrk ex tolerance fix (#2851) * fix syrk_ex tolerance due to reference conversions and add double precision reference function * add gfx11 tolerance template using f32 for f64 compute
TorreZuk
added a commit
that referenced
this pull request
Nov 24, 2025
* fix syrk_ex tolerance due to reference conversions and add double precision reference function * add gfx11 tolerance template using f32 for f64 compute (cherry picked from commit d471513)
TorreZuk
added a commit
that referenced
this pull request
Nov 24, 2025
* fix syrk_ex tolerance due to reference conversions and add double precision reference function * add gfx11 tolerance template using f32 for f64 compute (cherry picked from commit d471513)
idass1990
pushed a commit
that referenced
this pull request
Nov 27, 2025
vamovsik
pushed a commit
that referenced
this pull request
Nov 28, 2025
tfalders
pushed a commit
to tfalders/rocm-libraries
that referenced
this pull request
Jan 21, 2026
ROCm#2851) * [CK_TILE] Add sequence padding and variable length support in fmha (and v3) - Group Mode Padding: Introduces the `-s_qpad` argument to support physically padded layouts. Kernels now use padded start pointers (`seqstart_padded_*_ptr`) for memory addressing. - Batch Mode Variable Length: Adds `-q_eff_lens` and `-kv_eff_lens` arguments for efficient processing of variable-length sequences by passing cumulative effective lengths (`cu_seqlen_*_ptr`) to the kernel. - FMHA examples: Support padding and variable length both in group and batch mode. Dispatcher is updated as well (dispatch to kPadSeqLenK enabled pipeline). - New padding test cases: Add padding test cases to `smoke_test_fwd.sh`, and add benchmarks to `benchmark_fwd.sh` and `benchmark_fwd_v3.sh` as well. These test cases and benchmarks that specifically validate/benchmark the new padding and variable-length functionalities in both group and batch modes. * [CK_TILE] Fix build error in fmha unit tests --------- Co-authored-by: Po Yen Chen <PoYen.Chen@amd.com> Co-authored-by: Yi DING <yi.ding@amd.com>
ammallya
pushed a commit
that referenced
this pull request
Feb 3, 2026
#2851) * [CK_TILE] Add sequence padding and variable length support in fmha (and v3) - Group Mode Padding: Introduces the `-s_qpad` argument to support physically padded layouts. Kernels now use padded start pointers (`seqstart_padded_*_ptr`) for memory addressing. - Batch Mode Variable Length: Adds `-q_eff_lens` and `-kv_eff_lens` arguments for efficient processing of variable-length sequences by passing cumulative effective lengths (`cu_seqlen_*_ptr`) to the kernel. - FMHA examples: Support padding and variable length both in group and batch mode. Dispatcher is updated as well (dispatch to kPadSeqLenK enabled pipeline). - New padding test cases: Add padding test cases to `smoke_test_fwd.sh`, and add benchmarks to `benchmark_fwd.sh` and `benchmark_fwd_v3.sh` as well. These test cases and benchmarks that specifically validate/benchmark the new padding and variable-length functionalities in both group and batch modes. * [CK_TILE] Fix build error in fmha unit tests --------- Co-authored-by: Po Yen Chen <PoYen.Chen@amd.com> Co-authored-by: Yi DING <yi.ding@amd.com> [ROCm/composable_kernel commit: 86dd59c]
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.