Skip to content

Fix MIArchVgpr=1 + LSU=2 + SK and fix grouped gemm solution selection#345

Merged
AlexBrownAMD merged 3 commits into
ROCm:developfrom
AlexBrownAMD:FixBugs
Jun 24, 2025
Merged

Fix MIArchVgpr=1 + LSU=2 + SK and fix grouped gemm solution selection#345
AlexBrownAMD merged 3 commits into
ROCm:developfrom
AlexBrownAMD:FixBugs

Conversation

@AlexBrownAMD
Copy link
Copy Markdown
Contributor

Fix incorrect assembly code in the fixup step for stream-k when LSU=2 and MIArchVgpr=1 causing results to be reordered. This fixes issues with the test:

nightly_matmul_deepbench_f16_rf16_rf16_rf16_rf32_r_NN_576_128_12544_1_576_12544_0_576_576_1

Also fix an issue in grouped gemm solution selection that would cause a div by 0 fault. This fixes the test:

pre_checkin_matmul_groupedgemm_zero_n_f16_rf16_rf16_rf16_rf32_r_NT_127_127_127_1_127_127_0_127_127_1_GG7

bnemanich
bnemanich previously approved these changes Jun 23, 2025
aliry95amd
aliry95amd previously approved these changes Jun 23, 2025
KKyang
KKyang previously approved these changes Jun 23, 2025
@AlexBrownAMD AlexBrownAMD dismissed stale reviews from KKyang, aliry95amd, and bnemanich via c2f6613 June 23, 2025 22:39
@jichangjichang
Copy link
Copy Markdown
Contributor

gfx908 got tox errors.

@aliry95amd aliry95amd mentioned this pull request Jun 24, 2025
@AlexBrownAMD AlexBrownAMD merged commit e65dce7 into ROCm:develop Jun 24, 2025
13 of 19 checks passed
assistant-librarian Bot pushed a commit to ROCm/hipBLASLt that referenced this pull request Jun 24, 2025
Fix MIArchVgpr=1 + LSU=2 + SK and fix grouped gemm solution
 selection (#345)

Fix incorrect assembly code in the fixup step for stream-k when LSU=2
and MIArchVgpr=1 causing results to be reordered. This fixes issues with
the test:

*nightly_matmul_deepbench_f16_rf16_rf16_rf16_rf32_r_NN_576_128_12544_1_576_12544_0_576_576_1*

Also fix an issue in grouped gemm solution selection that would cause a
div by 0 fault. This fixes the test:

*pre_checkin_matmul_groupedgemm_zero_n_f16_rf16_rf16_rf16_rf32_r_NT_127_127_127_1_127_127_0_127_127_1_GG7*
AlexBrownAMD added a commit to AlexBrownAMD/rocm-libraries that referenced this pull request Jun 24, 2025
…ROCm#345)

Fix incorrect assembly code in the fixup step for stream-k when LSU=2
and MIArchVgpr=1 causing results to be reordered. This fixes issues with
the test:


*nightly_matmul_deepbench_f16_rf16_rf16_rf16_rf32_r_NN_576_128_12544_1_576_12544_0_576_576_1*

Also fix an issue in grouped gemm solution selection that would cause a
div by 0 fault. This fixes the test:


*pre_checkin_matmul_groupedgemm_zero_n_f16_rf16_rf16_rf16_rf32_r_NT_127_127_127_1_127_127_0_127_127_1_GG7*
AlexBrownAMD added a commit to AlexBrownAMD/rocm-libraries that referenced this pull request Jun 27, 2025
…ROCm#345)

Fix incorrect assembly code in the fixup step for stream-k when LSU=2
and MIArchVgpr=1 causing results to be reordered. This fixes issues with
the test:


*nightly_matmul_deepbench_f16_rf16_rf16_rf16_rf32_r_NN_576_128_12544_1_576_12544_0_576_576_1*

Also fix an issue in grouped gemm solution selection that would cause a
div by 0 fault. This fixes the test:


*pre_checkin_matmul_groupedgemm_zero_n_f16_rf16_rf16_rf16_rf32_r_NT_127_127_127_1_127_127_0_127_127_1_GG7*
evetsso pushed a commit to evetsso/rocm-libraries that referenced this pull request Dec 31, 2025
…ashot_091825

Merge rahulc/develop_snpashot_091825 to gfx1250 after solver migration PR#192
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants