Skip to content

[Bug fix] GridwiseGemm_k0mk1_k0nk1_mn_xdlops_v2r3 k0 main loop#45

Merged
asroy merged 1 commit into
developfrom
bug_fix_grid_gemm_k0_main_loop
Oct 21, 2021
Merged

[Bug fix] GridwiseGemm_k0mk1_k0nk1_mn_xdlops_v2r3 k0 main loop#45
asroy merged 1 commit into
developfrom
bug_fix_grid_gemm_k0_main_loop

Conversation

@asroy
Copy link
Copy Markdown
Contributor

@asroy asroy commented Oct 21, 2021

In GridwiseGemm_k0mk1_k0nk1_mn_xdlops_v2r3, sometimes there is no main K0 loop. So you need to detect that on host, and not using main K0 loop in the kernel.

GridwiseGemm_bk0mk1_bk0nk1_mn_xdlops_v2r4 has same issue, and need to be fixed in separate PR

@asroy asroy added the bug Something isn't working label Oct 21, 2021
@asroy asroy requested a review from ltqin October 21, 2021 16:01
@asroy asroy closed this Oct 21, 2021
@asroy asroy reopened this Oct 21, 2021
@asroy asroy merged commit d5297ab into develop Oct 21, 2021
@junliume junliume deleted the bug_fix_grid_gemm_k0_main_loop branch October 21, 2023 06:09
asroy added a commit that referenced this pull request Dec 1, 2023
carlushuang pushed a commit that referenced this pull request Apr 26, 2024
* Use kernel shape to define encoding

* Remove redundant blockSize

* Save mean and invStd

* Refine SAVE_MEAN_INV_STD

* Refine template parameter name

* Let mean var tensor easier to get

* Add InvSqrt() function, prepare to add fast InvSqrt algorithm

* Refine function name

* remove useless reference

* Rename kXNumWarps to kXWarpPerBlock

* Refine InvSqrt()

* remove if condition, global write is warp level, if condition is useless here
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant