Skip to content

Create tensor descriptor inside kernel to improve performance of small/tiny gemm cases#596

Closed
carlushuang wants to merge 16 commits into
developfrom
simplified_karg
Closed

Create tensor descriptor inside kernel to improve performance of small/tiny gemm cases#596
carlushuang wants to merge 16 commits into
developfrom
simplified_karg

Conversation

@carlushuang
Copy link
Copy Markdown
Contributor

@carlushuang carlushuang commented Feb 22, 2023

CK_EXPERIMENTAL_USE_BUFFER_ATOMIC_ADD_OOB_CHECK_OFFSET_TRICK
has bug when using OOB with atomic. Have to switch to 0.
=> #616

for 2D gemm original karg is 292 Byte, simplified karg is 68 Byte.

up to 10%+ performance improvement
image

@zjing14
Copy link
Copy Markdown
Contributor

zjing14 commented Mar 2, 2023

@carlushuang Could you measure the argument size with the improvement?

Comment thread include/ck/ck.hpp Outdated
@carlushuang
Copy link
Copy Markdown
Contributor Author

for 2D gemm original karg is 292 Byte, simplified karg is 68 Byte.

@zjing14 updated in PR description

zjing14 and others added 6 commits March 6, 2023 01:22
* clean up

* fast gelu using builtin function

* clean

* clean

* clean

* clean:

* clean

* fix compilation

* clean

* clean

---------

Co-authored-by: zjing14 <zhangjing14@gmail.com>
* fix a bug blocking wmma_gemm_multipleD

* Utilize matrix padder in device_wmma_op

* cosmetic change for gemmpadding format

* clang format

* Change gridwise gemm from FIFO to KMN loop fashion
* suppress the reserved-identifier warnings

* keep BUILD_DEV=On and use -Werror by default
* add new parallel stage on navi node

* dont run performance tests on navi, get rid of 9110 compiler

* only run navi build when not doing QA

* fix syntax

* use navi21 label

* dont stash profiler on navi nodes, scp deb package to ginger

* disable tests on navi nodes

* test posting a binary to ginger

* add sshpass and use it to copy deb package

* fix the scp example

* fix syntax

* debug the scp issues

* add jenkins user to docker

* dont try whoami

* change jenkins uid and add user with uid=1002

* try scp from the last stage on micimaster

* rename and stash the package, scp from micimaster
@carlushuang
Copy link
Copy Markdown
Contributor Author

carlushuang commented Mar 6, 2023

dependes on #616 -> to fix the atomic bug

@zjing14 zjing14 requested a review from asroy March 9, 2023 14:08
Copy link
Copy Markdown
Contributor

@asroy asroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR need to be refactored:

@carlushuang
Copy link
Copy Markdown
Contributor Author

close this PR since we have refactored into #644

@carlushuang carlushuang deleted the simplified_karg branch March 23, 2023 14:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants