Skip to content

3D conv heuristics (KTN part)#3918

Merged
amd-bartgips merged 21 commits into
miopenff/3d-heuristicfrom
bartgips/3d-ktn
Jul 29, 2025
Merged

3D conv heuristics (KTN part)#3918
amd-bartgips merged 21 commits into
miopenff/3d-heuristicfrom
bartgips/3d-ktn

Conversation

@amd-bartgips
Copy link
Copy Markdown

Proposed changes

This PR is focussed on introducing a new feature: A new model to perform heuristics for tuning the CK kernels used for 3D convolutions, initially only for gfx942:

  • gfx942_ConvHipImplicitGemm3DGroupWrwXdlops
  • gfx942_ConvHipImplicitGemm3DGroupBwdXdlops
  • gfx942_ConvHipImplicitGemm3DGroupFwdXdlops

These models are different from the current implementation of KTN:
It does not use an LSTM architecture, but instead relies on two TunaNet models:

  • input_encoder.tn.model: projects the input (conv parameters extracted from fdb_key) to latent space
  • kernel_config_encoder.tn.model: projects all possible kernel configurations (kernel parameters + split_k) for a particular fdb_key to the same latent space.

the kernel configs are then scored through a dot product with the latent representaiton of the input, and the kernel_config with the highest score is selected

Checklist

Please put an x into the boxes that apply. You can also fill these out after creating the PR. If you're not sure, please don't hesitate to ask.

  • I have added automated tests relevant to the introduced functionality
  • I have sufficient test coverage for the changes, and code coverage hasn't decreased as a result of my PR
  • I have ran the tests, and they are all passing locally
  • I have added relevant documentation for the changes
  • I have removed the stale documentation which is no longer relevant after this pull request
  • I have ran make format & make check_format to ensure the changes have been formatted

TODO:
* check if metadata is correct.
* Implement relevant cpp functions to load models and run them
… the conv_hip_implicit_gemm_3d_grouped_wrw_xdlops solver
* Forward declaration of EncodeKernelParams.
Removed some redundant code and unused vars to avoid most warnings during build (left unused var "arch" in for future use)
@amd-bartgips
Copy link
Copy Markdown
Author

Based on only reading the code and fixing any issues that came up when building MIOpen, we are ready for the next steps:

  • make sure 3d Tunanet works, so we get routed to use the relevant CK kernels
  • test with MIOpenDriver
    Note that for now we only have implemented things for gfx942_ConvHipImplicitGemm3DGroupWrwXdlops so we should test a 3D conv command with --forw 4 .
  • expand this code to cover the other two directions (bwd, fwd)
    Note that these three kernels can probably share a lot of code (so make sure to not simply copy redundant stuff for all three solvers)

@amd-bartgips amd-bartgips marked this pull request as ready for review July 29, 2025 12:23
@amd-bartgips amd-bartgips merged commit 3a97cec into miopenff/3d-heuristic Jul 29, 2025
4 of 37 checks passed
@amd-bartgips amd-bartgips deleted the bartgips/3d-ktn branch July 29, 2025 12:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant