Add 3D convolution AI heuristics implementation#3925
Conversation
- Add Gfx942Model_3D class for 3D convolution predictions - Implement metadata_3d.cpp for 3D feature extraction - Add trained 3D TunaNet models for gfx942 - Update CMakeLists.txt for 3D components
|
|
||
| namespace miopen { | ||
| namespace ai { | ||
|
|
There was a problem hiding this comment.
I like this refactor to miopen::ai::common, makes sense to me.
There was a problem hiding this comment.
A question from a C++ beginner: Why would we place these function definitions in the .hpp file instead of the .cpp file?
There was a problem hiding this comment.
It seems that MIOpen wants have the option to distribute the library as a header-only library, where the user can compile the library with their own code. When the implementation is in header files, this is possible. If you would place the implementation to .cpp files, you would need to distribute compiled libraries. This create problems with compiler/runtime versions since users of MIOpen would need to have versions that are compatible with the compiled binaries. Header-only library is free of such dependencies.
There was a problem hiding this comment.
Actually, since you already have the .cpp file, I would move all implementations to the .cpp file. The hybrid is just confusing.
There was a problem hiding this comment.
So only declarations (signature) in the header file and the implementation in the .cpp file.
There was a problem hiding this comment.
Related to this: I seem to recall John Shumway saying something about being header-only in our meeting yesterday... or was that only about CK?
There was a problem hiding this comment.
I created metadata_3d.cpp, which needed to use ReverseMap and LoadJSON functions that were originally in hpp file. To make these functions accessible to both ai_heuristics.cpp and metadata_3d.cpp, I moved them to a header file (.hpp).
| /** | ||
| * @brief 3D-specific metadata handler for TunaNet3D models | ||
| * | ||
| * This class provides a simple interface for accessing 3D convolution metadata. |
There was a problem hiding this comment.
did you need to alter the metadata json structure at all? Or did you simply use the current file format that our python code spits out?
There was a problem hiding this comment.
I used the JSON structure generated from the Python code as is.
| } | ||
|
|
||
| bool IsProblemSupported(const conv::ProblemDescription& problem, | ||
| const ExecutionContext& ctx) const override |
There was a problem hiding this comment.
I receive a warning when building, since ctx is not used, perhaps do:
| const ExecutionContext& ctx) const override | |
| [[maybe_unused]] const ExecutionContext& ctx) const override |
There was a problem hiding this comment.
It is easier to simply write
bool IsProblemSupported(const conv::ProblemDescription& problem,
const ExecutionContext&) const override
If you are not using the ctx. This should remove the compiler warning.
There was a problem hiding this comment.
Thank you Bart and Ville, I corrected.
There was a problem hiding this comment.
As Ville rightfully pointed out. Perhaps we should make these git-lfs files. Can you perhaps check if this is the case with the other tunanet files?
There was a problem hiding this comment.
Currently in Git LFS:
- Only *.kdb.bz2 files
NOT in Git LFS
- All *.tn.model files (TunaNet models) - ranging from 2KB to 1.1MB
- All *.ktn.model files (Kernel Tuning models) - ranging from 827 bytes to
1.1MB
If we need to track git-lfs for model files then
git lfs track "*.tn.model"
git lfs track "*.ktn.model"
git add .gitattributes
git add src/kernels/*.tn.model src/kernels/*.ktn.model
git commit -m "Move TunaNet models to Git LFS"
|
When I run your commands suggested above it is better to set the log level higher: So now it's easier to see that your new stuff is properly being called :) Any idea why it gives us two solutions here at the end? (ConvHipImplicitGemm3DGroupFwdXdlops, GemmFwdRest). Shouldn't the tunanet only give us one of them? Or is one of the two a hard-coded fallback? Besides this, as I said yesterday: it would be nice to have explicit test functions for you specific functions :) |
| * | ||
| * MIT License | ||
| * | ||
| * Copyright (c) 2023 Advanced Micro Devices, Inc. |
| } catch (...) { | ||
| success = false; | ||
| return {}; | ||
| } |
There was a problem hiding this comment.
Perhaps a better pattern is to return std::optional<std::vector<std::string>> where std::nullopt indicates failure. This would apply to all LoadXXX methods. Then you don't need additional success variable. Generally, the functional style of programming where functions do not have side-effects (such as modifying value of an input variable) is safer.
There was a problem hiding this comment.
I didn't know the std::optional. Thank you. I've fixed the code.
There was a problem hiding this comment.
I will keep in mind "So only declarations (signature) in the header file and the implementation in the .cpp file."
Since this was existing code, I'll probably need to refactor it after merging it with Bart. If I modify the existing code, Bart's code will likely stop working.
Thank you!!
|
|
||
| namespace miopen { | ||
| namespace ai { | ||
|
|
There was a problem hiding this comment.
Actually, since you already have the .cpp file, I would move all implementations to the .cpp file. The hybrid is just confusing.
|
|
||
| namespace miopen { | ||
| namespace ai { | ||
|
|
There was a problem hiding this comment.
So only declarations (signature) in the header file and the implementation in the .cpp file.
- Add 13 unit tests covering 3D AI heuristics functionality - Test metadata loading, model creation, and solver predictions - Add MIOpenDriver equivalent test (BartTest) for real-world validation - Enhance logging to show solver predictions with scores - Verify correct solver selection for 3D convolution problems
There was a problem hiding this comment.
I think perhaps this could be slightly renamed to be more accurate? I.e. this is only to test the "tunanet" part, right? So maybe name it something like that? Or something with "solver_selection".
This is just to make sure that this does not test the heuristics related to kernel tuning/parameter selection.
There was a problem hiding this comment.
Otherwise, this all looks good! All tests pass on my fresh build 🥳
…tor of Metadata3D, since the compiler did not like this.
|
Merging this, next step: #3923 |
@amd-bartgips , @vpietila-amd , @amd-ahyttine
3D tunanet code PR
Test script