-
Notifications
You must be signed in to change notification settings - Fork 294
[hipblaslt] Add support for gfx950 mxfp4 #6499
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
251 commits
Select commit
Hold shift + click to select a range
148d196
Contraction and DataType rebase
javier-amd 3b41f20
client modification
javier-amd 1bc4230
Parameters and LocalRead rebased
wen-des 32d700f
rocisa supportand other changes
javier-amd 81c40d2
Components rebased
wen-des 28f2602
writer related change and yaml
javier-amd d89ad91
Conversion rebased
wen-des 2822693
Fixed errors in compiling
wen-des 0df3258
Fixed python space issues
wen-des 089cd5a
Bugfixed in python files and generated kernel sucessfully
wen-des 74d6373
Disable swap address for mxsa/mxsb
wen-des e7cb8f6
Committed some missing fixes
wen-des f56cb78
Fixed mxsa/mxsb address offset
wen-des 0649b1e
Added TODO memo for later consideration
wen-des b81161d
bpe function fix
javier-amd ee1714e
Bugfixed for the wrong address offset calculation
wen-des c89efb4
MX F8 functional testes passed in tensilelite
wen-des cf1251a
Updated f8 yaml file
wen-des 8478d05
Removed the mx f6 yaml files for mx f6 is not ready by now
wen-des 67b50bd
Updated f4 yaml file for test coverage
wen-des 045f9ec
Standardize kernel names with MX types (#4363)
AlexBrownAMD 9cb5440
Fix some errors breaking non-mx tests on mx branch (#4616)
AlexBrownAMD de7dee5
Fix for gfx950 mxfp4 DirectToLds (#4644)
nakajee e0a7991
[hipBLASLt] Enable MX data generation for Tensile host and support ca…
amd-chunxlin 9e0422c
[hipBLASLt] Add block size into predicate for correct solution select…
amd-chunxlin 7afd6fb
[Tensilelite] Add MXFP4 data generator for Tensile (#4597)
archana-ramalingam e91ecf3
Enable DirectToLds for MXSA/B and re-enable LdsPad for MXFP4 + Direc…
nakajee e0e6ecc
Fix data initialization (#4827)
bnemanich dab0b9c
Fix a verification fail with MXFP4 + non DTL (#4715)
nakajee a3654aa
[hipblaslt] Fixing build issues for gfx_950_mx_rebase (#4465)
NineKa fd621eb
[TensileLite] Fix MX FP4 scale data overwrite in initializeCPUInputs …
archana-ramalingam a2ce1ab
Fix stream-k with mx scaling (#4388)
AlexBrownAMD 1c2fe0e
[hipblaslt] Fix fails with dtl.yaml and xfp32.yaml on gfx950_mx_rebas…
nakajee 3b3c84b
Merge commit '4ffdf58b7d36b29ad86806c642e8d7aa930deeaf' into users/ho…
NineKa 613ccdb
add kernel["ProblemType"]["Sparse"] to condition
NineKa a4a6368
Merge commit '0db944b2e05878e30d441fb1b32421096107ddf5' into users/ho…
nakajee 337dbbe
fix dependency issues for tensilelite clients
NineKa 20e4cb1
Merge commit '70b16b75e53a69200142bf27fa6f90771a0ba0c9' into users/ho…
NineKa 85d98aa
fix computeInputType in tensilelite
NineKa e1a5bfb
Merge commit '7c3a3e5c044b8abbf77aaf97c2b93f303e763fff' into users/ho…
nakajee d5b8ff8
Merge branch 'develop' into users/hongjche/gfx950_mx_rebase_sync
NineKa 9c46a42
fix computeInputType issue in ReferenceValidator.cpp
NineKa 84d18f9
[hipblaslt] fix unit tests for gfx950_mx_rebase (#4912)
NineKa 0a645bd
Merge branch 'gfx950_mx_rebase' into users/hongjche/gfx950_mx_rebase_…
NineKa 9a3591d
[hipblaslt] Fix a verification fail with spmm_i8hs.yaml (#5034)
nakajee 0f5c904
initial set of testcase for MXFP4 (#4739)
pdhirajkumarprasad 8724f2f
Merge branch 'develop' into users/hongjche/gfx950_mx_rebase_sync
NineKa 01f29ad
[Tensilelite] Add regression test for MX FP4 scale buffer determinism…
archana-ramalingam f83ef8e
Merge branch 'gfx950_mx_rebase' into users/hongjche/gfx950_mx_rebase_…
NineKa 24c36e1
Merge branch 'develop' into users/hongjche/gfx950_mx_rebase_sync
NineKa 024ba23
UseF32XEmulation in forceLrvwTile1 for B tensor (#5143)
talumbau 8de6b1a
Merge branch 'users/hongjche/gfx950_mx_rebase_sync' into gfx950_mx_re…
NineKa ede3a2a
[hipblaslt] Enable StoreSwapAddr for MXFP4, plus add GRVWMXSA/B adust…
nakajee 34bca88
[Tensilelite] Fix UserArgs struct stride mismatch in grouped GEMM (#…
archana-ramalingam 946988a
[hipBLASLt] Disable failed mx f8 problem sizes (#5105)
amd-chunxlin dcc90b5
[hipblaslt] Scheduling related fixes for MXFP4 (#5169)
nakajee 20b1923
Merge branch 'develop' into users/hongjche/gfx950_mx_rebase_sync
NineKa e09725d
remove explicit constructor from variable_value class
NineKa faa7dc7
fix return statement in hipDataType_to_tensile_type and add type chec…
NineKa ae18131
Merge branch 'users/hongjche/gfx950_mx_rebase_sync' into gfx950_mx_re…
NineKa 2a4b814
[Tensilelite] Shuffle mx scaling data in Tensile (#4864)
archana-ramalingam b690c88
[hipblaslt] Fix fail with kringshift.yaml (#5228)
nakajee b967536
Merge branch 'develop' into users/hongjche/gfx950_mx_rebase_sync
NineKa 59349d6
[hipblaslt] Optimize StoreSwapAddr (#5217)
nakajee 945fd17
[hipblaslt] Enable MXFP4 + DtlPlusLdsBuf (#5251)
nakajee 5babee6
Fix gfx12 build error with integer cast
73b68cf
[hipblaslt] Fix SIA3 issues with MXFP4 (#5245)
nakajee 87181ea
[hipBLASLt] Fix CI failures for gfx942 (#5216)
amd-chunxlin b6c3d45
Make the usage side’s logic consistent with allocation side (tPackM) …
tomchengchitang 1ee50eb
[hipblaslt] Fix fail with gfx942+dtv/dtl.yaml (#5349)
nakajee e20a090
[hipblaslt] disable mxDataGenerator for windows builds (#5298)
NineKa 30665e0
Merge branch 'develop' into users/hongjche/gfx950_mx_rebase_sync
NineKa 086a3f2
fix build errors of merge
NineKa c44133d
Fix: add MacDataTypeA to mock kernel (#5351)
talumbau 9701b31
[hipblaslt] Fix tox test fp8_gfx12 failed when dtva1=1 or dtvb1=1 (#5…
tomchengchitang b3a0b98
Revert "[hipblaslt] disable mxDataGenerator for windows builds (#5298)"
nakajee c488af4
[hipBLASLt] Fix failed swizzle tests (#5400)
amd-chunxlin 4339bba
[hipblaslt] Fix tailLoop errors in GLOBAL_OFFSET_{A or B} for fp16_gf…
tomchengchitang 76433a2
[hipblaslt] disable mxDataGenerator for windows builds (#5414)
NineKa 4263c28
Merge branch 'develop' into users/hongjche/gfx950_mx_rebase_sync
NineKa 7c50394
[hipblaslt] Add F4/F6/BF6 to instTypeToDataType (#5457)
nakajee bf1d16b
[hipblaslt] remove MXFP4 TN logic file (#5487)
nakajee 9a9655a
[hipblaslt] Use64bShadowLimitMX support (#5499)
nakajee 92d332e
Merge branch 'develop' into users/hongjche/gfx950_mx_rebase_sync
NineKa 7b3a5e5
cleanup redundant lines of code
NineKa e6785bb
[hipBLASLt] Fix failed rocRoller test (#5529)
amd-chunxlin b96a10a
Revert "cleanup redundant lines of code"
NineKa f85dc43
[hipsparselt] Fix numSplitMetadata logic (#5608)
tomchengchitang c377746
[hipblaslt] const GRInc support (#5526)
nakajee a620792
Merge branch 'develop' into gfx950_mx_rebase
bnemanich 456c3dd
Fix merge
bnemanich 9a1bd95
Fix tensilelite build error due to merge conflict
nakajee 637f05b
Merge branch 'develop' into users/hongjche/gfx950_mx_rebase_cleanup
NineKa 52b8654
fix various issue in review
NineKa 03a359b
[hipblaslt] Reject MX + nonDTL + UnrollLoopSwapGlobalReadOrder (#5794)
nakajee ae8f301
[hipsparselt] Restore to develop logic and fix mistakenly used PackKF…
tomchengchitang cff72f1
[hipsparelt] Delete spurious rIdx_ loop for hipsparselt failed tests …
tomchengchitang f0849a4
Merge branch 'develop' into users/hongjche/gfx950_mx_rebase_sync
NineKa dc3eac5
Merge branch 'develop' into users/hongjche/gfx950_mx_rebase_sync
NineKa 0b5c339
[hipsparselt] Fix metadata vgpr idx calculation (#5920)
tomchengchitang 8a6ed73
[hipblaslt] Add support for MXFP4 + TailLoop (K multiple of 32) (#5692)
nakajee abce258
Merge branch 'develop' into users/hongjche/gfx950_mx_rebase_sync
NineKa 41a35bc
Merge remote-tracking branch 'origin/develop' into gfx950_mx_rebase
nakajee 15c010e
hipBLASLt Tensile: tighten LRVW/GRVW validation and trim Solution.py …
NineKa f8f6070
Merge remote-tracking branch 'origin/gfx950_mx_rebase' into gfx950_mx…
nakajee 90fbfa8
Add conversion for MX types for Origami (#6271)
yenong-amd 3dd2873
Merge remote-tracking branch 'origin/gfx950_mx_rebase' into gfx950_mx…
nakajee a377e5b
Merge remote-tracking branch 'origin/develop' into users/nakajee/gfx9…
nakajee 136cbde
Update Tensile.py
pdhirajkumarprasad 4b32be2
new line formatting fix
pdhirajkumarprasad ff3bc60
Resolve merge conflicts involving mx-block-a
bnemanich 9a23f56
Merge conflicts in DataType.py
bnemanich 3c3bc42
Fix SoftmaxGenerator.py SolutionLibrary.py ClientProblemFactory.hpp R…
amd-chunxlin 4e8e337
conflict resolution
0ed0a44
Merge branch 'users/nakajee/gfx950_mx_rebase_merge' of https://github…
4a1054c
Resolve conflicts in client/include/DataInitialization.hpp
vinayakdsci b2313cc
Resolve conflicts in client/src/Reference.cpp
vinayakdsci 7083d99
Resolve conflicts in client/include/TypedId.hpp
vinayakdsci a039d38
Rsolve conflicts in Tensile/AsmAddressCalculation.py
NineKa 382df2c
Solved merge conflict for KernelWriter.py
nakajee 56166f5
Fix ContractionProblem.hpp
amd-chunxlin a66f3fa
Fix Serialization/ContractionPredicates.hpp
amd-chunxlin 301467f
Resolve conflicts in Tensile/Components/SIA.py
NineKa 33a4a67
Resolve merge conflicts in GSU.py
bnemanich 9dba4aa
Merge branch 'users/nakajee/gfx950_mx_rebase_merge' of github.com:ROC…
bnemanich d531271
Removed unnecessary change in KernelWriter.py
nakajee 5ec2a35
Resolve conflicts for tensilelite/include/Tensile/DataTypes_BFloat6.hpp
NineKa 949193c
HipUtils.hpp/ContractionSolution.cpp conflicts
792f16a
Merge branch 'users/nakajee/gfx950_mx_rebase_merge' of https://github…
84f949e
conflict resolution
pdhirajkumarprasad 8a67779
Fix rocisa files conflicts
66fe871
Resolve conflicts in DataTypes.cpp and enum
8a708eb
Resolve conflicts in ContractionProblem.cpp and related functions call
5ab3ff8
Retain both mxfp4/6 types, with and without "_EXT"
archana-ramalingam 4b38bda
fix conflict in computeLoadSrd
pdhirajkumarprasad 4fb414d
Fix conflicts in DataTypes_Float6.hpp
CurtisFu1002 6df6e78
Remove Float6x16 and Float6x16_Storage
CurtisFu1002 d69aa7d
Resolve conflicts in include/Tensile/DataTypes.hpp
vinayakdsci 996091c
Resolve conflicts in include/Tensile/Contraction{Solution,ProblemPred…
vinayakdsci 4c56ae9
fix the conflict
pdhirajkumarprasad 2c9cb4c
Resolve conflicts in Tensile/SolutionStructs/Validators/MatrixInstruc…
vinayakdsci 67c7caf
Use single-character keys (e.g., S) when MacDataTypeA equals MacDataT…
pdhirajkumarprasad 412e1b9
Fix TensorDescriptor.hpp
amd-chunxlin 98c723a
Fix merge conflicts in testing_matmul.hpp
bnemanich 13a0005
Resolve conflicts in tensile_host except prob scaleAType part
cff3f03
conflict res LSU.py
0ae609e
Fix setUseScaleAB conflict
amd-chunxlin e3bcdba
Fix rocsparselt/src/tensile_host.cpp
amd-chunxlin 1bea41a
Resolve conflicts for tensilelite/include/Tensile/DataTypes_Float4.hpp
NineKa f616292
Matching implementations in gfx950_mx_rebase
NineKa d70475d
Fix conflicts for rocblaslt's tensile_host.cpp
archana-ramalingam 43a0a6d
Replace _EXT with non _EXT data types
archana-ramalingam 9716cf2
Resolve conflicts in projects/hipblaslt/tensilelite/Tensile/KernelWri…
nakajee 7b78a84
Merge remote-tracking branch 'origin/users/nakajee/gfx950_mx_rebase_m…
nakajee ff741a3
Resolve conflicts in projects/hipblaslt/tensilelite/Tensile/SolutionS…
nakajee af13b27
Resolve conflicts in projects/hipblaslt/tensilelite/Tensile/Component…
nakajee 0b38012
Fix conflicts for DataInitialization.cpp
CurtisFu1002 1efc795
Resolve initializeConstantInputs in DataInitialization.cpp
5114c7a
Resolve macro guards in testing_matmul.hpp
9fe02d4
Fix bugs in DataTypes.cpp and ContractionProblem.cpp
4a51b17
Resolve 6x16 <-> 6x32 and redefinition issues
3cd7f37
Merge branch 'users/nakajee/gfx950_mx_rebase_merge' of github.com:ROC…
bnemanich 6991994
Fix merge issues
bnemanich dd01c69
fix regression due to merge conflict resolution
pdhirajkumarprasad f05171e
Fix for errors with mx32f8_tn.yaml
nakajee 86c625e
Fix for mx32f8_tn.yaml (#6641)
bnemanich 0a4346f
Fix sk_mx32f4_quick
bnemanich 00a0ed7
Revert "Fix sk_mx32f4_quick"
bnemanich d54a6b3
Fix unsupported GEMM problem
bnemanich 26b175e
Removed duplicated vgpr allocation code for MX
nakajee c13b6cc
Fix sk_mx32f4_quick and remove duplicate code
amd-chunxlin 252b894
More fixes for mx32f4_tn.yaml
nakajee 3fa7a7c
More fix for sk_mx32f4_quick.yaml
nakajee 4fdb314
Fix sparse yaml tests (#6657)
tomchengchitang 516aa5f
Merge remote-tracking branch 'origin/develop' into users/nakajee/gfx9…
nakajee d2e3e21
Resolve merge conflicts
bnemanich ba8a34d
Resolve conflicts SIA.py
archana-ramalingam fd41203
Fix merge conflicts in Reference.cpp
bnemanich 934908f
Merge branch 'users/nakajee/gfx950_mx_rebase_merge' of github.com:ROC…
bnemanich a98061b
Resolve merge conflicts for KernelWriterAssembly.py
nakajee 3c91a21
Fix StreamK merge issues
bnemanich 0ab6aff
Resolve merge conflicts with KernelWriter.py
nakajee 17a6b08
Merge remote-tracking branch 'origin/users/nakajee/gfx950_mx_rebase_m…
nakajee 44d0580
Resolve merge conflicts with Solution.py
nakajee 5cf99a6
Add scale type to generateMXInput calls in TensileLite client
bnemanich 42a7b59
Merge branch 'users/nakajee/gfx950_mx_rebase_merge' of github.com:ROC…
bnemanich 8406315
Resolve merge conflicts with mfma.hpp
nakajee 7e4ea08
Merge remote-tracking branch 'origin/users/nakajee/gfx950_mx_rebase_m…
nakajee 385deeb
Resolve testing_matmul.hpp
archana-ramalingam 7de1e69
Resolve HIP_R_8F_E5M3_EXT build error
archana-ramalingam ebb576f
fix the typo in form of missing endif block
pdhirajkumarprasad 1c2e31f
Fix build break when building tensilelite-client in ffm
CurtisFu1002 c55c690
Change hardcoded dtype for DataTypeMXS{A,B}
vinayakdsci 8599dcc
mark these test as xfail for 1250 as mix type is not supported yet
pdhirajkumarprasad ddc0023
Fixes for fails with sk_mx32f4_quick and sk_mx32f8_quick
nakajee 4b80342
Remove MXScale
bnemanich a5c478b
Merge branch 'users/nakajee/gfx950_mx_rebase_merge' of github.com:ROC…
bnemanich 671db66
Fix fallthrough issue due to missing semicolon
amd-chunxlin fccb7bb
Merge origin/develop (stinkytofu MUBUF off / gfx1250 tooling)
bnemanich d9ff981
Fix fails with sk_mx32f8_quick.yaml
nakajee 8583829
Removed redundant code for MX.
nakajee 52e1131
Fix fails with mxfp4_mxfp4_fp32_tn_act.yaml (Tailloop fix)
nakajee e801d80
Fix for gfx950 mxfp4 + GSU
nakajee b0b764a
More fix for gfx950+mx+Tailloop
nakajee 639bb30
Fix namespace errors when compiling using amdclang 23
NineKa 4ebb168
Fix for gfx950 mx + DTL2 or 3
nakajee 57e67ea
Merge remote-tracking branch 'origin/develop' into users/nakajee/gfx9…
nakajee 9472aaf
fix mxf4_gfx1250 + mxf8_gfx1250 yaml files and more (#6767)
tomchengchitang 1e35325
Small change for PR6767
nakajee c9557bb
Merge branch 'users/nakajee/gfx950_mx_rebase_merge' of github.com:ROC…
bnemanich 51695e7
Fix hipsparselt builds
bnemanich e44c608
Fix Windows bug
bnemanich a95702c
Fix headers
bnemanich 5a53cfd
Fix issues in hipsparselt builds
bnemanich 28619a2
Fix fail with gfx942 xfp32.yaml
nakajee f84cfe2
Merge branch 'develop' of github.com:ROCm/rocm-libraries into users/n…
bnemanich 0f4d05e
Merge branch 'users/nakajee/gfx950_mx_rebase_merge' of github.com:ROC…
bnemanich 1e0e2b7
Fix for gfx950 mx fail due to previous fix for gfx1250
nakajee 1cf9a9d
Removed duplicated mx code + redundant new line
nakajee e35aebc
Merge remote-tracking branch 'origin/develop' into users/nakajee/gfx9…
nakajee 2f49f22
More small refactoring
nakajee c530e75
added skip-1250 for 950 specific yaml
pdhirajkumarprasad 5527082
Fix gfx950 MX code review findings
bnemanich 6df7f01
Merge branch 'develop' into users/nakajee/gfx950_mx_rebase_merge
bnemanich c76bd54
Fix datatype macro guards
bnemanich 8122284
Fix datatype macro guards
bnemanich e6f99d4
Revert TENSILE_USE_{FP4,FP6,BF6} guard to fix gfx942 regression
bnemanich a65b1ee
Address review comments
bnemanich e4fb0e0
Replace MacDataTypeA with DataType
amd-chunxlin 00759a0
Added missing B8 + F4/F6
nakajee 5fa9df7
Added missing F8/B8 + B6
nakajee 12e62d4
Fix srdShiftLeft MXSA/B and TailLoop MXSB code for gfx1250
nakajee ef26d97
Fix data initialization for mixed precision
bnemanich b197127
Merge branch 'users/nakajee/gfx950_mx_rebase_merge' of github.com:ROC…
bnemanich 20905d2
Change xfail-gfx1250 to skip-gfx1250 for new gfx950 mx test cases
nakajee a9d00e2
Removed duplicated code
nakajee 3e7e2a8
Updated year in copyright header
nakajee ad38d55
Uncomment a valid gemm type
amd-chunxlin 1f85a5c
Address cmake code review requests
bnemanich fedf5b0
Merge branch 'users/nakajee/gfx950_mx_rebase_merge' of github.com:ROC…
bnemanich 089eafb
Fix rotating buffers for mxfp4
bnemanich 2b2ee72
More cmake changes
bnemanich 4181713
Address cmake review comments
bnemanich 7c78dd5
Merge branch 'develop' of github.com:ROCm/rocm-libraries into users/n…
bnemanich fce0e0f
Fix build issue
bnemanich d8b5d3f
Fix init error
bnemanich 3f303bd
Merge branch 'develop' of github.com:ROCm/rocm-libraries into users/n…
bnemanich 1ea7df2
Merge branch 'develop' of github.com:ROCm/rocm-libraries into users/n…
bnemanich 4e89c91
Reduce mxfp4 test time
bnemanich 6256768
Merge branch 'develop' into users/nakajee/gfx950_mx_rebase_merge
bnemanich File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,2 +1,78 @@ | ||
| add_subdirectory(include) | ||
| add_subdirectory(src) | ||
| # Copyright Advanced Micro Devices, Inc., or its affiliates. | ||
| # SPDX-License-Identifier: MIT | ||
|
|
||
| # Owns the `hipblaslt::mxdatagen` STATIC helper. The source and header live | ||
| # under this directory, so the target is defined here rather than at the root. | ||
| # | ||
| # This file is added by the project root via `add_subdirectory(clients/common)` | ||
| # only when: | ||
| # * HIPBLASLT_ENABLE_MXDATAGENERATOR is ON, and | ||
| # * at least one consumer subtree is enabled (HIPBLASLT_ENABLE_CLIENT, | ||
| # TENSILELITE_ENABLE_CLIENT, or TENSILELITE_BUILD_TESTING). | ||
| # | ||
| # Consumers should `target_link_libraries(<tgt> PUBLIC hipblaslt::mxdatagen)` | ||
| # (PRIVATE if no further consumer needs the macro). Linking propagates: | ||
| # * the include path for `<mxDataGen.hpp>` | ||
| # * the link to `roc::mxDataGenerator` | ||
| # * the C++20 requirement | ||
| # * the `HIPBLASLT_ENABLE_MXDATAGENERATOR` macro | ||
| # * `hipblaslt::headers`, which carries the in-tree `<hipblaslt/...>` API | ||
| # headers (the source pulls in `<hipblaslt/hipblaslt-export.h>` and | ||
| # `<hipblaslt/hipblaslt-types.h>`). | ||
|
|
||
| if(NOT ROCM_LIBS_SUPERBUILD) | ||
| if(HIPBLASLT_ENABLE_THEROCK) | ||
| find_package(mxDataGenerator REQUIRED) | ||
| else() | ||
| # `${PROJECT_SOURCE_DIR}/../../shared/mxdatagenerator` resolves to the | ||
| # sibling `shared/mxdatagenerator` subtree of the project. The binary | ||
| # dir is anchored under the hipblaslt project's build tree to match | ||
| # the upstream subdirectory's existing layout assumptions. | ||
| add_subdirectory( | ||
| "${PROJECT_SOURCE_DIR}/../../shared/mxdatagenerator" | ||
| "${PROJECT_BINARY_DIR}/mxdatagenerator" | ||
| ) | ||
| endif() | ||
| endif() | ||
|
|
||
| add_library(hipblaslt-mxdatagen STATIC | ||
| "${CMAKE_CURRENT_SOURCE_DIR}/src/mxDataGen.cpp" | ||
| ) | ||
| add_library(hipblaslt::mxdatagen ALIAS hipblaslt-mxdatagen) | ||
|
|
||
| target_include_directories(hipblaslt-mxdatagen | ||
| PUBLIC | ||
| $<BUILD_INTERFACE:${CMAKE_CURRENT_SOURCE_DIR}/include> | ||
| ) | ||
|
|
||
| target_link_libraries(hipblaslt-mxdatagen | ||
| PUBLIC | ||
| hipblaslt::headers | ||
| roc::mxDataGenerator | ||
| PRIVATE | ||
| # `-x hip` (via hip::device's INTERFACE_COMPILE_OPTIONS) makes the | ||
| # translation unit compile through hipcc/clang. mxDataGen.cpp itself | ||
| # does not launch kernels, but the modern ROCm bf16 header it pulls | ||
| # in transitively uses clang-only builtins (e.g. __builtin_elementwise_rint). | ||
| hip::device | ||
| ) | ||
|
|
||
| target_compile_features(hipblaslt-mxdatagen PUBLIC cxx_std_20) | ||
| target_compile_definitions(hipblaslt-mxdatagen PUBLIC HIPBLASLT_ENABLE_MXDATAGENERATOR) | ||
| set_target_properties(hipblaslt-mxdatagen | ||
| PROPERTIES POSITION_INDEPENDENT_CODE ON | ||
| ) | ||
|
|
||
| # `<hipblaslt/hipblaslt-export.h>` is generated by | ||
| # `library/include/CMakeLists.txt`'s `generate_export_header(hipblaslt ...)` | ||
| # call when HIPBLASLT_ENABLE_HOST=ON. In a tensilelite-only build that target | ||
| # does not exist, so produce the same file from this helper instead. Both | ||
| # code paths emit the file to the same location so `hipblaslt::headers`'s | ||
| # `${CMAKE_BINARY_DIR}/library/include` interface dir resolves it either way. | ||
| if(NOT HIPBLASLT_ENABLE_HOST) | ||
| include(GenerateExportHeader) | ||
| generate_export_header(hipblaslt-mxdatagen | ||
| BASE_NAME hipblaslt | ||
| EXPORT_FILE_NAME "${PROJECT_BINARY_DIR}/library/include/hipblaslt/hipblaslt-export.h" | ||
| ) | ||
| endif() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.