Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
79 commits
Select commit Hold shift + click to select a range
8ec76ac
- Add SBCC partial pass kernel generator.
eng-flavio-teixeira Feb 12, 2025
1bc8bee
Add SBRR partial pass kernel generator.
eng-flavio-teixeira Feb 13, 2025
2b0c5d3
- Remove fixed length-64 partial pass kernels.
eng-flavio-teixeira Feb 14, 2025
3356f9f
- Remove no longer needed code.
eng-flavio-teixeira Feb 14, 2025
4e56a6d
Fix register size in partial pass sbrr
eng-flavio-teixeira Feb 15, 2025
bc281f9
- Fix offset initialization.
eng-flavio-teixeira Feb 18, 2025
ed2afe3
- remove debug code.
eng-flavio-teixeira Feb 18, 2025
97e6379
- Clean up debug code.
eng-flavio-teixeira Feb 18, 2025
e7dceae
- Handle direction in partial-pass twiddle multiply step.
eng-flavio-teixeira Feb 18, 2025
bd3af1b
- Fix partial pass radix include in rtc_stockham_gen.
eng-flavio-teixeira Feb 20, 2025
9c021bf
- Clean up comment
eng-flavio-teixeira Feb 20, 2025
860cc78
- Disable half-lds support in partial-pass SBRR kernels.
eng-flavio-teixeira Feb 21, 2025
ec4eb38
- Fix partial-pass kernel name
eng-flavio-teixeira Feb 22, 2025
d50ae30
Fix formatting.
eng-flavio-teixeira Feb 24, 2025
c6c02d0
- Add partial pass test scripts.
eng-flavio-teixeira Mar 7, 2025
f6a26a0
- Fixes to test scripts
eng-flavio-teixeira Mar 8, 2025
64710e7
WIP
eng-flavio-teixeira Apr 10, 2025
6bd5960
- Merge with develop
eng-flavio-teixeira Apr 23, 2025
de36c9f
- Resolve merge conflict.
eng-flavio-teixeira Apr 23, 2025
b368f0b
- Clean up.
eng-flavio-teixeira Apr 23, 2025
e89612c
- Delete partial-pass test scripts.
eng-flavio-teixeira Apr 23, 2025
6f286b0
- undo test changes.
eng-flavio-teixeira Apr 23, 2025
987a7f9
- Clang formatting.
eng-flavio-teixeira Apr 23, 2025
7b092a1
- Further clean up
eng-flavio-teixeira Apr 24, 2025
5c11172
- Remove no longer needed include
eng-flavio-teixeira Apr 24, 2025
708a98c
- More clean-up.
eng-flavio-teixeira Apr 24, 2025
47cbc4e
- match SBRR configuration.
eng-flavio-teixeira Apr 25, 2025
eecba0a
- Changes to kernel-generator.py and function pool to support partia…
eng-flavio-teixeira Apr 25, 2025
acfe158
WIP
eng-flavio-teixeira May 8, 2025
ce4e4d4
formatting.
eng-flavio-teixeira May 8, 2025
4ae3e02
- Fix ROCFFT_LAYER=8 not displaying the new partial-pass nodes.
eng-flavio-teixeira May 8, 2025
96ee9cf
- WIP: Fixes for dealing with column- to row-major configuration ent…
eng-flavio-teixeira May 15, 2025
64b6cd4
- Merge with develop
eng-flavio-teixeira May 16, 2025
eedc32d
- Resolve merge conflicts.
eng-flavio-teixeira May 16, 2025
834f888
- Formatting.
eng-flavio-teixeira May 16, 2025
66b16bb
- Refactor kernel-generator.py changes.
eng-flavio-teixeira May 16, 2025
e2c142b
- Refactor and improvements.
eng-flavio-teixeira May 16, 2025
304dd85
- Refactor function pool.
eng-flavio-teixeira May 16, 2025
0a53c1e
- Refactoring.
eng-flavio-teixeira May 20, 2025
a37b90c
- Get partial pass off-dim from function pool.
eng-flavio-teixeira May 21, 2025
bbbe036
- Clean up.
eng-flavio-teixeira May 21, 2025
6c0dfb2
- Further partial-pass kernel config validation.
eng-flavio-teixeira May 21, 2025
8bdad2f
Merge with develop
eng-flavio-teixeira May 27, 2025
aae9b33
- Add further validation for kernel-generator.py partial-pass data.
eng-flavio-teixeira May 27, 2025
20f1d3c
- Remove no longer needed field from KernelConfig.
eng-flavio-teixeira May 28, 2025
82d8e3a
- Refactor kernel configuration lists
eng-flavio-teixeira Jun 2, 2025
c739c15
- merge with
eng-flavio-teixeira Jun 2, 2025
d1f241d
- Fix issue with wgs / tpt / tpb in partial-pass kernel configuration.
eng-flavio-teixeira Jun 6, 2025
78dba8f
- Add validation and fix hardcoded offset calculation in partial pas…
eng-flavio-teixeira Jun 9, 2025
e232ad0
- Fixes for local transposition in SBCC partial-pass kernel generator.
eng-flavio-teixeira Jun 23, 2025
348e0a4
- Fix for hardcoded value in offset computation.
eng-flavio-teixeira Jun 24, 2025
a3be0ff
- Further fixes to calculate_offsets in partial-pass SBCC kernel.
eng-flavio-teixeira Jun 26, 2025
f52aefa
- clang format.
eng-flavio-teixeira Jun 26, 2025
3657b31
- Fixes for partial-pass SBCC kernel when using certain wgs and tpt …
eng-flavio-teixeira Jul 11, 2025
eee9bbd
- Fixes for partial-pass step_3_4 lds-to-reg and reg-to-lds generators.
eng-flavio-teixeira Jul 15, 2025
47b7020
- Merge with develop.
eng-flavio-teixeira Jul 15, 2025
25b4fa2
- Further resolved merge conflicts.
eng-flavio-teixeira Jul 15, 2025
a08e2d1
- Remove no longer needed lines from python config files.
eng-flavio-teixeira Jul 15, 2025
36608cd
- Further fixes to partial-pass SBCC calculate_offsets().
eng-flavio-teixeira Jul 17, 2025
3050aaa
- Fixes for partial-pass twiddle table generation.
eng-flavio-teixeira Jul 22, 2025
ab395af
- Add more partial-pass 3D lengths.
eng-flavio-teixeira Jul 23, 2025
3e40ba8
- Clean up.
eng-flavio-teixeira Jul 23, 2025
1061964
- Changes to reduce cost of local transpose in partial-pass steps 3-4.
eng-flavio-teixeira Jul 23, 2025
73d6b8c
- Add accuracy test coverage for new 3D lengths.
eng-flavio-teixeira Jul 23, 2025
28a3689
- Add more lengths to partial-pass perf suite.
eng-flavio-teixeira Jul 24, 2025
40bbe6f
- Merge with develop.
eng-flavio-teixeira Jul 24, 2025
280d78a
- Fix function pool init for partial-pass kernels.
eng-flavio-teixeira Jul 25, 2025
7cdcad5
- Further double-precision restrictions for some of the new partial-…
eng-flavio-teixeira Jul 25, 2025
5a06adb
- Clang format.
eng-flavio-teixeira Jul 29, 2025
a533919
- More clang format.
eng-flavio-teixeira Jul 29, 2025
9e691d9
Merge commit 'a5339195b470dd7138a54ef380e93d776900f32c' into import/d…
assistant-librarian[bot] Jul 31, 2025
163abf7
Merge branch 'develop' into import/develop/eng-flavio-teixeira_rocFFT…
eng-flavio-teixeira Jul 31, 2025
ea4e8f7
- Replace std::accumulate instances with arithmetic helper.
eng-flavio-teixeira Aug 1, 2025
e76ba66
- Address code review suggestions.
eng-flavio-teixeira Aug 1, 2025
3819868
- More code review suggestions.
eng-flavio-teixeira Aug 1, 2025
5205082
- Remove casts from offset calculation.
eng-flavio-teixeira Aug 1, 2025
719c7f8
- Add parent length to partial-pass kernel name.
eng-flavio-teixeira Aug 1, 2025
eef8ff7
Merge commit '5205082bed068e0f85a925cfe48dd4693694601f' into users/en…
eng-flavio-teixeira Aug 1, 2025
434f30d
Merge commit 'f81a85990eff78f08ec6746629d82f47d01b7729' into users/en…
eng-flavio-teixeira Aug 11, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 21 additions & 20 deletions projects/rocfft/library/src/include/rtc_stockham_gen.h
Original file line number Diff line number Diff line change
Expand Up @@ -32,26 +32,27 @@
#include "../device/kernels/common.h"

// generate name for RTC stockham kernel
std::string stockham_rtc_kernel_name(const StockhamGeneratorSpecs& specs,
const StockhamGeneratorSpecs& specs2d,
ComputeScheme scheme,
int direction,
rocfft_precision precision,
rocfft_result_placement placement,
rocfft_array_type inArrayType,
rocfft_array_type outArrayType,
bool unitstride,
size_t largeTwdBase,
size_t largeTwdSteps,
bool largeTwdBatchIsTransformCount,
DirectRegType dir2regMode,
IntrinsicAccessType intrinsicMode,
SBRC_TRANSPOSE_TYPE transpose_type,
CallbackType cbtype,
BluesteinFuseType fuseBlue,
PartialPassType ppType,
const LoadOps& loadOps,
const StoreOps& storeOps);
std::string stockham_rtc_kernel_name(const StockhamGeneratorSpecs& specs,
const StockhamGeneratorSpecs& specs2d,
ComputeScheme scheme,
int direction,
rocfft_precision precision,
rocfft_result_placement placement,
rocfft_array_type inArrayType,
rocfft_array_type outArrayType,
bool unitstride,
size_t largeTwdBase,
size_t largeTwdSteps,
bool largeTwdBatchIsTransformCount,
DirectRegType dir2regMode,
IntrinsicAccessType intrinsicMode,
SBRC_TRANSPOSE_TYPE transpose_type,
CallbackType cbtype,
BluesteinFuseType fuseBlue,
PartialPassType ppType,
const StockhamPartialPassParams& ppParams,
const LoadOps& loadOps,
const StoreOps& storeOps);

// generate source for RTC stockham kernel. transforms_per_block may
// be nullptr, but if non-null, stockham_rtc stores the number of
Expand Down
2 changes: 2 additions & 0 deletions projects/rocfft/library/src/rocfft_aot_helper.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -301,6 +301,7 @@ void build_stockham_function_pool(CompileQueue& queue)
cbtype,
fuseBlue,
ppType,
ppParams,
{},
{});
std::function<std::string(const std::string&)> generate_src
Expand Down Expand Up @@ -692,6 +693,7 @@ void build_solution_kernels(CompileQueue& queue)
cbtype,
fuseBlue,
ppType,
ppParams,
{},
{});

Expand Down
51 changes: 28 additions & 23 deletions projects/rocfft/library/src/rtc_stockham_gen.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -43,26 +43,27 @@ using namespace std::placeholders;
#include "device/kernel-generator-embed.h"

// generate name for RTC stockham kernel
std::string stockham_rtc_kernel_name(const StockhamGeneratorSpecs& specs,
const StockhamGeneratorSpecs& specs2d,
ComputeScheme scheme,
int direction,
rocfft_precision precision,
rocfft_result_placement placement,
rocfft_array_type inArrayType,
rocfft_array_type outArrayType,
bool unitstride,
size_t largeTwdBase,
size_t largeTwdSteps,
bool largeTwdBatchIsTransformCount,
DirectRegType dir2regMode,
IntrinsicAccessType intrinsicMode,
SBRC_TRANSPOSE_TYPE transpose_type,
CallbackType cbtype,
BluesteinFuseType fuseBlue,
PartialPassType ppType,
const LoadOps& loadOps,
const StoreOps& storeOps)
std::string stockham_rtc_kernel_name(const StockhamGeneratorSpecs& specs,
const StockhamGeneratorSpecs& specs2d,
ComputeScheme scheme,
int direction,
rocfft_precision precision,
rocfft_result_placement placement,
rocfft_array_type inArrayType,
rocfft_array_type outArrayType,
bool unitstride,
size_t largeTwdBase,
size_t largeTwdSteps,
bool largeTwdBatchIsTransformCount,
DirectRegType dir2regMode,
IntrinsicAccessType intrinsicMode,
SBRC_TRANSPOSE_TYPE transpose_type,
CallbackType cbtype,
BluesteinFuseType fuseBlue,
PartialPassType ppType,
const StockhamPartialPassParams& ppParams,
const LoadOps& loadOps,
const StoreOps& storeOps)
{
std::string kernel_name = "fft_rtc";

Expand All @@ -77,10 +78,14 @@ std::string stockham_rtc_kernel_name(const StockhamGeneratorSpecs& specs,
break;
case PPT_SBCC:
case PPT_SBRR:
kernel_name += "_pp";
kernel_name += "_partial_pass";
kernel_name += "_parent_len";
for(auto f : ppParams.parent_length)
kernel_name += "_" + std::to_string(f);
break;
}

kernel_name += "_len";
kernel_name += "_len_";
kernel_name += std::to_string(specs.length);
if(scheme == CS_KERNEL_2D_SINGLE)
kernel_name += "x" + std::to_string(specs2d.length);
Expand Down Expand Up @@ -113,7 +118,7 @@ std::string stockham_rtc_kernel_name(const StockhamGeneratorSpecs& specs,

if(specs.static_dim)
{
kernel_name += "_dim";
kernel_name += "_dim_";
kernel_name += std::to_string(specs.static_dim);
}

Expand Down
1 change: 1 addition & 0 deletions projects/rocfft/library/src/rtc_stockham_kernel.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -206,6 +206,7 @@ RTCKernel::RTCGenerator RTCKernelStockham::generate_from_node(const LeafNode&
node.GetCallbackType(enable_callbacks),
node.fuseBlue,
ppType,
pp_params,
node.loadOps,
node.storeOps);
};
Expand Down