Enable tests for externally-managed work areas by expanding hipfft_params by regan-amd · Pull Request #160 · ROCm/hipFFT

regan-amd · 2025-07-03T06:36:52Z

Summary: changes suggested to enable coverage for features relevant to externally-managed work areas with hipFFT.

Details

In general, hipFFT should assume/require a pointer to an array of $N_\text{GPU}$ size_t values where $N_\text{GPU}$ is the number of GPU devices being used, when it comes to the size_t* workSize arguments of its various APIs. Some implementation changes may be required for multi-gpu usage as only one of those values is written anyways on AMD platforms currently unless I missed something; full testing cannot be done at the moment due to a missing API in my understanding, though.
Anyhow, this point likely motivated some gymnastics currently present within hipfft_params that makes the base class member variable workbuffersize irrelevant/unused/possibly misleading in some cases. I suggest to no longer declare workbuffersize in the base class to let derived classes consistently unroll their own logic, free of that "constraint". This also triggers some required changes for the fft_params::work_buffer_alloc_failure exception, in turn.
The base class is expanded with a member enum variable identifying whether the encapsulated plan is allowed to "auto-allocate" its work area(s) or not; the possible values are basically "on", "off", and "default". For rocfft_params, "default:=off" whereas "default:=on" for hipfft_params.
The hipfft_param class is expanded with a static std::vector<gpubuf> externally_managed_workareas; class member that is meant to be used by its instances if they're required to create plans without auto-allocation. externally_managed_workareas exists independently of any of the hipfft_params instance. Any instance instructed to create plans without auto-allocate will use the member(s) of externally_managed_workareas (after reallocating them if they're too small for their needs) and provide them as work areas to the created plans.
If any plan creation is found to fail due to a failing device allocation at any point, the specific hipfft_params instance will clear externally_managed_workareas and attempt the plan creation again before reporting the failure (or not), given that externally_managed_workareas may contain buffers that are not needed or too large for the instance.
I'm suggesting some implementation changes to hipFFT to prevent it from allocating more buffer(s) if a simple work buffer size or size estimate is queried.

Please let me know if hipfft_params is supposed to be thread-safe as more care would be required w.r.t. externally_managed_workareas in that case.

malcolmroberts · 2025-07-04T20:40:49Z

+                               [](size_t total, const gpubuf& buf) { return total + buf.size(); });
    }

+    bool is_preventing_auto_allocation_at_generation() const


This seems like it's going to affect performance tests, which is also a target for hipfft_params.

AFAIK, the suggested changes should not modify the overall behavior of this class for configurations that could be tested prior (at least not w.r.t. the allocations of work areas). With the suggested changes, any previously-testable configuration remains testable, and their (newly-introduced) member auto_allocate is then fft_auto_allocation_default (note: fft_auto_allocation_on behaves equivalently in case of hipfft_params). In such cases, this specific member function would always return false and the logic flow of create_plan is globally unchanged (work areas are automatically allocated by hipfft at plan generation).

For newly-enabled configurations of hipfft_params though (i.e., cases with auto_allocate set to fft_auto_allocation_off), note that the additions to create_plan will still fully set the required work areas (see set_externally_managed_work_areas) before returning. The changes do not rely on the backend library (e.g., rocfft) allocating its own work area(s) at plan execution if they were not provided with some prior.

Therefore, I do not expect a significant change in measured performance, in either case. I verified with a few various sizes on a gfx1201 (single-process, single-GPU cases), using hipfft-bench on this branch and develop. Please let me know if there is a specific size you'd want me to look into.

malcolmroberts

Does this affect a performance test?

…-bench

regan-amd · 2025-07-07T23:26:53Z

Does this affect a performance test?

I don't think it does.

…scription thereof

malcolmroberts · 2025-07-09T20:10:48Z

+            // hipfftXtSetWorkArea would be required
+            throw std::runtime_error(
+                "cannot request externally-managed work areas with multi-gpu usage");
+        }


This isn't really an error; we just haven't implemented it yet. Normally, one would just skip the test using gtest's mechanism, but we can't do that here because it's in the fft params level.

Thank you. I introduced a dedicated base-class-defined unimplemented_exception along with the corresponding logic in the accuracy test in 517f101: let me know if that's more acceptable o you, or not.

Note: thinking about it, I don't think the (current) lack implementation support warrants an exception to be thrown by validate_fields, so I removed it therefrom.

…nt failure for get_xt_api_execution_type in case of unexepected precision

@regan-amd

Following up here upon completion of ROCm/hipFFT#160. Note: 1. if auto-allocation is used, plans allocate resources at execution for rocFFT whereas they do it at plan generation for hipFFT; 2. "default" auto-allocation flag for rocFFT tests is equivalent to "off", while "default" auto-allocation behavior for hipFFT tests is equivalent to "on", consistently with the above; 3. If "on" auto-allocation flag is used for a rocFFT test, that would force the underlying plan to allocate resources at execution time. It might be best to have a warning printed by bench in that case? --- 🔁 Imported from [ROCm/rocFFT#626](ROCm/rocFFT#626) 🧑‍💻 Originally authored by @regan-amd Co-authored-by: Raphael Egan <Raphael.Egan@amd.com>

Adding auto-allocation logic for rocfft tests MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Following up here upon completion of ROCm/hipFFT#160. Note: 1. if auto-allocation is used, plans allocate resources at execution for rocFFT whereas they do it at plan generation for hipFFT; 2. "default" auto-allocation flag for rocFFT tests is equivalent to "off", while "default" auto-allocation behavior for hipFFT tests is equivalent to "on", consistently with the above; 3. If "on" auto-allocation flag is used for a rocFFT test, that would force the underlying plan to allocate resources at execution time. It might be best to have a warning printed by bench in that case?

…e discarded + clear externally-managed work areas while runtime is initialized (#1388) This fixes failures for `hipfft-test` with CUDA backend for tests added in ROCm/hipFFT#160.

[hipfft-test] do not use nullptrs for pointers to workarea sizes to be discarded + clear externally-managed work areas while runtime is initialized (#1388) This fixes failures for `hipfft-test` with CUDA backend for tests added in #160.

…rams (#160) * Enable (and test for) externally-managed workareas via hipfft_params * Remove dependency on external random_seed and repair build for hipfft-bench * Adding 'auto_allocation' option to hip-bench target and clarifying description thereof * Using ad-hoc exception type for unimplemented cases and avoiding silent failure for get_xt_api_execution_type in case of unexepected precision * 'HipFFT' -> 'hipFFT' [ROCm/hipFFT commit: 3ee8789]

…e discarded + clear externally-managed work areas while runtime is initialized (#1388) This fixes failures for `hipfft-test` with CUDA backend for tests added in #160.

Enable (and test for) externally-managed workareas via hipfft_params

675fe00

regan-amd requested review from af-ayala, eng-flavio-teixeira, evetsso and malcolmroberts as code owners July 3, 2025 06:36

regan-amd changed the title ~~Enable (and test for) externally-managed work areas by expanding hipfft_params~~ Enable tests for externally-managed work areas by expanding hipfft_params Jul 3, 2025

malcolmroberts reviewed Jul 4, 2025

View reviewed changes

malcolmroberts requested changes Jul 4, 2025

View reviewed changes

Remove dependency on external random_seed and repair build for hipfft…

2eda57c

…-bench

regan-amd requested a review from malcolmroberts July 7, 2025 23:26

Adding 'auto_allocation' option to hip-bench target and clarifying de…

f827ad6

…scription thereof

malcolmroberts reviewed Jul 9, 2025

View reviewed changes

Comment thread clients/hipfft_params.h

Using ad-hoc exception type for unimplemented cases and avoiding sile…

517f101

…nt failure for get_xt_api_execution_type in case of unexepected precision

regan-amd requested a review from malcolmroberts July 9, 2025 23:58

malcolmroberts approved these changes Jul 15, 2025

View reviewed changes

Kevonosdiaz reviewed Jul 17, 2025

View reviewed changes

Comment thread clients/bench/bench.cpp Outdated

'HipFFT' -> 'hipFFT'

bc6708a

eng-flavio-teixeira approved these changes Jul 17, 2025

View reviewed changes

Kevonosdiaz approved these changes Jul 17, 2025

View reviewed changes

regan-amd merged commit 3ee8789 into ROCm:develop Jul 18, 2025
9 of 10 checks passed

This was referenced Jul 19, 2025

Adding auto-allocation logic for rocfft tests ROCm/rocFFT#626

Closed

Define static class member of hipfft_params for hipfft_mpi_worker #174

Merged

assistant-librarian Bot mentioned this pull request Jul 31, 2025

Adding auto-allocation logic for rocfft tests ROCm/rocm-libraries#993

Merged

regan-amd mentioned this pull request Aug 27, 2025

[hipfft-test] do not use nullptrs for pointers to workarea sizes to be discarded + clear externally-managed work areas while runtime is initialized ROCm/rocm-libraries#1388

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable tests for externally-managed work areas by expanding hipfft_params#160

Enable tests for externally-managed work areas by expanding hipfft_params#160
regan-amd merged 5 commits into
ROCm:developfrom
regan-amd:externally_managed_workareas_in_hipfftparams

regan-amd commented Jul 3, 2025

Uh oh!

malcolmroberts Jul 4, 2025

Uh oh!

regan-amd Jul 7, 2025

Uh oh!

malcolmroberts left a comment

Uh oh!

regan-amd commented Jul 7, 2025

Uh oh!

malcolmroberts Jul 9, 2025

Uh oh!

regan-amd Jul 9, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

regan-amd commented Jul 3, 2025

Details

Uh oh!

malcolmroberts Jul 4, 2025

Choose a reason for hiding this comment

Uh oh!

regan-amd Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

malcolmroberts left a comment

Choose a reason for hiding this comment

Uh oh!

regan-amd commented Jul 7, 2025

Uh oh!

malcolmroberts Jul 9, 2025

Choose a reason for hiding this comment

Uh oh!

regan-amd Jul 9, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants