Added Jenkinsfile by amdkila · Pull Request #2 · ROCm/hipCUB

amdkila · 2019-05-05T22:51:18Z

Have to test this in Jenkins

Update thread load/store assembly for GFX12 The "s_wait_cnt" instruction is used to avoid data hazards after some load and store instructions. On gfx12, s_wait_cnt has been depricated, and replaced with more specific instructions for each individual type of counter (eg. loadcnt, storecnt). This changes updates two locations where the old s_waitcnt instruction is used within some inline assembly. In these two cases, the instruction is replaced with s_[load/store]cnt_dscnt. The "dscnt" suffix ensures that we also wait for any outstanding local memory operations to complete.

Added Jenkinsfile [ROCm/hipCUB commit: 7756cb3]

[rocPRIM] Config modernization MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ## Motivation Our previous configuration system had become limiting in several ways. Most importantly, it was not able to differentiate between individual GPUs when selecting config parameters. This made proper tuning difficult and prevented future work involving SPIR-V–specific tuning. In addition, the old approach relied heavily on complex template metaprogramming, which had become difficult to maintain. With the move to C++17, we now have cleaner and more expressive language features available, making this a good opportunity to redesign the system. ## Technical Details All changes are internal. **There are no API changes for users.** The majority of the diff in this PR consists of the new configuration definitions themselves, so while the PR appears large, the actual code changes are relatively small. ### New Configuration Structure Each algorithm now defines a *_config_picker templated on the target and value type. Below is a simplified example: ```cpp template<class Target, class value_type> constexpr <algo_name>_config_picker() -> std::enable_if_t< std::is_same_v<Target, comp_target<gen::gcn5, target_arch::gfx906, gpu::mi50, rep::amdgcn>>, <algo_name>_config_params> { // Tuned configuration #1 if constexpr (/* condition for this combination */) { return <algo_name>_config_params{ ... }; } // Tuned configuration #2 if constexpr (/* condition for this combination */) { return <algo_name>_config_params{ ... }; } // Default for this target return <algo_name>_config_params_base<value_type>(); } ``` Each tuned target provides a similar overload. For untuned or unknown targets, we provide a general fallback: ```cpp template<class Target, class value_type> constexpr auto <algo_name>_config_picker() -> std::enable_if_t< std::is_same_v<Target, comp_target<gen::unknown, target_arch::unknown, gpu::generic, rep::amdgcn>>, <algo_name>_config_params> { // Fallback: use a commonly tuned target (often MI100) return <algo_name>_config_picker< comp_target<gen::cdna1, target_arch::gfx908, gpu::mi100, rep::amdgcn>, key_type, value_type>(); } ``` All available tuned targets are listed in: ```cpp using <algo_name>_targets = comp_targets< comp_target<gen::gcn5, target_arch::gfx906, gpu::mi50, rep::amdgcn>, ..., comp_target<gen::unknown, target_arch::unknown, gpu::generic, rep::amdgcn>>; ``` ### How Config Selection Works Now In the new system, kernels are compiled for all tuned targets. At runtime, if the current GPU does not have dedicated tuning, the library uses the most_common_config policy to choose the best matching compiled kernel. The selection policy (tested in test_config_dispatch.cpp) attempts to match, in decreasing priority: 1. Exact GPU model 2. Architecture 3. Generation If no match is found, it falls back to the unknown target. If multiple candidates match, the last one listed in the comp_targets type list is chosen, which gives us a controlled and predictable fallback order. We also pass the selected target into kernel compilation, enabling compile-time specialization based on GPU, architecture, and generation. ### Target struct The target struct currently stores only: - GPU generation - Architecture - GPU Name - Representation (rep), which distinguishes SPIR-V from native AMDGCN The rep field is not yet functional (requires compiler support), and the dispatch policy does not consider it at the moment. Also this target structs makes it relatively easy to store more data. ### Scripts The python script changes in this PR are there for scripts that used the configs as input/output. ### Summary of Improvements: - Better differentiation and selection across GPUs - Cleaner C++17-based implementation - Easier extension for future SPIR-V tuning - Improved maintainability of config definitions - Added more flexibility for future features. ## Test Plan Some tests were added in test_config_dispatch.cpp, these and all the other tests should pass. Also everything needs to be benchmarked to see if the correct configs are chosen. ## Test Result All tests pass, benchmarks are still WIP. ## Submission Checklist - [ ] Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

amdkila added 6 commits May 5, 2019 16:47

Added Jenkinsfile

bc877b2

Added Jenkins file with updated build command

90b07b9

Changed repository name to all lowercase

be5cc0d

Adding placeholder README file so Jenkins will notice the docker folder

6c1c31e

Adding docker files from rocPRIM

9002157

Jenkinsfile that passed Jenkins tests

3ad8989

saadrahim merged commit 7756cb3 into ROCm:develop May 7, 2019

amdkila deleted the Jenkins branch May 7, 2019 20:35

stanleytsang-amd added a commit to stanleytsang-amd/hipCUB that referenced this pull request Jun 26, 2024

Fix typo ROCm#2

18f65da

ammallya pushed a commit that referenced this pull request Oct 28, 2025

Merge pull request #2 from akilaMD/Jenkins

3deb99c

Added Jenkinsfile [ROCm/hipCUB commit: 7756cb3]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added Jenkinsfile#2

Added Jenkinsfile#2
saadrahim merged 6 commits into
ROCm:developfrom
amdkila:Jenkins

amdkila commented May 5, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

amdkila commented May 5, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants