-
Notifications
You must be signed in to change notification settings - Fork 324
Develop stream 2025 05 06 #30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 81 commits
Commits
Show all changes
88 commits
Select commit
Hold shift + click to select a range
1e13119
Abstract benchmarking loop
MyNameIsTrez 2c0800c
Apply new benchmark abstraction - Part 2
MyNameIsTrez 7348a1d
Apply new benchmark abstraction - Part 1
MyNameIsTrez 43da4c0
Resolve "Introduce device_ptr to Benchmarks Part 3"
yungshengtu f6f2195
Resolve "Introduce device_ptr to Benchmarks Part 5"
yungshengtu 29a6d12
Apply new benchmark abstraction - Part 3
MyNameIsTrez 8bffaad
Apply new benchmark abstraction - Part 4
MyNameIsTrez a472e80
Resolve "Remove [[deprecated]]float_bit_mask and all uses of it from …
cenxuantian 38ac5a5
Resolve "Remove short_radix_bits in segmented_radix_sort_config_params"
NB4444 4d265b7
Remove deprecations
NB4444 0249ce1
Resolve "Fix "warning: explicit specialization cannot have a storage …
NB4444 7c664b3
Resolve "Move rocprim::detail::radix_key_codec_base into traits system"
cenxuantian e642d8e
Apply new benchmark abstraction - Part 8
MyNameIsTrez d6ba6da
Apply new benchmark abstraction - Part 6
MyNameIsTrez 06ce1d9
Fix merge issues and clang format
NB4444 79b4655
Resolve "Implement tuning for rocprim::search_n"
cenxuantian 95ad769
Resolve "Apply new benchmark abstraction - Part 7"
ApoorvaKalyani eed8369
Resolve "Introduce device_ptr to BenchmarksPart 8"
yungshengtu 0c5e3dd
Resolve "Introduce device_ptr to Benchmarks Part 2"
yungshengtu ca3b0cd
Resolve "Add virtual shared memory fallback to device_merge"
NB4444 cff2e16
Resolve "Add device-level inclusive_scan with initial value support"
a755431
Resolve "Make use of vectorized load in rocprim::transform"
jblok27 300de6e
Fix autotuning in benchmark_device_transform
NB4444 ffd4887
Apply new benchmark abstraction - Part 9
MyNameIsTrez dacfb1e
Resolve "Introduce device_ptr to Benchmarks Part 4"
yungshengtu a88ed02
Resolve "Introduce device_ptr to Benchmarks Part 6"
yungshengtu 2148abe
Resolve "Autotune failure on benchmark_device_segmented_radix_sort_pa…
NB4444 68354b7
Derive 'ROCPRIM_WAVEFRONT_SIZE' from architecture defines
MyNameIsTrez 5f7accb
Resolve "Change default scan accumulator type to be in line with (hip…
ApoorvaKalyani b135286
Apply new benchmark abstraction - Part 10
MyNameIsTrez ff1b0c5
feat(arch.hpp): implement mechanism for wavefront size-based dispatching
Naraenda c203cff
Deduplicate benchmark_device_binary_search
MyNameIsTrez 684fd34
Put the Google Benchmark state in benchmark_utils::state
MyNameIsTrez fd2168f
Abstract device_histogram's benchmarking
MyNameIsTrez bfd2e5f
Resolve "Add SPIR-V to rocPRIM CI"
borysborys 8825597
Resolve "Introduce device_ptr to Benchmarks Part 7"
yungshengtu 19947dd
Replace benchmark template with regular parameter
MyNameIsTrez d91b967
CI Fix spirv build benchmark and tests
NB4444 6c747d3
Resolve "Fix compilation failure in hipCUB/rocThrust to rocPRIM."
yungshengtu 53a1bea
fix: fix various compile issues when targeting spir-v
Naraenda 9d08154
Resolve "Check device_transform for cuda parity"
NB4444 5f42dc3
Rename "tmp" to "unused" for clarity
MyNameIsTrez 97e8f77
Resolve "Apply new benchmark abstraction to device_search_n"
ApoorvaKalyani 87b473f
Resolve "rocm 6.4 failures in rocprim"
NB4444 2a54a85
Remove REGISTER_BENCHMARK(), config_autotune_interface, and config_au…
MyNameIsTrez b629596
Resolve "Replace all ROCPRIM_IF_CONSTEXPR with constexpr"
ApoorvaKalyani fdac589
Output JSON benchmark statistics
MyNameIsTrez e16e3da
Fix benchmarking assert
MyNameIsTrez 8c50b54
ci(.gitlab-ci.yml): add tests for spirv target
Naraenda 1ed975a
Resolve "SPIR-V: warp sort"
NB4444 58e7c0b
Initialize total_gbench_iterations and total_size to 0
MyNameIsTrez a69a9bb
Fix compile warning in thread_load for the new compiler
NB4444 9fe68bc
Stop repeating tests three times
MyNameIsTrez 960461d
Disable dispatching with macro for usage with spir-v
NB4444 6ad6c2d
Call non-static method properly
MyNameIsTrez dd360e9
Fix unintended benchmark JSON format changes
MyNameIsTrez c50cd7a
Extra warp_sort check in tests
NB4444 0b58176
Fix benchmarks that call set_throughput() more than once
MyNameIsTrez d1beef2
Lower benchmark_device_batch_memcpy from 1 KiB to 0 Bytes
MyNameIsTrez 4239ef5
Resolve "Match CUB's behavior in rocPRIM for device merge"
sikba 7caf280
Resolve "device_merge_sort custom_huge_type failing test"
NB4444 527c24c
fix(intrinsics/atomics.hpp): fix atomics when compiler to spirv
Naraenda 668f913
Resolve "Create tests for rocPRIM's bit_cast"
Saiyang-Zhang 5815656
fix: improve compatibility with spir-v target in algorithms using 'la…
Naraenda ab9dc0a
Resolve "SPIR-V: warp reduce/scan"
Saiyang-Zhang 9f0dcf1
Resolve "SPIR-V: block scan/reduce/RLD"
Saiyang-Zhang 75820ee
Resolve "Temporarily stop running device_partition test for SPIR-V du…
yungshengtu 54802ef
Resolve "fallback to host side input generation for device_run_length…
borysborys d3a8911
Resolve "SPIR-V: warp exchange/load/store"
yungshengtu cff88f8
Resolve "SPIR-V: block exchange/load/store (and funcs)"
yungshengtu b2bb04c
Resolve "device_run_length_encode failing test"
borysborys 9fc6fff
test: fix clangd language server errors for tests that use generated …
Naraenda cc1c028
Resolve "SPIR-V: block radix rank/sort"
Saiyang-Zhang 5daf2de
Resolve "REVERT: fallback to host side input generation for device_ru…
borysborys 4784b69
Added generic pragmas and created fallback for atomics
NB4444 2db9f4e
ci(.gitlab-ci.yml): add timeout to spirv tests
Naraenda 1ed863a
Resolve "SPIR-V: lookback_scan_state"
NB4444 7da3c6a
Resolve "Prepare to move 'lookback_scan' to public API"
parbenc 9d6dd68
Clang format
NB4444 ab1ef9e
fix: skip including the init value in block aggregate for warp and bl…
Naraenda d7852b6
Merge commit 'ab1ef9ef1a47ff6799320a809ea969a17ffc15fc' into import/d…
assistant-librarian[bot] 578f5a1
CHANGELOG update
NB4444 a0dedff
Merge commit '578f5a170ed02febd132358675705968390eeb4c' into import/d…
jayhawk-commits cff99e3
Fix failing test device_scan
NB4444 d210bc2
update rocprim version to 4.0.0
NB4444 ac6947e
Merge commit 'd210bc2d2e0a28396233c4b757326c1a11933797' into import/d…
jayhawk-commits 8ed4a9a
Fix build error in benchmark_utils
NB4444 d553a23
Merge commit '8ed4a9a79525ddc1fd793a03fa9d2118c5aecd87' into import/d…
jayhawk-commits File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.