Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce Thrust benchmarks #534

Merged
merged 2 commits into from
Oct 13, 2023

Conversation

gevtushenko
Copy link
Collaborator

Description

closes #473

This PR:

  • Ports existing Thrust benchmarks into CCCL repository
  • Extends Thrust benchmarks to work with any given device system (CUDA, CPP, OMP, etc.)
  • Adds optional tests of benchmark data generators (can be enabled with CUB_ENABLE_NVBENCH_HELPER_TESTS)
  • Refactors benchmark data generators API
  • Makes benchmark data generators heterogeneous
  • Abstracts CUB tuning infrastructure from CUB and associates it with CCCL instead

To run Thrust benchmarks, it's sufficient to:

$: cmake -DCCCL_ENABLE_THRUST=YES \ 
      -DCCCL_ENABLE_CUB=YES \
      -DCCCL_ENABLE_LIBCUDACXX=NO \
      -DCMAKE_BUILD_TYPE=Release \
      -DCMAKE_CUDA_ARCHITECTURES=89 \
      -DCCCL_ENABLE_BENCHMARKS=YES .. 
$: ../benchmarks/scripts/run.py -R 'thrust.*'
&&&& RUNNING bench
 ctk:  12.2.140
cccl:  monorepo-309-g9ce36e3b7
&&&& PERF thrust_bench_adjacent_difference_basic_base_T_ct__I8___Elements_pow2__16 1.0048000149254221e-05 -sec
&&&& PERF thrust_bench_adjacent_difference_basic_base_T_ct__I8___Elements_pow2__20 0.00013096000475343317 -sec
&&&& PERF thrust_bench_adjacent_difference_basic_base_T_ct__I8___Elements_pow2__24 0.00017212801321875304 -sec
&&&& PERF thrust_bench_adjacent_difference_basic_base_T_ct__I8___Elements_pow2__28 0.0007995199994184077 -sec
&&&& PERF thrust_bench_adjacent_difference_basic_base_T_ct__I16___Elements_pow2__16 1.0239999937766697e-05 -sec
&&&& PERF thrust_bench_adjacent_difference_basic_base_T_ct__I16___Elements_pow2__20 0.0001341439929092303 -sec
&&&& PERF thrust_bench_adjacent_difference_basic_base_T_ct__I16___Elements_pow2__24 0.00021606399968732148 -sec
&&&& PERF thrust_bench_adjacent_difference_basic_base_T_ct__I16___Elements_pow2__28 0.0014510079054161906 -sec

Thrust multiconfig setup is used to produce benchmarks for different device systems.

Checklist

  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@gevtushenko gevtushenko requested review from a team as code owners October 10, 2023 19:31
@gevtushenko gevtushenko requested review from wmaxey, miscco and elstehle and removed request for a team October 10, 2023 19:31
@gevtushenko gevtushenko mentioned this pull request Oct 10, 2023
2 tasks
@jrhemstad jrhemstad requested a review from alliepiper October 11, 2023 16:40
Copy link
Contributor

@elstehle elstehle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great work! That's a awesome step up for our benchmarking infrastructure!

Just a few, mostly minor, comments/suggestions.

@gevtushenko gevtushenko merged commit fcbe255 into NVIDIA:main Oct 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

[FEA]: Port Thrust benchmarks into CCCL repo
5 participants