generate benchmark input in device by karthikeyann · Pull Request #10109 · rapidsai/cudf

karthikeyann · 2022-01-24T11:56:21Z

To speedup generate benchmark input generation, move all data generation to device.
To address #5773 (comment)
This PR moves the random input generation to device.

Rest all of the original work in this PR was split to multiple PRs and merged.
#10277
#10278
#10279
#10280
#10281
#10300

With all of these changes, single iteration of all benchmark runs in <1000 seconds. (from 3067s to 964s).
Running more iterations would see higher benefit too because the benchmark is restarted several times during run which again calls benchmark input generation code.

closes #9857

codecov · 2022-01-24T13:34:11Z

Codecov Report

Merging #10109 (373b08d) into branch-22.04 (4596244) will increase coverage by 0.02%.
The diff coverage is 96.86%.

❗ Current head 373b08d differs from pull request most recent head 0810425. Consider uploading reports for the commit 0810425 to get more accurate results

@@               Coverage Diff                @@
##           branch-22.04   #10109      +/-   ##
================================================
+ Coverage         86.13%   86.16%   +0.02%     
================================================
  Files               139      139              
  Lines             22438    22447       +9     
================================================
+ Hits              19328    19341      +13     
+ Misses             3110     3106       -4

Impacted Files	Coverage Δ
python/dask_cudf/dask_cudf/tests/test_accessor.py	`98.41% <ø> (ø)`
python/cudf/cudf/core/_base_index.py	`85.92% <50.00%> (-0.51%)`	⬇️
python/cudf/cudf/core/column/decimal.py	`91.30% <73.68%> (-1.01%)`	⬇️
python/cudf/cudf/core/column/categorical.py	`89.63% <84.61%> (-0.29%)`	⬇️
python/cudf/cudf/core/column/string.py	`88.91% <94.44%> (+0.64%)`	⬆️
python/cudf/cudf/core/column/numerical.py	`95.62% <95.83%> (+0.64%)`	⬆️
python/cudf/cudf/core/frame.py	`91.84% <96.42%> (+0.12%)`	⬆️
python/cudf/cudf/_typing.py	`94.11% <100.00%> (+0.78%)`	⬆️
python/cudf/cudf/api/types.py	`89.79% <100.00%> (ø)`
python/cudf/cudf/core/column/column.py	`89.27% <100.00%> (+0.10%)`	⬆️
... and 26 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4596244...0810425. Read the comment docs.

…hmark_speedup2

karthikeyann · 2022-03-16T18:02:25Z

rerun tests

vyasr

Aside from my request for documenting the normal/binomial approximation, I think this is good enough to merge on my end. Thanks again!

vyasr · 2022-03-16T22:16:00Z

@karthikeyann pointed out that the binomial has been removed, I think I was getting linked back to old versions of the code when verifying those parts of the code. I think we're set.

vuule

Concern about the geometric distribution, plus some nitpicks.

cpp/benchmarks/common/generate_input.hpp

cpp/benchmarks/common/random_distribution_factory.hpp

vuule · 2022-03-16T21:37:06Z

cpp/benchmarks/common/random_distribution_factory.hpp

+        [lower_bound, upper_bound, dist = make_normal_dist(diffType{0}, upper_bound - lower_bound)](
+          thrust::minstd_rand& engine, size_t size) -> rmm::device_uvector<T> {
+          rmm::device_uvector<T> result(size, rmm::cuda_stream_default);
+          thrust::tabulate(thrust::device,
+                           result.begin(),
+                           result.end(),
+                           abs_value_generator{lower_bound, upper_bound, engine, dist});


I don't think this emulates a geometric distribution. At least it didn't add up with some sample lower_bound
and upper_bound values I tried on paper.
AFAICT we need a normal distribution with mean = 0, so we can use abs to make half of the bell. Then these values need to be moved/inverted so that the tip of the bell is at lower_bound, and probability falls towards upper_bound.
We can maybe leave this as TODO, but it might affect benchmarks in the meantime.

updated. added geometric distribution.

cpp/benchmarks/copying/contiguous_split.cu

cpp/benchmarks/common/generate_input.cu

karthikeyann · 2022-03-18T15:54:17Z

rerun tests

karthikeyann · 2022-03-21T09:01:51Z

rerun tests

vuule

Thank you for addressing all review comments!
Looks 🔥 🔥

cpp/benchmarks/common/generate_input.cu

cpp/benchmarks/common/random_distribution_factory.hpp

karthikeyann · 2022-03-22T13:42:59Z

@gpucibot merge

karthikeyann · 2022-03-22T13:49:12Z

Thank you @vuule, @vyasr and @davidwendt for reviewing this big PR! 💯

karthikeyann added 3 commits January 21, 2022 09:45

rename generate_benchmark_input.cpp to .cu

51fe975

update generator lambdas to use thrust::random

57b0400

use thrust random generators: numeric, chrono, fixed_point, string

df2b986

karthikeyann added feature request New feature or request 2 - In Progress Currently a work in progress tests Unit testing for project cuda libcudf Affects libcudf (C++/CUDA) code. Performance Performance related issue non-breaking Non-breaking change labels Jan 24, 2022

karthikeyann added this to the C++ Benchmark Runtime Improvements milestone Jan 24, 2022

karthikeyann self-assigned this Jan 24, 2022

github-actions bot added the CMake CMake build issue label Jan 24, 2022

karthikeyann added 2 commits January 24, 2022 17:31

rename copy_benchmark.cpp to cu (thrust code)

501bbca

disable debug print, env iterations in gbench fixture

58a6d38

karthikeyann requested a review from davidwendt January 24, 2022 16:50

karthikeyann and others added 13 commits January 25, 2022 11:11

Merge branch 'branch-22.04' into fea-benchmark_speedup2

211778a

fix bug in bounds

759f967

fix static shared_ptr bug

c5f263a

use generator in anyall_benchmark.cpp

1b36384

use generator in minmax_benchmark.cpp

114b9aa

use generator in reduce_benchmark.cpp

8c27985

use thrust::shuffle in string/copy_benchmark.cu

709bf0d

Merge branch 'branch-22.04' of github.com:rapidsai/cudf into fea-benc…

cc1d7dc

…hmark_speedup2

rename to copy.cu

a96dd03

remove copy_benchmark.cu

dab9c59

revert generate_input.cpp

5957f58

rename to generate_input.cu

ca04d2e

recheckin generate_input.cu changes with old commits

c31fcc2

karthikeyann added 5 commits March 15, 2022 22:47

Merge branch 'branch-22.04' of github.com:rapidsai/cudf into fea-benc…

38ee90b

…hmark_speedup2

add std::clamp to value_generator

156ac3f

static_cast fix

c5b3c47

update more benchmarks input gen to device

7d17ea5

std::optional for null_frequency

de3b6e8

karthikeyann requested review from vuule and vyasr March 16, 2022 00:31

vyasr approved these changes Mar 16, 2022

View reviewed changes

vuule requested changes Mar 16, 2022

View reviewed changes

karthikeyann added 3 commits March 18, 2022 14:53

null probablilty float to double

d02c2fb

add geometric distribution approximation

5b59642

more comments addressed

1e393a5

karthikeyann requested a review from vuule March 18, 2022 15:48

vuule approved these changes Mar 21, 2022

View reviewed changes

davidwendt reviewed Mar 22, 2022

View reviewed changes

cpp/benchmarks/common/generate_input.cu Outdated Show resolved Hide resolved

davidwendt requested changes Mar 22, 2022

View reviewed changes

cpp/benchmarks/common/random_distribution_factory.hpp Show resolved Hide resolved

addressing review commments (davidwendt)

0810425

karthikeyann requested a review from davidwendt March 22, 2022 12:19

davidwendt approved these changes Mar 22, 2022

View reviewed changes

karthikeyann added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team 4 - Needs Review Waiting for reviewer to review or respond labels Mar 22, 2022

rapids-bot bot merged commit 76c772e into rapidsai:branch-22.04 Mar 22, 2022

karthikeyann mentioned this pull request Mar 22, 2022

[DISCUSSION] Lower Google Benchmark Suite Runtime #5773

Closed

Conversation

karthikeyann commented Jan 24, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jan 24, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

karthikeyann commented Mar 16, 2022

Uh oh!

vyasr left a comment

Choose a reason for hiding this comment

Uh oh!

vyasr commented Mar 16, 2022

Uh oh!

vuule left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vuule Mar 16, 2022

Choose a reason for hiding this comment

Uh oh!

karthikeyann Mar 18, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

karthikeyann commented Mar 18, 2022

Uh oh!

karthikeyann commented Mar 21, 2022

Uh oh!

vuule left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

karthikeyann commented Mar 22, 2022

Uh oh!

karthikeyann commented Mar 22, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

karthikeyann commented Jan 24, 2022 •

edited

Loading

codecov bot commented Jan 24, 2022 •

edited

Loading