ARROW-10510: [Rust] [DataFusion] Benchmark COUNT(DISTINCT) queries. #8606

drusso · 2020-11-06T18:55:19Z

This change adds benchmarks for COUNT(DISTINCT) queries. This is a small follow-up to ARROW-10043 / #8222. In that PR, a number of implementation ideas were discussed for follow-ups, and having benchmarks will help evaluate them.

There are two benchmarks added:

wide: all of the values are distinct; this is looking at worst-case performance
narrow: only a handful of distinct values; this is closer to best-case performance

The wide benchmark runs ~ 7x slower than the narrow benchmark.

github-actions · 2020-11-06T19:04:08Z

https://issues.apache.org/jira/browse/ARROW-10510

nevi-me

LGTM

ARROW-10510: [Rust] [DataFusion] Benchmark COUNT(DISTINCT) queries.

57893b4

github-actions bot added Component: Rust - DataFusion Component: Rust labels Nov 6, 2020

drusso mentioned this pull request Nov 6, 2020

ARROW-10043: [Rust][DataFusion] Implement COUNT(DISTINCT col) #8222

Closed

nevi-me approved these changes Nov 7, 2020

View reviewed changes

nevi-me closed this in eb42c50 Nov 7, 2020

asfimport mentioned this pull request Nov 7, 2020

[Rust] [DataFusion] Add benchmarks for COUNT(DISTINCT) #26480

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ARROW-10510: [Rust] [DataFusion] Benchmark COUNT(DISTINCT) queries. #8606

ARROW-10510: [Rust] [DataFusion] Benchmark COUNT(DISTINCT) queries. #8606

Uh oh!

drusso commented Nov 6, 2020

Uh oh!

github-actions bot commented Nov 6, 2020

Uh oh!

nevi-me left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ARROW-10510: [Rust] [DataFusion] Benchmark COUNT(DISTINCT) queries. #8606

ARROW-10510: [Rust] [DataFusion] Benchmark COUNT(DISTINCT) queries. #8606

Uh oh!

Conversation

drusso commented Nov 6, 2020

Uh oh!

github-actions bot commented Nov 6, 2020

Uh oh!

nevi-me left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants