Skip to content

Conversation

@Dandandan
Copy link
Contributor

@Dandandan Dandandan commented Dec 31, 2020

As mentioned in #9048 it is wasteful to calculate the column indexes for every batch. In this PR we instead do it only once.

This doesn't seem to have a large impact on performance, as expected, but seems to give a small improvement when using smaller batch sizes.

@github-actions
Copy link

@codecov-io
Copy link

codecov-io commented Dec 31, 2020

Codecov Report

Merging #9059 (5b6bf83) into master (cc0ee5e) will decrease coverage by 0.00%.
The diff coverage is 85.71%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #9059      +/-   ##
==========================================
- Coverage   82.55%   82.55%   -0.01%     
==========================================
  Files         203      203              
  Lines       50043    50054      +11     
==========================================
+ Hits        41313    41322       +9     
- Misses       8730     8732       +2     
Impacted Files Coverage Δ
rust/datafusion/src/physical_plan/hash_join.rs 89.68% <85.71%> (-0.26%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cc0ee5e...5b6bf83. Read the comment docs.

@Dandandan
Copy link
Contributor Author

Fixed the linting error in #9061

@kou kou changed the title ARROW-11088: [Rust][DataFusion[ Calculate column indices upfront in hash join ARROW-11088: [Rust][DataFusion] Calculate column indices upfront in hash join Dec 31, 2020
@Dandandan
Copy link
Contributor Author

Can be combined in #9070 to reduce nr. of open PRs

@Dandandan Dandandan closed this Jan 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants