Skip to content

Conversation

@Dandandan
Copy link
Contributor

@Dandandan Dandandan commented Dec 27, 2020

This increases the default batch size 8x from 4096 to 32768 as it improves performance of quite some operations.
I just increased the size until performance didn't increase on my machine. Note that CSV reading also is faster on bigger batches on the bigger data sources.

This PR

Loading table 'part' into memory
Loaded table 'part' into memory in 125 ms
Loading table 'supplier' into memory
Loaded table 'supplier' into memory in 10 ms
Loading table 'partsupp' into memory
Loaded table 'partsupp' into memory in 381 ms
Loading table 'customer' into memory
Loaded table 'customer' into memory in 126 ms
Loading table 'orders' into memory
Loaded table 'orders' into memory in 961 ms
Loading table 'lineitem' into memory
Loaded table 'lineitem' into memory in 6382 ms
Loading table 'nation' into memory
Loaded table 'nation' into memory in 2 ms
Loading table 'region' into memory
Loaded table 'region' into memory in 2 ms
Query 12 iteration 0 took 220.2 ms
Query 12 iteration 1 took 223.2 ms
Query 12 iteration 2 took 222.4 ms
Query 12 iteration 3 took 222.2 ms
Query 12 iteration 4 took 221.8 ms
Query 12 iteration 5 took 222.0 ms
Query 12 iteration 6 took 223.1 ms
Query 12 iteration 7 took 223.7 ms
Query 12 iteration 8 took 222.5 ms
Query 12 iteration 9 took 222.9 ms
Query 12 avg time: 222.40 ms

Master

Loading table 'part' into memory
Loaded table 'part' into memory in 116 ms
Loading table 'supplier' into memory
Loaded table 'supplier' into memory in 7 ms
Loading table 'partsupp' into memory
Loaded table 'partsupp' into memory in 386 ms
Loading table 'customer' into memory
Loaded table 'customer' into memory in 115 ms
Loading table 'orders' into memory
Loaded table 'orders' into memory in 1048 ms
Loading table 'lineitem' into memory
Loaded table 'lineitem' into memory in 7673 ms
Loading table 'nation' into memory
Loaded table 'nation' into memory in 0 ms
Loading table 'region' into memory
Loaded table 'region' into memory in 0 ms
Query 12 iteration 0 took 596.1 ms
Query 12 iteration 1 took 602.0 ms
Query 12 iteration 2 took 608.1 ms
Query 12 iteration 3 took 607.9 ms
Query 12 iteration 4 took 613.5 ms
Query 12 iteration 5 took 615.3 ms
Query 12 iteration 6 took 611.6 ms
Query 12 iteration 7 took 609.8 ms
Query 12 iteration 8 took 615.7 ms
Query 12 iteration 9 took 616.9 ms
Query 12 avg time: 609.68 ms

Query 1 also improves a bit (but smaller improvement)

PR.

Query 1 iteration 0 took 653.0 ms
Query 1 iteration 1 took 653.4 ms
Query 1 iteration 2 took 652.3 ms
Query 1 iteration 3 took 658.9 ms
Query 1 iteration 4 took 655.1 ms
Query 1 iteration 5 took 662.0 ms
Query 1 iteration 6 took 659.7 ms
Query 1 iteration 7 took 662.7 ms
Query 1 iteration 8 took 669.0 ms
Query 1 iteration 9 took 665.7 ms
Query 1 avg time: 659.19 ms

Master:

Query 1 iteration 0 took 708.8 ms
Query 1 iteration 1 took 714.5 ms
Query 1 iteration 2 took 700.4 ms
Query 1 iteration 3 took 713.7 ms
Query 1 iteration 4 took 707.5 ms
Query 1 iteration 5 took 727.8 ms
Query 1 iteration 6 took 727.9 ms
Query 1 iteration 7 took 721.3 ms
Query 1 iteration 8 took 717.3 ms
Query 1 iteration 9 took 729.4 ms
Query 1 avg time: 716.85 ms

@github-actions
Copy link

Copy link
Member

@jorgecarleitao jorgecarleitao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @Dandandan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants