Skip to content

Transformations of Existential Subqueries using Early-out Joins#18381

Merged
rschlussel merged 2 commits intoprestodb:masterfrom
vivek-bharathan:earlyoutjointransformations
Nov 14, 2022
Merged

Transformations of Existential Subqueries using Early-out Joins#18381
rschlussel merged 2 commits intoprestodb:masterfrom
vivek-bharathan:earlyoutjointransformations

Conversation

@vivek-bharathan
Copy link
Contributor

@vivek-bharathan vivek-bharathan commented Sep 21, 2022

Fixes #17927

Benchmark results on TPC-DS data set with scale factors sf100 and sf1000
Summary - 9/100 queries affected (plans changed) as a result of this feature.
Ignoring variance of less that 5%, we see that queries q58, q83, q95 are consistently improved

sf100 run with Benchto on AWS (r5.4xlarge, 4 workers + coordinator, 16vCPU, 128G RAM).

TPCDS Query Id   eoj enabled (ms) baseline (ms) % improved ((baseline - eoj) *100/baseline)
         
q23_1   27186 27344 1
q23_2   26044 27368 5
q33   2749 2675 -3
q54   5757 6229 8
q56   2720 2665 -2
a58   6383 7527 15
a60   2787 2714 -3
q83   2630 2879 9
q95   4020 9242 57
Sum   80276 88643 9

sf1000 run with Benchto on AWS (r5.4xlarge, 8 workers + coordinator, 128vCPU, 1024G RAM).

TPCDS Query Id   eoj (ms) baseline (ms) % improved ((baseline - eoj) *100/baseline)
         
q23_1   248859 248411 0
q23_2   251982 251775 0
q33   17545 16648 -5
q54   19654 20292 3
q56   17392 16668 -4
a58   49764 57701 14
a60   18155 17694 -3
q83   9614 10280 6
q95   18244 97085 81
Sum   651209 736554 12

@vivek-bharathan vivek-bharathan force-pushed the earlyoutjointransformations branch 2 times, most recently from 1d7469f to 8db47f8 Compare September 24, 2022 01:20
@vivek-bharathan vivek-bharathan marked this pull request as ready for review September 27, 2022 20:07
@vivek-bharathan vivek-bharathan requested a review from a team as a code owner September 27, 2022 20:07
@vivek-bharathan vivek-bharathan requested review from presto-oss and rschlussel and removed request for rschlussel September 27, 2022 20:07
@kaikalur
Copy link
Contributor

kaikalur commented Oct 4, 2022

Can we start by having some benchmarks first? Maybe sqlebenchmark can be a good start to demonstrate the effectiveness of this optimization.

@vivek-bharathan vivek-bharathan force-pushed the earlyoutjointransformations branch 3 times, most recently from 504ecc0 to 2e4d267 Compare October 12, 2022 00:36
@vivek-bharathan
Copy link
Contributor Author

Can we start by having some benchmarks first? Maybe sqlebenchmark can be a good start to demonstrate the effectiveness of this optimization.

I've added results from a tpcds benchmark run in the PR description. Let me know if you still think SQLBenchmarks are preferable

@vivek-bharathan vivek-bharathan force-pushed the earlyoutjointransformations branch from 2e4d267 to 630275c Compare October 12, 2022 02:11
@ethanyzhang ethanyzhang requested a review from aaneja October 13, 2022 17:20
@ethanyzhang
Copy link
Contributor

Hi @aaneja , can you help take a look at this PR?
@ClarenceThreepwood there are some conflicts in your branch with master.

@vivek-bharathan vivek-bharathan force-pushed the earlyoutjointransformations branch from 630275c to ef0d23e Compare October 13, 2022 17:58
@mbasmanova
Copy link
Contributor

@ClarenceThreepwood Vivek, thank you for sharing the TPC-DS results. It would be helpful to add % change column to the table. It looks like q95 is 2x faster, but the rest of the queries are about the same or slower (q33, q56, q60). Is this expected? Do you think the slowness of the 3 queries is just noise?

Copy link
Contributor

@aaneja aaneja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there a few other TPCDS queries that use semijoin, namely q14_1, q14_2, q70. Could you check using presto-benchto-benchmarks that these indeed did not have any plan changes and is this expected ?

@vivek-bharathan
Copy link
Contributor Author

@ClarenceThreepwood Vivek, thank you for sharing the TPC-DS results. It would be helpful to add % change column to the table. It looks like q95 is 2x faster, but the rest of the queries are about the same or slower (q33, q56, q60). Is this expected? Do you think the slowness of the 3 queries is just noise?

Good catch @mbasmanova. After more runs of the benchmarks and also running it at a higher scale factor (sf1000), I agree that the minor differences in performance are just noise (variance in S3 read latency also contributes to the noise). I've updated the description above with the latest runs and interpretations.

@mbasmanova
Copy link
Contributor

@ClarenceThreepwood Thank you, Vivek. Would it make sense to update the % comment to show percentages as whole numbers. I don't think we care about fractional percents and having lots of numbers after period makes it a bit difficult to read.

@vivek-bharathan vivek-bharathan force-pushed the earlyoutjointransformations branch 2 times, most recently from 9348706 to 4227bf8 Compare October 24, 2022 21:43
@aaneja
Copy link
Contributor

aaneja commented Oct 25, 2022

LGTM

@ethanyzhang
Copy link
Contributor

@prestodb/committers Could we get a review on this?

@vivek-bharathan
Copy link
Contributor Author

@rschlussel - Would you mind giving this PR a once-over?

@vivek-bharathan vivek-bharathan force-pushed the earlyoutjointransformations branch from 5564398 to 86cb07e Compare November 1, 2022 21:00
@vivek-bharathan
Copy link
Contributor Author

I see that there are a few cases listed in #17927 where this optimization can/cannot benefit from. Can you write an sql benchmark similar to existing ones in https://github.com/prestodb/presto/tree/master/presto-benchmark/src/main/java/com/facebook/presto/benchmark with all cases to demonstrate the performance change.

I can do that. Just curious what the benefit of doing that is compared to running on standardized benchmarks. Maybe explore/include more query shapes in the sql benchmark?

The benefit is that it gives a handy and lightweight way to check the performance of a specific optimization rule

explore/include more query shapes in the sql benchmark

Can you give a pointer to what sql benchmark you refer to here?

I just meant the AbstractSqlBenchmark inheritors in general vs the standardized benchmarks.

I've added a new suite - SqlEarlyOutJoinsBenchmarks - the results are as follows

Summary:

INFO: benchmarkTransformDistinctInnerJoinToLeftEarlyOutJoin
INFO: Without optimization
early_out_joins :: 1516.799 cpu ms :: 2.49MB peak memory :: in 60.2K, 0B, 39.7K/s, 0B/s :: out 7, 63B, 4/s, 41B/s

INFO: With optimization
early_out_joins :: 59.898 cpu ms :: 0B peak memory :: in 60.2K, 0B, 1.01M/s, 0B/s :: out 7, 63B, 116/s, 1.03KB/s

INFO: benchmarkTransformDistinctInnerJoinToRightEarlyOutJoin
INFO: Without optimization
early_out_joins :: 1618.802 cpu ms :: 8.51MB peak memory :: in 75.2K, 0B, 46.4K/s, 0B/s :: out 60.2K, 2.33MB, 37.2K/s, 1.44MB/s

INFO: With optimization
early_out_joins :: 160.980 cpu ms :: 6.14MB peak memory :: in 75.2K, 0B, 467K/s, 0B/s :: out 60.2K, 2.33MB, 374K/s, 14.5MB/s

INFO: benchmarkInPredicateToDistinctInnerJoin
INFO: Case 1: Rewrite IN predicate to distinct + inner join
INFO: Without optimization
early_out_joins :: 18.525 cpu ms :: 0B peak memory :: in 1, 0B, 53/s, 0B/s :: out 1, 2.37KB, 53/s, 128KB/s
INFO: With optimization: case 1
early_out_joins :: 20.238 cpu ms :: 0B peak memory :: in 1, 0B, 49/s, 0B/s :: out 1, 3.42KB, 49/s, 169KB/s
INFO: With optimization: case 2
early_out_joins :: 20.579 cpu ms :: 0B peak memory :: in 1, 0B, 48/s, 0B/s :: out 1, 3.42KB, 48/s, 166KB/s

Raw Results

Nov 01, 2022 1:55:44 PM com.facebook.airlift.log.Logger info
INFO: benchmarkTransformDistinctInnerJoinToLeftEarlyOutJoin
Nov 01, 2022 1:55:44 PM com.facebook.airlift.log.Logger info
INFO: Without optimization
Nov 01, 2022 1:55:46 PM com.facebook.airlift.log.Logger info
INFO: -- Loading system access control --
Nov 01, 2022 1:55:46 PM com.facebook.airlift.log.Logger info
INFO: -- Loaded system access control allow-all --
Nov 01, 2022 1:55:46 PM com.facebook.airlift.log.Logger info
INFO: -- Loading temp storage local --
Nov 01, 2022 1:55:46 PM com.facebook.airlift.log.Logger info
INFO: -- Loaded temp storage local --
Nov 01, 2022 1:55:46 PM com.facebook.airlift.log.Logger info
INFO: -- Loading temp storage local --
Nov 01, 2022 1:55:46 PM com.facebook.airlift.log.Logger info
INFO: -- Loaded temp storage local --
peak_memory:2607357,elapsed_millis:1534,input_rows_per_second:39246,output_rows_per_second:4,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:1533884463,cpu_nanos:1524698000,user_nanos:1512086000,input_rows:60200,input_bytes:0,output_rows:7,output_bytes:63
peak_memory:2607357,elapsed_millis:1535,input_rows_per_second:39228,output_rows_per_second:4,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:1534616992,cpu_nanos:1519503000,user_nanos:1510009000,input_rows:60200,input_bytes:0,output_rows:7,output_bytes:63
peak_memory:2607357,elapsed_millis:1520,input_rows_per_second:39594,output_rows_per_second:4,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:1520417322,cpu_nanos:1509597000,user_nanos:1506234000,input_rows:60200,input_bytes:0,output_rows:7,output_bytes:63
peak_memory:2607357,elapsed_millis:1535,input_rows_per_second:39219,output_rows_per_second:4,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:1534952987,cpu_nanos:1515527000,user_nanos:1508865000,input_rows:60200,input_bytes:0,output_rows:7,output_bytes:63
peak_memory:2607357,elapsed_millis:1541,input_rows_per_second:39058,output_rows_per_second:4,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:1541273124,cpu_nanos:1527733000,user_nanos:1518186000,input_rows:60200,input_bytes:0,output_rows:7,output_bytes:63
peak_memory:2607357,elapsed_millis:1536,input_rows_per_second:39184,output_rows_per_second:4,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:1536331721,cpu_nanos:1526734000,user_nanos:1522622000,input_rows:60200,input_bytes:0,output_rows:7,output_bytes:63
peak_memory:2607357,elapsed_millis:1530,input_rows_per_second:39355,output_rows_per_second:4,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:1529639932,cpu_nanos:1512256000,user_nanos:1505532000,input_rows:60200,input_bytes:0,output_rows:7,output_bytes:63
peak_memory:2607357,elapsed_millis:1533,input_rows_per_second:39281,output_rows_per_second:4,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:1532529168,cpu_nanos:1520174000,user_nanos:1516197000,input_rows:60200,input_bytes:0,output_rows:7,output_bytes:63
peak_memory:2607357,elapsed_millis:1510,input_rows_per_second:39861,output_rows_per_second:4,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:1510217905,cpu_nanos:1500184000,user_nanos:1496087000,input_rows:60200,input_bytes:0,output_rows:7,output_bytes:63
peak_memory:2607357,elapsed_millis:1531,input_rows_per_second:39314,output_rows_per_second:4,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:1531248109,cpu_nanos:1511582000,user_nanos:1505135000,input_rows:60200,input_bytes:0,output_rows:7,output_bytes:63
early_out_joins :: 1516.799 cpu ms :: 2.49MB peak memory :: in 60.2K, 0B, 39.7K/s, 0B/s :: out 7, 63B, 4/s, 41B/s
Nov 01, 2022 1:56:21 PM com.facebook.airlift.log.Logger info
INFO: With optimization
Nov 01, 2022 1:56:21 PM com.facebook.airlift.log.Logger info
INFO: -- Loading system access control --
Nov 01, 2022 1:56:21 PM com.facebook.airlift.log.Logger info
INFO: -- Loaded system access control allow-all --
Nov 01, 2022 1:56:21 PM com.facebook.airlift.log.Logger info
INFO: -- Loading temp storage local --
Nov 01, 2022 1:56:21 PM com.facebook.airlift.log.Logger info
INFO: -- Loaded temp storage local --
Nov 01, 2022 1:56:21 PM com.facebook.airlift.log.Logger info
INFO: -- Loading temp storage local --
Nov 01, 2022 1:56:21 PM com.facebook.airlift.log.Logger info
INFO: -- Loaded temp storage local --
peak_memory:0,elapsed_millis:59,input_rows_per_second:1012005,output_rows_per_second:117,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:59485866,cpu_nanos:59280000,user_nanos:58608000,input_rows:60200,input_bytes:0,output_rows:7,output_bytes:63
peak_memory:0,elapsed_millis:59,input_rows_per_second:1026269,output_rows_per_second:119,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:58659066,cpu_nanos:58448000,user_nanos:57751000,input_rows:60200,input_bytes:0,output_rows:7,output_bytes:63
peak_memory:0,elapsed_millis:57,input_rows_per_second:1061895,output_rows_per_second:123,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:56691062,cpu_nanos:56688000,user_nanos:56121000,input_rows:60200,input_bytes:0,output_rows:7,output_bytes:63
peak_memory:0,elapsed_millis:57,input_rows_per_second:1050260,output_rows_per_second:122,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:57319116,cpu_nanos:57168000,user_nanos:56479000,input_rows:60200,input_bytes:0,output_rows:7,output_bytes:63
peak_memory:0,elapsed_millis:62,input_rows_per_second:971650,output_rows_per_second:112,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:61956456,cpu_nanos:61107000,user_nanos:59980000,input_rows:60200,input_bytes:0,output_rows:7,output_bytes:63
peak_memory:0,elapsed_millis:68,input_rows_per_second:887436,output_rows_per_second:103,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:67835834,cpu_nanos:67226000,user_nanos:66400000,input_rows:60200,input_bytes:0,output_rows:7,output_bytes:63
peak_memory:0,elapsed_millis:58,input_rows_per_second:1030694,output_rows_per_second:119,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:58407219,cpu_nanos:58297000,user_nanos:57583000,input_rows:60200,input_bytes:0,output_rows:7,output_bytes:63
peak_memory:0,elapsed_millis:58,input_rows_per_second:1039067,output_rows_per_second:120,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:57936561,cpu_nanos:57681000,user_nanos:56990000,input_rows:60200,input_bytes:0,output_rows:7,output_bytes:63
peak_memory:0,elapsed_millis:64,input_rows_per_second:937367,output_rows_per_second:108,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:64222400,cpu_nanos:63639000,user_nanos:62520000,input_rows:60200,input_bytes:0,output_rows:7,output_bytes:63
peak_memory:0,elapsed_millis:61,input_rows_per_second:992511,output_rows_per_second:115,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:60654215,cpu_nanos:59443000,user_nanos:58664000,input_rows:60200,input_bytes:0,output_rows:7,output_bytes:63
early_out_joins :: 59.898 cpu ms :: 0B peak memory :: in 60.2K, 0B, 1.01M/s, 0B/s :: out 7, 63B, 116/s, 1.03KB/s
Nov 01, 2022 1:56:22 PM com.facebook.airlift.log.Logger info
INFO: benchmarkTransformDistinctInnerJoinToRightEarlyOutJoin
Nov 01, 2022 1:56:22 PM com.facebook.airlift.log.Logger info
INFO: Without optimization
Nov 01, 2022 1:56:22 PM com.facebook.airlift.log.Logger info
INFO: -- Loading system access control --
Nov 01, 2022 1:56:22 PM com.facebook.airlift.log.Logger info
INFO: -- Loaded system access control allow-all --
Nov 01, 2022 1:56:22 PM com.facebook.airlift.log.Logger info
INFO: -- Loading temp storage local --
Nov 01, 2022 1:56:22 PM com.facebook.airlift.log.Logger info
INFO: -- Loaded temp storage local --
Nov 01, 2022 1:56:22 PM com.facebook.airlift.log.Logger info
INFO: -- Loading temp storage local --
Nov 01, 2022 1:56:22 PM com.facebook.airlift.log.Logger info
INFO: -- Loaded temp storage local --
peak_memory:8928065,elapsed_millis:1624,input_rows_per_second:46296,output_rows_per_second:37058,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:1623765152,cpu_nanos:1611443000,user_nanos:1607848000,input_rows:75175,input_bytes:0,output_rows:60174,output_bytes:2440790
peak_memory:8928065,elapsed_millis:1630,input_rows_per_second:46128,output_rows_per_second:36924,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:1629668980,cpu_nanos:1619309000,user_nanos:1611305000,input_rows:75175,input_bytes:0,output_rows:60174,output_bytes:2440790
peak_memory:8928065,elapsed_millis:1648,input_rows_per_second:45605,output_rows_per_second:36505,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:1648368121,cpu_nanos:1638546000,user_nanos:1634748000,input_rows:75175,input_bytes:0,output_rows:60174,output_bytes:2440790
peak_memory:8928065,elapsed_millis:1639,input_rows_per_second:45868,output_rows_per_second:36715,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:1638926628,cpu_nanos:1618188000,user_nanos:1614510000,input_rows:75175,input_bytes:0,output_rows:60174,output_bytes:2440790
peak_memory:8928065,elapsed_millis:1645,input_rows_per_second:45712,output_rows_per_second:36590,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:1644526889,cpu_nanos:1626430000,user_nanos:1620587000,input_rows:75175,input_bytes:0,output_rows:60174,output_bytes:2440790
peak_memory:8928065,elapsed_millis:1632,input_rows_per_second:46075,output_rows_per_second:36881,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:1631543771,cpu_nanos:1614447000,user_nanos:1610549000,input_rows:75175,input_bytes:0,output_rows:60174,output_bytes:2440790
peak_memory:8928065,elapsed_millis:1644,input_rows_per_second:45729,output_rows_per_second:36604,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:1643896997,cpu_nanos:1629702000,user_nanos:1623215000,input_rows:75175,input_bytes:0,output_rows:60174,output_bytes:2440790
peak_memory:8928065,elapsed_millis:1626,input_rows_per_second:46230,output_rows_per_second:37005,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:1626103372,cpu_nanos:1611032000,user_nanos:1607450000,input_rows:75175,input_bytes:0,output_rows:60174,output_bytes:2440790
peak_memory:8928065,elapsed_millis:1628,input_rows_per_second:46170,output_rows_per_second:36957,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:1628196266,cpu_nanos:1615380000,user_nanos:1609349000,input_rows:75175,input_bytes:0,output_rows:60174,output_bytes:2440790
peak_memory:8928065,elapsed_millis:1617,input_rows_per_second:46481,output_rows_per_second:37206,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:1617295790,cpu_nanos:1603544000,user_nanos:1600554000,input_rows:75175,input_bytes:0,output_rows:60174,output_bytes:2440790
early_out_joins :: 1618.802 cpu ms :: 8.51MB peak memory :: in 75.2K, 0B, 46.4K/s, 0B/s :: out 60.2K, 2.33MB, 37.2K/s, 1.44MB/s
Nov 01, 2022 1:56:55 PM com.facebook.airlift.log.Logger info
INFO: With optimization
Nov 01, 2022 1:56:55 PM com.facebook.airlift.log.Logger info
INFO: -- Loading system access control --
Nov 01, 2022 1:56:55 PM com.facebook.airlift.log.Logger info
INFO: -- Loaded system access control allow-all --
Nov 01, 2022 1:56:55 PM com.facebook.airlift.log.Logger info
INFO: -- Loading temp storage local --
Nov 01, 2022 1:56:55 PM com.facebook.airlift.log.Logger info
INFO: -- Loaded temp storage local --
Nov 01, 2022 1:56:55 PM com.facebook.airlift.log.Logger info
INFO: -- Loading temp storage local --
Nov 01, 2022 1:56:55 PM com.facebook.airlift.log.Logger info
INFO: -- Loaded temp storage local --
peak_memory:6438953,elapsed_millis:158,input_rows_per_second:476106,output_rows_per_second:381100,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:157895463,cpu_nanos:157593000,user_nanos:156860000,input_rows:75175,input_bytes:0,output_rows:60174,output_bytes:2440790
peak_memory:6438953,elapsed_millis:163,input_rows_per_second:461253,output_rows_per_second:369211,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:162979929,cpu_nanos:162224000,user_nanos:161444000,input_rows:75175,input_bytes:0,output_rows:60174,output_bytes:2440790
peak_memory:6438953,elapsed_millis:163,input_rows_per_second:461614,output_rows_per_second:369500,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:162852460,cpu_nanos:162380000,user_nanos:161326000,input_rows:75175,input_bytes:0,output_rows:60174,output_bytes:2440790
peak_memory:6438953,elapsed_millis:155,input_rows_per_second:484512,output_rows_per_second:387829,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:155155881,cpu_nanos:154820000,user_nanos:154092000,input_rows:75175,input_bytes:0,output_rows:60174,output_bytes:2440790
peak_memory:6438953,elapsed_millis:163,input_rows_per_second:461437,output_rows_per_second:369358,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:162914735,cpu_nanos:162074000,user_nanos:161027000,input_rows:75175,input_bytes:0,output_rows:60174,output_bytes:2440790
peak_memory:6438953,elapsed_millis:166,input_rows_per_second:452285,output_rows_per_second:362032,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:166211540,cpu_nanos:162295000,user_nanos:161299000,input_rows:75175,input_bytes:0,output_rows:60174,output_bytes:2440790
peak_memory:6438953,elapsed_millis:162,input_rows_per_second:465327,output_rows_per_second:372472,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:161553041,cpu_nanos:161160000,user_nanos:159924000,input_rows:75175,input_bytes:0,output_rows:60174,output_bytes:2440790
peak_memory:6438953,elapsed_millis:162,input_rows_per_second:465341,output_rows_per_second:372484,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:161547871,cpu_nanos:161020000,user_nanos:160117000,input_rows:75175,input_bytes:0,output_rows:60174,output_bytes:2440790
peak_memory:6438953,elapsed_millis:167,input_rows_per_second:449185,output_rows_per_second:359551,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:167358543,cpu_nanos:166899000,user_nanos:165852000,input_rows:75175,input_bytes:0,output_rows:60174,output_bytes:2440790
peak_memory:6438953,elapsed_millis:160,input_rows_per_second:470332,output_rows_per_second:376479,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:159833585,cpu_nanos:159336000,user_nanos:158446000,input_rows:75175,input_bytes:0,output_rows:60174,output_bytes:2440790
early_out_joins :: 160.980 cpu ms :: 6.14MB peak memory :: in 75.2K, 0B, 467K/s, 0B/s :: out 60.2K, 2.33MB, 374K/s, 14.5MB/s
Nov 01, 2022 1:56:59 PM com.facebook.airlift.log.Logger info
INFO: benchmarkInPredicateToDistinctInnerJoin
Nov 01, 2022 1:56:59 PM com.facebook.airlift.log.Logger info
INFO: Case 1: Rewrite IN predicate to distinct + inner join
Nov 01, 2022 1:56:59 PM com.facebook.airlift.log.Logger info
INFO: Without optimization
Nov 01, 2022 1:56:59 PM com.facebook.airlift.log.Logger info
INFO: -- Loading system access control --
Nov 01, 2022 1:56:59 PM com.facebook.airlift.log.Logger info
INFO: -- Loaded system access control allow-all --
Nov 01, 2022 1:56:59 PM com.facebook.airlift.log.Logger info
INFO: -- Loading temp storage local --
Nov 01, 2022 1:56:59 PM com.facebook.airlift.log.Logger info
INFO: -- Loaded temp storage local --
Nov 01, 2022 1:56:59 PM com.facebook.airlift.log.Logger info
INFO: -- Loading temp storage local --
Nov 01, 2022 1:56:59 PM com.facebook.airlift.log.Logger info
INFO: -- Loaded temp storage local --
peak_memory:0,elapsed_millis:19,input_rows_per_second:52,output_rows_per_second:52,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:19069568,cpu_nanos:19028000,user_nanos:18712000,input_rows:1,input_bytes:0,output_rows:1,output_bytes:2423
peak_memory:0,elapsed_millis:18,input_rows_per_second:54,output_rows_per_second:54,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:18191981,cpu_nanos:18158000,user_nanos:17920000,input_rows:1,input_bytes:0,output_rows:1,output_bytes:2423
peak_memory:0,elapsed_millis:19,input_rows_per_second:53,output_rows_per_second:53,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:18613672,cpu_nanos:18615000,user_nanos:18370000,input_rows:1,input_bytes:0,output_rows:1,output_bytes:2423
peak_memory:0,elapsed_millis:18,input_rows_per_second:55,output_rows_per_second:55,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:18030190,cpu_nanos:18037000,user_nanos:17800000,input_rows:1,input_bytes:0,output_rows:1,output_bytes:2423
peak_memory:0,elapsed_millis:18,input_rows_per_second:54,output_rows_per_second:54,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:18319779,cpu_nanos:18178000,user_nanos:17871000,input_rows:1,input_bytes:0,output_rows:1,output_bytes:2423
peak_memory:0,elapsed_millis:20,input_rows_per_second:51,output_rows_per_second:51,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:19548897,cpu_nanos:19317000,user_nanos:18928000,input_rows:1,input_bytes:0,output_rows:1,output_bytes:2423
peak_memory:0,elapsed_millis:17,input_rows_per_second:58,output_rows_per_second:58,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:17079801,cpu_nanos:17060000,user_nanos:16819000,input_rows:1,input_bytes:0,output_rows:1,output_bytes:2423
peak_memory:0,elapsed_millis:18,input_rows_per_second:55,output_rows_per_second:55,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:18028514,cpu_nanos:17867000,user_nanos:17494000,input_rows:1,input_bytes:0,output_rows:1,output_bytes:2423
peak_memory:0,elapsed_millis:27,input_rows_per_second:36,output_rows_per_second:36,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:27243469,cpu_nanos:20457000,user_nanos:19862000,input_rows:1,input_bytes:0,output_rows:1,output_bytes:2423
peak_memory:0,elapsed_millis:19,input_rows_per_second:53,output_rows_per_second:53,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:18666729,cpu_nanos:18531000,user_nanos:18216000,input_rows:1,input_bytes:0,output_rows:1,output_bytes:2423
early_out_joins :: 18.525 cpu ms :: 0B peak memory :: in 1, 0B, 53/s, 0B/s :: out 1, 2.37KB, 53/s, 128KB/s
Nov 01, 2022 1:56:59 PM com.facebook.airlift.log.Logger info
INFO: With optimization: case 1
Nov 01, 2022 1:57:00 PM com.facebook.airlift.log.Logger info
INFO: -- Loading system access control --
Nov 01, 2022 1:57:00 PM com.facebook.airlift.log.Logger info
INFO: -- Loaded system access control allow-all --
Nov 01, 2022 1:57:00 PM com.facebook.airlift.log.Logger info
INFO: -- Loading temp storage local --
Nov 01, 2022 1:57:00 PM com.facebook.airlift.log.Logger info
INFO: -- Loaded temp storage local --
Nov 01, 2022 1:57:00 PM com.facebook.airlift.log.Logger info
INFO: -- Loading temp storage local --
Nov 01, 2022 1:57:00 PM com.facebook.airlift.log.Logger info
INFO: -- Loaded temp storage local --
peak_memory:0,elapsed_millis:25,input_rows_per_second:39,output_rows_per_second:39,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:25262174,cpu_nanos:24616000,user_nanos:23320000,input_rows:1,input_bytes:0,output_rows:1,output_bytes:3507
peak_memory:0,elapsed_millis:25,input_rows_per_second:40,output_rows_per_second:40,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:24878453,cpu_nanos:24440000,user_nanos:23308000,input_rows:1,input_bytes:0,output_rows:1,output_bytes:3507
peak_memory:0,elapsed_millis:23,input_rows_per_second:43,output_rows_per_second:43,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:22813809,cpu_nanos:22563000,user_nanos:21612000,input_rows:1,input_bytes:0,output_rows:1,output_bytes:3507
peak_memory:0,elapsed_millis:20,input_rows_per_second:49,output_rows_per_second:49,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:20121710,cpu_nanos:19938000,user_nanos:19182000,input_rows:1,input_bytes:0,output_rows:1,output_bytes:3507
peak_memory:0,elapsed_millis:19,input_rows_per_second:52,output_rows_per_second:52,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:19145070,cpu_nanos:19125000,user_nanos:18528000,input_rows:1,input_bytes:0,output_rows:1,output_bytes:3507
peak_memory:0,elapsed_millis:19,input_rows_per_second:51,output_rows_per_second:51,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:19258824,cpu_nanos:19114000,user_nanos:18371000,input_rows:1,input_bytes:0,output_rows:1,output_bytes:3507
peak_memory:0,elapsed_millis:19,input_rows_per_second:53,output_rows_per_second:53,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:18614563,cpu_nanos:18506000,user_nanos:17847000,input_rows:1,input_bytes:0,output_rows:1,output_bytes:3507
peak_memory:0,elapsed_millis:18,input_rows_per_second:54,output_rows_per_second:54,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:18405474,cpu_nanos:18318000,user_nanos:17717000,input_rows:1,input_bytes:0,output_rows:1,output_bytes:3507
peak_memory:0,elapsed_millis:18,input_rows_per_second:55,output_rows_per_second:55,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:18109927,cpu_nanos:18071000,user_nanos:17435000,input_rows:1,input_bytes:0,output_rows:1,output_bytes:3507
peak_memory:0,elapsed_millis:18,input_rows_per_second:56,output_rows_per_second:56,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:17776906,cpu_nanos:17688000,user_nanos:17066000,input_rows:1,input_bytes:0,output_rows:1,output_bytes:3507
early_out_joins :: 20.238 cpu ms :: 0B peak memory :: in 1, 0B, 49/s, 0B/s :: out 1, 3.42KB, 49/s, 169KB/s
Nov 01, 2022 1:57:00 PM com.facebook.airlift.log.Logger info
INFO: Case 2: Rewrite IN predicate to distinct + inner join and then push aggregation down into the probe of the join
Nov 01, 2022 1:57:00 PM com.facebook.airlift.log.Logger info
INFO: With optimization: case 2
Nov 01, 2022 1:57:00 PM com.facebook.airlift.log.Logger info
INFO: -- Loading system access control --
Nov 01, 2022 1:57:00 PM com.facebook.airlift.log.Logger info
INFO: -- Loaded system access control allow-all --
Nov 01, 2022 1:57:00 PM com.facebook.airlift.log.Logger info
INFO: -- Loading temp storage local --
Nov 01, 2022 1:57:00 PM com.facebook.airlift.log.Logger info
INFO: -- Loaded temp storage local --
Nov 01, 2022 1:57:00 PM com.facebook.airlift.log.Logger info
INFO: -- Loading temp storage local --
Nov 01, 2022 1:57:00 PM com.facebook.airlift.log.Logger info
INFO: -- Loaded temp storage local --
peak_memory:0,elapsed_millis:20,input_rows_per_second:48,output_rows_per_second:48,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:20491894,cpu_nanos:20205000,user_nanos:19332000,input_rows:1,input_bytes:0,output_rows:1,output_bytes:3507
peak_memory:0,elapsed_millis:24,input_rows_per_second:41,output_rows_per_second:41,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:23985314,cpu_nanos:23456000,user_nanos:22212000,input_rows:1,input_bytes:0,output_rows:1,output_bytes:3507
peak_memory:0,elapsed_millis:24,input_rows_per_second:41,output_rows_per_second:41,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:24134293,cpu_nanos:23602000,user_nanos:22371000,input_rows:1,input_bytes:0,output_rows:1,output_bytes:3507
peak_memory:0,elapsed_millis:25,input_rows_per_second:39,output_rows_per_second:39,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:25036966,cpu_nanos:23865000,user_nanos:22406000,input_rows:1,input_bytes:0,output_rows:1,output_bytes:3507
peak_memory:0,elapsed_millis:23,input_rows_per_second:42,output_rows_per_second:42,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:23433818,cpu_nanos:22767000,user_nanos:21499000,input_rows:1,input_bytes:0,output_rows:1,output_bytes:3507
peak_memory:0,elapsed_millis:21,input_rows_per_second:48,output_rows_per_second:48,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:20503784,cpu_nanos:20359000,user_nanos:19489000,input_rows:1,input_bytes:0,output_rows:1,output_bytes:3507
peak_memory:0,elapsed_millis:19,input_rows_per_second:53,output_rows_per_second:53,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:18582433,cpu_nanos:18584000,user_nanos:17963000,input_rows:1,input_bytes:0,output_rows:1,output_bytes:3507
peak_memory:0,elapsed_millis:18,input_rows_per_second:54,output_rows_per_second:54,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:18252146,cpu_nanos:18125000,user_nanos:17473000,input_rows:1,input_bytes:0,output_rows:1,output_bytes:3507
peak_memory:0,elapsed_millis:18,input_rows_per_second:54,output_rows_per_second:54,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:18192600,cpu_nanos:18098000,user_nanos:17397000,input_rows:1,input_bytes:0,output_rows:1,output_bytes:3507
peak_memory:0,elapsed_millis:17,input_rows_per_second:59,output_rows_per_second:59,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:16759647,cpu_nanos:16729000,user_nanos:16199000,input_rows:1,input_bytes:0,output_rows:1,output_bytes:3507
early_out_joins :: 20.579 cpu ms :: 0B peak memory :: in 1, 0B, 48/s, 0B/s :: out 1, 3.42KB, 48/s, 166KB/s

@vivek-bharathan
Copy link
Contributor Author

vivek-bharathan commented Nov 1, 2022

I noticed that this work depends on the constraint optimization framework. In my understanding (correct me if I am wrong), the constraint optimization framework currently only works for the iterative optimizers and constrains handled only within these optimizers. However we have other non iterative optimizers too, will it be a problem if these constraints are compromised in some cases (if it's possible)?

Not quite sure I understand the question. The iterative optimizer in Presto is augmented with a framework that can infer and propagate logical properties of plans (unique keys, cardinality bounds, etc). These logical properties may be "seeded" with constraints that are defined in the catalog. There are also ways to indicate that pre-defined constraints are not to be relied on. Does that help?

@feilong-liu
Copy link
Contributor

I noticed that this work depends on the constraint optimization framework. In my understanding (correct me if I am wrong), the constraint optimization framework currently only works for the iterative optimizers and constrains handled only within these optimizers. However we have other non iterative optimizers too, will it be a problem if these constraints are compromised in some cases (if it's possible)?

Not quite sure I understand the question. The iterative optimizer in Presto is augmented with a framework that can infer and propagate logical properties of plans (unique keys, cardinality bounds, etc. These logical properties may be "seeded" with constraints that are defined in the catalog. There are also ways to indicate that pre-defined constraints are not to be relied on. Does that help?

Oh, maybe an example can better demonstrate my question here.
For example, there is an iterative optimization rule A which produce some constraints, let's say a unique keys constraint. However, a following non iterative optimizer B break this constraints (if it's possible). For a following iterative optimization rule C, will it still assume this the unique keys constraint since the compromise of the constraint is introduced by a non iterative optimizer, hence affect correctness of the optimization in this PR (if it's possible)? Sorry that I am not familiar wit the constraint framework, just a high level question.

@feilong-liu
Copy link
Contributor

I've added a new suite - SqlEarlyOutJoinsBenchmarks - the results are as follows

Looks like that you forgot to include the benchmark in your commit.

@vivek-bharathan vivek-bharathan force-pushed the earlyoutjointransformations branch from 86cb07e to 75c945a Compare November 2, 2022 23:29
@vivek-bharathan
Copy link
Contributor Author

I've added a new suite - SqlEarlyOutJoinsBenchmarks - the results are as follows

Looks like that you forgot to include the benchmark in your commit.

My bad - fixed now

@vivek-bharathan
Copy link
Contributor Author

Oh, maybe an example can better demonstrate my question here. For example, there is an iterative optimization rule A which produce some constraints, let's say a unique keys constraint. However, a following non iterative optimizer B break this constraints (if it's possible). For a following iterative optimization rule C, will it still assume this the unique keys constraint since the compromise of the constraint is introduced by a non iterative optimizer, hence affect correctness of the optimization in this PR (if it's possible)? Sorry that I am not familiar wit the constraint framework, just a high level question.

Good question. However, I don't think that is possible. The LogicalProperties hang off of the Memo structure which is valid for the lifetime of the iterative optimizer. The logical properties are computed in a bottom-up manner for the groups in the memo. Any changes in the plan shape, no matter how it is achieved, that results in a "replace" in the memo, will trigger a recompute of the logical properties for that group/node. Therefore, a breaking change as you suggest, cannot sneak in.

@vivek-bharathan
Copy link
Contributor Author

Thanks for the review @feilong-liu !

@vivek-bharathan vivek-bharathan force-pushed the earlyoutjointransformations branch 3 times, most recently from e2be007 to e195781 Compare November 7, 2022 21:44
Copy link
Contributor

@rschlussel rschlussel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you move the changes to LogicalProperties to a separate commit (can still be in this PR) and explain what's going on there?

@vivek-bharathan vivek-bharathan force-pushed the earlyoutjointransformations branch from e195781 to 798718a Compare November 10, 2022 00:08
@vivek-bharathan
Copy link
Contributor Author

can you move the changes to LogicalProperties to a separate commit (can still be in this PR) and explain what's going on there?

Done

@vivek-bharathan vivek-bharathan force-pushed the earlyoutjointransformations branch from 798718a to 98ce23a Compare November 10, 2022 01:13
Copy link
Contributor

@rschlussel rschlussel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks great. A couple style comments, and also suggestions for testing.

…of intermediate plan expressions can be realized using another set

Extend Rule.Context to provide a LogicalPropertiesProvider
@vivek-bharathan vivek-bharathan force-pushed the earlyoutjointransformations branch from 98ce23a to 07b3d4a Compare November 13, 2022 04:05
Adds three optimizer rules:

TransformUncorrelatedInPredicateSubqueryToDistinctInnerJoin - converts an in-predicate subquery into an inner join with distinct aggregation (logically equivalent to a semijoin)

TransformDistinctInnerJoinToRightEarlyOutJoin - pushes aggregation into the left input of an inner join where applicable

TransformDistinctInnerJoinToLeftEarlyOutJoin - converts an inner join with distinct aggregation into a semijoin

Benchmarked on TPCDS(100G) data set. Queries impacted (9/100). Wallclock time performance improvement 10% for the impacted queries.
@vivek-bharathan vivek-bharathan force-pushed the earlyoutjointransformations branch from 07b3d4a to fbd49c8 Compare November 13, 2022 07:21
@rschlussel rschlussel merged commit 5b1f2dd into prestodb:master Nov 14, 2022
@vivek-bharathan vivek-bharathan deleted the earlyoutjointransformations branch November 14, 2022 22:27
@vivek-bharathan
Copy link
Contributor Author

Thanks @rschlussel !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Design] Transformations of Existential Subqueries using Early-out Joins

8 participants