-
Notifications
You must be signed in to change notification settings - Fork 29.1k
[SPARK-29975][SQL] introduce --CONFIG_DIM directive #26612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,8 +1,3 @@ | ||
| -- List of configuration the test suite is run against: | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is to test the optimizer, don't need to run it with different join operators |
||
| --SET spark.sql.autoBroadcastJoinThreshold=10485760 | ||
| --SET spark.sql.autoBroadcastJoinThreshold=-1,spark.sql.join.preferSortMergeJoin=true | ||
| --SET spark.sql.autoBroadcastJoinThreshold=-1,spark.sql.join.preferSortMergeJoin=false | ||
|
|
||
| CREATE TEMPORARY VIEW t1 AS SELECT * FROM VALUES (1) AS GROUPING(a); | ||
| CREATE TEMPORARY VIEW t2 AS SELECT * FROM VALUES (1) AS GROUPING(a); | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,8 +1,3 @@ | ||
| -- List of configuration the test suite is run against: | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is to test the analyzer/optimizer. Natural join will be rewritten to other normal joins, no need to un it with different join operators |
||
| --SET spark.sql.autoBroadcastJoinThreshold=10485760 | ||
| --SET spark.sql.autoBroadcastJoinThreshold=-1,spark.sql.join.preferSortMergeJoin=true | ||
| --SET spark.sql.autoBroadcastJoinThreshold=-1,spark.sql.join.preferSortMergeJoin=false | ||
|
|
||
| create temporary view nt1 as select * from values | ||
| ("one", 1), | ||
| ("two", 2), | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,7 +1,14 @@ | ||
| -- List of configuration the test suite is run against: | ||
| --SET spark.sql.autoBroadcastJoinThreshold=10485760 | ||
| --SET spark.sql.autoBroadcastJoinThreshold=-1,spark.sql.join.preferSortMergeJoin=true | ||
| --SET spark.sql.autoBroadcastJoinThreshold=-1,spark.sql.join.preferSortMergeJoin=false | ||
| -- There are 2 dimensions we want to test | ||
| -- 1. run with broadcast hash join, sort merge join or shuffle hash join. | ||
| -- 2. run with whole-stage-codegen, operator codegen or no codegen. | ||
|
|
||
| --CONFIG_DIM1 spark.sql.autoBroadcastJoinThreshold=10485760 | ||
| --CONFIG_DIM1 spark.sql.autoBroadcastJoinThreshold=-1,spark.sql.join.preferSortMergeJoin=true | ||
| --CONFIG_DIM1 spark.sql.autoBroadcastJoinThreshold=-1,spark.sql.join.preferSortMergeJoin=false | ||
|
|
||
| --CONFIG_DIM2 spark.sql.codegen.wholeStage=true | ||
| --CONFIG_DIM2 spark.sql.codegen.wholeStage=false,spark.sql.codegen.factoryMode=CODEGEN_ONLY | ||
| --CONFIG_DIM2 spark.sql.codegen.wholeStage=false,spark.sql.codegen.factoryMode=NO_CODEGEN | ||
|
|
||
| -- SPARK-17099: Incorrect result when HAVING clause is added to group by query | ||
| CREATE OR REPLACE TEMPORARY VIEW t1 AS SELECT * FROM VALUES | ||
|
|
@@ -29,16 +36,10 @@ CREATE OR REPLACE TEMPORARY VIEW t1 AS SELECT * FROM VALUES (97) as t1(int_col1) | |
|
|
||
| CREATE OR REPLACE TEMPORARY VIEW t2 AS SELECT * FROM VALUES (0) as t2(int_col1); | ||
|
|
||
| -- Set the cross join enabled flag for the LEFT JOIN test since there's no join condition. | ||
| -- Ultimately the join should be optimized away. | ||
| set spark.sql.crossJoin.enabled = true; | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is true by default now. |
||
| SELECT * | ||
| FROM ( | ||
| SELECT | ||
| COALESCE(t2.int_col1, t1.int_col1) AS int_col | ||
| FROM t1 | ||
| LEFT JOIN t2 ON false | ||
| ) t where (t.int_col) is not null; | ||
| set spark.sql.crossJoin.enabled = false; | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,8 +1,3 @@ | ||
| -- List of configuration the test suite is run against: | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we are testing UDFs, and the join operator doesn't matter. |
||
| --SET spark.sql.autoBroadcastJoinThreshold=10485760 | ||
| --SET spark.sql.autoBroadcastJoinThreshold=-1,spark.sql.join.preferSortMergeJoin=true | ||
| --SET spark.sql.autoBroadcastJoinThreshold=-1,spark.sql.join.preferSortMergeJoin=false | ||
|
|
||
| -- This test file was converted from join-empty-relation.sql. | ||
|
|
||
| CREATE TEMPORARY VIEW t1 AS SELECT * FROM VALUES (1) AS GROUPING(a); | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Sorry to be late) In the one dimension case,
CONFIG_DIM1is the same withSET?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's different. We still run this test 3 times as there are 3 config sets in this dimension. It's only the same with SET if there is only one dimension and one config set.