Skip to content

Conversation

@chenzhx
Copy link

@chenzhx chenzhx commented Jun 1, 2022

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

beliefer and others added 4 commits June 1, 2022 21:00
…it to data source

### What changes were proposed in this pull request?
Currently, Spark supports push down limit to data source.
If limit could pushed down and Data source only have one partition, DS V2 still do limit again.
This PR want remove `Limit` from plan if complete push down limit to data source.

### Why are the changes needed?
Improve performance.

### Does this PR introduce _any_ user-facing change?
'No'.
New feature.

### How was this patch tested?
Tests updated.

Closes apache#36043 from beliefer/SPARK-38768.

Authored-by: Jiaan Geng <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
…` and `pushTopN` of `PushDownUtils`

### What changes were proposed in this pull request?
`pushLimit` and `pushTopN` of `PushDownUtils` returns tuple of boolean. It will be good to explain what the boolean value represents.

### Why are the changes needed?
Make DS V2 API more friendly to developers.

### Does this PR introduce _any_ user-facing change?
'No'.
Just update comments.

### How was this patch tested?
N/A

Closes apache#36092 from beliefer/SPARK-38391_SPARK-38768_followup.

Authored-by: Jiaan Geng <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
…asonable

### What changes were proposed in this pull request?
Some testing CASE WHEN queries are not carefully written and do not make sense. In the future, the optimizer may get smarter and get rid of the CASE WHEN completely, and then we loose test coverage.

This PR updates some CASE WHEN queries to make them more reasonable.

### Why are the changes needed?
future-proof test coverage.

### Does this PR introduce _any_ user-facing change?
'No'.

### How was this patch tested?
N/A

Closes apache#36125 from beliefer/SPARK-37960_followup3.

Authored-by: Jiaan Geng <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
@chenzhx chenzhx merged commit 0553594 into Kyligence:kyspark-3.2.x-4.x Jun 2, 2022
leejaywei pushed a commit that referenced this pull request Jul 14, 2022
… to data source (#474)

* [SPARK-38768][SQL] Remove `Limit` from plan if complete push down limit to data source

Currently, Spark supports push down limit to data source.
If limit could pushed down and Data source only have one partition, DS V2 still do limit again.
This PR want remove `Limit` from plan if complete push down limit to data source.

Improve performance.

'No'.
New feature.

Tests updated.

Closes apache#36043 from beliefer/SPARK-38768.

Authored-by: Jiaan Geng <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>

* [SPARK-38391][SPARK-38768][SQL][FOLLOWUP] Add comments for `pushLimit` and `pushTopN` of `PushDownUtils`

`pushLimit` and `pushTopN` of `PushDownUtils` returns tuple of boolean. It will be good to explain what the boolean value represents.

Make DS V2 API more friendly to developers.

'No'.
Just update comments.

N/A

Closes apache#36092 from beliefer/SPARK-38391_SPARK-38768_followup.

Authored-by: Jiaan Geng <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>

* [SPARK-37960][SQL][FOLLOWUP] Make the testing CASE WHEN query more reasonable

Some testing CASE WHEN queries are not carefully written and do not make sense. In the future, the optimizer may get smarter and get rid of the CASE WHEN completely, and then we loose test coverage.

This PR updates some CASE WHEN queries to make them more reasonable.

future-proof test coverage.

'No'.

N/A

Closes apache#36125 from beliefer/SPARK-37960_followup3.

Authored-by: Jiaan Geng <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>

* update spark version

Co-authored-by: Jiaan Geng <[email protected]>
RolatZhang pushed a commit that referenced this pull request Aug 29, 2023
… to data source (#474)

* [SPARK-38768][SQL] Remove `Limit` from plan if complete push down limit to data source

Currently, Spark supports push down limit to data source.
If limit could pushed down and Data source only have one partition, DS V2 still do limit again.
This PR want remove `Limit` from plan if complete push down limit to data source.

Improve performance.

'No'.
New feature.

Tests updated.

Closes apache#36043 from beliefer/SPARK-38768.

Authored-by: Jiaan Geng <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>

* [SPARK-38391][SPARK-38768][SQL][FOLLOWUP] Add comments for `pushLimit` and `pushTopN` of `PushDownUtils`

`pushLimit` and `pushTopN` of `PushDownUtils` returns tuple of boolean. It will be good to explain what the boolean value represents.

Make DS V2 API more friendly to developers.

'No'.
Just update comments.

N/A

Closes apache#36092 from beliefer/SPARK-38391_SPARK-38768_followup.

Authored-by: Jiaan Geng <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>

* [SPARK-37960][SQL][FOLLOWUP] Make the testing CASE WHEN query more reasonable

Some testing CASE WHEN queries are not carefully written and do not make sense. In the future, the optimizer may get smarter and get rid of the CASE WHEN completely, and then we loose test coverage.

This PR updates some CASE WHEN queries to make them more reasonable.

future-proof test coverage.

'No'.

N/A

Closes apache#36125 from beliefer/SPARK-37960_followup3.

Authored-by: Jiaan Geng <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>

* update spark version

Co-authored-by: Jiaan Geng <[email protected]>
baibaichen pushed a commit that referenced this pull request Jun 18, 2024
### What changes were proposed in this pull request?
apache#41458 updated `numeric.sql.out` but not update `numeric.sql.out.java21`, this pr updated `numeric.sql.out.java21` for Java 21.

### Why are the changes needed?
Fix golden file for Java 21.

https://github.com/apache/spark/actions/runs/5362442727/jobs/9729315685

```
[info] - postgreSQL/numeric.sql *** FAILED *** (1 minute, 4 seconds)
[info]   postgreSQL/numeric.sql
[info]   Expected "...OLUMN_ARITY_MISMATCH[",
[info]     "sqlState" : "21S01",
[info]     "messageParameters" : {
[info]       "dataColumns" : "'id', 'id', 'val', 'val', '(val * val)'",
[info]       "reason" : "too many data columns",
[info]       "tableColumns" : "'id1', 'id2', 'result']",
[info]       "tableName" :...", but got "...OLUMN_ARITY_MISMATCH[.TOO_MANY_DATA_COLUMNS",
[info]     "sqlState" : "21S01",
[info]     "messageParameters" : {
[info]       "dataColumns" : "`id`, `id`, `val`, `val`, `(val * val)`",
[info]       "tableColumns" : "`id1`, `id2`, `result`]",
[info]       "tableName" :..." Result did not match for query #474
[info]   INSERT INTO num_result SELECT t1.id, t2.id, t1.val, t2.val, t1.val * t2.val
[info]       FROM num_data t1, num_data t2 (SQLQueryTestSuite.scala:848)
[info]   org.scalatest.exceptions.TestFailedException:
[info]   at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:472)
[info]   at org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:471)
[info]   at org.scalatest.funsuite.AnyFunSuite.newAssertionFailedException(AnyFunSuite.scala:1564)
[info]   at org.scalatest.Assertions.assertResult(Assertions.scala:847)
[info]   at org.scalatest.Assertions.assertResult$(Assertions.scala:842)
[info]   at org.scalatest.funsuite.AnyFunSuite.assertResult(AnyFunSuite.scala:1564)
[info]   at org.apache.spark.sql.SQLQueryTestSuite.$anonfun$readGoldenFileAndCompareResults$3(SQLQueryTestSuite.scala:848)
```

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- Pass GitHub Actions
- Manual checked using Java 21

Closes apache#41720 from LuciferYang/SPARK-43969-FOLLOWUP-2.

Authored-by: yangjie01 <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants