-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-31489][SPARK-31488][SQL] Translate date values of pushed down filters to java.sql.Date
#28272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
Outdated
Show resolved
Hide resolved
|
Test build #121516 has finished for PR 28272 at commit
|
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystTypeConverters.scala
Outdated
Show resolved
Hide resolved
|
Test build #121537 has finished for PR 28272 at commit
|
java.sql.Datejava.sql.Date
|
jenkins, retest this, please |
|
Test build #121538 has finished for PR 28272 at commit
|
|
Test build #121543 has finished for PR 28272 at commit
|
|
Test build #121544 has finished for PR 28272 at commit
|
|
Test build #121545 has finished for PR 28272 at commit
|
java.sql.Datejava.sql.Date
|
jenkins, retest this, please |
|
Test build #121569 has finished for PR 28272 at commit
|
|
jenkins, retest this, please |
| } | ||
| } | ||
|
|
||
| test("filter pushdown - local date") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we just update test("filter pushdown - date") to test with DATETIME_JAVA8API_ENABLED on and off, so that we have less duplicated code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| } | ||
| } | ||
|
|
||
| test("filter pushdown - local date") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| } | ||
| } | ||
|
|
||
| test("filter pushdown - local date") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| private def translateLeafNodeFilter(predicate: Expression): Option[Filter] = predicate match { | ||
| case expressions.EqualTo(PushableColumn(name), Literal(v, t)) => | ||
| Some(sources.EqualTo(name, convertToScala(v, t))) | ||
| Some(sources.EqualTo(name, convertToScala(v, t, false))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we're treating this as a temp fix for Spark 3.0?
Looks like ideally we should support Java 8 datetime instances for this interface as well when spark.sql.datetime.java8API.enabled is enabled. It could cause more confusion. In addition, seems like spark.sql.datetime.java8API.enabled is disabled by default, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's problematic to let the java8 config also control the value type inside Filter, as it can break existing DS v1 implementations. It's a bit unfortunate that we don't document clearly what the value type can be for Filter, but if we do, it's not user-friendly to say "the value type depends on xxx config". This just makes it harder to implement data source filter pushdown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@HyukjinKwon Taking into account #23811 (comment), the flag won't be enabled by default in the near future.
|
Test build #121574 has finished for PR 28272 at commit
|
java.sql.Datejava.sql.Date
|
jenkins, retest this, please |
|
Test build #121594 has finished for PR 28272 at commit
|
|
Test build #121593 has finished for PR 28272 at commit
|
java.sql.Datejava.sql.Date
|
@cloud-fan @HyukjinKwon @dongjoon-hyun Please, take a look at the PR if you have time. |
…` values in ORC ### What changes were proposed in this pull request? Convert `java.time.LocalDate` to `java.sql.Date` in pushed down filters to ORC datasource when Java 8 time API enabled. Closes #28272 ### Why are the changes needed? The changes fix the exception raised while pushing date filters when `spark.sql.datetime.java8API.enabled` is set to `true`: ``` Wrong value class java.time.LocalDate for DATE.EQUALS leaf java.lang.IllegalArgumentException: Wrong value class java.time.LocalDate for DATE.EQUALS leaf at org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$PredicateLeafImpl.checkLiteralType(SearchArgumentImpl.java:192) at org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$PredicateLeafImpl.<init>(SearchArgumentImpl.java:75) at org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$BuilderImpl.equals(SearchArgumentImpl.java:352) at org.apache.spark.sql.execution.datasources.orc.OrcFilters$.buildLeafSearchArgument(OrcFilters.scala:229) ``` ### Does this PR introduce any user-facing change? Yes ### How was this patch tested? Added tests to `OrcFilterSuite`. Closes #28261 from MaxGekk/orc-date-filter-pushdown. Authored-by: Max Gekk <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit bd139bd) Signed-off-by: Dongjoon Hyun <[email protected]>
What changes were proposed in this pull request?
Always translate date values of pushed down filters to
java.sql.Dateindependently from the SQL configspark.sql.datetime.java8API.enabled.Why are the changes needed?
spark.sql.datetime.java8API.enabledis set totrue:spark.sql.datetime.java8API.enabledistrue.Does this PR introduce any user-facing change?
Yes
How was this patch tested?
Added a test to
ParquetFilterSuiteand toOrcFilterSuite.