-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-43018][SQL] Fix bug for INSERT commands with timestamp literals #40652
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hi @gengliangwang here is the correctness bug fix 🙏 |
sql/core/src/test/scala/org/apache/spark/sql/ResolveDefaultColumnsSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/ResolveDefaultColumnsSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/catalyst/analysis/ResolveDefaultColumnsSuite.scala
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveDefaultColumns.scala
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveDefaultColumns.scala
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveDefaultColumns.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/catalyst/analysis/ResolveDefaultColumnsSuite.scala
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveDefaultColumns.scala
Outdated
Show resolved
Hide resolved
gengliangwang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except one minor comment
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveDefaultColumns.scala
Show resolved
Hide resolved
|
Thanks, merging to master/branch-3.4 |
### What changes were proposed in this pull request? This PR fixes a correctness bug for INSERT commands with timestamp literals. The bug manifests when: * An INSERT command includes a user-specified column list of fewer columns than the target table. * The provided values include timestamp literals. The bug was that the long integer values stored in the rows to represent these timestamp literals were getting assigned back to `UnresolvedInlineTable` rows without the timestamp type. Then the analyzer inserted an implicit cast from `LongType` to `TimestampType` later, which incorrectly caused the value to change during execution. This PR fixes the bug by propagating the timestamp type directly to the output table instead. ### Why are the changes needed? This PR fixes a correctness bug. ### Does this PR introduce _any_ user-facing change? Yes, this PR fixes a correctness bug. ### How was this patch tested? This PR adds a new unit test suite. Closes #40652 from dtenedor/assign-correct-insert-types. Authored-by: Daniel Tenedorio <[email protected]> Signed-off-by: Gengliang Wang <[email protected]> (cherry picked from commit 9f0bf51) Signed-off-by: Gengliang Wang <[email protected]>
| * limitations under the License. | ||
| */ | ||
|
|
||
| package org.apache.spark.sql.catalyst.analysis |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @dtenedor and @gengliangwang . I have a question.
Although I understand this test suite provide a test coverage for org.apache.spark.sql.catalyst.analysis.ResolveDefaultColumns, it doesn't mean this test suite is belong to org.apache.spark.sql.catalyst.analysis package. This test suite exists in sql module and alone in this directory
$ tree sql/core/src/test/scala/org/apache/spark/sql/catalyst
sql/core/src/test/scala/org/apache/spark/sql/catalyst
└── analysis
└── ResolveDefaultColumnsSuite.scala
Is this intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @dongjoon-hyun I don't think this is intentional, we could move the ResolveDefaultColumnsSuite to org.apache.spark.sql package. What do you think? If you want me to do this, I can prepare a PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. I made a PR.
| assert(asLocalRelation(result) == localRelation) | ||
| } | ||
|
|
||
| test("SPARK-43018: INSERT timestamp values into a table with column DEFAULTs") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Especially, this test case tests more than catalyst/analysis.
### What changes were proposed in this pull request? This PR fixes a correctness bug for INSERT commands with timestamp literals. The bug manifests when: * An INSERT command includes a user-specified column list of fewer columns than the target table. * The provided values include timestamp literals. The bug was that the long integer values stored in the rows to represent these timestamp literals were getting assigned back to `UnresolvedInlineTable` rows without the timestamp type. Then the analyzer inserted an implicit cast from `LongType` to `TimestampType` later, which incorrectly caused the value to change during execution. This PR fixes the bug by propagating the timestamp type directly to the output table instead. ### Why are the changes needed? This PR fixes a correctness bug. ### Does this PR introduce _any_ user-facing change? Yes, this PR fixes a correctness bug. ### How was this patch tested? This PR adds a new unit test suite. Closes apache#40652 from dtenedor/assign-correct-insert-types. Authored-by: Daniel Tenedorio <[email protected]> Signed-off-by: Gengliang Wang <[email protected]> (cherry picked from commit 9f0bf51) Signed-off-by: Gengliang Wang <[email protected]>
What changes were proposed in this pull request?
This PR fixes a correctness bug for INSERT commands with timestamp literals. The bug manifests when:
The bug was that the long integer values stored in the rows to represent these timestamp literals were getting assigned back to
UnresolvedInlineTablerows without the timestamp type. Then the analyzer inserted an implicit cast fromLongTypetoTimestampTypelater, which incorrectly caused the value to change during execution.This PR fixes the bug by propagating the timestamp type directly to the output table instead.
Why are the changes needed?
This PR fixes a correctness bug.
Does this PR introduce any user-facing change?
Yes, this PR fixes a correctness bug.
How was this patch tested?
This PR adds a new unit test suite.