-
Notifications
You must be signed in to change notification settings - Fork 29.1k
[SPARK-33888][SQL][FOLLOWUP] Restored scale metadata for ARRAY type (Postgres) #31252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -306,13 +306,14 @@ object JdbcUtils extends Logging { | |
| } | ||
| val metadata = new MetadataBuilder() | ||
| // SPARK-33888 | ||
| // - include scale in metadata for only DECIMAL & NUMERIC | ||
| // - include scale in metadata for only DECIMAL & NUMERIC as well as ARRAY (for Postgres) | ||
| // - include TIME type metadata | ||
| // - always build the metadata | ||
| dataType match { | ||
| // scalastyle:off | ||
| case java.sql.Types.NUMERIC => metadata.putLong("scale", fieldScale) | ||
| case java.sql.Types.DECIMAL => metadata.putLong("scale", fieldScale) | ||
| case java.sql.Types.ARRAY => metadata.putLong("scale", fieldScale) // PostgresDialect.scala wants this information | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If this issue is only for postgresql, could you add this fix in
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1 for @maropu 's advice.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The only way
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've explained a few more rationale regarding this change proposed by @skestle in the comment below #31252 (comment) |
||
| case java.sql.Types.TIME => metadata.putBoolean("logical_time_type", true) | ||
| case _ => | ||
| // scalastyle:on | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we always include the scale metadata if it's present? cc @saikocat
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do that for simplification. But we need to fix the tests for the rest of the dialects. Cos it adds
{"scale": 0}to all metadata, then existing tests failed (previously metadata didn't get build in getSchema(), so the JSON meta didn't generated)That's why I choose to set it to only decimal and numeric. The problem also eluded me cos the test for array in Postgresql dialects call the toCalaystType directly instead of going through the code path of using metadata. Sorry on phone so it's hard for me to link the line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, let me elaborate more so you two (cc: @skestle) can decide on which approach to go for. Though I'm kind of favor the current approach of adding data type matching for adding scale metadata cos fixing the failing tests will be more difficult and it makes the JDBCSuite test
"jdbc data source shouldn't have unnecessary metadata in its schema"test slightly lose its meaning.So in order to push "logical_time_type" to metadata, I have to force metadata to be built in the field type as of here:
skestle@0b647fe#diff-c3859e97335ead4b131263565c987d877bea0af3adbd6c5bf2d3716768d2e083R323 whereas previously, metadata can be built by the dialect or completely ignored (as default)
This will cause 3 tests in
JDBCSuiteto fail cause of schema mismatch (extra{"scale": 0}in the metadata always present) [1."jdbc API support custom schema", 2."jdbc API custom schema DDL-like strings.", 3."jdbc data source shouldn't have unnecessary metadata in its schema"].There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it was dialect's responsibility to put things into metadata before, I think this PR should put the fix in
PostgresDialect.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the only way
fieldScalecan make it into the dialect is by the field metadata. So it is a very chicken and egg problem.EDIT: Postgresql utilize the metadatabuilder to get the scale for array[][] of type numeric for example - cos dataType is
ARRAYbut the typeName is_numeric- note the underscore specific for Postgresql. Whereas MySQL dialect is putting more info into the metadata (like put("binarylong")). The use cases differ.Might have to change the interface somehow to let the ResultSetMetadata to be passed or init-ed to the dialect.