-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-16675][SQL] Avoid per-record type dispatch in JDBC when writing #14323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #62745 has finished for PR 14323 at commit
|
|
Test build #62814 has finished for PR 14323 at commit
|
|
Hi @cloud-fan , this one is about writing. Could you please take a look? |
| // and then setting into a field for `PreparedStatement`. The last argument | ||
| // `Int` means the index for the value to be set in the SQL statement and also used | ||
| // for the value to retrieve from `Row`. | ||
| private type JDBCValueGetter = (PreparedStatement, Row, Int) => Unit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, I think setter is a more proper name here, and we should use getter for the read path.
JDBCValueGetter, sounds like we get the value from JDBC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure! thanks!
| // A `JDBCValueSetter` is responsible for converting and setting a value from `Row` into | ||
| // a field for `PreparedStatement`. The last argument `Int` means the index for the | ||
| // value to be set in the SQL statement and also used for the value in `Row`. | ||
| private type JDBCValueSetter = (PreparedStatement, Row, Int) => Unit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please rename the read path too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed!
|
Test build #62863 has finished for PR 14323 at commit
|
|
Test build #62867 has finished for PR 14323 at commit
|
|
Test build #62868 has finished for PR 14323 at commit
|
|
thanks, merging to master! |
…riting apache#14323 [MINOR] Remove extra anonymous closure within functional transformations apache#12382 just JdbcUtils.scala
|
|
||
| case ArrayType(et, _) => | ||
| // remove type length parameters from end of type name | ||
| val typeName = getJdbcType(et, dialect).databaseTypeDefinition |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Late to the party 😝 but why the type name is converted to lower case?
It makes code trying to get a JDBCType enum value using the string fails with: java.lang.IllegalArgumentException: No enum constant java.sql.JDBCType.varchar 😞
//cc @HyukjinKwon
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same as the original code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah from https://github.com/apache/spark/pull/14323/files#diff-c3859e97335ead4b131263565c987d877bea0af3adbd6c5bf2d3716768d2e083L244
will continue digging to try to know why toLowerCase wad added, thanks!
What changes were proposed in this pull request?
Currently,
JdbcUtils.savePartitionis doing type-based dispatch for each row to write appropriate values.So, appropriate setters for
PreparedStatementcan be created first according to the schema, and then apply them to each row. This approach is similar withCatalystWriteSupport.This PR simply make the setters to avoid this.
How was this patch tested?
Existing tests should cover this.