[SPARK-16675][SQL] Avoid per-record type dispatch in JDBC when writing #14323

HyukjinKwon · 2016-07-23T03:06:01Z

What changes were proposed in this pull request?

Currently, JdbcUtils.savePartition is doing type-based dispatch for each row to write appropriate values.

So, appropriate setters for PreparedStatement can be created first according to the schema, and then apply them to each row. This approach is similar with CatalystWriteSupport.

This PR simply make the setters to avoid this.

How was this patch tested?

Existing tests should cover this.

SparkQA · 2016-07-23T04:40:42Z

Test build #62745 has finished for PR 14323 at commit 8cac7de.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-07-25T10:19:30Z

Test build #62814 has finished for PR 14323 at commit 81d8aca.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2016-07-26T02:11:27Z

Hi @cloud-fan , this one is about writing. Could you please take a look?

cloud-fan · 2016-07-26T02:36:22Z

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala

+  // and then setting into a field for `PreparedStatement`. The last argument
+  // `Int` means the index for the value to be set in the SQL statement and also used
+  // for the value to retrieve from `Row`.
+  private type JDBCValueGetter = (PreparedStatement, Row, Int) => Unit


hmm, I think setter is a more proper name here, and we should use getter for the read path.

JDBCValueGetter, sounds like we get the value from JDBC.

Sure! thanks!

cloud-fan · 2016-07-26T04:08:54Z

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala

+  // A `JDBCValueSetter` is responsible for converting and setting a value from `Row` into
+  // a field for `PreparedStatement`. The last argument `Int` means the index for the
+  // value to be set in the SQL statement and also used for the value in `Row`.
+  private type JDBCValueSetter = (PreparedStatement, Row, Int) => Unit


please rename the read path too.

SparkQA · 2016-07-26T05:05:48Z

Test build #62863 has finished for PR 14323 at commit c33bb62.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-07-26T06:21:04Z

Test build #62867 has finished for PR 14323 at commit ab4a1cf.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-07-26T06:23:49Z

Test build #62868 has finished for PR 14323 at commit fb0f9a8.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2016-07-26T09:15:29Z

thanks, merging to master!

…riting apache#14323 [MINOR] Remove extra anonymous closure within functional transformations apache#12382 just JdbcUtils.scala

jneira-stratio · 2024-12-04T07:20:40Z

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala

+
+    case ArrayType(et, _) =>
+      // remove type length parameters from end of type name
+      val typeName = getJdbcType(et, dialect).databaseTypeDefinition


Late to the party 😝 but why the type name is converted to lower case?
It makes code trying to get a JDBCType enum value using the string fails with: java.lang.IllegalArgumentException: No enum constant java.sql.JDBCType.varchar 😞
//cc @HyukjinKwon

This is the same as the original code

Oh yeah from https://github.com/apache/spark/pull/14323/files#diff-c3859e97335ead4b131263565c987d877bea0af3adbd6c5bf2d3716768d2e083L244
will continue digging to try to know why toLowerCase wad added, thanks!

HyukjinKwon added 2 commits July 23, 2016 11:59

[SPARK-16675][SQL] Avoid per-record type dispatch in JDBC when writing

4284d46

Fix comment

8cac7de

HyukjinKwon added 2 commits July 25, 2016 17:38

Fix nits, rename some type and functions and add comments

532b3b1

Consistent naming for variables

81d8aca

cloud-fan reviewed Jul 26, 2016
View reviewed changes

Rename getter to setter and fix some more comments

c33bb62

cloud-fan reviewed Jul 26, 2016
View reviewed changes

HyukjinKwon added 3 commits July 26, 2016 13:13

Fetch upstream

f2be8a4

Rename setter to getter in JDBCRDD as well

ab4a1cf

Clean up comments

fb0f9a8

asfgit closed this in 3b2b785 Jul 26, 2016

HyukjinKwon deleted the SPARK-16675 branch January 2, 2018 03:39

jneira-stratio reviewed Dec 4, 2024

View reviewed changes

[SPARK-16675][SQL] Avoid per-record type dispatch in JDBC when writing #14323

[SPARK-16675][SQL] Avoid per-record type dispatch in JDBC when writing #14323

Uh oh!

Conversation

HyukjinKwon commented Jul 23, 2016

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

SparkQA commented Jul 23, 2016

Uh oh!

SparkQA commented Jul 25, 2016

Uh oh!

HyukjinKwon commented Jul 26, 2016

Uh oh!

cloud-fan Jul 26, 2016

Choose a reason for hiding this comment

Uh oh!

HyukjinKwon Jul 26, 2016

Choose a reason for hiding this comment

Uh oh!

cloud-fan Jul 26, 2016

Choose a reason for hiding this comment

Uh oh!

HyukjinKwon Jul 26, 2016

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Jul 26, 2016

Uh oh!

SparkQA commented Jul 26, 2016

Uh oh!

SparkQA commented Jul 26, 2016

Uh oh!

cloud-fan commented Jul 26, 2016

Uh oh!

jneira-stratio Dec 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HyukjinKwon Dec 4, 2024

Choose a reason for hiding this comment

Uh oh!

jneira-stratio Dec 4, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jneira-stratio Dec 4, 2024 •

edited

Loading