-
Notifications
You must be signed in to change notification settings - Fork 29k
[WIP][SPARK-42578][CONNECT] Add JDBC to DataFrameWriter #40291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@hvanhovell It seems that add test cases no way. |
|
hmmm - let me think about it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a question @hvanhovell @beliefer . For the connect-client api, should we verify the parameters on the client side or on the server side?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think verify parameters on the server side is a robust way. Certainly, the work on client side will reduce the pressure on the server side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Server side. There are a couple of reasons for this:
- The server cannot trust the client to implement the verification properly. I am sure we will get it right for Scala and Python, but there are potentially a plethora of other frontends that need to do the same.
- Keeping the client simple and reduce duplication. If we need to do this for every client we'll end up with a lot of duplication and increase client complexity.
|
@beliefer we should be able to create an in-memory table and append a couple of rows to that right? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need a separate method? There is only one method using it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is calling save() right? Why not call it TABLE_SAVE_METHOD_SAVE?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove ProblemFilters.exclude[Problem]("org.apache.spark.sql.DataFrameWriter.jdbc") from CheckConnectJvmClientCompatibility in this pr
`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the reminder.
connector/connect/server/pom.xml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the move?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hvanhovell
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Can you look into the test failure?
|
@beliefer are you abandoning this one? |
Because other PR implement this function. |
|
Is that #40415? |

What changes were proposed in this pull request?
Currently, the connect project have the new
DataFrameWriterAPI which is corresponding to SparkDataFrameWriterAPI. But the connect'sDataFrameWritermissing the jdbc API.Why are the changes needed?
This PR try to add JDBC to
DataFrameWriter.Does this PR introduce any user-facing change?
'No'.
New feature.
How was this patch tested?
@hvanhovell It seems that add test cases no way.