-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-54581][SQL] Making fetchsize option case-insensitive for Postgres connector #53308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
[SPARK-54581][SQL] Making fetchsize option case-insensitive for Postgres connector #53308
Conversation
d32f1e9 to
56294cb
Compare
| * @param connection The connection object | ||
| * @param properties The connection properties. This is passed through from the relation. | ||
| */ | ||
| @deprecated("Use beforeFetch(Connection, JDBCOptions) instead", "4.0.0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To @marko-sisovic-db , since Apache Spark 4.0.0 and 4.0.1 are released already, this is wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, changed to 4.2.0.
| * @param connection The connection object | ||
| * @param properties The connection properties. This is passed through from the relation. | ||
| */ | ||
| @deprecated("Use beforeFetch(Connection, JDBCOptions) instead", "4.0.0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be 4.1.0 or 4.2.0, depending on whether we want this bug fix in 4.1 or not. cc @dongjoon-hyun for the decision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for 4.2.0 because 4.1.0 already has several pending decisions. Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks folks, changed to 4.2.0.
|
@marko-sisovic-db can you update the PR description to match the actual change? |
|
@marko-sisovic-db can you re-trigger the failed CI jobs? seems flaky |
What changes were proposed in this pull request?
Making new API for
beforeFetchinJDBCDialect, where it accepts options asJDBCOptionsinstead ofMap[String, String], and deprecating the old API starting from Spark version4.2.0. Spark docs state that options are case insensitive, so this will make it easier for dialects to respect that. Even if we have some edge case where we need the original casing, we can access the original map inside theJDBCOptionsobject.Why are the changes needed?
The option
fetchsizerequires another option,autocommitto be set tofalsefor the Postgres connector. We have logic for this:spark/sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala
Lines 203 to 205 in 415f505
However, this logic is case-sensitive, and will only work for lowercased
fetchsize. When passingfetchSizefor example, the correct value for the fetchsize will be set on the Postgres driver, but it won't haveautocommit -> false, so the fetch size will be ignored.Does this PR introduce any user-facing change?
No.
How was this patch tested?
New test:
PostgresDialectSuite.Was this patch authored or co-authored using generative AI tooling?
Yes, partly generated-by: claude code.