Skip to content

Conversation

@marko-sisovic-db
Copy link

@marko-sisovic-db marko-sisovic-db commented Dec 3, 2025

What changes were proposed in this pull request?

Making new API for beforeFetch in JDBCDialect, where it accepts options as JDBCOptions instead of Map[String, String], and deprecating the old API starting from Spark version 4.2.0. Spark docs state that options are case insensitive, so this will make it easier for dialects to respect that. Even if we have some edge case where we need the original casing, we can access the original map inside the JDBCOptions object.

Why are the changes needed?

The option fetchsize requires another option, autocommit to be set to false for the Postgres connector. We have logic for this:

if (properties.getOrElse(JDBCOptions.JDBC_BATCH_FETCH_SIZE, "0").toInt > 0) {
connection.setAutoCommit(false)
}

However, this logic is case-sensitive, and will only work for lowercased fetchsize. When passing fetchSize for example, the correct value for the fetchsize will be set on the Postgres driver, but it won't have autocommit -> false, so the fetch size will be ignored.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

New test: PostgresDialectSuite.

Was this patch authored or co-authored using generative AI tooling?

Yes, partly generated-by: claude code.

@github-actions github-actions bot added the SQL label Dec 3, 2025
@marko-sisovic-db marko-sisovic-db force-pushed the msisovic/postgres-fetchsize-fix branch from d32f1e9 to 56294cb Compare December 3, 2025 19:22
* @param connection The connection object
* @param properties The connection properties. This is passed through from the relation.
*/
@deprecated("Use beforeFetch(Connection, JDBCOptions) instead", "4.0.0")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To @marko-sisovic-db , since Apache Spark 4.0.0 and 4.0.1 are released already, this is wrong.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, changed to 4.2.0.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-54581] Making fetchsize option case-insensitive for Postgres connector [SPARK-54581][SQL] Making fetchsize option case-insensitive for Postgres connector Dec 4, 2025
* @param connection The connection object
* @param properties The connection properties. This is passed through from the relation.
*/
@deprecated("Use beforeFetch(Connection, JDBCOptions) instead", "4.0.0")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be 4.1.0 or 4.2.0, depending on whether we want this bug fix in 4.1 or not. cc @dongjoon-hyun for the decision.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for 4.2.0 because 4.1.0 already has several pending decisions. Thank you!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks folks, changed to 4.2.0.

@cloud-fan
Copy link
Contributor

@marko-sisovic-db can you update the PR description to match the actual change?

@cloud-fan
Copy link
Contributor

@marko-sisovic-db can you re-trigger the failed CI jobs? seems flaky

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants