Skip to content

Conversation

@cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

This is a followup of #50674 . In that PR, we made it easier to define sql configs with enum values, and we also refactored some code to make things simpler.

This PR reverts the API changes of ParquetOptions. Ideally we can change private APIs, but Parquet is a very popular format and there are third-party spark plugins that use Parquet related private APIs in Spark. We can of course ask these Spark plugins to update their code or add shim layers, but it's more friendly to avoid breaking certain private APIs if easy.

Why are the changes needed?

avoid breaking private APIs that used by Spark plugins, such as https://github.com/apache/hudi/blob/master/hudi-spark-datasource/hudi-spark3.5.x/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/Spark35LegacyHoodieParquetFileFormat.scala#L150

Does this PR introduce any user-facing change?

no

How was this patch tested?

N/A

Was this patch authored or co-authored using generative AI tooling?

no

@cloud-fan
Copy link
Contributor Author

cc @yaooqinn

@github-actions github-actions bot added the SQL label Aug 18, 2025
@HyukjinKwon HyukjinKwon changed the title [SPARK-51874][SQL][FOLLOWUP] revert ParquetOptions rebase methods to return string type [SPARK-51874][SQL][FOLLOW-UP] Revert ParquetOptions rebase methods to return string type Aug 18, 2025
@cloud-fan
Copy link
Contributor Author

thanks for the review, merging to master!

@cloud-fan cloud-fan closed this in 77413d4 Aug 19, 2025
cloud-fan added a commit that referenced this pull request Aug 20, 2025
… DataSourceUtils and AvroOptions

### What changes were proposed in this pull request?

A similar followup PR of #52065 . This PR restores the rebase APIs in `DataSourceUtils` for compatibility with external Spark plugins, and also in `AvroOptions` to simplify the code.

### Why are the changes needed?

External plugins may use `DataSourceUtils` directly, see https://github.com/apache/hudi/blob/master/hudi-spark-datasource/hudi-spark3.5.x/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/Spark35LegacyHoodieParquetFileFormat.scala#L194

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

N/A

### Was this patch authored or co-authored using generative AI tooling?

no

Closes #52074 from cloud-fan/follow.

Lead-authored-by: Wenchen Fan <[email protected]>
Co-authored-by: Wenchen Fan <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants