Spark: Support writing shredded variant in Iceberg-Spark #14297

aihuaxu · 2025-10-11T21:02:14Z

This change adds support for writing shredded variants in the iceberg-spark module, enabling Spark to write shredded variant data into Iceberg tables.

Ideally, this should follow the approach described in the reader/writer API proposal for Iceberg V4, where the execution engine provides the shredded writer schema before creating the Iceberg writer. This design is cleaner, as it delegates schema generation responsibility to the engine.

As an interim solution, this PR implements a writer with lazy initialization for the actual Parquet writer. It buffers a portion of the data first, derives the shredded schema from the buffered records, then initializes the Parquet writer and flushes the buffered data to the file.

aihuaxu · 2025-10-15T01:57:15Z

@amogh-jahagirdar @Fokko @huaxingao Can you help take a look at this PR and if we have better approach for this?

aihuaxu · 2025-10-21T04:59:31Z

cc @RussellSpitzer, @pvary and @rdblue Seems it's better to have the implementation with new File Format proposal but want to check if this is acceptable approach as an interim solution or you see a better alternative.

huaxingao · 2025-10-21T06:03:57Z

parquet/src/main/java/org/apache/iceberg/parquet/ParquetWriter.java

+              lazy.initialize(props, compressor, rowGroupOrdinal);
+          this.parquetSchema = result.getSchema();
+          this.pageStore = result.getPageStore();
+          this.writeStore = result.getWriteStore();


Seems the initial writeStore/pageStore from startRowGroup() aren’t closed before being replaced here. Could this cause memory leak?

pvary · 2025-10-21T10:01:30Z

@aihuaxu: Don't we want to do the same but instead of wrapping the ParquetWriter, we could wrap the DataWriter. The schema would be created near the SparkWrite.WriterFactory and it would be easier to move to the new API when it is ready. The added benefit would be that when other formats implement the Variant, we could reuse the code.

Would this be prohibitively complex?

huaxingao · 2025-10-21T18:32:50Z

In Spark DSv2, planning/validation happens on the driver. BatchWrite#createBatchWriterFactory runs on the driver and returns a DataWriterFactory that is serialized to executors. That factory must already carry the write schema the executors will use when they create DataWriters.

For shredded variant, we don’t know the shredded schema at planning time. We have to inspect some records to derive it. Doing a read on the driver during createBatchWriterFactory would mean starting a second job inside planning, which is not how DSv2 is intended to work.

Because of that, the current proposed Spark approach is: put the logical variant in the writer factory, on the executor, buffer the first N rows, infer the shredded schema from data, then initialize the concrete writer and flush the buffer. I believe this PR follow the same approach, which seems like a practical solution to me given DSV2's constraints.

pvary · 2025-10-22T08:47:11Z

Thanks for the explanation, @huaxingao! I see several possible workarounds for the DataWriterFactory serialization issue, but I have some more fundamental concerns about the overall approach.
I believe shredding should be driven by future reader requirements rather than by the actual data being written. Ideally, it should remain relatively stable across data files within the same table and originate from a writer job configuration—or even better, from a table-level configuration.

Even if we accept that the written data should dictate the shredding logic, Spark’s implementation—while dependent on input order—is at least somewhat stable. It drops rarely used fields, handles inconsistent types, and limits the number of columns.
I understand this is only a PoC implementation for shredding, but I’m concerned that the current simplifications make it very unstable. If I’m interpreting correctly, the logic infers the type from the first occurrence of each field and creates a column for every field. This could lead to highly inconsistent column layouts within a table, especially in IoT scenarios where multiple sensors produce vastly different data.
Did I miss anything?

aihuaxu · 2025-10-24T16:28:26Z

Thanks @huaxingao and @pvary for reviewing, and thanks to Huaxin for explaining how the writer works in Spark.

Regarding the concern about unstable schemas, Spark's approach makes sense:

If a field appears consistently with a consistent type, create both value and typed_value
If a field appears with inconsistent types, create only value
Drop fields that occur in less than 10% of sampled rows
Cap the total at 300 fields (counting value and typed_value separately)

We could implement similar heuristics. Additionally, making the shredded schema configurable would allow users to choose which fields to shred at write time based on their read patterns.

For this POC, I'd like any feedback on whether there are any significant high-level design options to consider first and if this approach is acceptable. This seems hacky. I may have missed big picture on how the writers work across Spark + Iceberg + Parquet and we may have better way.

github-actions bot added spark parquet labels Oct 11, 2025

aihuaxu force-pushed the spark-write-iceberg-variant branch from 16b7a09 to dc4f72e Compare October 11, 2025 21:03

aihuaxu marked this pull request as ready for review October 11, 2025 21:15

aihuaxu force-pushed the spark-write-iceberg-variant branch 2 times, most recently from 3a7d704 to 97851f0 Compare October 13, 2025 04:47

Spark shredded variant implementation

b87e999

aihuaxu force-pushed the spark-write-iceberg-variant branch from 97851f0 to b87e999 Compare October 13, 2025 16:47

huaxingao reviewed Oct 21, 2025

View reviewed changes

deniskuzZ mentioned this pull request Oct 31, 2025

HIVE-29287: Iceberg: Variant Shredding support apache/hive#6152

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Spark: Support writing shredded variant in Iceberg-Spark #14297

Spark: Support writing shredded variant in Iceberg-Spark #14297

Uh oh!

aihuaxu commented Oct 11, 2025 •

edited

Loading

Uh oh!

aihuaxu commented Oct 15, 2025 •

edited

Loading

Uh oh!

aihuaxu commented Oct 21, 2025

Uh oh!

huaxingao Oct 21, 2025

Uh oh!

pvary commented Oct 21, 2025

Uh oh!

huaxingao commented Oct 21, 2025

Uh oh!

pvary commented Oct 22, 2025

Uh oh!

aihuaxu commented Oct 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Spark: Support writing shredded variant in Iceberg-Spark #14297

Are you sure you want to change the base?

Spark: Support writing shredded variant in Iceberg-Spark #14297

Uh oh!

Conversation

aihuaxu commented Oct 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aihuaxu commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aihuaxu commented Oct 21, 2025

Uh oh!

huaxingao Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

pvary commented Oct 21, 2025

Uh oh!

huaxingao commented Oct 21, 2025

Uh oh!

pvary commented Oct 22, 2025

Uh oh!

aihuaxu commented Oct 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

aihuaxu commented Oct 11, 2025 •

edited

Loading

aihuaxu commented Oct 15, 2025 •

edited

Loading