Enable reading Iceberg v3 failing on all unimplemented features by dain · Pull Request #27786 · trinodb/trino

dain · 2025-12-30T08:07:30Z

Description

Add support for creating Iceberg format version 3 tables, upgrading v2 tables to v3, and inserting into v3 tables.

This change intentionally does not implement Iceberg v3 features beyond allowing v3 metadata and validating that inserts produce the required row-lineage metadata (as observed through the Iceberg library).

To avoid spec violations while v3 support is incomplete, the connector now explicitly rejects v3 features that are not yet supported. The goal is to safely unlock v3 table creation and incremental adoption, while making unsupported behavior fail fast and predictably.

Unsupported v3 features that now throw NOT_SUPPORTED include:

Row-level mutations on v3 tables: DELETE, UPDATE, MERGE
OPTIMIZE on v3 tables
add_files / add_files_from_table procedures on v3 tables
Deletion vectors (PUFFIN delete files)
Column default values (initial-default, write-default)
Iceberg table encryption (encryption-keys / snapshot key-id)

Tests:

Add TestIcebergV3 to cover:
- create v3 tables and upgrade v2→v3
- inserts into v3 tables produce required lineage metadata (nextRowId, firstRowId, dataSequenceNumber)
- unsupported v3 features fail with clear exceptions

Release notes

(X) Release notes are required, with the following suggested text:

## Iceberg
* Allow creating Iceberg format version 3 tables, upgrading v2 tables to v3, and inserting into v3 tables. Unsupported v3 features are explicitly rejected.

findepi · 2025-12-30T14:15:54Z

For something as central as table format compatibility, we should have tests with some other system (Spark?) reading tables produced by Trino.

chenjian2664 · 2025-12-31T13:07:41Z

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java

    }

+    // TODO: Remove when Iceberg v3 is fully supported
+    private static void validateTableForTrino(BaseTable table, Optional<Long> tableSnapshotId)


If the table is upgraded from v2 to v3, all read operations on the table will be blocked, even for snapshots that do not include any v3 data

There is a test showing that insert and read works. Are you seeing something I'm not?

I see that you are explicitly checking for v3 features (for example, default value) and rejecting queries against v3 tables. There is a possible scenario where another engine upgrades a table from v2 to v3, performs inserts and/or updates (including deletes), and then enables v3-specific features such as default value

In this case, what is the expected behavior when Trino queries such a table? how should this behave when time travel is used to query a snapshot from before upgrade?

Yes. That is the point of this PR. Today, there are no v3 tables allowed at all, so any query against them will fail. With this PR you can create, insert and read v3 tables, and most other things fail. If you have defaults the table can not be used, but remember it can't be used today.

This is an iterative PR. Allow what works to be used and explicitly fail on everything else. Then additional PRs will implement the remaining features one at a time.

Copilot

Pull request overview

This PR enables support for creating Iceberg format version 3 tables, upgrading v2 tables to v3, and inserting data into v3 tables. The implementation intentionally limits v3 support to these basic operations while explicitly rejecting unsupported v3 features to prevent spec violations.

Key changes:

Updated maximum supported format version from 2 to 3 while keeping default at 2
Added validation logic to reject unsupported v3 features (deletion vectors, column defaults, encryption)
Implemented version checks for row-level operations (DELETE, UPDATE, MERGE) and table procedures (OPTIMIZE, ADD_FILES)

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/TestIcebergV3.java	Comprehensive test suite covering v3 table creation, upgrades, inserts, lineage metadata validation, and rejection of unsupported features
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/TestIcebergV2.java	Updated error message to reflect new maximum supported format version (3 instead of 2)
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergSplitSource.java	Added validation to reject PUFFIN deletion vectors during split creation
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java	Added validateTableForTrino method to reject unsupported v3 features, refactored version checks for table procedures, enhanced verifyTableVersionForUpdate to block v3 table mutations
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergConfig.java	Increased FORMAT_VERSION_SUPPORT_MAX to 3 while maintaining default format version at 2 for backward compatibility

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/TestIcebergV3.java

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergSplitSource.java

plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/TestIcebergV3.java

findinpath · 2026-01-07T14:49:58Z

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java

+                .map(table::snapshot)
+                .orElse(table.currentSnapshot());
+        if (snapshot == null) {
+            // the snapshot does not exist, this is an error that will be handled elsewhere


That's not an error

spark-sql (default)> create table t1 (data integer) using iceberg tblproperties ('format-version' = 3);

Spark does not create a snapshot when doing CREATE TABLE

findinpath · 2026-01-07T15:05:26Z

Related failues

Error:    TestIcebergV3.testV3RejectsColumnDefaults:283->AbstractTestQueryFramework.assertUpdate:411->AbstractTestQueryFramework.assertUpdate:416 » QueryFailed Iceberg v3 column default values are not supported
Error:    TestIcebergV3.testV3RejectsColumnWriteDefaults:330->AbstractTestQueryFramework.assertUpdate:411->AbstractTestQueryFramework.assertUpdate:416 » QueryFailed Iceberg v3 column default values are not supported
Error:    TestIcebergV3.testV3RejectsDeletionVectorsPuffinDeleteFile:480 » FileSystem /tmp/TrinoTest13055050090781735228/iceberg_data/hadoop_v3_dv_qdovndcpa6: failed to delete one or more files; see suppressed exceptions for details
Error:    TestIcebergV3.testV3RejectsEncryptionKeys:416->AbstractTestQueryFramework.assertUpdate:411->AbstractTestQueryFramework.assertUpdate:416 » QueryFailed Iceberg table encryption is not supported

ebyhr

There is a bug about column defaults. The INSERT statement in the below test doesn't throw an exception.

    @Test
    void testWriteDefault()
    {
        String tableName = "tmp_v3_defaults_src_" + randomNameSuffix();
        assertUpdate("CREATE TABLE " + tableName + " (id INTEGER, data INTEGER) WITH (format_version = 3, format = 'ORC')");
        assertUpdate("INSERT INTO " + tableName + " VALUES (1, 10)", 1);

        Table icebergTable = loadTable(tableName);
        icebergTable.updateSchema()
                .updateColumnDefault("data", Expressions.lit(42))
                .commit();

        assertQueryFails(
                "INSERT INTO " + tableName + " (id) VALUES (2)",
                ".*Iceberg v3 column default values are not supported.*");

        assertUpdate("DROP TABLE " + tableName);
    }

plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/TestIcebergV3.java

ebyhr · 2026-01-08T00:16:59Z

plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/TestIcebergV3.java

+        Table icebergTable = new HadoopTables(new Configuration(false)).create(
+                schema,
+                PartitionSpec.unpartitioned(),
+                SortOrder.unsorted(),
+                ImmutableMap.of(
+                        "format-version", "3",
+                        "write.format.default", "ORC"),
+                hadoopTableLocation.toString());


We don't need to use HadoopTables in most tests. We can use loadTable method for existing tables, or TrinoCatalog (there's IcebergTestUtils#getTrinoCatalog) for new tables.

testV3RejectsColumnDefaults and testV3RejectsColumnWriteDefaults require HadoopTables because they need to create tables with v3 column default features that cannot be created through Trino SQL. These tests verify Trino correctly rejects v3 features when reading tables created by other engines.

That isn't true. You can create such tables like this:

catalog.newCreateTableTransaction( SESSION, schemaTableName, new Schema( Types.NestedField.optional("id") .withId(1) .ofType(Types.IntegerType.get()) .withInitialDefault(Expressions.lit(42)) .build()), PartitionSpec.unpartitioned(), SortOrder.unsorted(), Optional.ofNullable(catalog.defaultTableLocation(SESSION, schemaTableName)), ImmutableMap.of("format-version", "3")) .commitTransaction();

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergConfig.java

chenjian2664 · 2026-01-09T04:33:09Z

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java

+        }
+
+        Schema schema = metadata.schemasById().get(snapshot.schemaId());
+        if (schema == null) {


Under what circumstances did you encounter this? I would not expect it to be null

see #27786 (comment)

The snapshot can be null for an empty table that has been created but has no data inserted yet. As findinpath noted above, Spark (and other engines) don't create a snapshot when doing CREATE TABLE. The null check prevents NPE when validating an empty table.

@findinpath @dain If the table was created by Spark and is empty, then an empty snapshot is expected. However, reaching this line indicates that a snapshot exists but has no schema, which should be an invalid state. I do not see a valid scenario where this can happen. I would suggest either removing this code path or explicitly failing fast by throwing an exception. thoughts?
cc @ebyhr

plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/TestIcebergV3.java

All usage of any unimplemented v3 feature results in a failure.

Cherry pick of trinodb/trino#27786 Co-authored-by: Dain Sundstrom <dain@iq80.com>

Cherry pick of trinodb/trino#27786 Co-authored-by: Dain Sundstrom <dain@iq80.com> ## Description Add initial support for Iceberg table format version 3 while constraining unsupported features and row-level operations to safe, explicitly validated paths. **New Features:** 1. Allow creating Iceberg tables with format version 3 and inserting/querying data from them, including partitioned tables. 2. Support upgrading existing Iceberg format version 2 tables to format version 3. **Enhancements:** 1. Introduce version guardrails for Iceberg operations, including explicit maximum supported table format and maximum format version for row-level operations. 2. Validate Iceberg v3 tables for unsupported features such as column default values and table encryption before executing writes or inserts. 3. Add validation to reject use of PUFFIN-based deletion vectors that are not yet supported. 4. Improve error handling for Iceberg update and delete operations by using specific PrestoException errors and clearer messages when format/version constraints are violated. 5. Prevent OPTIMIZE (rewrite_data_files) from running on Iceberg tables with format versions above the supported threshold. ## Test Plan Add TestIcebergV3 integration test suite covering creation, upgrade, insert, query, and partitioning for v3 tables, as well as rejection of unsupported delete, update, merge, and OPTIMIZE operations on v3 tables. ## Release Notes ``` == RELEASE NOTES == Iceberg Connector Changes * Add support for creating Iceberg tables with format-version = '3'. * Add reading from Iceberg V3 tables, including partitioned tables. * Add INSERT operations into Iceberg V3 tables. * Add support for upgrading existing V2 tables to V3 using the Iceberg API. ``` ## Summary by Sourcery Add guarded initial support for Iceberg table format version 3 while constraining unsupported features and row-level operations to fail fast with clear errors. New Features: - Enable creating, reading from, and inserting into Iceberg tables with format version 3, including partitioned tables. Enhancements: - Introduce format-version guardrails for Iceberg operations, including a maximum supported table format and a maximum format version for row-level operations. - Validate Iceberg v3 tables for unsupported features such as column default values, table encryption, and PUFFIN-based deletion vectors before executing reads or writes. - Tighten validation and error handling for Iceberg update, delete, merge, and OPTIMIZE (rewrite_data_files) operations on tables with unsupported format versions. Tests: - Add TestIcebergV3 integration suite covering supported v3 operations and expected failures for unsupported delete, update, merge, OPTIMIZE, encryption, and deletion vector features. Co-authored-by: Dain Sundstrom <dain@iq80.com>

cla-bot bot added the cla-signed label Dec 30, 2025

github-actions bot added the iceberg Iceberg connector label Dec 30, 2025

dain requested review from ebyhr, electrum, findepi and raunaqmorarka and removed request for findepi December 30, 2025 08:09

dain mentioned this pull request Dec 30, 2025

[Iceberg v3] VARIANT type #27753

Merged

chenjian2664 reviewed Dec 31, 2025

View reviewed changes

dain force-pushed the iceberg-v3 branch 2 times, most recently from cae6b76 to 2b0318c Compare January 2, 2026 22:27

electrum requested a review from Copilot January 7, 2026 00:22

Copilot started reviewing on behalf of electrum January 7, 2026 00:22 View session

Copilot AI reviewed Jan 7, 2026

View reviewed changes

electrum approved these changes Jan 7, 2026

View reviewed changes

dain force-pushed the iceberg-v3 branch 3 times, most recently from 90cc94c to aba25c7 Compare January 7, 2026 02:43

findinpath reviewed Jan 7, 2026

View reviewed changes

ebyhr reviewed Jan 7, 2026

View reviewed changes

ebyhr reviewed Jan 8, 2026

View reviewed changes

plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/TestIcebergV3.java Show resolved Hide resolved

ebyhr reviewed Jan 8, 2026

View reviewed changes

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergConfig.java Show resolved Hide resolved

dain force-pushed the iceberg-v3 branch 2 times, most recently from 2473ab1 to b08b997 Compare January 8, 2026 22:21

github-actions bot added the docs label Jan 8, 2026

chenjian2664 reviewed Jan 9, 2026

View reviewed changes

dain force-pushed the iceberg-v3 branch 2 times, most recently from a5cf8a5 to ab96194 Compare January 10, 2026 07:03

Enable reading Iceberg v3

4d0ef36

All usage of any unimplemented v3 feature results in a failure.

dain force-pushed the iceberg-v3 branch from ab96194 to 4d0ef36 Compare January 10, 2026 21:26

dain merged commit bb65065 into trinodb:master Jan 10, 2026
55 checks passed

dain deleted the iceberg-v3 branch January 10, 2026 22:30

github-actions bot added this to the 480 milestone Jan 10, 2026

chenjian2664 mentioned this pull request Jan 11, 2026

Add 480 release notes #27719

Merged

This was referenced Jan 23, 2026

[WIP] Add support for Iceberg format version 3 tables Joe-Abraham/presto#74

Closed

Add Iceberg format version 3 table support Joe-Abraham/presto#75

Draft

Joe-Abraham mentioned this pull request Feb 20, 2026

feat: Add initial support for Iceberg format version 3 prestodb/presto#27021

Merged

Joe-Abraham added a commit to Joe-Abraham/presto that referenced this pull request Mar 2, 2026

feat(iceberg): Add initial support for Iceberg format version 3

52c051a

Cherry pick of trinodb/trino#27786 Co-authored-by: Dain Sundstrom <dain@iq80.com>

Conversation

dain commented Dec 30, 2025

Description

Release notes

Uh oh!

findepi commented Dec 30, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

findinpath commented Jan 7, 2026

Uh oh!

ebyhr left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

7 participants