Skip to content

feat(plugin-iceberg): Add support for MERGE INTO#25470

Merged
tdcmeehan merged 3 commits intoprestodb:masterfrom
acarpente-denodo:feature/20578_SQL_Support_for_MERGE_INTO
Jan 14, 2026
Merged

feat(plugin-iceberg): Add support for MERGE INTO#25470
tdcmeehan merged 3 commits intoprestodb:masterfrom
acarpente-denodo:feature/20578_SQL_Support_for_MERGE_INTO

Conversation

@acarpente-denodo
Copy link
Copy Markdown
Contributor

@acarpente-denodo acarpente-denodo commented Jul 1, 2025

Description

Iceberg connector support for the SQL MERGE INTO command.

Syntax:

MERGE INTO target_table [ [ AS ]  target_alias ]
USING { source_table | query } [ [ AS ] source_alias ]
ON search_condition
WHEN MATCHED THEN
    UPDATE SET ( column = expression [, ...] )
WHEN NOT MATCHED THEN
    INSERT [ column_list ]
    VALUES (expression, ...)

Example: MERGE INTO usage to update the sales information for existing products and insert the sales information for the new products in the market.

MERGE INTO product_sales AS s
    USING monthly_sales AS ms
    ON s.product_id = ms.product_id
WHEN MATCHED THEN
    UPDATE SET
        sales = sales + ms.sales
      , last_sale = ms.sale_date
      , current_price = ms.price
WHEN NOT MATCHED THEN
    INSERT (product_id, sales, last_sale, current_price)
    VALUES (ms.product_id, ms.sales, ms.sale_date, ms.price)

The Presto engine commit introduces an enum called RowChangeParadigm, which describes how a connector modifies rows. The iceberg connector will utilize the DELETE_ROW_AND_INSERT_ROW paradigm, as it represents an updated row as a combination of a deleted row followed by an inserted row. The CHANGE_ONLY_UPDATED_COLUMNS paradigm is meant for connectors that support updating individual columns of rows.

Note: Changes were made after reviewing the following Trino PR: trinodb/trino#7126
So, this PR is deeply inspired by Trino's implementation.

Motivation and Context

The MERGE INTO statement is commonly used to integrate data from two tables with different contents but similar structures.
For example, the source table could be part of a production transactional system, while the target table might be located in a data warehouse for analytics.
Regularly, MERGE operations are performed to update the analytics warehouse with the latest production data.
You can also use MERGE with tables that have different structures, as long as you can define a condition to match the rows between them.

Test Plan

Automated tests developed in IcebergDistributedTestBase.

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

== RELEASE NOTES ==

Iceberg Connector Changes
* Add support for MERGE command in the Iceberg connector.

@acarpente-denodo acarpente-denodo force-pushed the feature/20578_SQL_Support_for_MERGE_INTO branch 3 times, most recently from 3e4437e to 5b65ae1 Compare July 4, 2025 07:59
@aditi-pandit
Copy link
Copy Markdown
Contributor

@acarpente-denodo : Thanks for this contribution.

The code will need extensive changes for Presto C++ support. Have you investigated that ?

Did you try this code with a C++ worker (or side-car enabled ) ? We can do the full C++ support as a follow up, but its important the queries fail with a clean error message on Presto C++

@acarpente-denodo
Copy link
Copy Markdown
Contributor Author

Hi @aditi-pandit,
Thanks for your feedback. No, I haven't investigated the support for Presto C++ yet. I also haven't tested this code with a C++ worker or with side-car enabled. At this point, I'm still focused on getting the MERGE INTO command fully working on Java. Once that's stable, I will plan for proper C++ support, including ensuring clean error messages when used with Presto C++.

Copy link
Copy Markdown
Contributor

@yingsu00 yingsu00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Adrian, were you able to build this PR? I got the following errors:

[ERROR] Failed to execute goal org.basepom.maven:duplicate-finder-maven-plugin:1.5.1:check (default) on project presto-main: Found duplicate classes/resources! -> [Help 1]

Also, how did you get the "handle not found" error? Were you running a query using the query runner? Were you able to get the correct plan? Would you please share the plan and full error stack?

sql/grant
sql/grant-roles
sql/insert
sql/merge
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this commit, the merge into only supports parsing and not the full functionality. Let's move the documentation to the last commit when it's fully supported.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, it's done.

}

@Test
public void testMergeInto()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep the original name testMerge()

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, it's done.

Copy link
Copy Markdown
Contributor

@yingsu00 yingsu00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Adrian, were you able to build this PR? The CI build failed with these:

In file included from /__w/presto/presto/presto-native-execution/./presto_cpp/main/PeriodicHeartbeatManager.h:17:
/__w/presto/presto/presto-native-execution/./presto_cpp/presto_protocol/core/presto_protocol_core.h:1684:3: error: unknown type name 'com'
  com.facebook.presto.spi.MergeHandle handle = {};
  ^
/__w/presto/presto/presto-native-execution/./presto_cpp/presto_protocol/core/presto_protocol_core.h:1684:6: error: expected member name or ';' after declaration specifiers
  com.facebook.presto.spi.MergeHandle handle = {};
  ~~~^
/__w/presto/presto/presto-native-execution/./presto_cpp/presto_protocol/core/presto_protocol_core.h:1686:3: error: unknown type name 'ConnectorMergeTableHandle'; did you mean 'ConnectorDeleteTableHandle'?
  ConnectorMergeTableHandle connectorMergeTableHandle = {};
  ^~~~~~~~~~~~~~~~~~~~~~~~~
  ConnectorDeleteTableHandle
/__w/presto/presto/presto-native-execution/./presto_cpp/presto_protocol/core/presto_protocol_core.h:[313](https://github.com/prestodb/presto/actions/runs/16068848370/job/45348688297?pr=25470#step:17:314):8: note: 'ConnectorDeleteTableHandle' declared here
struct ConnectorDeleteTableHandle : public JsonEncodedSubclass {};
       ^
/__w/presto/presto/presto-native-execution/./presto_cpp/presto_protocol/core/presto_protocol_core.h:1709:3: error: unknown type name 'RowChangeParadigm'
  RowChangeParadigm paradigm = {};
  ^
4 errors generated.

Would you please fix the build errors first?

Also, how did you get the "handle not found" error? Were you running a query using the query runner? Since the build fails I wonder how you could do that. If you could run a query, were you able to get the correct plan? Would you please share the plan and full error stack? Thanks!

@acarpente-denodo acarpente-denodo force-pushed the feature/20578_SQL_Support_for_MERGE_INTO branch 3 times, most recently from ace4af6 to b02d82f Compare July 14, 2025 14:46
@acarpente-denodo
Copy link
Copy Markdown
Contributor Author

acarpente-denodo commented Jul 15, 2025

Hi @yingsu00,

The PR is still in development. I'm still focused on getting the MERGE INTO command fully working on Java. The coordinator and workers are currently running on Java. I haven't investigated the MERGE INTO support for Presto C++ yet.

Automated tests have not been created yet. To test this enhancement, you need to perform a manual test. It is described at the end of the PR description in the section called "Test Plan".

I'm facing an error during the serialization of a new class called MergePartitioningHandle, which implements ConnectorPartitioningHandle interface

Caused by: java.lang.IllegalArgumentException: No connector for handle: MERGE [update = iceberg:INSTANCE]
at com.facebook.presto.metadata.HandleResolver.getId(HandleResolver.java:228)
at com.facebook.presto.metadata.HandleResolver.getId(HandleResolver.java:120)
at com.facebook.presto.metadata.AbstractTypedJacksonModule$InternalTypeResolver.idFromValueAndType(AbstractTypedJacksonModule.java:158)
at com.facebook.presto.metadata.AbstractTypedJacksonModule$InternalTypeResolver.idFromValue(AbstractTypedJacksonModule.java:150)
at com.fasterxml.jackson.databind.jsontype.impl.TypeSerializerBase.idFromValue(TypeSerializerBase.java:92)
at com.fasterxml.jackson.databind.jsontype.impl.TypeSerializerBase._generateTypeId(TypeSerializerBase.java:77)
at com.fasterxml.jackson.databind.jsontype.impl.TypeSerializerBase.writeTypePrefix(TypeSerializerBase.java:45)
at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeWithType(BeanSerializerBase.java:650)
at com.facebook.presto.metadata.AbstractTypedJacksonModule$InternalTypeSerializer.serialize(AbstractTypedJacksonModule.java:115)
at com.fasterxml.jackson.databind.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:732)
at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:772)
... 100 more

I think there is an issue with com.facebook.presto.iceberg.IcebergUpdateHandle#INSTANCE. You can find a complete description of this problem in the Slack thread I created a few weeks ago: https://prestodb.slack.com/archives/CLYMEEJ6S/p1749466521358289

The goal of this draft PR is to get assistance in fixing this problem.

@acarpente-denodo acarpente-denodo force-pushed the feature/20578_SQL_Support_for_MERGE_INTO branch from 3465776 to 9658fd0 Compare July 21, 2025 10:50
@acarpente-denodo acarpente-denodo force-pushed the feature/20578_SQL_Support_for_MERGE_INTO branch 6 times, most recently from 20d7c91 to 441ed5b Compare August 8, 2025 06:43
@acarpente-denodo acarpente-denodo force-pushed the feature/20578_SQL_Support_for_MERGE_INTO branch 3 times, most recently from 09c9c54 to bca3929 Compare August 12, 2025 13:09
Copy link
Copy Markdown
Contributor

@steveburnett steveburnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Format nit. Thanks for the doc!

@acarpente-denodo
Copy link
Copy Markdown
Contributor Author

Hi @steveburnett! Thank you for reviewing the feature documentation.

@acarpente-denodo acarpente-denodo force-pushed the feature/20578_SQL_Support_for_MERGE_INTO branch from 7bb8555 to e799d9f Compare August 13, 2025 09:32
steveburnett
steveburnett previously approved these changes Aug 13, 2025
@acarpente-denodo acarpente-denodo force-pushed the feature/20578_SQL_Support_for_MERGE_INTO branch 3 times, most recently from fd4bc78 to 8f49697 Compare August 18, 2025 18:22
@acarpente-denodo acarpente-denodo force-pushed the feature/20578_SQL_Support_for_MERGE_INTO branch from f9e99cb to 994f813 Compare November 26, 2025 17:08
@tdcmeehan tdcmeehan self-assigned this Nov 26, 2025
@acarpente-denodo
Copy link
Copy Markdown
Contributor Author

@sourcery-ai

@tdcmeehan
Copy link
Copy Markdown
Contributor

Can you please move the UI changes to a separate PR?

Copy link
Copy Markdown
Member

@hantangwangd hantangwangd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this great feature. I've left a few comments, mostly small things and a point for discussion.

@acarpente-denodo acarpente-denodo force-pushed the feature/20578_SQL_Support_for_MERGE_INTO branch from 994f813 to 6510337 Compare December 17, 2025 17:55
@acarpente-denodo
Copy link
Copy Markdown
Contributor Author

acarpente-denodo commented Dec 17, 2025

Can you please move the UI changes to a separate PR?

Hi @tdcmeehan! OK. I moved the presto-ui changes to a separate PR: #26825

@acarpente-denodo acarpente-denodo force-pushed the feature/20578_SQL_Support_for_MERGE_INTO branch 2 times, most recently from 07acbfe to 0b73712 Compare December 19, 2025 10:27
@acarpente-denodo acarpente-denodo force-pushed the feature/20578_SQL_Support_for_MERGE_INTO branch from 0b73712 to 298b708 Compare December 22, 2025 15:46
@acarpente-denodo acarpente-denodo force-pushed the feature/20578_SQL_Support_for_MERGE_INTO branch 2 times, most recently from 79a5b6e to da4927d Compare December 29, 2025 20:15
Copy link
Copy Markdown
Contributor

@PingLiuPing PingLiuPing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feature.

@steveburnett
Copy link
Copy Markdown
Contributor

Is documentation no longer needed for this PR? Has the doc been moved to a different PR? Thanks!

@acarpente-denodo
Copy link
Copy Markdown
Contributor Author

acarpente-denodo commented Jan 8, 2026

Is documentation no longer needed for this PR? Has the doc been moved to a different PR? Thanks!

Hi @steveburnett, the documentation for this feature was merged in this PR: Add SQL Support for MERGE INTO in Presto (engine)

So, no documentation is required for this PR.

Copy link
Copy Markdown
Contributor

@PingLiuPing PingLiuPing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks good overall. Could you consolidate the commit message little bit.
Leaving few comments on the tests.

{
String targetTable = "merge_inserts_" + randomTableSuffix();
assertUpdate(format("CREATE TABLE %s (customer VARCHAR, purchases INT, address VARCHAR)", targetTable));
assertUpdate(format("INSERT INTO %s (customer, purchases, address) VALUES ('Aaron', 11, 'Antioch'), ('Bill', 7, 'Buena')", targetTable), 2);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These code repeats lot, maybe adding a helper function.

(long) computeActual("SELECT count(*) FROM tpch.sf1.orders").getOnlyValue());

// verify copied rows: same total price
assertEquals(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, consider to use assertQuery . And all other places around.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@acarpente-denodo
Copy link
Copy Markdown
Contributor Author

Hi @tdcmeehan, @steveburnett, @ethanyzhang and @PingLiuPing!

I've made the requested changes, and the PR has passed all automated tests. Could you please approve this PR?

assertUpdate("DROP TABLE " + targetTable);
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add some tests for:

  1. Empty source table
  2. Empty target table
  3. All rows matched
  4. No rows matched

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @PingLiuPing! Sure, I created the new automated tests you requested.

fragments.addAll(writePositionDeletes(sink, deletion.rowsToDelete()));
});

return completedFuture(fragments);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This method returns CompletableFuture but calls .join() synchronously, and this could blocking the thread and defeats the purpose of returning a CompletableFuture.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @PingLiuPing! You are correct. I fixed it.

Copy link
Copy Markdown
Contributor

@PingLiuPing PingLiuPing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

Copy link
Copy Markdown
Member

@hantangwangd hantangwangd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@acarpente-denodo thanks for this feature, lgtm!

Copy link
Copy Markdown
Contributor

@tdcmeehan tdcmeehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @acarpente-denodo

acarpente-denodo and others added 3 commits January 14, 2026 09:14
Support SQL MERGE in the Iceberg connector

Cherry-pick of trinodb/trino@6cb188b

Co-authored-by: David Phillips <david@acz.org>
SQL MERGE automated tests for Iceberg connector

Cherry-pick of trinodb/trino@6cb188b

Co-authored-by: David Phillips <david@acz.org>
- Improved CompletableFuture management.
- Added new automted tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants