Skip to content

Add information from written files to Iceberg conflict detection#24470

Merged
raunaqmorarka merged 3 commits intotrinodb:masterfrom
pajaks:pajaks/iceberg_concurent_merge
Feb 12, 2025
Merged

Add information from written files to Iceberg conflict detection#24470
raunaqmorarka merged 3 commits intotrinodb:masterfrom
pajaks:pajaks/iceberg_concurent_merge

Conversation

@pajaks
Copy link
Copy Markdown
Member

@pajaks pajaks commented Dec 13, 2024

Description

Currently, Iceberg's concurrent write conflict detection is based on predicates received from the engine. In more complicated cases (like joins, merges, or different type comparisons), this information is not available at the connector level.

This PR aims to take partition information from the actual written files as a source for the conflict detection system. If, during a write, created files are only for some partitions, we can check only those partitions for potential conflicts from concurrent writes.

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

## Iceberg
* Improve conflict detection to avoid failures from concurrent MERGE INTO queries on an iceberg table. ({issue}`24470`)

@cla-bot cla-bot bot added the cla-signed label Dec 13, 2024
@github-actions github-actions bot added the iceberg Iceberg connector label Dec 13, 2024
Comment thread plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java Outdated
Comment thread plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java Outdated
Comment thread plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java Outdated
@ebyhr ebyhr self-requested a review December 16, 2024 09:56
@pajaks pajaks force-pushed the pajaks/iceberg_concurent_merge branch from 078a427 to 9d8bfd3 Compare December 16, 2024 12:05
@pajaks pajaks requested a review from findinpath December 16, 2024 12:38
Comment thread plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java Outdated
Comment thread plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java Outdated
@pajaks pajaks force-pushed the pajaks/iceberg_concurent_merge branch 4 times, most recently from 0bc5d55 to 44dd035 Compare December 19, 2024 13:23
Comment thread plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java Outdated
Comment thread plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java Outdated
Comment thread plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java Outdated
Comment thread plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java Outdated
Comment thread plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java Outdated
@pajaks pajaks force-pushed the pajaks/iceberg_concurent_merge branch 2 times, most recently from 7f3392a to 08e0dd7 Compare December 23, 2024 14:15
Comment thread plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergConfig.java Outdated
Comment thread plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java Outdated
Comment thread plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java Outdated
Comment thread plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can potentially use something like io.trino.plugin.iceberg.IcebergSplitSource#createFileStatisticsDomain for un-partitioned columns. Not necessary for current PR though.

@pajaks pajaks force-pushed the pajaks/iceberg_concurent_merge branch 2 times, most recently from c8232a1 to 8e73d7e Compare January 9, 2025 13:23
@pajaks pajaks requested review from ebyhr and raunaqmorarka January 9, 2025 13:57
Comment thread plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java Outdated
@github-actions
Copy link
Copy Markdown

This pull request has gone a while without any activity. Tagging for triage help: @mosabua

@github-actions github-actions bot added the stale label Jan 31, 2025
@pajaks pajaks force-pushed the pajaks/iceberg_concurent_merge branch from 8e73d7e to 2158554 Compare February 6, 2025 10:57
@pajaks pajaks force-pushed the pajaks/iceberg_concurent_merge branch from 2158554 to 31f23c4 Compare February 6, 2025 11:10
@pajaks pajaks requested a review from raunaqmorarka February 6, 2025 11:10
@github-actions github-actions bot removed the stale label Feb 6, 2025
@pajaks pajaks force-pushed the pajaks/iceberg_concurent_merge branch from 31f23c4 to 4db2b70 Compare February 7, 2025 09:21
@pajaks pajaks force-pushed the pajaks/iceberg_concurent_merge branch from 4db2b70 to a4bad0c Compare February 10, 2025 09:48
@pajaks pajaks force-pushed the pajaks/iceberg_concurent_merge branch from a4bad0c to 8e7d1c0 Compare February 10, 2025 09:51
@ebyhr
Copy link
Copy Markdown
Member

ebyhr commented Feb 10, 2025

/test-with-secrets sha=8e7d1c0799730d7da18522cc142b023193d22848

@github-actions
Copy link
Copy Markdown

The CI workflow run with tests that require additional secrets has been started: https://github.com/trinodb/trino/actions/runs/13238104125

return fileBasedConflictDetectionEnabled;
}

@Config("iceberg.file-based-conflict-detection")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ConfigHidden - this is a kill-switch as i see it .

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't mean it has to be hidden, it should just be removed eventually

@raunaqmorarka raunaqmorarka merged commit be9ae2f into trinodb:master Feb 12, 2025
@github-actions github-actions bot added this to the 471 milestone Feb 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed iceberg Iceberg connector

Development

Successfully merging this pull request may close these issues.

4 participants