Skip to content

Support table optimization with OR predicates on partitioned timestamp columns#28583

Merged
raunaqmorarka merged 2 commits intotrinodb:masterfrom
laserninja:optimize-multiple-partition-iceberg
Apr 4, 2026
Merged

Support table optimization with OR predicates on partitioned timestamp columns#28583
raunaqmorarka merged 2 commits intotrinodb:masterfrom
laserninja:optimize-multiple-partition-iceberg

Conversation

@laserninja
Copy link
Copy Markdown
Contributor

@laserninja laserninja commented Mar 9, 2026

Fix table optimization with OR predicates on partitioned timestamp columns

ALTER TABLE EXECUTE OPTIMIZE with OR predicates using date() on timestamp with time zone columns failed with "Unexpected FilterNode found in plan". For example:

ALTER TABLE t EXECUTE OPTIMIZE
WHERE date(c2_ts) BETWEEN DATE '2025-07-01' AND DATE '2025-07-31'
OR date(c2_ts) BETWEEN DATE '2025-10-01' AND DATE '2025-10-31'

UtcConstraintExtractor only handled AND (conjuncts) when converting ConnectorExpressions to TupleDomains. An OR expression was treated as a single unconvertible conjunct, leaving a FilterNode in the plan that TableExecuteStructureValidator rejects.

Add OR (disjunct) support to UtcConstraintExtractor by extracting each disjunct, converting it to a TupleDomain, and combining them with TupleDomain.strictUnion(). Also add extractDisjuncts() and or() utility methods to ConnectorExpressions, mirroring the existing extractConjuncts() and and() methods.

Fixes #27136

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

## General
* Fix `ALTER TABLE EXECUTE OPTIMIZE` failure with OR predicates on partitioned timestamp with time zone columns. ({issue}`27136`)

@cla-bot
Copy link
Copy Markdown

cla-bot bot commented Mar 9, 2026

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@github-actions github-actions bot added the iceberg Iceberg connector label Mar 9, 2026
@cla-bot
Copy link
Copy Markdown

cla-bot bot commented Mar 9, 2026

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@laserninja laserninja force-pushed the optimize-multiple-partition-iceberg branch from 2545a55 to 48ef3e6 Compare March 9, 2026 21:09
@cla-bot cla-bot bot added the cla-signed label Mar 9, 2026
@laserninja laserninja requested review from findepi and kasiafi March 9, 2026 21:32
@laserninja laserninja self-assigned this Mar 9, 2026
@findepi findepi changed the title Fix table optimization with OR predicates on partitioned timestamp columns Support table optimization with OR predicates on partitioned timestamp columns Mar 10, 2026
@laserninja laserninja force-pushed the optimize-multiple-partition-iceberg branch 2 times, most recently from 205c038 to 993151e Compare March 11, 2026 03:06
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes ALTER TABLE ... EXECUTE OPTIMIZE failures when the WHERE clause contains OR predicates over date(...) applied to partitioned timestamp with time zone columns, by enabling safe OR-to-TupleDomain extraction in connector-side constraint handling.

Changes:

  • Add OR (disjunct) support to UtcConstraintExtractor by extracting each disjunct, converting to TupleDomain, and safely combining via TupleDomain.columnWiseUnion() when exact.
  • Add extractDisjuncts() and or() helpers to ConnectorExpressions (mirroring existing AND helpers).
  • Add regression tests for OR extraction and an Iceberg OPTIMIZE scenario covering date(ts) disjunctions across partitions.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/BaseIcebergConnectorTest.java Adds an Iceberg regression test ensuring OPTIMIZE works with OR on date(c2_ts) across month partitions and validates file compaction + data integrity.
lib/trino-plugin-toolkit/src/test/java/io/trino/plugin/base/filter/TestUtcConstraintExtractor.java Adds unit tests covering OR extraction on timestamp-with-time-zone date casts, including unconvertible disjunct and multi-column OR safety behavior.
lib/trino-plugin-toolkit/src/main/java/io/trino/plugin/base/filter/UtcConstraintExtractor.java Implements disjunct-aware extraction and safe TupleDomain union logic aligned with DomainTranslator’s exactness conditions.
lib/trino-plugin-toolkit/src/main/java/io/trino/plugin/base/expression/ConnectorExpressions.java Introduces OR construction and OR flattening (extractDisjuncts) utilities consistent with existing AND utilities.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@laserninja laserninja force-pushed the optimize-multiple-partition-iceberg branch from 93c842d to aa4fff5 Compare March 12, 2026 03:14
@laserninja laserninja force-pushed the optimize-multiple-partition-iceberg branch from aa4fff5 to 888c48b Compare March 13, 2026 05:42
@laserninja laserninja force-pushed the optimize-multiple-partition-iceberg branch from 888c48b to 18254ae Compare March 14, 2026 02:54
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@laserninja laserninja force-pushed the optimize-multiple-partition-iceberg branch from 18254ae to e770480 Compare March 14, 2026 23:42
@laserninja
Copy link
Copy Markdown
Contributor Author

@raunaqmorarka the copilot suggestions seems out of date. Can you re-request it?

@raunaqmorarka raunaqmorarka self-requested a review March 17, 2026 00:28
@raunaqmorarka raunaqmorarka requested a review from Copilot March 18, 2026 18:46
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread core/trino-spi/src/main/java/io/trino/spi/predicate/TupleDomain.java Outdated
@laserninja laserninja force-pushed the optimize-multiple-partition-iceberg branch from e770480 to 1dd15b9 Compare March 18, 2026 22:10
@raunaqmorarka
Copy link
Copy Markdown
Member

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 26, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 26, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c0fda020-cda7-4a9a-a3c8-a5e57b8de07c

📥 Commits

Reviewing files that changed from the base of the PR and between 1dd15b9 and 5354000.

📒 Files selected for processing (3)
  • core/trino-main/src/main/java/io/trino/sql/planner/DomainTranslator.java
  • core/trino-spi/src/main/java/io/trino/spi/predicate/TupleDomain.java
  • core/trino-spi/src/test/java/io/trino/spi/predicate/TestTupleDomain.java
🚧 Files skipped from review as they are similar to previous changes (1)
  • core/trino-spi/src/main/java/io/trino/spi/predicate/TupleDomain.java

📝 Walkthrough

Walkthrough

The PR adds support for disjunctive (OR) predicate handling across the stack: a new public API TupleDomain.strictUnion(List) implements strict union semantics (with single-column and NaN-safety checks), ConnectorExpressions gains utilities to extract and build OR expressions, UtcConstraintExtractor is extended to convert OR disjuncts into tuple domains via strictUnion, DomainTranslator.OR logic is simplified to rely on strictUnion, and tests (unit and integration) were added to validate OR handling and an Iceberg optimize scenario.

Assessment against linked issues

Objective Addressed Explanation
Support pushdown of OR predicates with functions applied to partition columns, such as date(c2_ts) BETWEEN ... OR date(c2_ts) BETWEEN ... [#27136]
Ensure TupleDomain.strictUnion() properly combines multiple domain constraints via OR semantics while preserving correctness across multiple partitions [#27136]
Enable OR predicate recognition in constraint extraction and conversion to tuple domains [#27136]

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
lib/trino-plugin-toolkit/src/test/java/io/trino/plugin/base/util/TestConnectorExpressionUtil.java (1)

32-123: Comprehensive test coverage for the new utility methods.

The tests thoroughly validate:

  • extractDisjuncts: flattening, FALSE filtering, nested OR handling
  • or(): construction, identity element handling, FALSE filtering, TRUE preservation

One minor suggestion: consider adding a test for extractDisjuncts with TRUE to explicitly document that TRUE is NOT filtered (unlike FALSE), similar to how testOrPreservesTrueAsRegularDisjunct documents this for or().

📝 Optional: Add test for extractDisjuncts preserving TRUE
+    `@Test`
+    public void testExtractDisjunctsPreservesTrue()
+    {
+        ConnectorExpression orExpression = new Call(BOOLEAN, OR_FUNCTION_NAME, ImmutableList.of(A, TRUE));
+        assertThat(extractDisjuncts(orExpression)).containsExactly(A, TRUE);
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@lib/trino-plugin-toolkit/src/test/java/io/trino/plugin/base/util/TestConnectorExpressionUtil.java`
around lines 32 - 123, Add a test that verifies extractDisjuncts does not filter
TRUE (similar to testOrPreservesTrueAsRegularDisjunct): create a Call expression
using OR_FUNCTION_NAME with arguments A and TRUE (or reuse or(A, TRUE)), call
extractDisjuncts on it, and assert the result containsExactly(A, TRUE); name it
e.g. testExtractDisjunctsPreservesTrue so it documents the behavior alongside
existing extractDisjuncts tests.
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/BaseIcebergConnectorTest.java (1)

5630-5653: Use TestTable (or try/finally) for guaranteed cleanup.

If any assertion fails before Line 5653, the table is left behind and can pollute subsequent test runs. Please wrap lifecycle in try (TestTable ...) or a finally block.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/BaseIcebergConnectorTest.java`
around lines 5630 - 5653, The test creates a temp table via tableName =
"test_select_disjunct_ts_" + randomNameSuffix() and drops it at the end with
assertUpdate("DROP TABLE " + tableName) but if an assertion fails the drop is
skipped; wrap the table lifecycle in a try-with-resources TestTable (or use try
{ ... } finally { assertUpdate("DROP TABLE " + tableName); }) around the
creates, inserts, queries (the assertUpdate/computeActual/query calls) so the
table created by randomNameSuffix() is always cleaned up even on failures (refer
to tableName, randomNameSuffix(), assertUpdate, computeActual, query).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/trino-spi/src/main/java/io/trino/spi/predicate/TupleDomain.java`:
- Around line 510-517: The implicitlyAddedNaN detection is too narrow: change
the computation of implicitlyAddedNaN so it treats an input TupleDomain as
already containing NaN if either TupleDomain::isAll() is true OR any of its
constituent Domain value sets report isAll(); in other words, replace the
current nonNoneDomains.stream().noneMatch(TupleDomain::isAll) check with a
predicate that returns true only when none of the nonNoneDomains satisfy
(td.isAll() || td.getDomains().map(doms ->
doms.values().stream().anyMatch(domain ->
domain.getValues().isAll())).orElse(false)), and keep the unionContainsNaN
conjunction and the existing Optional.empty() return behavior when
implicitlyAddedNaN is true.

---

Nitpick comments:
In
`@lib/trino-plugin-toolkit/src/test/java/io/trino/plugin/base/util/TestConnectorExpressionUtil.java`:
- Around line 32-123: Add a test that verifies extractDisjuncts does not filter
TRUE (similar to testOrPreservesTrueAsRegularDisjunct): create a Call expression
using OR_FUNCTION_NAME with arguments A and TRUE (or reuse or(A, TRUE)), call
extractDisjuncts on it, and assert the result containsExactly(A, TRUE); name it
e.g. testExtractDisjunctsPreservesTrue so it documents the behavior alongside
existing extractDisjuncts tests.

In
`@plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/BaseIcebergConnectorTest.java`:
- Around line 5630-5653: The test creates a temp table via tableName =
"test_select_disjunct_ts_" + randomNameSuffix() and drops it at the end with
assertUpdate("DROP TABLE " + tableName) but if an assertion fails the drop is
skipped; wrap the table lifecycle in a try-with-resources TestTable (or use try
{ ... } finally { assertUpdate("DROP TABLE " + tableName); }) around the
creates, inserts, queries (the assertUpdate/computeActual/query calls) so the
table created by randomNameSuffix() is always cleaned up even on failures (refer
to tableName, randomNameSuffix(), assertUpdate, computeActual, query).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6a96a176-b81f-4077-9c47-dc4dbc84e771

📥 Commits

Reviewing files that changed from the base of the PR and between 335dc00 and 1dd15b9.

📒 Files selected for processing (8)
  • core/trino-main/src/main/java/io/trino/sql/planner/DomainTranslator.java
  • core/trino-spi/src/main/java/io/trino/spi/predicate/TupleDomain.java
  • core/trino-spi/src/test/java/io/trino/spi/predicate/TestTupleDomain.java
  • lib/trino-plugin-toolkit/src/main/java/io/trino/plugin/base/expression/ConnectorExpressions.java
  • lib/trino-plugin-toolkit/src/main/java/io/trino/plugin/base/filter/UtcConstraintExtractor.java
  • lib/trino-plugin-toolkit/src/test/java/io/trino/plugin/base/filter/TestUtcConstraintExtractor.java
  • lib/trino-plugin-toolkit/src/test/java/io/trino/plugin/base/util/TestConnectorExpressionUtil.java
  • plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/BaseIcebergConnectorTest.java

Comment thread core/trino-spi/src/main/java/io/trino/spi/predicate/TupleDomain.java Outdated
Comment thread core/trino-main/src/main/java/io/trino/sql/planner/DomainTranslator.java Outdated
Comment thread core/trino-spi/src/main/java/io/trino/spi/predicate/TupleDomain.java Outdated
@laserninja laserninja force-pushed the optimize-multiple-partition-iceberg branch from 1dd15b9 to 5da8a21 Compare March 26, 2026 19:45
@raunaqmorarka
Copy link
Copy Markdown
Member

@laserninja we don't use merge commits in this project, please rebase your branch to latest master and make sure the CI is passing

@laserninja laserninja force-pushed the optimize-multiple-partition-iceberg branch 4 times, most recently from 89d9ad0 to 2951511 Compare March 29, 2026 03:39
@laserninja
Copy link
Copy Markdown
Contributor Author

@raunaqmorarka yes, thanks. The CI is working now.

…lumns

ALTER TABLE EXECUTE OPTIMIZE with OR predicates using date() on
timestamp with time zone columns failed with "Unexpected FilterNode
found in plan". For example:

  ALTER TABLE t EXECUTE OPTIMIZE
  WHERE date(c2_ts) BETWEEN DATE '2025-07-01' AND DATE '2025-07-31'
     OR date(c2_ts) BETWEEN DATE '2025-10-01' AND DATE '2025-10-31'

UtcConstraintExtractor only handled AND (conjuncts) when converting
ConnectorExpressions to TupleDomains. An OR expression was treated as
a single unconvertible conjunct, leaving a FilterNode in the plan that
TableExecuteStructureValidator rejects.

Add OR (disjunct) support to UtcConstraintExtractor by extracting each
disjunct, converting it to a TupleDomain, and combining them with
TupleDomain.columnWiseUnion(). Also add extractDisjuncts() and or()
utility methods to ConnectorExpressions, mirroring the existing
extractConjuncts() and and() methods.

Fixes trinodb#27136
@raunaqmorarka raunaqmorarka force-pushed the optimize-multiple-partition-iceberg branch from 2951511 to 5354000 Compare April 4, 2026 05:01
@raunaqmorarka
Copy link
Copy Markdown
Member

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 4, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@raunaqmorarka raunaqmorarka merged commit 7f5e2fb into trinodb:master Apr 4, 2026
191 of 194 checks passed
@github-actions github-actions bot added this to the 481 milestone Apr 4, 2026
@ebyhr ebyhr mentioned this pull request Apr 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed iceberg Iceberg connector

Development

Successfully merging this pull request may close these issues.

Iceberg table optimization fails for multiple partitions

4 participants