Support table optimization with OR predicates on partitioned timestamp columns#28583
Conversation
|
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla |
|
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla |
2545a55 to
48ef3e6
Compare
205c038 to
993151e
Compare
There was a problem hiding this comment.
Pull request overview
This PR fixes ALTER TABLE ... EXECUTE OPTIMIZE failures when the WHERE clause contains OR predicates over date(...) applied to partitioned timestamp with time zone columns, by enabling safe OR-to-TupleDomain extraction in connector-side constraint handling.
Changes:
- Add OR (disjunct) support to
UtcConstraintExtractorby extracting each disjunct, converting toTupleDomain, and safely combining viaTupleDomain.columnWiseUnion()when exact. - Add
extractDisjuncts()andor()helpers toConnectorExpressions(mirroring existing AND helpers). - Add regression tests for OR extraction and an Iceberg OPTIMIZE scenario covering
date(ts)disjunctions across partitions.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/BaseIcebergConnectorTest.java |
Adds an Iceberg regression test ensuring OPTIMIZE works with OR on date(c2_ts) across month partitions and validates file compaction + data integrity. |
lib/trino-plugin-toolkit/src/test/java/io/trino/plugin/base/filter/TestUtcConstraintExtractor.java |
Adds unit tests covering OR extraction on timestamp-with-time-zone date casts, including unconvertible disjunct and multi-column OR safety behavior. |
lib/trino-plugin-toolkit/src/main/java/io/trino/plugin/base/filter/UtcConstraintExtractor.java |
Implements disjunct-aware extraction and safe TupleDomain union logic aligned with DomainTranslator’s exactness conditions. |
lib/trino-plugin-toolkit/src/main/java/io/trino/plugin/base/expression/ConnectorExpressions.java |
Introduces OR construction and OR flattening (extractDisjuncts) utilities consistent with existing AND utilities. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
93c842d to
aa4fff5
Compare
aa4fff5 to
888c48b
Compare
888c48b to
18254ae
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
18254ae to
e770480
Compare
|
@raunaqmorarka the copilot suggestions seems out of date. Can you re-request it? |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
e770480 to
1dd15b9
Compare
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughThe PR adds support for disjunctive (OR) predicate handling across the stack: a new public API TupleDomain.strictUnion(List) implements strict union semantics (with single-column and NaN-safety checks), ConnectorExpressions gains utilities to extract and build OR expressions, UtcConstraintExtractor is extended to convert OR disjuncts into tuple domains via strictUnion, DomainTranslator.OR logic is simplified to rely on strictUnion, and tests (unit and integration) were added to validate OR handling and an Iceberg optimize scenario. Assessment against linked issues
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
lib/trino-plugin-toolkit/src/test/java/io/trino/plugin/base/util/TestConnectorExpressionUtil.java (1)
32-123: Comprehensive test coverage for the new utility methods.The tests thoroughly validate:
extractDisjuncts: flattening, FALSE filtering, nested OR handlingor(): construction, identity element handling, FALSE filtering, TRUE preservationOne minor suggestion: consider adding a test for
extractDisjunctswithTRUEto explicitly document thatTRUEis NOT filtered (unlikeFALSE), similar to howtestOrPreservesTrueAsRegularDisjunctdocuments this foror().📝 Optional: Add test for extractDisjuncts preserving TRUE
+ `@Test` + public void testExtractDisjunctsPreservesTrue() + { + ConnectorExpression orExpression = new Call(BOOLEAN, OR_FUNCTION_NAME, ImmutableList.of(A, TRUE)); + assertThat(extractDisjuncts(orExpression)).containsExactly(A, TRUE); + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/trino-plugin-toolkit/src/test/java/io/trino/plugin/base/util/TestConnectorExpressionUtil.java` around lines 32 - 123, Add a test that verifies extractDisjuncts does not filter TRUE (similar to testOrPreservesTrueAsRegularDisjunct): create a Call expression using OR_FUNCTION_NAME with arguments A and TRUE (or reuse or(A, TRUE)), call extractDisjuncts on it, and assert the result containsExactly(A, TRUE); name it e.g. testExtractDisjunctsPreservesTrue so it documents the behavior alongside existing extractDisjuncts tests.plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/BaseIcebergConnectorTest.java (1)
5630-5653: UseTestTable(ortry/finally) for guaranteed cleanup.If any assertion fails before Line 5653, the table is left behind and can pollute subsequent test runs. Please wrap lifecycle in
try (TestTable ...)or afinallyblock.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/BaseIcebergConnectorTest.java` around lines 5630 - 5653, The test creates a temp table via tableName = "test_select_disjunct_ts_" + randomNameSuffix() and drops it at the end with assertUpdate("DROP TABLE " + tableName) but if an assertion fails the drop is skipped; wrap the table lifecycle in a try-with-resources TestTable (or use try { ... } finally { assertUpdate("DROP TABLE " + tableName); }) around the creates, inserts, queries (the assertUpdate/computeActual/query calls) so the table created by randomNameSuffix() is always cleaned up even on failures (refer to tableName, randomNameSuffix(), assertUpdate, computeActual, query).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@core/trino-spi/src/main/java/io/trino/spi/predicate/TupleDomain.java`:
- Around line 510-517: The implicitlyAddedNaN detection is too narrow: change
the computation of implicitlyAddedNaN so it treats an input TupleDomain as
already containing NaN if either TupleDomain::isAll() is true OR any of its
constituent Domain value sets report isAll(); in other words, replace the
current nonNoneDomains.stream().noneMatch(TupleDomain::isAll) check with a
predicate that returns true only when none of the nonNoneDomains satisfy
(td.isAll() || td.getDomains().map(doms ->
doms.values().stream().anyMatch(domain ->
domain.getValues().isAll())).orElse(false)), and keep the unionContainsNaN
conjunction and the existing Optional.empty() return behavior when
implicitlyAddedNaN is true.
---
Nitpick comments:
In
`@lib/trino-plugin-toolkit/src/test/java/io/trino/plugin/base/util/TestConnectorExpressionUtil.java`:
- Around line 32-123: Add a test that verifies extractDisjuncts does not filter
TRUE (similar to testOrPreservesTrueAsRegularDisjunct): create a Call expression
using OR_FUNCTION_NAME with arguments A and TRUE (or reuse or(A, TRUE)), call
extractDisjuncts on it, and assert the result containsExactly(A, TRUE); name it
e.g. testExtractDisjunctsPreservesTrue so it documents the behavior alongside
existing extractDisjuncts tests.
In
`@plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/BaseIcebergConnectorTest.java`:
- Around line 5630-5653: The test creates a temp table via tableName =
"test_select_disjunct_ts_" + randomNameSuffix() and drops it at the end with
assertUpdate("DROP TABLE " + tableName) but if an assertion fails the drop is
skipped; wrap the table lifecycle in a try-with-resources TestTable (or use try
{ ... } finally { assertUpdate("DROP TABLE " + tableName); }) around the
creates, inserts, queries (the assertUpdate/computeActual/query calls) so the
table created by randomNameSuffix() is always cleaned up even on failures (refer
to tableName, randomNameSuffix(), assertUpdate, computeActual, query).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 6a96a176-b81f-4077-9c47-dc4dbc84e771
📒 Files selected for processing (8)
core/trino-main/src/main/java/io/trino/sql/planner/DomainTranslator.javacore/trino-spi/src/main/java/io/trino/spi/predicate/TupleDomain.javacore/trino-spi/src/test/java/io/trino/spi/predicate/TestTupleDomain.javalib/trino-plugin-toolkit/src/main/java/io/trino/plugin/base/expression/ConnectorExpressions.javalib/trino-plugin-toolkit/src/main/java/io/trino/plugin/base/filter/UtcConstraintExtractor.javalib/trino-plugin-toolkit/src/test/java/io/trino/plugin/base/filter/TestUtcConstraintExtractor.javalib/trino-plugin-toolkit/src/test/java/io/trino/plugin/base/util/TestConnectorExpressionUtil.javaplugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/BaseIcebergConnectorTest.java
1dd15b9 to
5da8a21
Compare
|
@laserninja we don't use merge commits in this project, please rebase your branch to latest master and make sure the CI is passing |
89d9ad0 to
2951511
Compare
|
@raunaqmorarka yes, thanks. The CI is working now. |
…lumns
ALTER TABLE EXECUTE OPTIMIZE with OR predicates using date() on
timestamp with time zone columns failed with "Unexpected FilterNode
found in plan". For example:
ALTER TABLE t EXECUTE OPTIMIZE
WHERE date(c2_ts) BETWEEN DATE '2025-07-01' AND DATE '2025-07-31'
OR date(c2_ts) BETWEEN DATE '2025-10-01' AND DATE '2025-10-31'
UtcConstraintExtractor only handled AND (conjuncts) when converting
ConnectorExpressions to TupleDomains. An OR expression was treated as
a single unconvertible conjunct, leaving a FilterNode in the plan that
TableExecuteStructureValidator rejects.
Add OR (disjunct) support to UtcConstraintExtractor by extracting each
disjunct, converting it to a TupleDomain, and combining them with
TupleDomain.columnWiseUnion(). Also add extractDisjuncts() and or()
utility methods to ConnectorExpressions, mirroring the existing
extractConjuncts() and and() methods.
Fixes trinodb#27136
2951511 to
5354000
Compare
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
Fix table optimization with OR predicates on partitioned timestamp columns
ALTER TABLE EXECUTE OPTIMIZE with OR predicates using date() on timestamp with time zone columns failed with "Unexpected FilterNode found in plan". For example:
ALTER TABLE t EXECUTE OPTIMIZE
WHERE date(c2_ts) BETWEEN DATE '2025-07-01' AND DATE '2025-07-31'
OR date(c2_ts) BETWEEN DATE '2025-10-01' AND DATE '2025-10-31'
UtcConstraintExtractor only handled AND (conjuncts) when converting ConnectorExpressions to TupleDomains. An OR expression was treated as a single unconvertible conjunct, leaving a FilterNode in the plan that TableExecuteStructureValidator rejects.
Add OR (disjunct) support to UtcConstraintExtractor by extracting each disjunct, converting it to a TupleDomain, and combining them with TupleDomain.strictUnion(). Also add extractDisjuncts() and or() utility methods to ConnectorExpressions, mirroring the existing extractConjuncts() and and() methods.
Fixes #27136
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text: