Skip to content

Fix querying and filtering using Iceberg metadata columns#23472

Merged
nmahadevuni merged 2 commits intoprestodb:masterfrom
nmahadevuni:fix_iceberg_info_column_filter_eval
Aug 27, 2024
Merged

Fix querying and filtering using Iceberg metadata columns#23472
nmahadevuni merged 2 commits intoprestodb:masterfrom
nmahadevuni:fix_iceberg_info_column_filter_eval

Conversation

@nmahadevuni
Copy link
Member

@nmahadevuni nmahadevuni commented Aug 19, 2024

Description

Fix to query and filter using Iceberg metadata columns "$path" and "$data_sequence_number".

Motivation and Context

Previously wasn't able to query these metadata columns.

Impact

No impact

Test Plan

Added new test file to test querying and filtering by metadata columns.

== RELEASE NOTES ==

General Changes
* Fix to query and filter using Iceberg metadata columns "$path" and "$data_sequence_number" :pr:`23472`

return allConstraints.transform(c -> isMetadataColumnId(((IcebergColumnHandle) c).getId()) ? null : (IcebergColumnHandle) c);
}

public static <U> TupleDomain<IcebergColumnHandle> getInfoColumnConstraints(TupleDomain<U> allConstraints)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think calling it "getMetadataColumnConstraints" is better. It's been called "MetadataColumn" in Presto forever, e.g. isMetadataColumnId(). Giving it a new name "InfoColumn" would confuse the readers a little bit. Will you be able to send commit to unify the names in the HIve change as well?

}

std::unordered_map<std::string, std::string> infoColumns;
infoColumns.reserve(1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, rename back to metadataColumns

QueryRunner javaQueryRunner = ((QueryRunner) getExpectedQueryRunner());
if (!javaQueryRunner.tableExists(getSession(), "test_hidden_columns")) {
javaQueryRunner.execute("CREATE TABLE test_hidden_columns AS SELECT * FROM tpch.tiny.region WHERE regionkey=0");
javaQueryRunner.execute("INSERT INTO test_hidden_columns SELECT * FROM tpch.tiny.region WHERE regionkey=1");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a test with filter not matching? For both tests

@nmahadevuni nmahadevuni force-pushed the fix_iceberg_info_column_filter_eval branch from 3a3fcc7 to 3ad86ab Compare August 20, 2024 08:59
@nmahadevuni
Copy link
Member Author

@yingsu00 Thank you. I have addressed your comments. Please review.

@nmahadevuni
Copy link
Member Author

@hantangwangd @ZacBlanco Can you please review this?

@yingsu00
Copy link
Contributor

cc @aditi-pandit e2e test failures:

[ERROR] Tests run: 49, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 64.896 s <<< FAILURE! - in com.facebook.presto.nativeworker.TestPrestoNativeCteExecutionParquet
[ERROR] com.facebook.presto.nativeworker.TestPrestoNativeCteExecutionParquet.testComplexCommonFilterPushdown  Time elapsed: 0.041 s  <<< FAILURE!
java.lang.RuntimeException: line 9:17: Cannot check if varchar is BETWEEN date and date
        at com.facebook.presto.tests.AbstractTestingPrestoClient.execute(AbstractTestingPrestoClient.java:126)
        at com.facebook.presto.tests.DistributedQueryRunner.execute(DistributedQueryRunner.java:784)
        at com.facebook.presto.tests.DistributedQueryRunner.execute(DistributedQueryRunner.java:752)
        at com.facebook.presto.hive.TestCteExecution.verifyResults(TestCteExecution.java:1209)
        at com.facebook.presto.hive.TestCteExecution.testComplexCommonFilterPushdown(TestCteExecution.java:177)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at org.testng.internal.invokers.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:135)
        at org.testng.internal.invokers.TestInvoker.invokeMethod(TestInvoker.java:673)
        at org.testng.internal.invokers.TestInvoker.invokeTestMethod(TestInvoker.java:220)
        at org.testng.internal.invokers.MethodRunner.runInSequence(MethodRunner.java:50)
        at org.testng.internal.invokers.TestInvoker$MethodInvocationAgent.invoke(TestInvoker.java:945)
        at org.testng.internal.invokers.TestInvoker.invokeTestMethods(TestInvoker.java:193)
        at org.testng.internal.invokers.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:146)
        at org.testng.internal.invokers.TestMethodWorker.run(TestMethodWorker.java:128)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: com.facebook.presto.sql.analyzer.SemanticException: line 9:17: Cannot check if varchar is BETWEEN date and date
        at com.facebook.presto.sql.analyzer.ExpressionAnalyzer$Visitor.getOperator(ExpressionAnalyzer.java:1611)
        at com.facebook.presto.sql.analyzer.ExpressionAnalyzer$Visitor.visitBetweenPredicate(ExpressionAnalyzer.java:1326)
        at com.facebook.presto.sql.analyzer.ExpressionAnalyzer$Visitor.visitBetweenPredicate(ExpressionAnalyzer.java:389)
        at com.facebook.presto.sql.tree.BetweenPredicate.accept(BetweenPredicate.java:71)
        at com.facebook.presto.sql.tree.StackableAstVisitor.process(StackableAstVisitor.java:26)
        at com.facebook.presto.sql.analyzer.ExpressionAnalyzer$Visitor.process(ExpressionAnalyzer.java:412)
        at com.facebook.presto.sql.analyzer.ExpressionAnalyzer$Visitor.coerceType(ExpressionAnalyzer.java:1642)
        at com.facebook.presto.sql.analyzer.ExpressionAnalyzer$Visitor.visitLogicalBinaryExpression(ExpressionAnalyzer.java:598)
        at com.facebook.presto.sql.analyzer.ExpressionAnalyzer$Visitor.visitLogicalBinaryExpression(ExpressionAnalyzer.java:389)
        at com.facebook.presto.sql.tree.LogicalBinaryExpression.accept(LogicalBinaryExpression.java:88)
        at com.facebook.presto.sql.tree.StackableAstVisitor.process(StackableAstVisitor.java:26)
        at com.facebook.presto.sql.analyzer.ExpressionAnalyzer$Visitor.process(ExpressionAnalyzer.java:412)
        at com.facebook.presto.sql.analyzer.ExpressionAnalyzer.analyze(ExpressionAnalyzer.java:350)
        at com.facebook.presto.sql.analyzer.ExpressionAnalyzer.analyzeExpression(ExpressionAnalyzer.java:1937)
        at com.facebook.presto.sql.analyzer.ExpressionAnalyzer.analyzeExpression(ExpressionAnalyzer.java:1921)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.analyzeExpression(StatementAnalyzer.java:2920)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.analyzeWhere(StatementAnalyzer.java:2766)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.visitQuerySpecification(StatementAnalyzer.java:1732)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.visitQuerySpecification(StatementAnalyzer.java:357)
        at com.facebook.presto.sql.tree.QuerySpecification.accept(QuerySpecification.java:138)
        at com.facebook.presto.sql.tree.AstVisitor.process(AstVisitor.java:27)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:371)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.lambda$visitSetOperation$24(StatementAnalyzer.java:1800)
        at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
        at java.base/java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
        at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
        at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
        at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913)
        at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
        at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.visitSetOperation(StatementAnalyzer.java:1803)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.visitUnion(StatementAnalyzer.java:1900)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.visitUnion(StatementAnalyzer.java:357)
        at com.facebook.presto.sql.tree.Union.accept(Union.java:56)
        at com.facebook.presto.sql.tree.AstVisitor.process(AstVisitor.java:27)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:371)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:379)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.visitQuery(StatementAnalyzer.java:1164)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.visitQuery(StatementAnalyzer.java:357)
        at com.facebook.presto.sql.tree.Query.accept(Query.java:105)
        at com.facebook.presto.sql.tree.AstVisitor.process(AstVisitor.java:27)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:371)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:379)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.analyzeWith(StatementAnalyzer.java:2956)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.visitQuery(StatementAnalyzer.java:1163)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.visitQuery(StatementAnalyzer.java:357)
        at com.facebook.presto.sql.tree.Query.accept(Query.java:105)
        at com.facebook.presto.sql.tree.AstVisitor.process(AstVisitor.java:27)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:371)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer.analyze(StatementAnalyzer.java:349)
        at com.facebook.presto.sql.analyzer.Analyzer.analyzeSemantic(Analyzer.java:117)
        at com.facebook.presto.sql.analyzer.BuiltInQueryAnalyzer.analyze(BuiltInQueryAnalyzer.java:93)
        at com.facebook.presto.execution.SqlQueryExecution.<init>(SqlQueryExecution.java:204)
        at com.facebook.presto.execution.SqlQueryExecution.<init>(SqlQueryExecution.java:108)
        at com.facebook.presto.execution.SqlQueryExecution$SqlQueryExecutionFactory.createQueryExecution(SqlQueryExecution.java:955)
        at com.facebook.presto.dispatcher.LocalDispatchQueryFactory.lambda$createDispatchQuery$0(LocalDispatchQueryFactory.java:170)
        at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131)
        at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:75)
        at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)

[ERROR] com.facebook.presto.nativeworker.TestPrestoNativeCteExecutionParquet.testCustomerOrderPatternAnalysis  Time elapsed: 0.015 s  <<< FAILURE!
java.lang.RuntimeException: line 1:91: Type of argument to extract must be DATE, TIME, TIMESTAMP, or INTERVAL (actual varchar)
        at com.facebook.presto.tests.AbstractTestingPrestoClient.execute(AbstractTestingPrestoClient.java:126)
        at com.facebook.presto.tests.DistributedQueryRunner.execute(DistributedQueryRunner.java:784)
        at com.facebook.presto.tests.DistributedQueryRunner.execute(DistributedQueryRunner.java:752)
        at com.facebook.presto.hive.TestCteExecution.verifyResults(TestCteExecution.java:1209)
        at com.facebook.presto.hive.TestCteExecution.verifyResults(TestCteExecution.java:1204)
        at com.facebook.presto.hive.TestCteExecution.testCustomerOrderPatternAnalysis(TestCteExecution.java:656)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at org.testng.internal.invokers.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:135)
        at org.testng.internal.invokers.TestInvoker.invokeMethod(TestInvoker.java:673)
        at org.testng.internal.invokers.TestInvoker.invokeTestMethod(TestInvoker.java:220)
        at org.testng.internal.invokers.MethodRunner.runInSequence(MethodRunner.java:50)
        at org.testng.internal.invokers.TestInvoker$MethodInvocationAgent.invoke(TestInvoker.java:945)
        at org.testng.internal.invokers.TestInvoker.invokeTestMethods(TestInvoker.java:193)
        at org.testng.internal.invokers.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:146)
        at org.testng.internal.invokers.TestMethodWorker.run(TestMethodWorker.java:128)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: com.facebook.presto.sql.analyzer.SemanticException: line 1:91: Type of argument to extract must be DATE, TIME, TIMESTAMP, or INTERVAL (actual varchar)
        at com.facebook.presto.sql.analyzer.ExpressionAnalyzer$Visitor.visitExtract(ExpressionAnalyzer.java:1302)
        at com.facebook.presto.sql.analyzer.ExpressionAnalyzer$Visitor.visitExtract(ExpressionAnalyzer.java:389)
        at com.facebook.presto.sql.tree.Extract.accept(Extract.java:87)
        at com.facebook.presto.sql.tree.StackableAstVisitor.process(StackableAstVisitor.java:26)
        at com.facebook.presto.sql.analyzer.ExpressionAnalyzer$Visitor.process(ExpressionAnalyzer.java:412)
        at com.facebook.presto.sql.analyzer.ExpressionAnalyzer.analyze(ExpressionAnalyzer.java:350)
        at com.facebook.presto.sql.analyzer.ExpressionAnalyzer.analyzeExpression(ExpressionAnalyzer.java:1937)
        at com.facebook.presto.sql.analyzer.ExpressionAnalyzer.analyzeExpression(ExpressionAnalyzer.java:1921)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.analyzeExpression(StatementAnalyzer.java:2920)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.analyzeSelect(StatementAnalyzer.java:2744)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.visitQuerySpecification(StatementAnalyzer.java:1736)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.visitQuerySpecification(StatementAnalyzer.java:357)
        at com.facebook.presto.sql.tree.QuerySpecification.accept(QuerySpecification.java:138)
        at com.facebook.presto.sql.tree.AstVisitor.process(AstVisitor.java:27)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:371)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:379)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.visitQuery(StatementAnalyzer.java:1164)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.visitQuery(StatementAnalyzer.java:357)
        at com.facebook.presto.sql.tree.Query.accept(Query.java:105)
        at com.facebook.presto.sql.tree.AstVisitor.process(AstVisitor.java:27)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:371)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:379)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.analyzeWith(StatementAnalyzer.java:2956)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.visitQuery(StatementAnalyzer.java:1163)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.visitQuery(StatementAnalyzer.java:357)
        at com.facebook.presto.sql.tree.Query.accept(Query.java:105)
        at com.facebook.presto.sql.tree.AstVisitor.process(AstVisitor.java:27)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:371)
        at com.facebook.presto.sql.analyzer.StatementAnalyzer.analyze(StatementAnalyzer.java:349)
        at com.facebook.presto.sql.analyzer.Analyzer.analyzeSemantic(Analyzer.java:117)
        at com.facebook.presto.sql.analyzer.BuiltInQueryAnalyzer.analyze(BuiltInQueryAnalyzer.java:93)
        at com.facebook.presto.execution.SqlQueryExecution.<init>(SqlQueryExecution.java:204)
        at com.facebook.presto.execution.SqlQueryExecution.<init>(SqlQueryExecution.java:108)
        at com.facebook.presto.execution.SqlQueryExecution$SqlQueryExecutionFactory.createQueryExecution(SqlQueryExecution.java:955)
        at com.facebook.presto.dispatcher.LocalDispatchQueryFactory.lambda$createDispatchQuery$0(LocalDispatchQueryFactory.java:170)
        at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131)
        at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:75)
        at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)

[INFO] 
[INFO] Results:
[INFO] 
[ERROR] Failures: 
[ERROR]   TestPrestoNativeCteExecutionParquet>TestCteExecution.testComplexCommonFilterPushdown:177->TestCteExecution.verifyResults:1209 ? Runtime line 9:17: Cannot check if varchar is BETWEEN date and date
[ERROR]   TestPrestoNativeCteExecutionParquet>TestCteExecution.testCustomerOrderPatternAnalysis:656->TestCteExecution.verifyResults:1204->TestCteExecution.verifyResults:1209 ? Runtime line 1:91: Type of argument to extract must be DATE, TIME, TIMESTAMP, or INTERVAL (actual varchar)

@yingsu00 yingsu00 self-assigned this Aug 21, 2024
Copy link
Contributor

@yingsu00 yingsu00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nmahadevuni Can you please also update the PR and commit title?

@nmahadevuni nmahadevuni force-pushed the fix_iceberg_info_column_filter_eval branch from 3ad86ab to ee42d27 Compare August 21, 2024 06:22
@nmahadevuni nmahadevuni changed the title Fix querying and filtering using Iceberg info columns Fix querying and filtering using Iceberg metadata columns Aug 21, 2024
@nmahadevuni
Copy link
Member Author

@nmahadevuni Can you please also update the PR and commit title?

I have updated it. Thank you.

@nmahadevuni
Copy link
Member Author

In the previous run TestPrestoNativeCteExecutionParquet had 2 tests failures, in the current run, 12 tests from TestPrestoNativeTpchQueriesParquetUsingJSON are failing. I'm not able to see any stack trace as it seems to be overridden in the log for TestPrestoNativeTpchQueriesParquetUsingJSON. Can we rerun the tests @yingsu00 ?

@imjalpreet
Copy link
Member

@nmahadevuni I think this is the exception:

Caused by: java.lang.RuntimeException: line 20:18: '<=' cannot be applied to date, varchar(10)
        at com.facebook.presto.tests.AbstractTestingPrestoClient.execute(AbstractTestingPrestoClient.java:126)
        at com.facebook.presto.tests.DistributedQueryRunner.execute(DistributedQueryRunner.java:784)
        at com.facebook.presto.tests.DistributedQueryRunner.execute(DistributedQueryRunner.java:752)
        at com.facebook.presto.tests.QueryAssertions.assertQuery(QueryAssertions.java:175)
        ... 19 more
Caused by: com.facebook.presto.sql.analyzer.SemanticException: line 20:18: '<=' cannot be applied to date, varchar(10)
        at com.facebook.presto.sql.analyzer.ExpressionAnalyzer$Visitor.getOperator(ExpressionAnalyzer.java:1611)

https://app.circleci.com/pipelines/github/prestodb/presto/19231/workflows/bcbab48f-a1fb-40df-b742-9c75cf844f2c/jobs/76915/parallel-runs/4/steps/4-105?invite=true#step-105-68281_94

Copy link
Member

@hantangwangd hantangwangd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix, overall looks good to me! Some little nits.

@nmahadevuni
Copy link
Member Author

@hantangwangd @yingsu00 Any idea why the TestPrestoNativeTpchQueriesParquetUsingJSON tests are failing. This PR has no changes related to that. There are some casting issues such as for Q1

Caused by: java.lang.RuntimeException: line 20:18: '<=' cannot be applied to date, varchar(10)

@nmahadevuni nmahadevuni force-pushed the fix_iceberg_info_column_filter_eval branch from ee42d27 to 33f62ac Compare August 22, 2024 07:45
@nmahadevuni
Copy link
Member Author

Thanks @hantangwangd @yingsu00 for the review. I have addressed your comments. Please review.

@nmahadevuni nmahadevuni force-pushed the fix_iceberg_info_column_filter_eval branch 3 times, most recently from 88dc8bf to cce58ea Compare August 22, 2024 14:22
@hantangwangd
Copy link
Member

Caused by: java.lang.RuntimeException: line 20:18: '<=' cannot be applied to date, varchar(10)

The most intuitive reason seems to be that we read table lineitem from a wrong catalog or schema. Because in AbstractTestNativeTpchQueries, we have changed the date columns to varchar columns when creating table lineitem. So, if we get an error with message '<=' cannot be applied to date, varchar(10) when executing a query on table lineitem with predicate like shipdate <= '1998-12-01', we can basically conclude that we read the wrong table, it's not the one we just created. Have any recent changes affected the default catalog/schema of the actual/expected query runner's default session?

It's my guess, for reference only.

@hantangwangd hantangwangd dismissed their stale review August 23, 2024 00:10

addressed

@nmahadevuni
Copy link
Member Author

@hantangwangd I didn't get your last comment "addressed". I also think the reason for the test failure is that its reading from a wrong catalog/schema or reusing old directory path where tables were created previously with type as DATE. How to verify this?

@hantangwangd
Copy link
Member

I didn't get your last comment "addressed".

Oh, the "addressed" means that my review comments has been addressed, not means the test failure has been addressed.

@nmahadevuni nmahadevuni force-pushed the fix_iceberg_info_column_filter_eval branch 2 times, most recently from 169297f to 7168841 Compare August 26, 2024 06:59
@nmahadevuni nmahadevuni force-pushed the fix_iceberg_info_column_filter_eval branch from 7168841 to 2d02348 Compare August 26, 2024 09:49
@nmahadevuni
Copy link
Member Author

@hantangwangd I have addressed the test failure issue as a second commit. Can you please review it?

Copy link
Contributor

@ZacBlanco ZacBlanco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't see this mentioned anywhere, but is this a bug in both Java and native? Or just native?

This change seems more related to the connector than the execution engine, I'm wondering if we should move any of these tests to the presto-iceberg module rather than rely on the native tests, or maybe put them in both places. Was there a specific reason to include them in TestPrestoNativeIceberg rather than in IcebergDistributedTestBase or similar?

@nmahadevuni
Copy link
Member Author

I didn't see this mentioned anywhere, but is this a bug in both Java and native? Or just native?

This change seems more related to the connector than the execution engine, I'm wondering if we should move any of these tests to the presto-iceberg module rather than rely on the native tests, or maybe put them in both places. Was there a specific reason to include them in TestPrestoNativeIceberg rather than in IcebergDistributedTestBase or similar?

This is a bug in both Java and Native. The tests in TestPrestoNativeIceberg will run on both native and Java worker, since Java query runner is the expected runner. IcebergDistributedTestBase has H2 as the expected runner, so these queries would fail there.

@ZacBlanco
Copy link
Contributor

ZacBlanco commented Aug 26, 2024

Thanks for the explanation. I don't love that we have separate test infrastructure for native connectors which exist outside of the maven module for the connector. That's a larger issue to tackle in the future.

However, I would like to have some version of these tests also exist inside of the presto-iceberg module. If we ever properly move the native testing into the connector maven modules, it will be difficult to discern how to merge the test classes if we split functionality testing in both places. Currently with Iceberg, except for native-only features such as parquet filter pushdown, the presto-iceberg module contains all the tests which are the "source of truth" for connector functionality testing. Hence, I think it would be good to have the tests for this PR exist in that module.

There are plenty of tests in the module which don't run the exact queries using the H2 expected query runner and we just assert on results using something like assertQuery("<presto query>", "VALUES ..."). I think we can do similar here. You can also use the computeActual function and assert values from the MaterializedResult which your current tests seem to use too. Either method would be sufficient. I don't want to miss these test cases if we ever have to merge these classes in the future.

@nmahadevuni nmahadevuni force-pushed the fix_iceberg_info_column_filter_eval branch from 2d02348 to 8662726 Compare August 27, 2024 06:39
@nmahadevuni nmahadevuni force-pushed the fix_iceberg_info_column_filter_eval branch from 8662726 to 06c3e93 Compare August 27, 2024 07:43
@nmahadevuni
Copy link
Member Author

@ZacBlanco @hantangwangd Added the same test in IcebergDistributedTestBase . Please review.

Copy link
Member

@hantangwangd hantangwangd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix, lgtm!

@nmahadevuni nmahadevuni merged commit 7b04566 into prestodb:master Aug 27, 2024
@nmahadevuni nmahadevuni deleted the fix_iceberg_info_column_filter_eval branch August 27, 2024 18:31
@jaystarshot jaystarshot mentioned this pull request Nov 1, 2024
25 tasks
@tdcmeehan tdcmeehan added the from:IBM PR from IBM label Dec 13, 2024
@prestodb-ci prestodb-ci requested review from a team, bibith4 and namya28 and removed request for a team December 13, 2024 15:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

from:IBM PR from IBM

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants

Comments