Fix querying and filtering using Iceberg metadata columns#23472
Fix querying and filtering using Iceberg metadata columns#23472nmahadevuni merged 2 commits intoprestodb:masterfrom
Conversation
| return allConstraints.transform(c -> isMetadataColumnId(((IcebergColumnHandle) c).getId()) ? null : (IcebergColumnHandle) c); | ||
| } | ||
|
|
||
| public static <U> TupleDomain<IcebergColumnHandle> getInfoColumnConstraints(TupleDomain<U> allConstraints) |
There was a problem hiding this comment.
I think calling it "getMetadataColumnConstraints" is better. It's been called "MetadataColumn" in Presto forever, e.g. isMetadataColumnId(). Giving it a new name "InfoColumn" would confuse the readers a little bit. Will you be able to send commit to unify the names in the HIve change as well?
| } | ||
|
|
||
| std::unordered_map<std::string, std::string> infoColumns; | ||
| infoColumns.reserve(1); |
There was a problem hiding this comment.
Same, rename back to metadataColumns
| QueryRunner javaQueryRunner = ((QueryRunner) getExpectedQueryRunner()); | ||
| if (!javaQueryRunner.tableExists(getSession(), "test_hidden_columns")) { | ||
| javaQueryRunner.execute("CREATE TABLE test_hidden_columns AS SELECT * FROM tpch.tiny.region WHERE regionkey=0"); | ||
| javaQueryRunner.execute("INSERT INTO test_hidden_columns SELECT * FROM tpch.tiny.region WHERE regionkey=1"); |
There was a problem hiding this comment.
Add a test with filter not matching? For both tests
3a3fcc7 to
3ad86ab
Compare
|
@yingsu00 Thank you. I have addressed your comments. Please review. |
|
@hantangwangd @ZacBlanco Can you please review this? |
|
cc @aditi-pandit e2e test failures: |
yingsu00
left a comment
There was a problem hiding this comment.
@nmahadevuni Can you please also update the PR and commit title?
3ad86ab to
ee42d27
Compare
I have updated it. Thank you. |
|
In the previous run |
|
@nmahadevuni I think this is the exception: |
hantangwangd
left a comment
There was a problem hiding this comment.
Thanks for the fix, overall looks good to me! Some little nits.
presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergSplitSource.java
Outdated
Show resolved
Hide resolved
presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergSplitSource.java
Outdated
Show resolved
Hide resolved
...on/src/test/java/com/facebook/presto/nativeworker/TestPrestoNativeIcebergGeneralQueries.java
Outdated
Show resolved
Hide resolved
...on/src/test/java/com/facebook/presto/nativeworker/TestPrestoNativeIcebergGeneralQueries.java
Show resolved
Hide resolved
|
@hantangwangd @yingsu00 Any idea why the
|
ee42d27 to
33f62ac
Compare
|
Thanks @hantangwangd @yingsu00 for the review. I have addressed your comments. Please review. |
88dc8bf to
cce58ea
Compare
The most intuitive reason seems to be that we read table It's my guess, for reference only. |
|
@hantangwangd I didn't get your last comment "addressed". I also think the reason for the test failure is that its reading from a wrong catalog/schema or reusing old directory path where tables were created previously with type as DATE. How to verify this? |
Oh, the "addressed" means that my review comments has been addressed, not means the test failure has been addressed. |
169297f to
7168841
Compare
7168841 to
2d02348
Compare
|
@hantangwangd I have addressed the test failure issue as a second commit. Can you please review it? |
ZacBlanco
left a comment
There was a problem hiding this comment.
I didn't see this mentioned anywhere, but is this a bug in both Java and native? Or just native?
This change seems more related to the connector than the execution engine, I'm wondering if we should move any of these tests to the presto-iceberg module rather than rely on the native tests, or maybe put them in both places. Was there a specific reason to include them in TestPrestoNativeIceberg rather than in IcebergDistributedTestBase or similar?
This is a bug in both Java and Native. The tests in |
|
Thanks for the explanation. I don't love that we have separate test infrastructure for native connectors which exist outside of the maven module for the connector. That's a larger issue to tackle in the future. However, I would like to have some version of these tests also exist inside of the There are plenty of tests in the module which don't run the exact queries using the H2 expected query runner and we just assert on results using something like |
2d02348 to
8662726
Compare
8662726 to
06c3e93
Compare
|
@ZacBlanco @hantangwangd Added the same test in |
hantangwangd
left a comment
There was a problem hiding this comment.
Thanks for the fix, lgtm!
Description
Fix to query and filter using Iceberg metadata columns "$path" and "$data_sequence_number".
Motivation and Context
Previously wasn't able to query these metadata columns.
Impact
No impact
Test Plan
Added new test file to test querying and filtering by metadata columns.