[iceberg] [native] Add support for V1 tables#21584
Conversation
presto-native-execution/presto_cpp/presto_protocol/Connectors.cpp
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/types/PrestoToVeloxQueryPlan.cpp
Outdated
Show resolved
Hide resolved
aditi-pandit
left a comment
There was a problem hiding this comment.
Thanks @imjalpreet.
Would you be able to split out the protocol changes from the plan conversion changes in this PR ? That would be easier to review and maintain.
@aditi-pandit can you let me know how you want to split the changes? Did you mean in a separate PR? In this PR I had kept separate commits for the protocol and Presto to Velox plan conversion changes. |
@imjalpreet : Yeah, would be great to have a single PR with just the changes you have in commit Add Iceberg Connector to parse Java Iceberg Plan Fragment. After that checked-in its easier to look at how the protocol pieces are mapped into Prestissimo. |
|
Please separate the Prestissimo bits into their own PR. |
62163c3 to
70146b3
Compare
There was a problem hiding this comment.
Can this really happen? Is the connectorName not from presto-native-execution/etc/catalog? There is no iceberg.properties there and in hive.properties there is only connector.name=hive
There was a problem hiding this comment.
Yes, we would need to add the iceberg catalog properties file in presto-native-execution/etc/catalog if someone wants to read iceberg tables
There was a problem hiding this comment.
If we want to add a default file, I can push one.
There was a problem hiding this comment.
BTW I think we should directly identify the connector name as hive in iceberg.properties. This catalog property file is for Presto Native, which doesn't have Iceberg connector, but just Hive connector.
connector.name=hive
presto-native-execution/presto_cpp/presto_protocol/Connectors.cpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
What needs to be done on hiveType?
There was a problem hiding this comment.
In the case of the hive connector, HiveColumnHandle(Presto) also has a parameter of type HiveType which is being used while creating the HiveColumnHandle velox object. We don't have the same in IcebergColumnHandle yet.
The prestissimo change was added as part of this commit in the case of hive connector: 6570b00
f0ff681 to
2e229ac
Compare
5f71cfe to
51050fc
Compare
|
@imjalpreet is this ready for review? |
|
Can you update the description? |
|
@majetideepak Yes, this is ready for review. I have updated the description as well. |
There was a problem hiding this comment.
BTW I think we should directly identify the connector name as hive in iceberg.properties. This catalog property file is for Presto Native, which doesn't have Iceberg connector, but just Hive connector.
connector.name=hive
presto-native-execution/presto_cpp/main/types/PrestoToVeloxQueryPlan.cpp
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/types/PrestoToVeloxSplit.cpp
Outdated
Show resolved
Hide resolved
Use Presto IcebergColumnHandle to create Velox HiveColumnHandle Use Presto IcebergTableLayoutHandle/IcebergTableHandle to create Velox HiveTableHandle
51050fc to
f9dfb73
Compare
yingsu00
left a comment
There was a problem hiding this comment.
@tdcmeehan @majetideepak @aditi-pandit @mbasmanova Do you want to review this PR again?
@imjalpreet, @yingsu00 can you share some details of the tests that were run? What scale factor was used? How many nodes? Was there any comparison made with the Hive Connector? |
|
I believe only Copy-on-Write tables are supported correct? Can you clarify the scope? |
presto-native-execution/presto_cpp/main/types/PrestoToVeloxQueryPlan.cpp
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/types/PrestoToVeloxSplit.cpp
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/types/PrestoToVeloxSplit.cpp
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/types/PrestoToVeloxSplit.cpp
Outdated
Show resolved
Hide resolved
Co-authored-by: Jalpreet Singh Nanda (:imjalpreet) <jalpreetnanda@gmail.com>
f9dfb73 to
1ec8e47
Compare
@majetideepak Currently, we have only done correctness verifications with smaller scale factors on local machines. We are in the process of testing it with sf1k and sf10k and working on generating the datasets. If you would like to see performance comparison for small scale factor, I can share that or we can wait for the tests with sf1k and sf10k.
Yes, currently this PR would enable Prestissimo to be able to read iceberg v1 tables. Since v2 is not fully supported yet, merge on read tables won't be supported with this PR since they are only available with iceberg v2 tables. |
@majetideepak Reetika has a branch with the TPCH/DS tiny e2e tests here https://github.com/agrawalreetika/prestodb/commits/icebergPrestissimo-tests-tpch-v1/ . Since it depends on this PR, we will submit a PR after this one is merged. |
There was a problem hiding this comment.
I think we can add a mapping from iceberg name to hive on the Velox side. Similar to how hive-hadoop2 is mapped. Not a blocker for now.
There was a problem hiding this comment.
Created a PR here facebookincubator/velox#8765
1ec8e47 to
7aa5ab1
Compare
majetideepak
left a comment
There was a problem hiding this comment.
Thanks, @imjalpreet
|
Just in case: if this doesn't fail hard on V2 tables, can we add a small followup to do that until V2 support is finished? |
@tdcmeehan Yes, sure. The follow-up PR for V2 support should also be ready soon. But I will raise the above fail-fast change in the meantime. |
Description
This PR includes the changes required to convert the Presto Iceberg plan and splits into the Velox plan and splits.
Motivation and Context
facebookincubator/velox#5977
Impact
With these changes, we will be able to read iceberg v1 tables from Prestissimo
Test Plan
Successfully ran all TPC-H queries with iceberg tables (both partitioned and non-partitioned)
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.