-
Notifications
You must be signed in to change notification settings - Fork 1.4k
feat(iceberg): Support Iceberg partition transforms #13874
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
PingLiuPing
wants to merge
2
commits into
facebookincubator:main
Choose a base branch
from
PingLiuPing:lp_iceberg_partition_transforms
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
feat(iceberg): Support Iceberg partition transforms #13874
PingLiuPing
wants to merge
2
commits into
facebookincubator:main
from
PingLiuPing:lp_iceberg_partition_transforms
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
✅ Deploy Preview for meta-velox canceled.
|
7922c5e to
290f28c
Compare
Collaborator
Author
|
CC @zhouyuan |
Yuhta
reviewed
Jun 26, 2025
4d8b073 to
fdccd55
Compare
yingsu00
reviewed
Jun 30, 2025
Collaborator
|
Please copy this implementation for bucket transform. #13174 |
fdccd55 to
0f52d96
Compare
3b58d4b to
1e587b0
Compare
1e587b0 to
9ec820f
Compare
This was referenced Jul 7, 2025
53a3bfc to
e8badd7
Compare
velox/connectors/hive/iceberg/tests/IcebergTransformE2ETest.cpp
Outdated
Show resolved
Hide resolved
jinchengchenghh
approved these changes
Jul 31, 2025
Collaborator
jinchengchenghh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only a small nits
c6b2574 to
39a67d4
Compare
facebook-github-bot
pushed a commit
that referenced
this pull request
Aug 7, 2025
Summary: The iceberg hash use mumur3 hash, which aligns with https://github.com/aappleby/smhasher/blob/master/src/MurmurHash3.cpp, firstly, process every 4 bytes as a chunk, then process remaining bytes by XOR, sparksql also uses this hash algorithm but is different with processing remaining bytes, which combine the remaining bytes. Extract the common function hashInt64 to functions/lib. This class will be used for iceberg bucket transform and bucket function. The iceberg mumur3 hash should be strictly with java implementation, then write by iceberg could read with iceberg Java, and the function call can also get the correct result. The iceberg utility lib `velox_functions_iceberg_hash` will be linked by iceberg connector write to do partition transform. #13874 Pull Request resolved: #14025 Reviewed By: pedroerp Differential Revision: D79732785 Pulled By: kgpai fbshipit-source-id: 6122b94673f015dca5c8484722926709a30fe65e
ebf1437 to
7b14ccc
Compare
7b14ccc to
ff677ae
Compare
39310bd to
9dfefb7
Compare
a7af822 to
a04c29e
Compare
Co-authored-by [email protected]
a04c29e to
e0ae168
Compare
Co-authored-by: Chengcheng Jin <[email protected]>
e0ae168 to
f5172db
Compare
wypb
pushed a commit
to wypb/velox
that referenced
this pull request
Sep 3, 2025
Summary: The iceberg hash use mumur3 hash, which aligns with https://github.com/aappleby/smhasher/blob/master/src/MurmurHash3.cpp, firstly, process every 4 bytes as a chunk, then process remaining bytes by XOR, sparksql also uses this hash algorithm but is different with processing remaining bytes, which combine the remaining bytes. Extract the common function hashInt64 to functions/lib. This class will be used for iceberg bucket transform and bucket function. The iceberg mumur3 hash should be strictly with java implementation, then write by iceberg could read with iceberg Java, and the function call can also get the correct result. The iceberg utility lib `velox_functions_iceberg_hash` will be linked by iceberg connector write to do partition transform. facebookincubator#13874 Pull Request resolved: facebookincubator#14025 Reviewed By: pedroerp Differential Revision: D79732785 Pulled By: kgpai fbshipit-source-id: 6122b94673f015dca5c8484722926709a30fe65e
jinchengchenghh
added a commit
to apache/incubator-gluten
that referenced
this pull request
Sep 3, 2025
Add Protobuf struct IcebergPartitionField to transfer the iceberg id information, add IcebergPartitionSpec to transfer partition information. Build with test and benchmark in CI and fix IcebergWriteTest build. Set the file format to orc to bypass native parquet write for partitioned tpch iceberg suite, after facebookincubator/velox#14670 which supports fanout false mode merged, we can relax the restriction. Relevant PR: facebookincubator/velox#13874
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
iceberg
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Support Iceberg partition transforms.
Reviewers, please ignore the first commit, that commit is from #10996 and is not merged yet.