-
Notifications
You must be signed in to change notification settings - Fork 2.5k
[HUDI-431] Adding support for Parquet in MOR LogBlocks
#4333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
63 commits
Select commit
Hold shift + click to select a range
bd8e50c
Adding parquet data block and inline read support
nsivabalan 6190789
Assert accepted `Path` instances actually represent InlineFS block pa…
3bdba30
Fixing compilation
8f720a9
Tidying up
fe23d29
Cleaned up `HoodieDataBlock`:
0896844
Tidying up
2caa71f
Fixing layout
3bf54eb
Fixing `HoodieHFileDataBlock` to properly handle cases when point loo…
8666008
Tidying up:
06a1dcb
Control ser/de flow fully from w/in the `HodieDataBlock`
c56a5bd
Added assertions for `HoodieDataBlock`;
811aabb
Tidying up
f82f136
Added `ByteBufferBackedInputStream`
dabd871
Rebased `HoodiHFileReader` onto `ByteBufferBackedInputStream` (to av…
f715050
Completed `HoodieParquetDataBlock`:
77993e2
Tidying up
c6534d7
Cleaned up `HoodieLogFileReader`;
5d1aa54
Missing license
2ec97cb
Tidying up;
59b1911
Reverted `hasContentLength` change
1839701
Fixed NPEs
ee17c94
Tidied up config
e15d770
Added tests for Parquet data blocks
bfb5bf7
Streamlining
5935a41
Extracted `ByteBufferBackedInputFile` as a standalone class
e353851
Extracted `OutputStreamBackedOutputFile` as a standalone class;
5183297
Missing license
1c533d9
Make sure `LogBlock`s schema is always set to either Reader's schema …
1ce9b6e
Added `Option.or`
12bb746
Abstracted common point-wise record lookups w/in `HoodieDataBlock` to…
c30f2e7
Disable point-lookups for Parquet data blocks
f0c9597
Tidying up
14dbeb8
Made `readRecordsFromContent` overridable to make sure that Parquet B…
1df7fb7
Tidying up
6ad6ac8
Repacked `HoodieLogFile` w/in `HoodieLogFileReader` to make sure that…
7fd5105
Down-scoping some utils
3934a42
Added test to assert that Data Block produces records w/ correct proj…
1f2a980
Tydying up
2b7ac1c
Make sure `HoodieAvroDataBlock` doesn't alter original records' list
c19e181
`lint`
fa5063d
Fixing styling
bc73ece
Tidying up
4fc9fbf
Cleaned up ctors used only in tests
1d65847
Thread through appropritately configured Parquet compression codec in…
3f286ca
Thread through appropriately configured Parquet compression codec int…
a232513
Thread t/h Hadoop `Configuration` to the Log Blocks to make sure prop…
6594fc2
Properly respect Hadoop Configuration
79accfa
Fixed compilation
f5df8bd
Leverage appropriately configured HFile compression algo
0cef98e
`lint`
640828b
Avoid superfluous `fs.getFileStatus` by amending the Path w/ Scheme d…
a4509d1
Added UT for `ByteBufferBackedInputStream`
a08d0ce
Fixed how Path is qualified w/ FS default URI, scheme
98f44f5
Properly pass in file-size when re-packaging `HoodieLogFile`;
12d0ed8
Tidying up
d451371
Moved `LOGFILE_DATA_BLOCK_FORMAT` into `HoodieWriteConfig`
ae505bc
Tidying up
c455ea0
Unused imports
c47b840
After rebase fixes
8743a38
Tidying up
10221e0
Tidying up more
d0b17e5
`lint`
29dd840
Fixed writers' ctor of the `HoodieHFileDataBlock`
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a very good cleanup. thanks!