-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parquet-1872: Add TransCompression command to parquet-tools #796
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a lot of code duplications between cli
and tools
. I would suggest adding these to parquet-hadoop
as a utility for this functionality and use it from the two command line tools. The unit test also can be placed in parquet-hadoop
so you write it once only. (You may keep simple unit tests at the tools side to verify the tool itself but the functionality shall be verified in the module where it is implemented.)
parquet-cli/src/main/java/org/apache/parquet/cli/commands/TransCompressionCommand.java
Outdated
Show resolved
Hide resolved
parquet-cli/src/main/java/org/apache/parquet/cli/commands/TransCompressionCommand.java
Outdated
Show resolved
Hide resolved
parquet-cli/src/main/java/org/apache/parquet/cli/commands/TransCompressionCommand.java
Outdated
Show resolved
Hide resolved
parquet-cli/src/main/java/org/apache/parquet/cli/commands/TransCompressionCommand.java
Outdated
Show resolved
Hide resolved
parquet-cli/src/test/java/org/apache/parquet/cli/commands/TransCompressionCommandTest.java
Outdated
Show resolved
Hide resolved
parquet-cli/src/test/java/org/apache/parquet/cli/commands/TransCompressionCommandTest.java
Outdated
Show resolved
Hide resolved
parquet-cli/src/test/java/org/apache/parquet/cli/commands/TransCompressionCommandTest.java
Outdated
Show resolved
Hide resolved
reply:
My second commit is based on #2. |
@@ -53,7 +53,7 @@ public void readFully(byte[] bytes) throws IOException { | |||
|
|||
@Override | |||
public void readFully(byte[] bytes, int start, int len) throws IOException { | |||
stream.readFully(bytes); | |||
stream.readFully(bytes, start, len); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One open finding (about the logic of retrieving the statistics) and one new about the data generation in the unit test. Otherwise, it looks great.
parquet-hadoop/src/test/java/org/apache/parquet/hadoop/util/CompressionConveterTest.java
Outdated
Show resolved
Hide resolved
Summary: source branch: prune conflict: No commits: 27c2d9625d8cc8375a48b972c202b9ec1f4a3acb 4f2997edbf5e2d67b56b134dcf78ebcb3ec28bc2 fac0f62af5163084abb8b302759cd62fbe477be6 ####below message are auto generated by arc diff Add parquet file diff utility ParquetFileWriter missing Api for DataPageV2 Parquet-1872: Add TransCompression command to parquet-tools (apache#796) Reviewers: shangx Reviewed By: shangx Differential Revision: https://code.uberinternal.com/D4970793
Make sure you have checked all steps below.
Jira
Tests
Commits
Documentation