-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PARQUET-2: Adding Type Persuasion for Primitive Types #3
Conversation
…ct type checking for conflicting schemas, which is strict by default.
Original from the old repo: Parquet/parquet-mr#410 JIRA: https://issues.apache.org/jira/browse/PARQUET-2 These changes allow primitive types to be requested as different types than what is stored in the file format using a flag to turn off strict type checking (default is on). Types are cast to the requested type where possible and will suffer precision loss for casting where necessary (e.g. requesting a double as an int). No performance penalty is imposed for using the type defined in the file type. A flag exists to A 6x6 test case is provided to test conversion between the primitive types. Author: Daniel Weeks <[email protected]> Closes #3 from dcw-netflix/type-persuasion and squashes the following commits: 1c3c0c7 [Daniel Weeks] Fixed test with strict checking off f3cb495 [Daniel Weeks] Added type persuasion for primitive types with a flag to control strict type checking for conflicting schemas, which is strict by default.
@@ -195,6 +195,13 @@ public boolean equals(Object other) { | |||
* @return the union result of merging toMerge into this | |||
*/ | |||
protected abstract Type union(Type toMerge); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should put the default implementation here:
protected Type union(Type toMerge) {
return union(toMerge, true);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to put the default implementation in the abstract class, but the
maven enforcer plugin wouldn't allow me to do it. I assume removing the
abstract is considered an interface change.
On Fri, Jun 20, 2014 at 9:21 PM, Julien Le Dem [email protected]
wrote:
In parquet-column/src/main/java/parquet/schema/Type.java:
@@ -195,6 +195,13 @@ public boolean equals(Object other) {
* @return the union result of merging toMerge into this
*/
protected abstract Type union(Type toMerge);We should put the default implementation here:
protected Type union(Type toMerge) {
return union(toMerge, true);
}—
Reply to this email directly or view it on GitHub
https://github.com/apache/incubator-parquet-mr/pull/3/files#r14048282.
Thanks Daniel! |
please prefix the PR title with "PARQUET-2: " |
Thank you @dcw-netflix ! |
…2 api Currently for creating a user defined predicate using the new filter api, no value can be passed to create a dynamic filter at runtime. This reduces the usefulness of the user defined predicate, and meaningful predicates cannot be created. We can add a generic Object value that is passed through the api, which can internally be used in the keep function of the user defined predicate for creating many different types of filters. For example, in spark sql, we can pass in a list of filter values for a where IN clause query and filter the row values based on that list. Author: Yash Datta <[email protected]> Author: Alex Levenson <[email protected]> Author: Yash Datta <[email protected]> Closes #73 from saucam/master and squashes the following commits: 7231a3b [Yash Datta] Merge pull request #3 from isnotinvain/alexlevenson/fix-binary-compat dcc276b [Alex Levenson] Ignore binary incompatibility in private filter2 class 7bfa5ad [Yash Datta] Merge pull request #2 from isnotinvain/alexlevenson/simplify-udp-state 0187376 [Alex Levenson] Resolve merge conflicts 25aa716 [Alex Levenson] Simplify user defined predicates with state 51952f8 [Yash Datta] PARQUET-116: Fix whitespace d7b7159 [Yash Datta] PARQUET-116: Make UserDefined abstract, add two subclasses, one accepting udp class, other accepting serializable udp instance 40d394a [Yash Datta] PARQUET-116: Fix whitespace 9a63611 [Yash Datta] PARQUET-116: Fix whitespace 7caa4dc [Yash Datta] PARQUET-116: Add ConfiguredUserDefined that takes a serialiazble udp directly 0eaabf4 [Yash Datta] PARQUET-116: Move the config object from keep method to a configure method in udp predicate f51a431 [Yash Datta] PARQUET-116: Adding type safety for the filter object to be passed to user defined predicate d5a2b9e [Yash Datta] PARQUET-116: Enforce that the filter object to be passed must be Serializable dfd0478 [Yash Datta] PARQUET-116: Add a test case for passing a filter object to user defined predicate 4ab46ec [Yash Datta] PARQUET-116: Pass a filter object to user defined predicate in filter2 api
…2 api Currently for creating a user defined predicate using the new filter api, no value can be passed to create a dynamic filter at runtime. This reduces the usefulness of the user defined predicate, and meaningful predicates cannot be created. We can add a generic Object value that is passed through the api, which can internally be used in the keep function of the user defined predicate for creating many different types of filters. For example, in spark sql, we can pass in a list of filter values for a where IN clause query and filter the row values based on that list. Author: Yash Datta <[email protected]> Author: Alex Levenson <[email protected]> Author: Yash Datta <[email protected]> Closes apache#73 from saucam/master and squashes the following commits: 7231a3b [Yash Datta] Merge pull request #3 from isnotinvain/alexlevenson/fix-binary-compat dcc276b [Alex Levenson] Ignore binary incompatibility in private filter2 class 7bfa5ad [Yash Datta] Merge pull request #2 from isnotinvain/alexlevenson/simplify-udp-state 0187376 [Alex Levenson] Resolve merge conflicts 25aa716 [Alex Levenson] Simplify user defined predicates with state 51952f8 [Yash Datta] PARQUET-116: Fix whitespace d7b7159 [Yash Datta] PARQUET-116: Make UserDefined abstract, add two subclasses, one accepting udp class, other accepting serializable udp instance 40d394a [Yash Datta] PARQUET-116: Fix whitespace 9a63611 [Yash Datta] PARQUET-116: Fix whitespace 7caa4dc [Yash Datta] PARQUET-116: Add ConfiguredUserDefined that takes a serialiazble udp directly 0eaabf4 [Yash Datta] PARQUET-116: Move the config object from keep method to a configure method in udp predicate f51a431 [Yash Datta] PARQUET-116: Adding type safety for the filter object to be passed to user defined predicate d5a2b9e [Yash Datta] PARQUET-116: Enforce that the filter object to be passed must be Serializable dfd0478 [Yash Datta] PARQUET-116: Add a test case for passing a filter object to user defined predicate 4ab46ec [Yash Datta] PARQUET-116: Pass a filter object to user defined predicate in filter2 api
Original from the old repo: https://github.com/Parquet/parquet-mr/pull/410
JIRA: https://issues.apache.org/jira/browse/PARQUET-2
These changes allow primitive types to be requested as different types than what is stored in the file format using a flag to turn off strict type checking (default is on). Types are cast to the requested type where possible and will suffer precision loss for casting where necessary (e.g. requesting a double as an int).
No performance penalty is imposed for using the type defined in the file type. A flag exists to
A 6x6 test case is provided to test conversion between the primitive types.