-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-37582: [Go][Parquet] Implement Float16 logical type #37599
Conversation
|
88e0c88
to
07eae6f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should cover the base requirements, but LMK if I missed anything. There's a bit of weird special-casing involved with the column statistics since float16 requires an independent Statistics
type despite not being physical - but maybe there's a better way.
for { | ||
f16 := float16.FromBits(uint16(r.Uint64n(math.MaxUint16 + 1))) | ||
f64 := float64(f16.Float32()) | ||
if !math.IsNaN(f64) && !math.IsInf(f64, 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other randFloat
functions don't exclude infinities. Should we change that? The checks for approximate equality will fail when comparing infinities of the same sign.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might make more sense to leave the NaN's and infinities to ensure we're covering those cases for testing. Alternately we could just make sure we test those cases separately
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks great, just a few nitpicks.
for { | ||
f16 := float16.FromBits(uint16(r.Uint64n(math.MaxUint16 + 1))) | ||
f64 := float64(f16.Float32()) | ||
if !math.IsNaN(f64) && !math.IsInf(f64, 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might make more sense to leave the NaN's and infinities to ensure we're covering those cases for testing. Alternately we could just make sure we test those cases separately
07eae6f
to
c0109db
Compare
Note that the updated So, fair warning: regenerating the thrift files from this branch alone will basically break everything (for now). |
Overall this is looking good to me, though I won't merge this until after the format PR with the thrift changes gets finalized and merged. |
@benibus the fix for the failing CIs has been merged, can you rebase and update this so that we can get a clean CI on it? I believe the float16 format has been merged so once you rebase and we have all green i'll be happy to merge this! |
3a1fe0f
to
f0ead7d
Compare
Rebased. A couple CI failures still, but they look unrelated. |
Can you please resolve the conflicts (and potentially rebase if necessary)? We can check the CI afterwards |
NaNs are still excluded from `randFloat16`, as this matches the native float equivalents
f0ead7d
to
a7922f9
Compare
Alright, I've fixing the conflicts and rebased again. It looks like everything's green now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…37599) ### Rationale for this change There is an active proposal for a Float16 logical type in Parquet (apache/parquet-format#184) with C++/Python implementations in progress (apache#36073), so we should add one for Go as well. ### What changes are included in this PR? - [x] Adds `LogicalType` definitions and methods for `Float16` - [x] Adds support for `Float16` column statistics and comparators - [x] Adds support for interchange between Parquet and Arrow's half-precision float ### Are these changes tested? Yes ### Are there any user-facing changes? Yes * Closes: apache#37582 Authored-by: benibus <[email protected]> Signed-off-by: Matt Topol <[email protected]>
After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit bff5fb9. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. |
…37599) ### Rationale for this change There is an active proposal for a Float16 logical type in Parquet (apache/parquet-format#184) with C++/Python implementations in progress (apache#36073), so we should add one for Go as well. ### What changes are included in this PR? - [x] Adds `LogicalType` definitions and methods for `Float16` - [x] Adds support for `Float16` column statistics and comparators - [x] Adds support for interchange between Parquet and Arrow's half-precision float ### Are these changes tested? Yes ### Are there any user-facing changes? Yes * Closes: apache#37582 Authored-by: benibus <[email protected]> Signed-off-by: Matt Topol <[email protected]>
Rationale for this change
There is an active proposal for a Float16 logical type in Parquet (apache/parquet-format#184) with C++/Python implementations in progress (#36073), so we should add one for Go as well.
What changes are included in this PR?
LogicalType
definitions and methods forFloat16
Float16
column statistics and comparatorsAre these changes tested?
Yes
Are there any user-facing changes?
Yes