Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Format] HALF precision FLOAT Logical type #317

Closed
asfimport opened this issue Oct 28, 2016 · 3 comments
Closed

[Format] HALF precision FLOAT Logical type #317

asfimport opened this issue Oct 28, 2016 · 3 comments

Comments

@asfimport
Copy link
Collaborator

asfimport commented Oct 28, 2016

Reporter: Julien Le Dem / @julienledem
Assignee: Anja Boskovic / @anjakefala

Related issues:

PRs and other links:

Note: This issue was originally created as PARQUET-758. Please see the migration documentation for further details.

@asfimport
Copy link
Collaborator Author

Gabor Szadovszky / @gszadovszky:
Hey everyone, who is interested in the half-float type,

When I've reviewed the format change it was obvious to me to use the "2-byte IEEE little-endian format". Now, I've faced another approach to encode 2 byte FP numbers: bfloat16. Since neither java nor c++ support 2 byte FP numbers natively we probably need to convert the encoded numbers to float. For bfloat16 it would be more performant to do so.
It might worth adding bfloat16 to the format as well and add implementations for it in the same round. WDYT?

@asfimport
Copy link
Collaborator Author

Anja Boskovic / @anjakefala:
Hi Gabor!

I would support a proposal for implementing bfloat16, maybe even as a canonical extension type in Arrow.

However, I have a hesitency to including that in this round of implementations. I think it should be considered seperately.

  1. My understanding is that the implementations have already begun (I messaged the parties working on the implementations, to create appropriate tickets).
  2. It would prolong the format review and implementations.
  3. Part of that prolonging is that I forsee additional back-and-forth over debating why "bfloat16"; why not tensorfloat? Why not add both?

And my experience has been that these conversations take a really long time for the Parquet community. It could easily add months to this process.

Float16 being an IEEE standard has a simplicity to its inclusion.

So, I guess my takeaway is that I support us opening a seperate format PR for bfloat16 inclusion, and having that occur seperate from the work of including, and implementing, IEEE float16.

@asfimport
Copy link
Collaborator Author

Gabor Szadovszky / @gszadovszky:
Thanks for your reply, @anjakefala!

I've mentioned bfloat16 only because of the ease of converting back and forth to java/c++ float which we will probably need to be implemented for IEEE Float16 as well. But I agree, we should not block the format release because of additional discussions about this additional topic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant