[BUG] Register super extension on to_arrow #3030
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is an issue where Daft Extension types were not getting converted to PyArrow properly. @jaychia discovered this while trying to write parquet with a tensor column, where the Extension metadata for tensor was getting dropped.
A simple test to reproduce the error:
Output:
It's not a tensor type! However if you uncomment the
_ensure_registered_super_ext_type()
, you will now see:The issue here is that the
class DaftExtension(pa.ExtensionType):
is not imported during the FFI, as it is now a lazy import that must be called via_ensure_registered_super_ext_type()
.This PR adds calls to this import in
to_arrow
for series and schema. However, I do not know if this is exhaustive, and I will give this more thought. @desmondcheongzx @samster25