Skip to content

Conversation

@jorisvandenbossche
Copy link
Member

@pitrou
Copy link
Member

pitrou commented Sep 19, 2019

It would be better to store the extension name and serialization in the Parquet metadata, no? Then you can recreate a true extension array when reading?

@jorisvandenbossche
Copy link
Member Author

Yes, that was the idea to look into next. But I can probably directly try that in this PR.

@pitrou
Copy link
Member

pitrou commented Sep 19, 2019

I think that would be nice, since that would validate the whole approach. The metadata keys are currently private in arrow/ipc/metadata_internal.cc (see kExtensionTypeKeyName and kExtensionMetadataKeyName), you can probably make them extern in a .h somewhere.

Perhaps you could instead use Parquet-specific metadata keys if Parquet has a well-known convention for them, though.

Copy link
Member Author

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To a potential reviewer already looking at it, this is currently a very ugly implementation(just wanted to have something working now to pass the tests), will try to clean-up tomorrow.

@pitrou
Copy link
Member

pitrou commented Sep 23, 2019

The implementation looks good on the principle, feel free to ping when this is ready for review.

@jorisvandenbossche jorisvandenbossche force-pushed the ARROW-6187-parquet-extension-type branch from 7b54d86 to 61d245e Compare September 24, 2019 11:46
@jorisvandenbossche jorisvandenbossche force-pushed the ARROW-6187-parquet-extension-type branch from 6143946 to e56164b Compare September 24, 2019 13:19
@jorisvandenbossche
Copy link
Member Author

This should be ready for review now.

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@pitrou
Copy link
Member

pitrou commented Sep 24, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants