Skip to content

Conversation

@alamb
Copy link
Contributor

@alamb alamb commented Oct 6, 2020

NOTE: this builds on #8346 so leaving as a draft until those are merged

This PR adds basic physical expression / casting support to DataFusion. Right now, it will cause all DictionaryArray data to be unpacked into a StringArray for operations.

Ideally, DataFusion would support direct comparison / operation on DictionaryArrays, however, there isn't the necessary support in the arrow kernels yet to do so (e.g. there is no Dictionary equality comparison kernel I could find https://github.com/apache/arrow/blob/master/rust/arrow/src/compute/kernels/comparison.rs

However, this PR gets the basic queries running and I hope to contribute further optimizations as time allows and our project needs dictate.

@github-actions
Copy link

github-actions bot commented Oct 6, 2020

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I don't understand why order_coercion and eq_coercion are different (eq_coercion does not include string_coercion).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in #8340, I don't think partially redundant and incomplete checks for casting support in DataFusion adds much over the actual arrow casting, so I have removed the plan time checks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to this PR, I just renamed some of these arguments to match the comment description

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here is the (simple) end to end test for DictionaryArray support

@alamb alamb force-pushed the alamb/ARROW-10159-dictionary-array-coercion branch from 7dc986e to ecd4b4e Compare October 6, 2020 12:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant