Skip to content

Conversation

@alamb
Copy link
Contributor

@alamb alamb commented Oct 8, 2020

The code / comments for type coercion are a little confusing and don't make the distinction between coercion and casting clear – this PR attempts to clarify the intent, channeling the information from @jorgecarleitao here: #8340 (comment)

@alamb alamb force-pushed the alamb/coercion-docs branch from 63f7114 to 9f497c6 Compare October 8, 2020 12:27
@github-actions
Copy link

github-actions bot commented Oct 8, 2020

Copy link
Member

@jorgecarleitao jorgecarleitao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @alamb! Looks great. Thanks a lot for taking this.

I left two minor suggestions.

nevi-me pushed a commit that referenced this pull request Oct 16, 2020
…ataFusion

This is a PR incorporating the feedback from @nevi-me  and @jorgecarleitao  from #8400

It adds
1. a `can_cast_types` function to the Arrow cast kernel (as suggested by @jorgecarleitao  / @nevi-me  in #8400 (comment)) that encodes the valid type casting
2. A test that ensures `can_cast_types` and `cast` remain in sync
3. Bug fixes that the test above uncovered (I'll comment inline)
4. Change DataFuson to use `can_cast_types` so that it plans casting consistently with what arrow allows

Previously the notions of coercion and casting were somewhat conflated in DataFusion. I have tried to clarify them in #8399 and this PR. See also #8340 (comment) for more discussion.

I am adding this functionality so DataFusion gains rudimentary support `DictionaryArray`.

Codewise, I am concerned about the duplication in logic between the match statements in `cast` and `can_cast_types. I have some thoughts on how to unify them (see #8400 (comment)), but I don't have time to implement that as it is a bigger change. I think this approach with some duplication is ok, and the test will ensure they remain in sync.

Closes #8460 from alamb/alamb/ARROW-10236-casting-rules-2

Authored-by: alamb <[email protected]>
Signed-off-by: Neville Dipale <[email protected]>
kszucs pushed a commit that referenced this pull request Oct 19, 2020
…ataFusion

This is a PR incorporating the feedback from @nevi-me  and @jorgecarleitao  from #8400

It adds
1. a `can_cast_types` function to the Arrow cast kernel (as suggested by @jorgecarleitao  / @nevi-me  in #8400 (comment)) that encodes the valid type casting
2. A test that ensures `can_cast_types` and `cast` remain in sync
3. Bug fixes that the test above uncovered (I'll comment inline)
4. Change DataFuson to use `can_cast_types` so that it plans casting consistently with what arrow allows

Previously the notions of coercion and casting were somewhat conflated in DataFusion. I have tried to clarify them in #8399 and this PR. See also #8340 (comment) for more discussion.

I am adding this functionality so DataFusion gains rudimentary support `DictionaryArray`.

Codewise, I am concerned about the duplication in logic between the match statements in `cast` and `can_cast_types. I have some thoughts on how to unify them (see #8400 (comment)), but I don't have time to implement that as it is a bigger change. I think this approach with some duplication is ok, and the test will ensure they remain in sync.

Closes #8460 from alamb/alamb/ARROW-10236-casting-rules-2

Authored-by: alamb <[email protected]>
Signed-off-by: Neville Dipale <[email protected]>
@alamb alamb deleted the alamb/coercion-docs branch October 26, 2020 20:04
alamb added a commit to apache/arrow-rs that referenced this pull request Apr 20, 2021
…ataFusion

This is a PR incorporating the feedback from @nevi-me  and @jorgecarleitao  from apache/arrow#8400

It adds
1. a `can_cast_types` function to the Arrow cast kernel (as suggested by @jorgecarleitao  / @nevi-me  in apache/arrow#8400 (comment)) that encodes the valid type casting
2. A test that ensures `can_cast_types` and `cast` remain in sync
3. Bug fixes that the test above uncovered (I'll comment inline)
4. Change DataFuson to use `can_cast_types` so that it plans casting consistently with what arrow allows

Previously the notions of coercion and casting were somewhat conflated in DataFusion. I have tried to clarify them in apache/arrow#8399 and this PR. See also apache/arrow#8340 (comment) for more discussion.

I am adding this functionality so DataFusion gains rudimentary support `DictionaryArray`.

Codewise, I am concerned about the duplication in logic between the match statements in `cast` and `can_cast_types. I have some thoughts on how to unify them (see apache/arrow#8400 (comment)), but I don't have time to implement that as it is a bigger change. I think this approach with some duplication is ok, and the test will ensure they remain in sync.

Closes #8460 from alamb/alamb/ARROW-10236-casting-rules-2

Authored-by: alamb <[email protected]>
Signed-off-by: Neville Dipale <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants