Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: coalesce schema issues #12308
fix: coalesce schema issues #12308
Changes from all commits
854ed60
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the root cause of the issue and to solve this other changes are necessary. Therefore, I think we should go with this change and maybe further optimize the coercion in another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so should I leave it as it is? Or change it back to how it was:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with the current change, maybe wait for @findepi @alamb for how to move on with this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I defer to @jayzhan211 -- if he is good to merge this PR, let's get the conflicts resolved and merge it in.
If there is additional work we know is needed / could be cleaned up, let's try and file them as tickets
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conflicts solved! 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me that moving the signature from a data driven description (aka describe "what" is needed and letting some other code compute if the given arguments match that signature), this PR is moving many of the functions towards more functional (each function has to implement its own custom coercion, likely resulting in significant duplication).
What do you think (perhaps as a follow on PR) of adding
DataType::Null
support to the Signature calculations somehow rather than inlining / duplicating the coercion logic?Maybe something like
that would support automatically coercing arguments from null?
Or maybe we should always support coercing Null to any type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternative signature like Signature::String, similar to Signature::numeric that includes converting null to string too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure -- I was just reacting that this "handle null" pattern seems common and it seems like this approach will require custom coerce logic for all functions 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Null to T coercion needs to be handled elsewhere anyway (eg when computing type of a UNION, etc.).
We can free functions from having to bother about coercions at all and let the engine calculate coercions when building the logical plan.
This is actually super fundamental for DataFusion vision as a composable query engine. Coercion rules are very implementation-specific. If we had functions spiced up with coercions inside them, that would make those functions non-reusable.
cc @wizardxz @sadboy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
100%
It seems to me like
Signature
is supposed to communicate what types the function implementation has a native implementation for and the coercion of whatever the user provided doesn't match one of the supported typesThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@findepi Are you suggesting something like general coercion that is non-function specific? But what if we want different coercion rule for different function, we might need to do coercion function wise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would we want different coercion rules for different functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My idea is that it is more flexible to the user, although, without the real use case, it might be a premature optimization 🤔.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to chip in late; this PR addresses other issues, such as #12307. I wonder if I could split it and leave the changes regarding the coercion of functions in this one (to keep the discussion in one place) and the others in a new PR. Would that be ok?