Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: coalesce schema issues #12308

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open

Conversation

mesejo
Copy link

@mesejo mesejo commented Sep 3, 2024

Which issue does this PR close?

Closes #12307.

Are these changes tested?

Yes

Are there any user-facing changes?

No

@github-actions github-actions bot added core Core DataFusion crate functions labels Sep 3, 2024
@github-actions github-actions bot added logical-expr Logical plan and expressions sqllogictest SQL Logic Tests (.slt) labels Sep 4, 2024
@mesejo mesejo force-pushed the fix/coalesce-null branch 7 times, most recently from 6954677 to 01fab57 Compare September 9, 2024 18:36
@mesejo mesejo force-pushed the fix/coalesce-null branch 3 times, most recently from 4c7989e to 30a5c5d Compare September 10, 2024 18:36
Copy link
Contributor

@jayzhan211 jayzhan211 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mesejo, I think overall looks good to me

datafusion/sqllogictest/test_files/timestamps.slt Outdated Show resolved Hide resolved
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mesejo and @jayzhan211

cc @findepi who I believe is also working in this area / thinking about functions

Self {
signature: Signature::one_of(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that moving the signature from a data driven description (aka describe "what" is needed and letting some other code compute if the given arguments match that signature), this PR is moving many of the functions towards more functional (each function has to implement its own custom coercion, likely resulting in significant duplication).

What do you think (perhaps as a follow on PR) of adding DataType::Null support to the Signature calculations somehow rather than inlining / duplicating the coercion logic?

Maybe something like

Signature::allow_null(..)

that would support automatically coercing arguments from null?

Or maybe we should always support coercing Null to any type

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternative signature like Signature::String, similar to Signature::numeric that includes converting null to string too?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure -- I was just reacting that this "handle null" pattern seems common and it seems like this approach will require custom coerce logic for all functions 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Null to T coercion needs to be handled elsewhere anyway (eg when computing type of a UNION, etc.).
We can free functions from having to bother about coercions at all and let the engine calculate coercions when building the logical plan.

This is actually super fundamental for DataFusion vision as a composable query engine. Coercion rules are very implementation-specific. If we had functions spiced up with coercions inside them, that would make those functions non-reusable.

cc @wizardxz @sadboy

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100%

It seems to me like Signature is supposed to communicate what types the function implementation has a native implementation for and the coercion of whatever the user provided doesn't match one of the supported types

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate functions logical-expr Logical plan and expressions sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Coalesce fails for query: `SELECT COALESCE(null, 5)
5 participants