Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate arrow_cast to a UDF #9610

Merged
merged 13 commits into from
Mar 18, 2024

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Mar 14, 2024

Which issue does this PR close?

Closes #9143
Closes #9287
Closes #9298

Rationale for this change

arrow_cast function migration.

What changes are included in this PR?

This PR is based on #9298 from @brayanjuls, updated to use the new simplify API.

Are these changes tested?

Yes, by existing tests

Are there any user-facing changes?

@github-actions github-actions bot added sql SQL Planner core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) labels Mar 14, 2024
@alamb alamb changed the title Feat/migrate arrow cast to udf fixed Migrate arrow_cast to a UDF Mar 14, 2024
"| 2020-09-04 |",
"+------------+",
"+-----------------------------------+",
"| arrow_cast(t.values,Utf8(\"Utf8\")) |",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These differences are due to the fact that arrow_cast is just a normal function now rather than a special case in the parser. Thus the naming reflects normal function naming

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting I got stuck implementing the simpliy function because I thought it should convert arrow_cast(t.values,Utf8(\"Utf8\")) to t.values and other similar cases as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah -- this is pretty tricky. arrow_cast was quite special in the parser, so now that it is handled like a normal function it has the same (somewhat strange) function effect of column naming

@alamb alamb marked this pull request as ready for review March 15, 2024 11:25
info: &dyn SimplifyInfo,
) -> Result<ExprSimplifyResult> {
// convert this into a real cast
let target_type = data_type_from_args(&args)?;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This simplify logic mirrors the previous behavior in that arrow_cast is replaced with a normal cast

Copy link
Contributor

@jayzhan211 jayzhan211 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

}

fn return_type(&self, arg_types: &[DataType]) -> Result<DataType> {
parse_data_type(&arg_types[1].to_string())
Copy link
Contributor

@jayzhan211 jayzhan211 Mar 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If return_type_from_exprs exists, we don't need return_type. Is it better to return err or panic here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an excellent call -- changed to internal_err in 0c7b7be

@alamb alamb merged commit 2499245 into apache:main Mar 18, 2024
23 checks passed
@alamb
Copy link
Contributor Author

alamb commented Mar 18, 2024

Thanks again @brayanjuls and @jayzhan211

@alamb alamb deleted the feat/migrate_arrow_cast_to_udf_fixed branch March 18, 2024 21:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate sql SQL Planner sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Move arrow_cast to datafusion-functions crate Issue using arrow_cast in ORDER BY expressions
3 participants