Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: generate_series function support string type #12002

Merged
merged 11 commits into from
Aug 19, 2024

Conversation

getChan
Copy link
Contributor

@getChan getChan commented Aug 15, 2024

Which issue does this PR close?

Closes #11922.

Rationale for this change

generate_series function does not accept string arguments.
However, the as-is test provide string arguments.

I think it would be better if the type signature of generate_series was changed.
https://github.com/apache/datafusion/blob/main/datafusion/functions-nested/src/range.rs#L129

What changes are included in this PR?

Only sqllogictest

Are these changes tested?

Yes.

Are there any user-facing changes?

No.

@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Aug 15, 2024
@jayzhan211
Copy link
Contributor

@getChan I think what we need is to support string type but not modifying the test to other already supported type

D create table date_table(start date, stop date, step interval);
D insert into date_table values (DATE '1992-01-01', DATE '1993-01-02', INTERVAL '1' MONTH);
D select generate_series(start, '1993-03-01', INTERVAL '1 year') from date_table;
┌────────────────────────────────────────────────────────────────────┐
│ generate_series("start", '1993-03-01', CAST('1 year' AS INTERVAL)) │
│                            timestamp[]                             │
├────────────────────────────────────────────────────────────────────┤
│ [1992-01-01 00:00:00, 1993-01-01 00:00:00]                         │
└────────────────────────────────────────────────────────────────────┘

We should able to return the same result as Duckdb's

@getChan
Copy link
Contributor Author

getChan commented Aug 15, 2024

In that case, I guess we should extend the above issue to "Add string type support to generate_series function".

@jayzhan211
Copy link
Contributor

jayzhan211 commented Aug 15, 2024

I expect you get the same answer as if you explicitly cast the second argument to date:

The goal is make the query return the same result as the one you explicitly cast the argument.

In that case, I guess we should extend the above issue to "Add string type support to generate_series function".

Yes, I think what we need is to support string type, or cast the string to date. Not sure what is the exact solution. But I expect we should get the correct result for that query

@alamb
Copy link
Contributor

alamb commented Aug 15, 2024

Yes, I think what we need is to support string type, or cast the string to date. Not sure what is the exact solution. But I expect we should get the correct result for that query

The type coercsion system is supposed to handle adding this kind of cast.

I wonder if we can could remove the Any(3) signature a new signature to generate_series :

TypeSignature::Exact(vec![Int64]),
TypeSignature::Exact(vec![Int64, Int64]),
TypeSignature::Exact(vec![Int64, Int64, Int64]),
TypeSignature::Exact(vec![Date32, Date32, Interval(MonthDayNano)]),
TypeSignature::Any(3),

I think that would then ensure DataFusion actually casts the arguments to the supported types

@alamb alamb marked this pull request as draft August 15, 2024 22:45
@alamb
Copy link
Contributor

alamb commented Aug 15, 2024

Marking as draft as I think this PR is no longer waiting on feedback. Please mark it as ready for review when it is ready for another look

@getChan getChan force-pushed the generate-series-sqllogictest-bug branch from 4957a1d to bccdc2e Compare August 17, 2024 17:49
@github-actions github-actions bot added the logical-expr Logical plan and expressions label Aug 17, 2024
@getChan
Copy link
Contributor Author

getChan commented Aug 17, 2024

I wonder if we can could remove the Any(3) signature a new signature to generate_series :

removed Any(3) signature. and fix some type coerce code for range, generate_series

@getChan getChan changed the title Fix: sqllogictest - generate_series function invalid argument type Fix: generate_series function support string type Aug 17, 2024
@getChan getChan marked this pull request as ready for review August 17, 2024 18:13
@getChan getChan marked this pull request as draft August 18, 2024 11:09
@github-actions github-actions bot removed the logical-expr Logical plan and expressions label Aug 18, 2024
@getChan getChan marked this pull request as ready for review August 18, 2024 12:30
Copy link
Contributor

@jayzhan211 jayzhan211 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

@jayzhan211 jayzhan211 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @getChan

@jayzhan211 jayzhan211 merged commit 574dfeb into apache:main Aug 19, 2024
24 checks passed
@@ -5804,7 +5804,7 @@ select generate_series(5),
----
[0, 1, 2, 3, 4, 5] [2, 3, 4, 5] [2, 5, 8] [1, 2, 3, 4, 5] [5, 4, 3, 2, 1] [10, 7, 4] [1992-09-01, 1992-10-01, 1992-11-01, 1992-12-01, 1993-01-01, 1993-02-01, 1993-03-01] [1993-02-01, 1993-01-31, 1993-01-30, 1993-01-29, 1993-01-28, 1993-01-27, 1993-01-26, 1993-01-25, 1993-01-24, 1993-01-23, 1993-01-22, 1993-01-21, 1993-01-20, 1993-01-19, 1993-01-18, 1993-01-17, 1993-01-16, 1993-01-15, 1993-01-14, 1993-01-13, 1993-01-12, 1993-01-11, 1993-01-10, 1993-01-09, 1993-01-08, 1993-01-07, 1993-01-06, 1993-01-05, 1993-01-04, 1993-01-03, 1993-01-02, 1993-01-01] [1989-04-01, 1990-04-01, 1991-04-01, 1992-04-01]

query error DataFusion error: Execution error: unsupported type for range. Expected Int64 or Date32, got: Timestamp\(Nanosecond, None\)
query error DataFusion error: Execution error: Cannot generate date range less than 1 day\.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will file an ticket for this issue

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Internal error generate_series: Internal error: could not cast value to .....arrow_array::types::Date32Type`
3 participants