Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regexp_replace fails when pattern or replacement is a scalar NULL #11410

Closed
Blizzara opened this issue Jul 11, 2024 · 1 comment · Fixed by #11459
Closed

regexp_replace fails when pattern or replacement is a scalar NULL #11410

Blizzara opened this issue Jul 11, 2024 · 1 comment · Fixed by #11459
Assignees
Labels
bug Something isn't working

Comments

@Blizzara
Copy link
Contributor

Blizzara commented Jul 11, 2024

Describe the bug

regexp_replace fails to produce correct number of rows if either pattern or replacement arg is a scalar NULL.

I think this is due to

let pattern = fetch_string_arg!(&args[1], "pattern", T, _regexp_replace_early_abort);
fetch_string_arg not passing the correct length to the _regexp_replace_early_abort function - when this specific arg is a scalar, its "array len" is just 1, and the abort function creates a 1-len array as the result.

To Reproduce

Normal case - values are valid scalars or array nulls:

> select regexp_replace(col, 'a', 'c') from (values ('a'), ('b')) as tbl(col);
+---------------------------------------------+
| regexp_replace(tbl.col,Utf8("a"),Utf8("c")) |
+---------------------------------------------+
| c                                           |
| b                                           |
+---------------------------------------------+
2 row(s) fetched. 
Elapsed 0.001 seconds.

> select regexp_replace(col, ncol, 'c') from (values ('a', NULL), ('b', NULL)) as tbl(col, ncol);
+--------------------------------------------+
| regexp_replace(tbl.col,tbl.ncol,Utf8("c")) |
+--------------------------------------------+
|                                            |
|                                            |
+--------------------------------------------+

Failing case - pattern or replacement is a scalar NULL:

> select regexp_replace(col, NULL, 'c') from (values ('a'), ('b')) as tbl(col);
Internal error: UDF returned a different number of rows than expected. Expected: 2, Got: 1.
This was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker

> select regexp_replace(col, 'a', NULL) from (values ('a'), ('b')) as tbl(col);
Internal error: UDF returned a different number of rows than expected. Expected: 2, Got: 1.
This was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker

Expected behavior

No response

Additional context

No response

@Blizzara Blizzara added the bug Something isn't working label Jul 11, 2024
@Weijun-H
Copy link
Member

take

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants