Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add regexp_count function #12970

Merged
merged 24 commits into from
Oct 18, 2024
Merged

Conversation

Omega359
Copy link
Contributor

Which issue does this PR close?

Closes #12079 and part of #11946. Followup to PR #12080 by #xinlifoobar with some additional work (documentation, minor fixes, scalar test fixes).

Rationale for this change

Add an additional useful regexp function to datafusion

What changes are included in this PR?

Code, tests, documentation.

Are these changes tested?

Yes.

Are there any user-facing changes?

Docs updated to include the new udf.

@github-actions github-actions bot added documentation Improvements or additions to documentation sqllogictest SQL Logic Tests (.slt) functions labels Oct 16, 2024
@Omega359
Copy link
Contributor Author

Note this PR does not resolve the performance degradation seen in this function as compared to regexp_match or regexp_like. (see #12080 (comment) for a benchmark) I believe that should be looked at in a followup ticket.

@Omega359 Omega359 marked this pull request as ready for review October 16, 2024 16:00
@Omega359 Omega359 changed the title Add regexp_count function feat: Add regexp_count function Oct 16, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Omega359 -- this looks great. I will merge it in and file a follow on ticket for enhancing its performance

@alamb
Copy link
Contributor

alamb commented Oct 18, 2024

FYI @xinlifoobar

@alamb alamb merged commit 73ba4c4 into apache:main Oct 18, 2024
27 checks passed
@alamb
Copy link
Contributor

alamb commented Oct 18, 2024

Note this PR does not resolve the performance degradation seen in this function as compared to regexp_match or regexp_like. (see #12080 (comment) for a benchmark) I believe that should be looked at in a followup ticket.

Filed #13011

@Omega359 Omega359 deleted the feature/regexp_count branch October 25, 2024 22:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation functions sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add additional regexp function regexp_count()
3 participants