-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-34244][SQL] Remove the Scala function version of regexp_extract_all #31346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
HyukjinKwon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @beliefer. Looks good, pending tests.
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
@cloud-fan WDYT? |
|
Test build #134503 has finished for PR 31346 at commit
|
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Thank you, @beliefer , @HyukjinKwon , @cloud-fan .
…t_all ### What changes were proposed in this pull request? #27507 implements `regexp_extract_all` and added the scala function version of it. According https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/functions.scala#L41-L59, it seems good for remove the scala function version. Although I think is regexp_extract_all is very useful, if we just reference the description. ### Why are the changes needed? `regexp_extract_all` is less common. ### Does this PR introduce _any_ user-facing change? 'No'. `regexp_extract_all` was added in Spark 3.1.0 which isn't released yet. ### How was this patch tested? Jenkins test. Closes #31346 from beliefer/SPARK-24884-followup. Authored-by: beliefer <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 99b6af2) Signed-off-by: Dongjoon Hyun <[email protected]>
|
@dongjoon-hyun @HyukjinKwon @cloud-fan Thanks! |
|
Can someone help me understand why this was removed? I added this method to spark-daria a while back to fill the gap for the Spark community. Think the arguments for removing were cause this method was for SQL compliance and that it's accessible via expr. I've needed this method a lot in practical projects. It's a "must have" not a nice to have for a lot of analyses. Invoking the method via Can we just implement all the SQL methods in Scala and Python so we have API consistency? |
|
Could you send your opinion on the dev mailing list, @MrPowers ? Specifically, the following. Thanks!
|
|
Yes, this looks something we should discuss in the mailing list. |
|
But I don't personally agree with that. Each language specific APIs should better focus on being the language friendly. Many of them are SQL specific and doesn't make much sense in other languages time to time. Also we have an API BTW, we should also think about R side APIs too not only Scala and Python. |
…t_all ### What changes were proposed in this pull request? apache#27507 implements `regexp_extract_all` and added the scala function version of it. According https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/functions.scala#L41-L59, it seems good for remove the scala function version. Although I think is regexp_extract_all is very useful, if we just reference the description. ### Why are the changes needed? `regexp_extract_all` is less common. ### Does this PR introduce _any_ user-facing change? 'No'. `regexp_extract_all` was added in Spark 3.1.0 which isn't released yet. ### How was this patch tested? Jenkins test. Closes apache#31346 from beliefer/SPARK-24884-followup. Authored-by: beliefer <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
I very much agree with that. Having language API for all SQL functions makes developing applications in IDE way easier and more user-friendly (typing, autocompletion, less error-prone than using strings...). |
What changes were proposed in this pull request?
#27507 implements
regexp_extract_alland added the scala function version of it.According https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/functions.scala#L41-L59, it seems good for remove the scala function version. Although I think is regexp_extract_all is very useful, if we just reference the description.
Why are the changes needed?
regexp_extract_allis less common.Does this PR introduce any user-facing change?
'No'.
regexp_extract_allwas added in Spark 3.1.0 which isn't released yet.How was this patch tested?
Jenkins test.