-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-34443][CORE] Replace symbol literals with Symbol constructor invocations to comply with Scala 2.13 #31569
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
For reviewers: This PR changes too many files though the changes themselves are simple. |
|
Kubernetes integration test starting |
|
I am okay doing this in one go. |
|
cc @rxin and @srowen FYI (from Do you use single-quote syntax for the DataFrame API?) |
|
Test build #135164 has started for PR 31569 at commit |
|
Test build #135159 has finished for PR 31569 at commit
|
|
Kubernetes integration test starting |
|
Test build #135166 has started for PR 31569 at commit |
|
Kubernetes integration test status failure |
|
I think it's ok. I did a lot of this a long time ago too just to tackle the compiler warnings. |
|
I share some of that frustration, not to the same extreme. Breaking binary/source compatibility across every minor release is a ... painful choice for users to eat. I have always thought of Scala as a bit more of a research language and acts accordingly. There are upsides to deciding to change quickly. In this case - removing the weird back tick syntax is a good thing, but should it ever have been that way? But not sure why it should go away entirely. As one of the major Scala projects, I'd hope nothing happens that really busts how Spark has to work. |
|
This is going to be a nightmare. I didn't realize this change had happened upstream in Scala 2.13. I felt like it'd be virtually impossible for most Spark users, including all Databricks customers, to upgrade in the future, because they would need to rewrite a lot of their code, and worse dependency code that they might not control. |
|
Hmm, sounds like bad. |
|
retest this please. |
|
Test build #135186 has started for PR 31569 at commit |
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Just a 2 cents, it might be better to convert to |
|
|
||
| // int type can be up cast to long type | ||
| val attrs1 = Seq('a.string, 'b.int) | ||
| val attrs1 = Seq(Symbol("a").string, Symbol("b").int) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we replace this with another semantically-equal one, I prefer $"a" instead of Symbol as for the SQL module. IIRC @cloud-fan left the simillar comment in another PR.
…mples and documents ### What changes were proposed in this pull request? This PR replaces all the occurrences of symbol literals (`'name`) with string interpolation (`$"name"`) in examples and documents. ### Why are the changes needed? Symbol literals are used to represent columns in Spark SQL but the Scala community seems to remove `Symbol` completely. As we discussed in #31569, first we should replacing symbol literals with `$"name"` in user facing examples and documents. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Build docs. Closes #31615 from sarutak/replace-symbol-literals-in-doc-and-examples. Authored-by: Kousuke Saruta <[email protected]> Signed-off-by: HyukjinKwon <[email protected]>
|
I asked @dragos to chime in and he started a discussion with the Scala team and Martin. Let's see what happens. |
|
Test build #136941 has finished for PR 31569 at commit
|
|
Is there any progress on this issue? |
|
@LuciferYang I keep watching the activity of the Scala community. So, it seems that Spark users can use the language import feature until they finishes migrating their application but Spark needs to be replace symbol literals. |
|
Backing up - this change isn't about removing usage of Symbol, but a deprecated syntax for Symbol. This doesn't actually change anything but how it's expressed in source code right? if so, why not do this in Spark (aside from the large code churn of course)? Changing to Column syntax is another question. |
I intended to just replace symbol literal with
Yeah, at first, the syntax stuff is a separate one but after a discussion, the most part of the symbol usage was to be removed. |
|
That's right. That would indeed be a bigger, problematic change. I'm saying that this change by itself isn't doing anything but removing deprecation warnings, so we're not arguing with it (except perhaps to debate the code churn). If the argument is - we may have to make a bigger more painful change later anyway, so this isn't worth it, I buy that too. But this by itself seems plausible to merge. |
|
fwiw, I will oppose any deprecation of The literal syntax was worth deprecating because it decreases the amount of Scala syntax, which makes everything easier for tooling authors, educators, etc. Whereas deprecating |
O.K, I don't care about replacing |
|
I think there are 2 topics here:
What do you think? |
Fortunately, the usage of Symbols in public non-internal APIs is limited.
So I think we can decide to deprecate these APIs later.
As I noticed in #31601, most usage of Symbols seems to be in testing DSL. |
|
With all due respect, the number of functions is not a good measure. Those functions are the most fundamental ones in the DataFrame API. Basically any time anybody wants to refer to a column... just look up stackoverflow. There are tons of answers with sample code doing ‘col.
I don’t think Scala should remove the symbol literal syntax given its damage. It looks like it’s possible it will be kept with an import and then Spark users wouldn’t need to suffer.
|
I mean that we don't need to decide how we treat those public APIs for now and it's easy even if we (unfortunately) need to replace it because there are three APIs.
Yes. As I mentioned here I think users can use the language import feature too. |
|
If the language import feature for symbol literal will be always there, I think we don't need any changes. We can probably wait until the Scala community decides to remove that language import. |
|
Or we can temporarily suppress the compilation warnings to make it look cleaner |
+1 |
|
Test build #137686 has finished for PR 31569 at commit
|
|
@cloud-fan create a new jira SPARK-35151 (#32261) |
As I understand it, the language import was added to give the Spark project extra time to migrate; I don't think the intention was for Spark to then turn around and say "yay, now that we have this import we don't need to do anything". |
It won't always be there, we've explicitly said that it can go away at any point: lampepfl/dotty-feature-requests#182 (comment). |
|
Yup I think that if eventually it has to go away, then, maybe doesn't make sense to replace the deprecated syntax, just suppress warnings, and take it out entirely in the future all at once. |
|
Do we need to continue this work? |
|
Test build #140475 has finished for PR 31569 at commit
|
|
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
What changes were proposed in this pull request?
This PR replaces all the occurrence of symbol literals with
Symbol()constructors.Why are the changes needed?
As of Scala 2.13, symbol literals are deprecated so when we build with Scala 2.13 and sbt, the compiler loudly inform us.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
I confirmed that
compileandtest:compilesuccessfully finish with both Scala 2.12 and 2.13.