Skip to content

Conversation

@martin-g
Copy link
Member

@martin-g martin-g commented Feb 17, 2022

What changes were proposed in this pull request?

Replace symbols like 'abc with the more verbose `Symbol("abc") in the test code.

Why are the changes needed?

Building with Scala 2.13 produces a lot of warnings like the following ones:

[warn] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/execution/BaseScriptTransformationSuite.scala:562:11: [deprecation @  | origin= | version=2.13.0] symbol literal is deprecated; use Symbol("d") instead
[warn]           'd.cast("string"),
[warn]           ^
[warn] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/execution/BaseScriptTransformationSuite.scala:563:11: [deprecation @  | origin= | version=2.13.0] symbol literal is deprecated; use Symbol("e") instead
[warn]           'e.cast("string")).collect())

This should make it easier to upgrade to Scala 3 later.

Does this PR introduce any user-facing change?

No! The PR touches only test classes!

How was this patch tested?

The build at CI must be green!

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think we did this for non-test code earlier. It should be OK as it is functionally identical and avoids a 2.13 deprecation warning.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this an intended change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because I've created the branch for this PR from the branch of #35556. I will remove this commit from here.

@martin-g
Copy link
Member Author

@dongjoon-hyun @HyukjinKwon What is your opinion ? Would you merge such PR ?

How big should the PR be ? Fix all issues of this type in the whole or module by module ?

@srowen
Copy link
Member

srowen commented Feb 18, 2022

Heh, no right answer, but I'd say fix < 200 files at a time? maybe that gets to all test code? that would be a fine scope

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@srowen
Copy link
Member

srowen commented Feb 21, 2022

Is this all the occurences in test? this could be an OK unit to merge

@HyukjinKwon
Copy link
Member

There was a previous try to refer to (#31569) cc @sarutak FYI

@sarutak
Copy link
Member

sarutak commented Feb 22, 2022

For making it easy to upgrade to Scala 3, another option would be to import language.deprecated.symbolLiterals._ when we support Scala 3.
What do you think, all?

@srowen
Copy link
Member

srowen commented Feb 22, 2022

Does that just suppress the warning? I think just consistently using Symbol() is easier IMHO; there may not be that much left to change.

@sarutak
Copy link
Member

sarutak commented Feb 22, 2022

Yeah, I thought it's better to replace them but there were some objections to do it in the previous PR. So, I just suggest another possible solution.

@martin-g
Copy link
Member Author

I wasn't aware of #31569. There some people said that it might be better to migrate to s"abc" instead of Symbol("abc"), and I agree.
I am fine with any of the following:

  • close this PR as "Won't do"
  • finish it
  • re-work it to s""

@srowen
Copy link
Member

srowen commented Feb 22, 2022

Does s"..." do anything here? I thought that just expressed interpolated strings

@martin-g
Copy link
Member Author

Sorry! Muscle memory ... I meant $"abc" - wangyum@b428acb

@srowen
Copy link
Member

srowen commented Feb 26, 2022

If you want to rebase this and mark it ready for review, I think it could be OK. A good scope might be all occurrences in tests?

@martin-g
Copy link
Member Author

I will create a JIRA ticket and update the PR soon!

@martin-g martin-g changed the title [SPARK-TODO][TESTS] Dont use deprecate symbol api [SPARK-38351][TESTS] Don't use deprecate symbol API in test classes Feb 28, 2022
The recommended way is to use Symbol("name").
Fix deprecation warnings only in the test suites

Signed-off-by: Martin Tzvetanov Grigorov <[email protected]>
@martin-g
Copy link
Member Author

The build fails due to :

 java.lang.RuntimeException: Failing because of negative scalastyle result
[error] 	at scala.sys.package$.error(package.scala:30)
[error] 	at org.scalastyle.sbt.Tasks$.handleResult$1(Plugin.scala:132)
[error] 	at org.scalastyle.sbt.Tasks$.doScalastyleWithConfig$1(Plugin.scala:187)
...
[error] /__w/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingJoinSuite.scala:580: File line length exceeds 100 characters
[error] /__w/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingJoinSuite.scala:927: File line length exceeds 100 characters
...

But executing ./dev/scalafmt leads to many other unrelated formatting changes like:

import org.apache.spark.sql.catalyst.expressions.{InSet, Literal, NamedExpression}
-import org.apache.spark.sql.catalyst.util.DateTimeTestUtils.{outstandingTimezonesIds, outstandingZoneIds}
+import org.apache.spark.sql.catalyst.util.DateTimeTestUtils.{
+  outstandingTimezonesIds,
+  outstandingZoneIds
+}
 import org.apache.spark.sql.catalyst.util.DateTimeUtils
 import org.apache.spark.sql.execution.ProjectExec
 import org.apache.spark.sql.functions._
@@ -43,83 +46,54 @@ class ColumnExpressionSuite extends QueryTest with SharedSparkSession {
   import testImplicits._
 
   private lazy val booleanData = {
-    spark.createDataFrame(sparkContext.parallelize(
-      Row(false, false) ::
-      Row(false, true) ::
-      Row(true, false) ::
-      Row(true, true) :: Nil),
+    spark.createDataFrame(
+      sparkContext.parallelize(
+        Row(false, false) ::
+          Row(false, true) ::
+          Row(true, false) ::
+          Row(true, true) :: Nil),

It might take me a while to format manually just the related code.

@srowen
Copy link
Member

srowen commented Feb 28, 2022

Yeah don't bother with scalafmt here, just fix the long lines if there are only a few instances

E.g.:
[error] /__w/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingJoinSuite.scala:927: File line length exceeds 100 characters
[error] /__w/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingJoinSuite.scala:928: File line length exceeds 100 characters

./dev/scalafmt formats other unrelated lines too, so it is not really
usable

Signed-off-by: Martin Tzvetanov Grigorov <[email protected]>
@martin-g martin-g marked this pull request as ready for review March 1, 2022 07:24
@martin-g
Copy link
Member Author

martin-g commented Mar 1, 2022

@srowen The PR is ready for review!

@srowen
Copy link
Member

srowen commented Mar 3, 2022

Merged to master

@srowen srowen closed this in 34618a7 Mar 3, 2022
@HeartSaVioR
Copy link
Contributor

Just to see our preference, given $"..." is not mentioned at all for deprecation and it is simpler than Symbol("..."), why not consistently using $"..."? My intuition is that ' would be used as it is simplest representation to refer a column, and $"..." is going to be simplest once we are enforced to use Symbol("...") instead.

@srowen
Copy link
Member

srowen commented Mar 3, 2022

Is that the same thing? $".." is shorthand for a Column, not Scala Symbol

@HeartSaVioR
Copy link
Contributor

Right, but what else we use Symbol except representing Column? I might be missing something, but from what I've seen from test codes, majority of usage has been the same for $"...".

@srowen
Copy link
Member

srowen commented Mar 3, 2022

Hm, I think you have a point. We definitely need Symbol in some parts of the Scala code, but I think they're rare and limited to parts that reason about closures, etc. For columns, yeah, in hindsight I agree.

Eh, hm, martin this might be my fault here. It may be smarter to turn all of those changes into $"..." after all if they're all really column refs. I totally missed that. We could do it in a follow up or revert this or whatever's easy, if you're willing.

@HeartSaVioR
Copy link
Contributor

Follow-up fix sounds good to me. Easier to work based on this since we now just need to try finding Symbol( and replace there. Finding shorthand of Symbol ' would be much more bothering work despite warning message will guide us.

@martin-g martin-g deleted the dont-use-deprecate-symbol-api branch March 4, 2022 07:15
@martin-g
Copy link
Member Author

martin-g commented Mar 4, 2022

Agreed! I will work on this soon!

@jiangxb1987
Copy link
Contributor

Do we still have any tests for '? Since this is still supported, we should have at least one test to make sure it's working and if it's broken by a future Scala version, we can notice it.

@martin-g
Copy link
Member Author

@jiangxb1987 Yes, there are still usages of ' in the tests. This PR changed just part of the test files to keep the diff not too big,
I didn't have time for the follow-up in the last two weeks but I will do it soon!

@srowen
Copy link
Member

srowen commented Mar 23, 2022

@martin-g any chance you have some time to take another look at this?

@martin-g
Copy link
Member Author

I promise to do it real soon! I have something else to finish first!
The new PR will be just for master branch, i.e. 3.4.x.

@martin-g
Copy link
Member Author

@srowen #35976

srowen pushed a commit that referenced this pull request Apr 2, 2022
…tests

### What changes were proposed in this pull request?

Replace symbols like 'abc and Symbol("abc") (where possible) with the more verbose $"abc" in the test code.

### Why are the changes needed?

This is a follow-up of #35560

Building with Scala 2.13 produces a lot of warnings like the following ones:

```
[warn] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/execution/BaseScriptTransformationSuite.scala:562:11: [deprecation   | origin= | version=2.13.0] symbol literal is deprecated; use Symbol("d") instead
[warn]           'd.cast("string"),
[warn]           ^
[warn] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/execution/BaseScriptTransformationSuite.scala:563:11: [deprecation   | origin= | version=2.13.0] symbol literal is deprecated; use Symbol("e") instead
[warn]           'e.cast("string")).collect())
```

This should make it easier to upgrade to Scala 3 later.

### Does this PR introduce _any_ user-facing change?

No! The PR touches only test classes!

### How was this patch tested?

The build at CI must be green!

Closes #35976 from martin-g/spark-38661-replace-symbol-with-dollarSign.

Lead-authored-by: Martin Tzvetanov Grigorov <[email protected]>
Co-authored-by: Martin Grigorov <[email protected]>
Signed-off-by: Sean Owen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants