Skip to content

[SPARK-26132][BUILD][CORE] Remove support for Scala 2.11 in Spark 3.0.0#23098

Closed
srowen wants to merge 1 commit intoapache:masterfrom
srowen:SPARK-26132
Closed

[SPARK-26132][BUILD][CORE] Remove support for Scala 2.11 in Spark 3.0.0#23098
srowen wants to merge 1 commit intoapache:masterfrom
srowen:SPARK-26132

Conversation

@srowen
Copy link
Member

@srowen srowen commented Nov 20, 2018

What changes were proposed in this pull request?

Remove Scala 2.11 support in build files and docs, and in various parts of code that accommodated 2.11. See some targeted comments below.

How was this patch tested?

Existing tests.

R/pkg/R/sparkR.R Outdated
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@felixcheung is it OK to refer to _2.12 artifacts here? I don't think this one actually exists, but is it just an example?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's even a separate discussion about even using this as an example, since that package is now in Spark since 2.4.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@felixcheung was the conclusion that we can make this a dummy package? I just want to avoid showing _2.11 usage here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, dummy name is completely fine with me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gengliangwang this was the update I was talking about to the .cmd script. You can follow up with this change, uncommented, if you like, separately from this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vanzin @cloud-fan you may want to look at this. It's getting a little hairy in this script.

I recall that the goal was to use this script to create older Spark releases, so it needs logic for older versions. But looking at it, I don't think it actually creates quite the same release as older versions anyway. Is it OK to clean house here and assume only Spark 3 will be built from this script? I already deleted some really old logic here (Spark < 2.2)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer keeping the script in master working against all currently supported versions. I find it pretty hard to keep things in sync across different branches, especially if you need to fix things in the middle of an RC cycle. Having master be the source of truth for this makes some of that pain goes away, with the cost of some added logic here.

I think the main problem now is that for 3.0 the default Scala version is different than for 2.4, which is the only added complication I can think of here...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question though, if you're releasing 2.3 wouldn't you use the release script as of 2.3?

I think the script works for Spark 1.6 and 2.x, but not earlier versions, already. What about drawing the lines at major releases, so that this can simplify further?

Right now I think it's basically trying to support 2.2+, which are the non-EOL releases, which seems reasonable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you're releasing 2.3 wouldn't you use the release script as of 2.3?

I started like that when I RM'ed 2.3.1, but the scripts were kinda broken that I had to fix a bunch of stuff in master. At that point, it didn't make sense to use the scripts in the 2.3 branch anymore, and keeping them in sync was kinda wasted effort.

basically trying to support 2.2+, which seems reasonable?

That's my view. The release scripts in master should be usable for all non-EOL releases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, the issue is that the release script for 2.3.0 that works might only appear in 2.3.1 because of some last-minute hacks during the release process. I can see being careful not to drop 2.3.0 logic immediately after 2.3.0. Maybe at 2.4.0, but that's aggressive.

At least, sure, we can forget Spark < 2.3 in Spark 3.0.0's build script. Maybe later we decide to go further. I'll work on making it that way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maryannxue I think you helped work on this part ... if we're only on Scala 2.12 now can we simplify this further?

@SparkQA
Copy link

SparkQA commented Nov 20, 2018

Test build #99086 has finished for PR 23098 at commit 0420c2d.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 21, 2018

Test build #99087 has finished for PR 23098 at commit 1c1f000.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@gengliangwang gengliangwang Nov 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not familiar with .cmd script. Should we keep the quote here, "2.12"?

Copy link
Member

@HyukjinKwon HyukjinKwon Nov 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, the string became as is including quotes if it's quoted on Windows ... haha odd (to me).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, so we shouldn't add quotes to values like SPARK_ENV_CMD above, but use them in if conditions, call, etc?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I think we shouldn't quote if I remember this correctly. Let me test and get back to you today or tomorrow. I'll have to fly to Korea for my one week vacation from tomorrow :D.

Copy link
Member

@HyukjinKwon HyukjinKwon Nov 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@HyukjinKwon HyukjinKwon Nov 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For call, it's a bit different (it can be quoted IIRC)(https://ss64.com/nt/call.html)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, I ran some of simple commands:

C:\>set A=aa

C:\>ECHO %A%
aa

C:\>set A="aa"

C:\>ECHO %A%
"aa"

C:\>call "python.exe"
Python 3.6.4 (v3.6.4:d48eceb, Dec 19 2017, 06:54:40) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> exit(0)

C:\>call python.exe
Python 3.6.4 (v3.6.4:d48eceb, Dec 19 2017, 06:54:40) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> exit(0)

@SparkQA
Copy link

SparkQA commented Nov 28, 2018

Test build #99382 has finished for PR 23098 at commit 2b8d770.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member Author

srowen commented Dec 4, 2018

Note I'm holding on to this PR for a while as I understand it might be disruptive to downstream builds to remove 2.11 support just now. Will look at merging it in weeks. Right now it's an FYI.

@SparkQA
Copy link

SparkQA commented Dec 4, 2018

Test build #99667 has finished for PR 23098 at commit 96f9c41.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member Author

srowen commented Jan 9, 2019

Still holding on to this. I intend to merge it before Spark 3, but have gotten feedback that it would be helpful to hold off for a while.

@SparkQA
Copy link

SparkQA commented Jan 9, 2019

Test build #100974 has finished for PR 23098 at commit 87fc1f1.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dbtsai dbtsai self-requested a review January 14, 2019 21:52
Copy link
Member

@dbtsai dbtsai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for removing Scala 2.11.

@srowen
Copy link
Member Author

srowen commented Feb 23, 2019

Rebased to keep this up to date; not merging just yet

@SparkQA
Copy link

SparkQA commented Feb 23, 2019

Test build #102713 has finished for PR 23098 at commit ffbad3e.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mengxr
Copy link
Contributor

mengxr commented Feb 26, 2019

@srowen @dbtsai Have we decided to remove Scala 2.11 support in 3.0? If not, shall we start a vote?

@srowen
Copy link
Member Author

srowen commented Feb 26, 2019

I think there is support for it, and don't know if we need a vote. Part of the reason I delay is that it's not a hurry, and in a way I wouldn't mind keeping Spark 3 pretty nearly compatible with 2.11, even if we won't support it. Leaving the 2.11 build maybe lets us avoid unnecessary incompatibilities with 2.11. I was planning to check on support again for this in a month or so.

@SparkQA
Copy link

SparkQA commented Mar 19, 2019

Test build #103686 has started for PR 23098 at commit d3adda6.

@skonto
Copy link
Contributor

skonto commented Mar 21, 2019

@srowen +100 for removing 2.11, let's move forward, 2.11 is old and Spark 3.0.0 gives us an opportunity to get rid of it. I believe we should focus on 2.12 and how to support 2.13.

@SparkQA
Copy link

SparkQA commented Mar 23, 2019

Test build #4655 has finished for PR 23098 at commit d3adda6.

  • This patch fails due to an unknown error code, -9.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@srowen srowen changed the title [WIP][SPARK-26132][BUILD][CORE] Remove support for Scala 2.11 in Spark 3.0.0 [SPARK-26132][BUILD][CORE] Remove support for Scala 2.11 in Spark 3.0.0 Mar 23, 2019
@srowen
Copy link
Member Author

srowen commented Mar 23, 2019

Going to merge this if the next set of test passes. Last call for comments.

@SparkQA
Copy link

SparkQA commented Mar 23, 2019

Test build #103850 has finished for PR 23098 at commit b212c27.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 24, 2019

Test build #103861 has finished for PR 23098 at commit 7372853.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member Author

srowen commented Mar 25, 2019

Merged to master

@srowen srowen closed this in 8bc304f Mar 25, 2019
@jzhuge
Copy link
Member

jzhuge commented Mar 25, 2019

@srowen So awesome to merge this. Does it mean we can revert [SPARK-27250][TEST-MAVEN][BUILD] Scala 2.11 maven compile should target Java 1.8 now?

@srowen
Copy link
Member Author

srowen commented Mar 25, 2019

@jzhuge no that change is still valid. We still need to target Java 1.8 now, and will need to continue to do so even when Java 11 support works

@srowen srowen deleted the SPARK-26132 branch March 26, 2019 00:13
@kiszk
Copy link
Member

kiszk commented Aug 25, 2019

@sowen I realized that this PR causes an error when we execute do-release-docker.sh for branch-2.3

...
Checked out revision 35358.
Copying release tarballs
cp: cannot stat 'pyspark-*': No such file or directory

For Spark 2.4, PUBLISH_SCALA_2_11 = 1, PUBLISH_SCALA_2_10=1
For Spark 2.3, PUBLISH_SCALA_2_11 = 1, PUBLISH_SCALA_2_10=0
Since This change does not pass "withpip,withr", i think the script does not build pyspark-* and SparkR_*.

Should we use the original code in the following if then ... fi:

  if [[ $PUBLISH_SCALA_2_11 = 1 ]]; then
    ...
  fi

cc @dongjoon-hyun

@dongjoon-hyun
Copy link
Member

Hi, @kiszk . You should not use master branch release script.
For branch-2.4, I'm using the release script in branch-2.4 as you see my VOTE email.

cc @gatorsmile , @vanzin

@dongjoon-hyun
Copy link
Member

BTW, that's the reason why we made 2.4.3 urgently after 2.4.2.

@dongjoon-hyun
Copy link
Member

FYI, I'm blocked by recent Jekyll release issue. I'll make a PR soon against master and branch-2.4. I'll ping you on there, too.

@kiszk
Copy link
Member

kiszk commented Aug 25, 2019

@dongjoon-hyun Thank you for your suggestion. Got it for 2.4.3.
I will use the script in branch-2.3.

@kiszk
Copy link
Member

kiszk commented Aug 25, 2019

I will add it to https://spark.apache.org/release-process.html

@dongjoon-hyun
Copy link
Member

BTW, I didn't test branch-2.3 release script before. As you know, I've been only testing on branch-2.4. Please test it before releasing.

@kiszk
Copy link
Member

kiszk commented Aug 25, 2019

Branch-2.3 is pretty old. I realized there is no script do-release-docker.sh. I will check the old page or adopt do-release-docker.sh for release 2.3.

@dongjoon-hyun
Copy link
Member

Although this is Apache Spark 2.3.x EOL release, you need to commit to branch-2.3 whatever you do. Otherwise, it will not be reproducible for the others.

@kiszk
Copy link
Member

kiszk commented Aug 26, 2019

Yes, you are right. If backporting of do-release-docker.sh works well, I will commit it to branch-2.3.

dongjoon-hyun pushed a commit that referenced this pull request Aug 30, 2019
### What changes were proposed in this pull request?
This PR re-enables `do-release-docker.sh` for branch-2.3.

According to the release manager of Spark 2.3.3 maropu, `do-release-docker.sh` in the master branch. After applying #23098, the script does not work for branch-2.3.

### Why are the changes needed?
This PR simplifies the release process in branch-2.3 simple.

While Spark 2.3.x will not be released further, as dongjoon-hyun [suggested](#23098 (comment)), it would be good to put this change for

1. to reproduce this release by others
2. to make the future urgent release simple

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
No test is added.
This PR is used to create Spark 2.3.4-rc1

Closes #25607 from kiszk/SPARK-28891.

Authored-by: Kazuaki Ishizaki <ishizaki@jp.ibm.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
alee-altiscale pushed a commit to Altiscale/spark that referenced this pull request Apr 15, 2020
### What changes were proposed in this pull request?
This PR re-enables `do-release-docker.sh` for branch-2.3.

According to the release manager of Spark 2.3.3 maropu, `do-release-docker.sh` in the master branch. After applying apache#23098, the script does not work for branch-2.3.

### Why are the changes needed?
This PR simplifies the release process in branch-2.3 simple.

While Spark 2.3.x will not be released further, as dongjoon-hyun [suggested](apache#23098 (comment)), it would be good to put this change for

1. to reproduce this release by others
2. to make the future urgent release simple

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
No test is added.
This PR is used to create Spark 2.3.4-rc1

Closes apache#25607 from kiszk/SPARK-28891.

Authored-by: Kazuaki Ishizaki <ishizaki@jp.ibm.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
dongjoon-hyun pushed a commit that referenced this pull request Aug 4, 2023
### What changes were proposed in this pull request?
`BytecodeUtils` and `BytecodeUtilsSuite` introduced in [Added the BytecodeUtils class for analyzing bytecode](ae12d16).

#23098 deleted the `BytecodeUtilsSuite`, and after #35566, `BytecodeUtils` is no longer used.

So this pr remove `BytecodeUtils` from `graphx` module.

### Why are the changes needed?
Clean up unnecessary code.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Pass GitHub Actions

Closes #42343 from LuciferYang/SPARK-44674.

Authored-by: yangjie01 <yangjie01@baidu.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
SteBaum pushed a commit to TOSIT-IO/spark that referenced this pull request Mar 7, 2024
### What changes were proposed in this pull request?
This PR re-enables `do-release-docker.sh` for branch-2.3.

According to the release manager of Spark 2.3.3 maropu, `do-release-docker.sh` in the master branch. After applying apache#23098, the script does not work for branch-2.3.

### Why are the changes needed?
This PR simplifies the release process in branch-2.3 simple.

While Spark 2.3.x will not be released further, as dongjoon-hyun [suggested](apache#23098 (comment)), it would be good to put this change for

1. to reproduce this release by others
2. to make the future urgent release simple

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
No test is added.
This PR is used to create Spark 2.3.4-rc1

Closes apache#25607 from kiszk/SPARK-28891.

Authored-by: Kazuaki Ishizaki <ishizaki@jp.ibm.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.