[SPARK-26132][BUILD][CORE] Remove support for Scala 2.11 in Spark 3.0.0#23098
[SPARK-26132][BUILD][CORE] Remove support for Scala 2.11 in Spark 3.0.0#23098srowen wants to merge 1 commit intoapache:masterfrom
Conversation
R/pkg/R/sparkR.R
Outdated
There was a problem hiding this comment.
@felixcheung is it OK to refer to _2.12 artifacts here? I don't think this one actually exists, but is it just an example?
There was a problem hiding this comment.
I think there's even a separate discussion about even using this as an example, since that package is now in Spark since 2.4.
There was a problem hiding this comment.
@felixcheung was the conclusion that we can make this a dummy package? I just want to avoid showing _2.11 usage here.
There was a problem hiding this comment.
yes, dummy name is completely fine with me.
bin/load-spark-env.cmd
Outdated
There was a problem hiding this comment.
@gengliangwang this was the update I was talking about to the .cmd script. You can follow up with this change, uncommented, if you like, separately from this PR.
dev/create-release/release-build.sh
Outdated
There was a problem hiding this comment.
@vanzin @cloud-fan you may want to look at this. It's getting a little hairy in this script.
I recall that the goal was to use this script to create older Spark releases, so it needs logic for older versions. But looking at it, I don't think it actually creates quite the same release as older versions anyway. Is it OK to clean house here and assume only Spark 3 will be built from this script? I already deleted some really old logic here (Spark < 2.2)
There was a problem hiding this comment.
I prefer keeping the script in master working against all currently supported versions. I find it pretty hard to keep things in sync across different branches, especially if you need to fix things in the middle of an RC cycle. Having master be the source of truth for this makes some of that pain goes away, with the cost of some added logic here.
I think the main problem now is that for 3.0 the default Scala version is different than for 2.4, which is the only added complication I can think of here...
There was a problem hiding this comment.
Just a question though, if you're releasing 2.3 wouldn't you use the release script as of 2.3?
I think the script works for Spark 1.6 and 2.x, but not earlier versions, already. What about drawing the lines at major releases, so that this can simplify further?
Right now I think it's basically trying to support 2.2+, which are the non-EOL releases, which seems reasonable?
There was a problem hiding this comment.
if you're releasing 2.3 wouldn't you use the release script as of 2.3?
I started like that when I RM'ed 2.3.1, but the scripts were kinda broken that I had to fix a bunch of stuff in master. At that point, it didn't make sense to use the scripts in the 2.3 branch anymore, and keeping them in sync was kinda wasted effort.
basically trying to support 2.2+, which seems reasonable?
That's my view. The release scripts in master should be usable for all non-EOL releases.
There was a problem hiding this comment.
OK, the issue is that the release script for 2.3.0 that works might only appear in 2.3.1 because of some last-minute hacks during the release process. I can see being careful not to drop 2.3.0 logic immediately after 2.3.0. Maybe at 2.4.0, but that's aggressive.
At least, sure, we can forget Spark < 2.3 in Spark 3.0.0's build script. Maybe later we decide to go further. I'll work on making it that way.
There was a problem hiding this comment.
@maryannxue I think you helped work on this part ... if we're only on Scala 2.12 now can we simplify this further?
|
Test build #99086 has finished for PR 23098 at commit
|
|
Test build #99087 has finished for PR 23098 at commit
|
bin/load-spark-env.cmd
Outdated
There was a problem hiding this comment.
I am not familiar with .cmd script. Should we keep the quote here, "2.12"?
There was a problem hiding this comment.
Nope, the string became as is including quotes if it's quoted on Windows ... haha odd (to me).
There was a problem hiding this comment.
Oh I see, so we shouldn't add quotes to values like SPARK_ENV_CMD above, but use them in if conditions, call, etc?
There was a problem hiding this comment.
Yea, I think we shouldn't quote if I remember this correctly. Let me test and get back to you today or tomorrow. I'll have to fly to Korea for my one week vacation from tomorrow :D.
There was a problem hiding this comment.
but at least that's what I remember when I tested https://github.com/apache/spark/blob/master/bin/spark-sql2.cmd#L23 line.
There was a problem hiding this comment.
For call, it's a bit different (it can be quoted IIRC)(https://ss64.com/nt/call.html)
There was a problem hiding this comment.
Here, I ran some of simple commands:
C:\>set A=aa
C:\>ECHO %A%
aa
C:\>set A="aa"
C:\>ECHO %A%
"aa"
C:\>call "python.exe"
Python 3.6.4 (v3.6.4:d48eceb, Dec 19 2017, 06:54:40) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> exit(0)
C:\>call python.exe
Python 3.6.4 (v3.6.4:d48eceb, Dec 19 2017, 06:54:40) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> exit(0)|
Test build #99382 has finished for PR 23098 at commit
|
|
Note I'm holding on to this PR for a while as I understand it might be disruptive to downstream builds to remove 2.11 support just now. Will look at merging it in weeks. Right now it's an FYI. |
|
Test build #99667 has finished for PR 23098 at commit
|
|
Still holding on to this. I intend to merge it before Spark 3, but have gotten feedback that it would be helpful to hold off for a while. |
|
Test build #100974 has finished for PR 23098 at commit
|
|
Rebased to keep this up to date; not merging just yet |
|
Test build #102713 has finished for PR 23098 at commit
|
|
I think there is support for it, and don't know if we need a vote. Part of the reason I delay is that it's not a hurry, and in a way I wouldn't mind keeping Spark 3 pretty nearly compatible with 2.11, even if we won't support it. Leaving the 2.11 build maybe lets us avoid unnecessary incompatibilities with 2.11. I was planning to check on support again for this in a month or so. |
|
Test build #103686 has started for PR 23098 at commit |
|
@srowen +100 for removing 2.11, let's move forward, 2.11 is old and Spark 3.0.0 gives us an opportunity to get rid of it. I believe we should focus on 2.12 and how to support 2.13. |
|
Test build #4655 has finished for PR 23098 at commit
|
|
Going to merge this if the next set of test passes. Last call for comments. |
|
Test build #103850 has finished for PR 23098 at commit
|
…ts of code that accommodated 2.11.
|
Test build #103861 has finished for PR 23098 at commit
|
|
Merged to master |
|
@srowen So awesome to merge this. Does it mean we can revert |
|
@jzhuge no that change is still valid. We still need to target Java 1.8 now, and will need to continue to do so even when Java 11 support works |
|
For Spark 2.4, Should we use the original code in the following |
|
Hi, @kiszk . You should not use cc @gatorsmile , @vanzin |
|
BTW, that's the reason why we made |
|
FYI, I'm blocked by recent Jekyll release issue. I'll make a PR soon against |
|
@dongjoon-hyun Thank you for your suggestion. Got it for |
|
I will add it to |
|
BTW, I didn't test |
|
|
|
Although this is Apache Spark 2.3.x EOL release, you need to commit to |
|
Yes, you are right. If backporting of |
### What changes were proposed in this pull request? This PR re-enables `do-release-docker.sh` for branch-2.3. According to the release manager of Spark 2.3.3 maropu, `do-release-docker.sh` in the master branch. After applying #23098, the script does not work for branch-2.3. ### Why are the changes needed? This PR simplifies the release process in branch-2.3 simple. While Spark 2.3.x will not be released further, as dongjoon-hyun [suggested](#23098 (comment)), it would be good to put this change for 1. to reproduce this release by others 2. to make the future urgent release simple ### Does this PR introduce any user-facing change? No ### How was this patch tested? No test is added. This PR is used to create Spark 2.3.4-rc1 Closes #25607 from kiszk/SPARK-28891. Authored-by: Kazuaki Ishizaki <ishizaki@jp.ibm.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request? This PR re-enables `do-release-docker.sh` for branch-2.3. According to the release manager of Spark 2.3.3 maropu, `do-release-docker.sh` in the master branch. After applying apache#23098, the script does not work for branch-2.3. ### Why are the changes needed? This PR simplifies the release process in branch-2.3 simple. While Spark 2.3.x will not be released further, as dongjoon-hyun [suggested](apache#23098 (comment)), it would be good to put this change for 1. to reproduce this release by others 2. to make the future urgent release simple ### Does this PR introduce any user-facing change? No ### How was this patch tested? No test is added. This PR is used to create Spark 2.3.4-rc1 Closes apache#25607 from kiszk/SPARK-28891. Authored-by: Kazuaki Ishizaki <ishizaki@jp.ibm.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request? `BytecodeUtils` and `BytecodeUtilsSuite` introduced in [Added the BytecodeUtils class for analyzing bytecode](ae12d16). #23098 deleted the `BytecodeUtilsSuite`, and after #35566, `BytecodeUtils` is no longer used. So this pr remove `BytecodeUtils` from `graphx` module. ### Why are the changes needed? Clean up unnecessary code. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Pass GitHub Actions Closes #42343 from LuciferYang/SPARK-44674. Authored-by: yangjie01 <yangjie01@baidu.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
### What changes were proposed in this pull request? This PR re-enables `do-release-docker.sh` for branch-2.3. According to the release manager of Spark 2.3.3 maropu, `do-release-docker.sh` in the master branch. After applying apache#23098, the script does not work for branch-2.3. ### Why are the changes needed? This PR simplifies the release process in branch-2.3 simple. While Spark 2.3.x will not be released further, as dongjoon-hyun [suggested](apache#23098 (comment)), it would be good to put this change for 1. to reproduce this release by others 2. to make the future urgent release simple ### Does this PR introduce any user-facing change? No ### How was this patch tested? No test is added. This PR is used to create Spark 2.3.4-rc1 Closes apache#25607 from kiszk/SPARK-28891. Authored-by: Kazuaki Ishizaki <ishizaki@jp.ibm.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
What changes were proposed in this pull request?
Remove Scala 2.11 support in build files and docs, and in various parts of code that accommodated 2.11. See some targeted comments below.
How was this patch tested?
Existing tests.