Skip to content

Update spark 4.0.0-preview2#17622

Merged
tianon merged 2 commits intodocker-library:masterfrom
yaooqinn:patch-1
Oct 8, 2024
Merged

Update spark 4.0.0-preview2#17622
tianon merged 2 commits intodocker-library:masterfrom
yaooqinn:patch-1

Conversation

@yaooqinn
Copy link
Contributor

@yaooqinn yaooqinn commented Sep 26, 2024

@yaooqinn yaooqinn requested a review from a team as a code owner September 26, 2024 02:06
@github-actions

This comment has been minimized.

@tianon
Copy link
Member

tianon commented Sep 26, 2024

+    ln -s "$(basename $(ls /opt/spark/examples/jars/spark-examples_*.jar))" /opt/spark/examples/jars/spark-examples.jar; \

This chain is definitely eyebrow-raising 😅

Can you explain a bit what it's trying to do? Does /opt/spark/examples/jars/spark-examples_*.jar match multiple things, or just one? Perhaps the --relative flag to ln is what you're looking for here so you can avoid both subshells?

@yaooqinn
Copy link
Contributor Author

This is a user experience improvement for referencing it for quickstart w/ spark examples.

FYI, apache/spark-docker#67

@tianon
Copy link
Member

tianon commented Sep 30, 2024

Sorry, let me be more clear -- ls in any scripting context is almost always a "code smell" because it is primarily a user interface, not intended for scripting use.

In this case, I think what you meant by this:

+    ln -s "$(basename $(ls /opt/spark/examples/jars/spark-examples_*.jar))" /opt/spark/examples/jars/spark-examples.jar; \

was really much more simply stated as:

+    ln -sr /opt/spark/examples/jars/spark-examples_*.jar /opt/spark/examples/jars/spark-examples.jar; \

(with what should be the exact same end result, but with less fiddly subshell layers in between where things could unintentionally go wrong)

@tianon
Copy link
Member

tianon commented Sep 30, 2024

Another alternative that's still valid, but only a single subshell (still not ideal, but more correct than the current that's shelling out to ls as well):

+    ln -s "$(basename /opt/spark/examples/jars/spark-examples_*.jar)" /opt/spark/examples/jars/spark-examples.jar; \

(The ideal solution would be to ship this symlink as part of the original upstream dist, but that's perhaps a separate topic.)

yaooqinn added a commit to apache/spark-docker that referenced this pull request Oct 8, 2024
…ples.jar (#73)

### What changes were proposed in this pull request?

Address comments docker-library/official-images#17622 (comment) from docker official owners


### Why are the changes needed?

Use less fiddly subshell layers in between where things could unintentionally go wrong


### Does this PR introduce _any_ user-facing change?

no
### How was this patch tested?

```
docker run -it --rm scala2.13-java17-ubuntu ls -al /opt/spark/examples/jars  | tail -n6
drwxr-xr-x 4 spark spark    4096 Sep 16 04:02 ..
-rw-r--r-- 1 spark spark  232248 Sep 16 04:02 jackson-core-asl-1.9.13.jar
-rw-r--r-- 1 spark spark  780664 Sep 16 04:02 jackson-mapper-asl-1.9.13.jar
-rw-r--r-- 1 spark spark   80424 Sep 16 04:02 scopt_2.13-3.7.1.jar
-rw-r--r-- 1 spark spark 1591043 Sep 16 04:02 spark-examples_2.13-4.0.0-preview2.jar
lrwxrwxrwx 1 root  root       38 Oct  8 05:41 spark-examples.jar -> spark-examples_2.13-4.0.0-preview2.jar
```
@yaooqinn
Copy link
Contributor Author

yaooqinn commented Oct 8, 2024

Thank you @tianon

I've addressed your comments in apache/spark-docker#73

@github-actions
Copy link

github-actions bot commented Oct 8, 2024

Diff for be1319b:
diff --git a/_bashbrew-cat b/_bashbrew-cat
index 75fa442..dfcdb83 100644
--- a/_bashbrew-cat
+++ b/_bashbrew-cat
@@ -61,22 +61,42 @@ Architectures: amd64, arm64v8
 GitCommit: b9f1f8e8ebed1959c2be3864a114b52f67519092
 Directory: 3.5.2/scala2.12-java17-ubuntu
 
-Tags: 4.0.0-preview1-scala2.13-java17-python3-r-ubuntu
+Tags: 4.0.0-preview2-scala2.13-java17-python3-r-ubuntu
 Architectures: amd64, arm64v8
-GitCommit: b9f1f8e8ebed1959c2be3864a114b52f67519092
-Directory: 4.0.0-preview1/scala2.13-java17-python3-r-ubuntu
+GitCommit: 059a2817e53ac7c0c408196f9eb91397a99ec84e
+Directory: 4.0.0-preview2/scala2.13-java17-python3-r-ubuntu
 
-Tags: 4.0.0-preview1-scala2.13-java17-python3-ubuntu, 4.0.0-preview1-python3, 4.0.0-preview1
+Tags: 4.0.0-preview2-scala2.13-java17-python3-ubuntu, 4.0.0-preview2-python3, 4.0.0-preview2
 Architectures: amd64, arm64v8
-GitCommit: b9f1f8e8ebed1959c2be3864a114b52f67519092
-Directory: 4.0.0-preview1/scala2.13-java17-python3-ubuntu
+GitCommit: 059a2817e53ac7c0c408196f9eb91397a99ec84e
+Directory: 4.0.0-preview2/scala2.13-java17-python3-ubuntu
 
-Tags: 4.0.0-preview1-scala2.13-java17-r-ubuntu, 4.0.0-preview1-r
+Tags: 4.0.0-preview2-scala2.13-java17-r-ubuntu, 4.0.0-preview2-r
 Architectures: amd64, arm64v8
-GitCommit: b9f1f8e8ebed1959c2be3864a114b52f67519092
-Directory: 4.0.0-preview1/scala2.13-java17-r-ubuntu
+GitCommit: 059a2817e53ac7c0c408196f9eb91397a99ec84e
+Directory: 4.0.0-preview2/scala2.13-java17-r-ubuntu
 
-Tags: 4.0.0-preview1-scala2.13-java17-ubuntu, 4.0.0-preview1-scala
+Tags: 4.0.0-preview2-scala2.13-java17-ubuntu, 4.0.0-preview2-scala
 Architectures: amd64, arm64v8
-GitCommit: b9f1f8e8ebed1959c2be3864a114b52f67519092
-Directory: 4.0.0-preview1/scala2.13-java17-ubuntu
+GitCommit: 059a2817e53ac7c0c408196f9eb91397a99ec84e
+Directory: 4.0.0-preview2/scala2.13-java17-ubuntu
+
+Tags: 4.0.0-preview2-scala2.13-java21-python3-r-ubuntu
+Architectures: amd64, arm64v8
+GitCommit: 059a2817e53ac7c0c408196f9eb91397a99ec84e
+Directory: 4.0.0-preview2/scala2.13-java21-python3-r-ubuntu
+
+Tags: 4.0.0-preview2-scala2.13-java21-python3-ubuntu, 4.0.0-preview2-java21-python3, 4.0.0-preview2-java21
+Architectures: amd64, arm64v8
+GitCommit: 059a2817e53ac7c0c408196f9eb91397a99ec84e
+Directory: 4.0.0-preview2/scala2.13-java21-python3-ubuntu
+
+Tags: 4.0.0-preview2-scala2.13-java21-r-ubuntu, 4.0.0-preview2-java21-r
+Architectures: amd64, arm64v8
+GitCommit: 059a2817e53ac7c0c408196f9eb91397a99ec84e
+Directory: 4.0.0-preview2/scala2.13-java21-r-ubuntu
+
+Tags: 4.0.0-preview2-scala2.13-java21-ubuntu, 4.0.0-preview2-java21-scala
+Architectures: amd64, arm64v8
+GitCommit: 059a2817e53ac7c0c408196f9eb91397a99ec84e
+Directory: 4.0.0-preview2/scala2.13-java21-ubuntu
diff --git a/_bashbrew-list b/_bashbrew-list
index 13fb4f8..5a0fa41 100644
--- a/_bashbrew-list
+++ b/_bashbrew-list
@@ -22,14 +22,22 @@ spark:3.5.2-scala2.12-java17-python3-r-ubuntu
 spark:3.5.2-scala2.12-java17-python3-ubuntu
 spark:3.5.2-scala2.12-java17-r-ubuntu
 spark:3.5.2-scala2.12-java17-ubuntu
-spark:4.0.0-preview1
-spark:4.0.0-preview1-python3
-spark:4.0.0-preview1-r
-spark:4.0.0-preview1-scala
-spark:4.0.0-preview1-scala2.13-java17-python3-r-ubuntu
-spark:4.0.0-preview1-scala2.13-java17-python3-ubuntu
-spark:4.0.0-preview1-scala2.13-java17-r-ubuntu
-spark:4.0.0-preview1-scala2.13-java17-ubuntu
+spark:4.0.0-preview2
+spark:4.0.0-preview2-java21
+spark:4.0.0-preview2-java21-python3
+spark:4.0.0-preview2-java21-r
+spark:4.0.0-preview2-java21-scala
+spark:4.0.0-preview2-python3
+spark:4.0.0-preview2-r
+spark:4.0.0-preview2-scala
+spark:4.0.0-preview2-scala2.13-java17-python3-r-ubuntu
+spark:4.0.0-preview2-scala2.13-java17-python3-ubuntu
+spark:4.0.0-preview2-scala2.13-java17-r-ubuntu
+spark:4.0.0-preview2-scala2.13-java17-ubuntu
+spark:4.0.0-preview2-scala2.13-java21-python3-r-ubuntu
+spark:4.0.0-preview2-scala2.13-java21-python3-ubuntu
+spark:4.0.0-preview2-scala2.13-java21-r-ubuntu
+spark:4.0.0-preview2-scala2.13-java21-ubuntu
 spark:latest
 spark:python3
 spark:python3-java17
diff --git a/_bashbrew-list-build-order b/_bashbrew-list-build-order
index 5d2d659..3536513 100644
--- a/_bashbrew-list-build-order
+++ b/_bashbrew-list-build-order
@@ -2,15 +2,19 @@ spark:3.4.3-scala
 spark:3.4.3-scala2.12-java11-python3-r-ubuntu
 spark:3.5.2-java17-scala
 spark:3.5.2-scala2.12-java17-python3-r-ubuntu
-spark:4.0.0-preview1-scala
-spark:4.0.0-preview1-scala2.13-java17-python3-r-ubuntu
+spark:4.0.0-preview2-java21-scala
+spark:4.0.0-preview2-scala
+spark:4.0.0-preview2-scala2.13-java17-python3-r-ubuntu
+spark:4.0.0-preview2-scala2.13-java21-python3-r-ubuntu
 spark:python3-java17
 spark:scala
 spark:3.4.3
 spark:3.4.3-r
 spark:3.5.2-java17-r
 spark:3.5.2-scala2.12-java11-python3-r-ubuntu
-spark:4.0.0-preview1
-spark:4.0.0-preview1-r
+spark:4.0.0-preview2
+spark:4.0.0-preview2-java21
+spark:4.0.0-preview2-java21-r
+spark:4.0.0-preview2-r
 spark:latest
 spark:r
diff --git a/spark_4.0.0-preview1-r/Dockerfile b/spark_4.0.0-preview2-java21-r/Dockerfile
similarity index 94%
rename from spark_4.0.0-preview1-r/Dockerfile
rename to spark_4.0.0-preview2-java21-r/Dockerfile
index c1729ec..1d77e12 100644
--- a/spark_4.0.0-preview1-r/Dockerfile
+++ b/spark_4.0.0-preview2-java21-r/Dockerfile
@@ -14,7 +14,7 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 #
-FROM spark:4.0.0-preview1-scala2.13-java17-ubuntu
+FROM spark:4.0.0-preview2-scala2.13-java21-ubuntu
 
 USER root
 
diff --git a/spark_4.0.0-preview1-scala/Dockerfile b/spark_4.0.0-preview2-java21-scala/Dockerfile
similarity index 88%
copy from spark_4.0.0-preview1-scala/Dockerfile
copy to spark_4.0.0-preview2-java21-scala/Dockerfile
index 1102caf..f2ec53d 100644
--- a/spark_4.0.0-preview1-scala/Dockerfile
+++ b/spark_4.0.0-preview2-java21-scala/Dockerfile
@@ -14,7 +14,7 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 #
-FROM eclipse-temurin:17-jre-jammy
+FROM eclipse-temurin:21-jammy
 
 ARG spark_uid=185
 
@@ -36,9 +36,9 @@ RUN set -ex; \
 
 # Install Apache Spark
 # https://downloads.apache.org/spark/KEYS
-ENV SPARK_TGZ_URL=https://archive.apache.org/dist/spark/spark-4.0.0-preview1/spark-4.0.0-preview1-bin-hadoop3.tgz \
-    SPARK_TGZ_ASC_URL=https://archive.apache.org/dist/spark/spark-4.0.0-preview1/spark-4.0.0-preview1-bin-hadoop3.tgz.asc \
-    GPG_KEY=4DC9676CEF9A83E98FCA02784D6620843CD87F5A
+ENV SPARK_TGZ_URL=https://archive.apache.org/dist/spark/spark-4.0.0-preview2/spark-4.0.0-preview2-bin-hadoop3.tgz \
+    SPARK_TGZ_ASC_URL=https://archive.apache.org/dist/spark/spark-4.0.0-preview2/spark-4.0.0-preview2-bin-hadoop3.tgz.asc \
+    GPG_KEY=F28C9C925C188C35E345614DEDA00CE834F0FC5C
 
 RUN set -ex; \
     export SPARK_TMP="$(mktemp -d)"; \
@@ -55,10 +55,12 @@ RUN set -ex; \
     tar -xf spark.tgz --strip-components=1; \
     chown -R spark:spark .; \
     mv jars /opt/spark/; \
+    mv RELEASE /opt/spark/; \
     mv bin /opt/spark/; \
     mv sbin /opt/spark/; \
     mv kubernetes/dockerfiles/spark/decom.sh /opt/; \
     mv examples /opt/spark/; \
+    ln -s "$(basename /opt/spark/examples/jars/spark-examples_*.jar)" /opt/spark/examples/jars/spark-examples.jar; \
     mv kubernetes/tests /opt/spark/; \
     mv data /opt/spark/; \
     mv python/pyspark /opt/spark/python/pyspark/; \
diff --git a/spark_4.0.0-preview1-scala/entrypoint.sh b/spark_4.0.0-preview2-java21-scala/entrypoint.sh
similarity index 100%
rename from spark_4.0.0-preview1-scala/entrypoint.sh
rename to spark_4.0.0-preview2-java21-scala/entrypoint.sh
diff --git a/spark_4.0.0-preview1/Dockerfile b/spark_4.0.0-preview2-java21/Dockerfile
similarity index 94%
copy from spark_4.0.0-preview1/Dockerfile
copy to spark_4.0.0-preview2-java21/Dockerfile
index 66fb618..c7155b0 100644
--- a/spark_4.0.0-preview1/Dockerfile
+++ b/spark_4.0.0-preview2-java21/Dockerfile
@@ -14,7 +14,7 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 #
-FROM spark:4.0.0-preview1-scala2.13-java17-ubuntu
+FROM spark:4.0.0-preview2-scala2.13-java21-ubuntu
 
 USER root
 
diff --git a/spark_3.4.3-r/Dockerfile b/spark_4.0.0-preview2-r/Dockerfile
similarity index 94%
copy from spark_3.4.3-r/Dockerfile
copy to spark_4.0.0-preview2-r/Dockerfile
index 58e228e..e3185f3 100644
--- a/spark_3.4.3-r/Dockerfile
+++ b/spark_4.0.0-preview2-r/Dockerfile
@@ -14,7 +14,7 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 #
-FROM spark:3.4.3-scala2.12-java11-ubuntu
+FROM spark:4.0.0-preview2-scala2.13-java17-ubuntu
 
 USER root
 
diff --git a/spark_4.0.0-preview1-scala/Dockerfile b/spark_4.0.0-preview2-scala/Dockerfile
similarity index 88%
rename from spark_4.0.0-preview1-scala/Dockerfile
rename to spark_4.0.0-preview2-scala/Dockerfile
index 1102caf..051acb3 100644
--- a/spark_4.0.0-preview1-scala/Dockerfile
+++ b/spark_4.0.0-preview2-scala/Dockerfile
@@ -14,7 +14,7 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 #
-FROM eclipse-temurin:17-jre-jammy
+FROM eclipse-temurin:17-jammy
 
 ARG spark_uid=185
 
@@ -36,9 +36,9 @@ RUN set -ex; \
 
 # Install Apache Spark
 # https://downloads.apache.org/spark/KEYS
-ENV SPARK_TGZ_URL=https://archive.apache.org/dist/spark/spark-4.0.0-preview1/spark-4.0.0-preview1-bin-hadoop3.tgz \
-    SPARK_TGZ_ASC_URL=https://archive.apache.org/dist/spark/spark-4.0.0-preview1/spark-4.0.0-preview1-bin-hadoop3.tgz.asc \
-    GPG_KEY=4DC9676CEF9A83E98FCA02784D6620843CD87F5A
+ENV SPARK_TGZ_URL=https://archive.apache.org/dist/spark/spark-4.0.0-preview2/spark-4.0.0-preview2-bin-hadoop3.tgz \
+    SPARK_TGZ_ASC_URL=https://archive.apache.org/dist/spark/spark-4.0.0-preview2/spark-4.0.0-preview2-bin-hadoop3.tgz.asc \
+    GPG_KEY=F28C9C925C188C35E345614DEDA00CE834F0FC5C
 
 RUN set -ex; \
     export SPARK_TMP="$(mktemp -d)"; \
@@ -55,10 +55,12 @@ RUN set -ex; \
     tar -xf spark.tgz --strip-components=1; \
     chown -R spark:spark .; \
     mv jars /opt/spark/; \
+    mv RELEASE /opt/spark/; \
     mv bin /opt/spark/; \
     mv sbin /opt/spark/; \
     mv kubernetes/dockerfiles/spark/decom.sh /opt/; \
     mv examples /opt/spark/; \
+    ln -s "$(basename /opt/spark/examples/jars/spark-examples_*.jar)" /opt/spark/examples/jars/spark-examples.jar; \
     mv kubernetes/tests /opt/spark/; \
     mv data /opt/spark/; \
     mv python/pyspark /opt/spark/python/pyspark/; \
diff --git a/spark_3.5.2-java17-scala/entrypoint.sh b/spark_4.0.0-preview2-scala/entrypoint.sh
similarity index 100%
copy from spark_3.5.2-java17-scala/entrypoint.sh
copy to spark_4.0.0-preview2-scala/entrypoint.sh
diff --git a/spark_4.0.0-preview1-scala2.13-java17-python3-r-ubuntu/Dockerfile b/spark_4.0.0-preview2-scala2.13-java17-python3-r-ubuntu/Dockerfile
similarity index 95%
rename from spark_4.0.0-preview1-scala2.13-java17-python3-r-ubuntu/Dockerfile
rename to spark_4.0.0-preview2-scala2.13-java17-python3-r-ubuntu/Dockerfile
index 3636a21..7c575a8 100644
--- a/spark_4.0.0-preview1-scala2.13-java17-python3-r-ubuntu/Dockerfile
+++ b/spark_4.0.0-preview2-scala2.13-java17-python3-r-ubuntu/Dockerfile
@@ -14,7 +14,7 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 #
-FROM spark:4.0.0-preview1-scala2.13-java17-ubuntu
+FROM spark:4.0.0-preview2-scala2.13-java17-ubuntu
 
 USER root
 
diff --git a/spark_3.5.2-scala2.12-java17-python3-r-ubuntu/Dockerfile b/spark_4.0.0-preview2-scala2.13-java21-python3-r-ubuntu/Dockerfile
similarity index 95%
copy from spark_3.5.2-scala2.12-java17-python3-r-ubuntu/Dockerfile
copy to spark_4.0.0-preview2-scala2.13-java21-python3-r-ubuntu/Dockerfile
index 80dda3b..99268f9 100644
--- a/spark_3.5.2-scala2.12-java17-python3-r-ubuntu/Dockerfile
+++ b/spark_4.0.0-preview2-scala2.13-java21-python3-r-ubuntu/Dockerfile
@@ -14,7 +14,7 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 #
-FROM spark:3.5.2-scala2.12-java17-ubuntu
+FROM spark:4.0.0-preview2-scala2.13-java21-ubuntu
 
 USER root
 
diff --git a/spark_4.0.0-preview1/Dockerfile b/spark_4.0.0-preview2/Dockerfile
similarity index 94%
rename from spark_4.0.0-preview1/Dockerfile
rename to spark_4.0.0-preview2/Dockerfile
index 66fb618..576fdaa 100644
--- a/spark_4.0.0-preview1/Dockerfile
+++ b/spark_4.0.0-preview2/Dockerfile
@@ -14,7 +14,7 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 #
-FROM spark:4.0.0-preview1-scala2.13-java17-ubuntu
+FROM spark:4.0.0-preview2-scala2.13-java17-ubuntu
 
 USER root

Relevant Maintainers:

@tianon tianon merged commit de9a545 into docker-library:master Oct 8, 2024
dongjoon-hyun added a commit to apache/spark that referenced this pull request Oct 31, 2024
….jar`

### What changes were proposed in this pull request?

This PR aims to simplify symbolic link creation of `spark-examples.jar` according to the downstream `docker-library` and `spark-docker` repository change.

### Why are the changes needed?

- `docker-library`
  - docker-library/official-images#17622 (comment)

- `spark-docker`
  - apache/spark-docker#73
  - apache/spark-docker#74
  - apache/spark-docker#76

### Does this PR introduce _any_ user-facing change?

No behavior change.

### How was this patch tested?

Pass the CIs.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #48723 from dongjoon-hyun/SPARK-50192.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants