-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-44935][K8S] Fix RELEASE file to have the correct information in Docker images if exists
#42636
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
RELEASE file to have the correct information in Docker imagesRELEASE file to have the correct information in Docker images if exists
… in Docker images
7b0f289 to
ed2896f
Compare
|
Could you review this, @viirya ? |
|
|
||
| COPY jars /opt/spark/jars | ||
| # Copy RELEASE file if exists | ||
| COPY RELEAS[E] /opt/spark/RELEASE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why [E]?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used a trick glob pattern here in Dockerfile. Since RELEASE file doesn't exist in Git repository, RELEAS[E] matches RELEASE or RELEAS and this statement is ignored when there is no such file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
|
Thank you for review and approval, @viirya . I replied here, #42636 (comment) . |
… in Docker images if exists ### What changes were proposed in this pull request? This PR aims to fix `RELEASE` file to have the correct information in Docker images if `RELEASE` file exists. Please note that `RELEASE` file doesn't exists in SPARK_HOME directory when we run the K8s integration test from Spark Git repository. So, we keep the following empty `RELEASE` file generation and use `COPY` conditionally via glob syntax. https://github.com/apache/spark/blob/2a3aec1f9040e08999a2df88f92340cd2710e552/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile#L37 ### Why are the changes needed? Currently, it's an empty file in the official Apache Spark Docker images. ``` $ docker run -it --rm apache/spark:latest ls -al /opt/spark/RELEASE -rw-r--r-- 1 spark spark 0 Jun 25 03:13 /opt/spark/RELEASE $ docker run -it --rm apache/spark:v3.1.3 ls -al /opt/spark/RELEASE | tail -n1 -rw-r--r-- 1 root root 0 Feb 21 2022 /opt/spark/RELEASE ``` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually build image and check it with `docker run -it --rm NEW_IMAGE ls -al /opt/spark/RELEASE` I copied this `Dockerfile` into Apache Spark 3.5.0 RC2 binary distribution and tested in the following way. ``` $ cd spark-3.5.0-rc2-bin-hadoop3 $ cp /tmp/Dockerfile kubernetes/dockerfiles/spark/Dockerfile $ bin/docker-image-tool.sh -t SPARK-44935 build $ docker run -it --rm docker.io/library/spark:SPARK-44935 ls -al /opt/spark/RELEASE | tail -n1 -rw-r--r-- 1 root root 165 Aug 18 21:10 /opt/spark/RELEASE $ docker run -it --rm docker.io/library/spark:SPARK-44935 cat /opt/spark/RELEASE | tail -n2 Spark 3.5.0 (git revision 010c4a6) built for Hadoop 3.3.4 Build flags: -B -Pmesos -Pyarn -Pkubernetes -Psparkr -Pscala-2.12 -Phadoop-3 -Phive -Phive-thriftserver ``` ### Was this patch authored or co-authored using generative AI tooling? No. Closes #42636 from dongjoon-hyun/SPARK-44935. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit d382c6b) Signed-off-by: Dongjoon Hyun <[email protected]>
… in Docker images if exists ### What changes were proposed in this pull request? This PR aims to fix `RELEASE` file to have the correct information in Docker images if `RELEASE` file exists. Please note that `RELEASE` file doesn't exists in SPARK_HOME directory when we run the K8s integration test from Spark Git repository. So, we keep the following empty `RELEASE` file generation and use `COPY` conditionally via glob syntax. https://github.com/apache/spark/blob/2a3aec1f9040e08999a2df88f92340cd2710e552/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile#L37 ### Why are the changes needed? Currently, it's an empty file in the official Apache Spark Docker images. ``` $ docker run -it --rm apache/spark:latest ls -al /opt/spark/RELEASE -rw-r--r-- 1 spark spark 0 Jun 25 03:13 /opt/spark/RELEASE $ docker run -it --rm apache/spark:v3.1.3 ls -al /opt/spark/RELEASE | tail -n1 -rw-r--r-- 1 root root 0 Feb 21 2022 /opt/spark/RELEASE ``` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually build image and check it with `docker run -it --rm NEW_IMAGE ls -al /opt/spark/RELEASE` I copied this `Dockerfile` into Apache Spark 3.5.0 RC2 binary distribution and tested in the following way. ``` $ cd spark-3.5.0-rc2-bin-hadoop3 $ cp /tmp/Dockerfile kubernetes/dockerfiles/spark/Dockerfile $ bin/docker-image-tool.sh -t SPARK-44935 build $ docker run -it --rm docker.io/library/spark:SPARK-44935 ls -al /opt/spark/RELEASE | tail -n1 -rw-r--r-- 1 root root 165 Aug 18 21:10 /opt/spark/RELEASE $ docker run -it --rm docker.io/library/spark:SPARK-44935 cat /opt/spark/RELEASE | tail -n2 Spark 3.5.0 (git revision 010c4a6) built for Hadoop 3.3.4 Build flags: -B -Pmesos -Pyarn -Pkubernetes -Psparkr -Pscala-2.12 -Phadoop-3 -Phive -Phive-thriftserver ``` ### Was this patch authored or co-authored using generative AI tooling? No. Closes #42636 from dongjoon-hyun/SPARK-44935. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit d382c6b) Signed-off-by: Dongjoon Hyun <[email protected]>
… in Docker images if exists ### What changes were proposed in this pull request? This PR aims to fix `RELEASE` file to have the correct information in Docker images if `RELEASE` file exists. Please note that `RELEASE` file doesn't exists in SPARK_HOME directory when we run the K8s integration test from Spark Git repository. So, we keep the following empty `RELEASE` file generation and use `COPY` conditionally via glob syntax. https://github.com/apache/spark/blob/2a3aec1f9040e08999a2df88f92340cd2710e552/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile#L37 ### Why are the changes needed? Currently, it's an empty file in the official Apache Spark Docker images. ``` $ docker run -it --rm apache/spark:latest ls -al /opt/spark/RELEASE -rw-r--r-- 1 spark spark 0 Jun 25 03:13 /opt/spark/RELEASE $ docker run -it --rm apache/spark:v3.1.3 ls -al /opt/spark/RELEASE | tail -n1 -rw-r--r-- 1 root root 0 Feb 21 2022 /opt/spark/RELEASE ``` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually build image and check it with `docker run -it --rm NEW_IMAGE ls -al /opt/spark/RELEASE` I copied this `Dockerfile` into Apache Spark 3.5.0 RC2 binary distribution and tested in the following way. ``` $ cd spark-3.5.0-rc2-bin-hadoop3 $ cp /tmp/Dockerfile kubernetes/dockerfiles/spark/Dockerfile $ bin/docker-image-tool.sh -t SPARK-44935 build $ docker run -it --rm docker.io/library/spark:SPARK-44935 ls -al /opt/spark/RELEASE | tail -n1 -rw-r--r-- 1 root root 165 Aug 18 21:10 /opt/spark/RELEASE $ docker run -it --rm docker.io/library/spark:SPARK-44935 cat /opt/spark/RELEASE | tail -n2 Spark 3.5.0 (git revision 010c4a6) built for Hadoop 3.3.4 Build flags: -B -Pmesos -Pyarn -Pkubernetes -Psparkr -Pscala-2.12 -Phadoop-3 -Phive -Phive-thriftserver ``` ### Was this patch authored or co-authored using generative AI tooling? No. Closes #42636 from dongjoon-hyun/SPARK-44935. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit d382c6b) Signed-off-by: Dongjoon Hyun <[email protected]>
|
Merged to master/3.5/3.4/3.3. |
… in Docker images if exists ### What changes were proposed in this pull request? This PR aims to fix `RELEASE` file to have the correct information in Docker images if `RELEASE` file exists. Please note that `RELEASE` file doesn't exists in SPARK_HOME directory when we run the K8s integration test from Spark Git repository. So, we keep the following empty `RELEASE` file generation and use `COPY` conditionally via glob syntax. https://github.com/apache/spark/blob/2a3aec1f9040e08999a2df88f92340cd2710e552/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile#L37 ### Why are the changes needed? Currently, it's an empty file in the official Apache Spark Docker images. ``` $ docker run -it --rm apache/spark:latest ls -al /opt/spark/RELEASE -rw-r--r-- 1 spark spark 0 Jun 25 03:13 /opt/spark/RELEASE $ docker run -it --rm apache/spark:v3.1.3 ls -al /opt/spark/RELEASE | tail -n1 -rw-r--r-- 1 root root 0 Feb 21 2022 /opt/spark/RELEASE ``` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually build image and check it with `docker run -it --rm NEW_IMAGE ls -al /opt/spark/RELEASE` I copied this `Dockerfile` into Apache Spark 3.5.0 RC2 binary distribution and tested in the following way. ``` $ cd spark-3.5.0-rc2-bin-hadoop3 $ cp /tmp/Dockerfile kubernetes/dockerfiles/spark/Dockerfile $ bin/docker-image-tool.sh -t SPARK-44935 build $ docker run -it --rm docker.io/library/spark:SPARK-44935 ls -al /opt/spark/RELEASE | tail -n1 -rw-r--r-- 1 root root 165 Aug 18 21:10 /opt/spark/RELEASE $ docker run -it --rm docker.io/library/spark:SPARK-44935 cat /opt/spark/RELEASE | tail -n2 Spark 3.5.0 (git revision 010c4a6) built for Hadoop 3.3.4 Build flags: -B -Pmesos -Pyarn -Pkubernetes -Psparkr -Pscala-2.12 -Phadoop-3 -Phive -Phive-thriftserver ``` ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#42636 from dongjoon-hyun/SPARK-44935. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit d382c6b) Signed-off-by: Dongjoon Hyun <[email protected]>
…ocker images if exists ### What changes were proposed in this pull request? This PR aims to fix `RELEASE` file to have the correct information in Docker images if exists. Apache Spark repository already fixed this. - apache/spark#42636 ### Why are the changes needed? To provide a correct information for Spark 3.4+ ### Does this PR introduce _any_ user-facing change? No behavior change. Only `RELEASE` file. ### How was this patch tested? Pass the CIs. Closes #68 from dongjoon-hyun/SPARK-44935. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
What changes were proposed in this pull request?
This PR aims to fix
RELEASEfile to have the correct information in Docker images ifRELEASEfile exists.Please note that
RELEASEfile doesn't exists in SPARK_HOME directory when we run the K8s integration test from Spark Git repository. So, we keep the following emptyRELEASEfile generation and useCOPYconditionally via glob syntax.spark/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile
Line 37 in 2a3aec1
Why are the changes needed?
Currently, it's an empty file in the official Apache Spark Docker images.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Manually build image and check it with
docker run -it --rm NEW_IMAGE ls -al /opt/spark/RELEASEI copied this
Dockerfileinto Apache Spark 3.5.0 RC2 binary distribution and tested in the following way.Was this patch authored or co-authored using generative AI tooling?
No.