-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-45497][K8S] Add a symbolic link file spark-examples.jar in K8s Docker images
#43324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…K8s Docker images
spark-examples.jar in K8s Docker imagesspark-examples.jar in K8s Docker images
|
Could you review this PR when you have some time, @viirya ? |
| COPY kubernetes/dockerfiles/spark/entrypoint.sh /opt/ | ||
| COPY kubernetes/dockerfiles/spark/decom.sh /opt/ | ||
| COPY examples /opt/spark/examples | ||
| RUN ln -s $(basename $(ls /opt/spark/examples/jars/spark-examples_*.jar)) /opt/spark/examples/jars/spark-examples.jar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does ln -s spark-examples_2.12-3.5.0.jar /opt/spark/examples/jars/spark-examples.jar, but is spark-examples_2.12-3.5.0.jar under current path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No~ The symbolic link file is created at jar directory.
$ docker run -it --rm spark:latest ls -al /opt/spark/examples/jars | tail -n6
total 1620
drwxr-xr-x 1 root root 4096 Oct 11 04:37 .
drwxr-xr-x 1 root root 4096 Sep 9 02:08 ..
-rw-r--r-- 1 root root 78803 Sep 9 02:08 scopt_2.12-3.7.1.jar
-rw-r--r-- 1 root root 1564255 Sep 9 02:08 spark-examples_2.12-3.5.0.jar
lrwxrwxrwx 1 root root 29 Oct 11 04:37 spark-examples.jar -> spark-examples_2.12-3.5.0.jar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first argument is a link source and the second argument is the location of newly created symbolic link. Since we don't use directory in the symbolic link, this relation is maintained even during copying the whole Spark directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not important for ln command. Only the first argument is used as a link location for the newly generate symbolic file.
is spark-examples_2.12-3.5.0.jar under current path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, yea, spark-examples_2.12-3.5.0.jar is the link source. My question is, does the source exist under the current path of running ln command?
No~ The symbolic link file is created at jar directory.
Does ln command run under jar directory? I don't see there is command changing to jar directory before ln.
Do I miss anything here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😄 I understand why you are confused. ln command doesn't need to switch directory. You can do that in your mac.
$ ls examples/jars
scopt_2.12-3.7.1.jar spark-examples_2.12-3.5.0.jar
$ ln -s spark-examples_2.12-3.5.0.jar examples/jars/spark-examples.jar
$ ls -al examples/jars
total 3216
drwxr-xr-x 5 dongjoon staff 160 Oct 10 23:41 .
drwxr-xr-x 4 dongjoon staff 128 Sep 8 19:08 ..
-rw-r--r-- 1 dongjoon staff 78803 Sep 8 19:08 scopt_2.12-3.7.1.jar
lrwxr-xr-x 1 dongjoon staff 29 Oct 10 23:41 spark-examples.jar -> spark-examples_2.12-3.5.0.jar
-rw-r--r-- 1 dongjoon staff 1564255 Sep 8 19:08 spark-examples_2.12-3.5.0.jar
BTW, this is tested in the cluster already, @viirya ~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that we are repeating the same questions and answers. Maybe, are you asking because something is not working, @viirya ? Does it fail in your environment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I asked this because I always run ln command with link source under current path (or it is absolute path). I don't know that you can run ln with a source in different path. Interesting. 😄
If you test it, then it should be okay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can do that in your mac.
Yea, I just tested it locally. It works. 👍
|
Thank you so much for your patience, @viirya ! I must be clear about the |
|
Merged to master for Apache Spark 4.0.0. |
|
Late LGTM and looks useful. |
|
Thank you @dongjoon-hyun for clarifying my confusion! |
…8s Docker images ### What changes were proposed in this pull request? This PR aims to add a symbolic link file, `spark-examples.jar`, in the example jar directory. ``` $ docker run -it --rm spark:latest ls -al /opt/spark/examples/jars | tail -n6 total 1620 drwxr-xr-x 1 root root 4096 Oct 11 04:37 . drwxr-xr-x 1 root root 4096 Sep 9 02:08 .. -rw-r--r-- 1 root root 78803 Sep 9 02:08 scopt_2.12-3.7.1.jar -rw-r--r-- 1 root root 1564255 Sep 9 02:08 spark-examples_2.12-3.5.0.jar lrwxrwxrwx 1 root root 29 Oct 11 04:37 spark-examples.jar -> spark-examples_2.12-3.5.0.jar ``` ### Why are the changes needed? Like PySpark example (`pi.py`), we can submit the examples without considering the version numbers which was painful before. ``` bin/spark-submit \ --master k8s://$K8S_MASTER \ --deploy-mode cluster \ ... --class org.apache.spark.examples.SparkPi \ local:///opt/spark/examples/jars/spark-examples.jar 10000 ``` The following is the driver pod log. ``` + exec /usr/bin/tini -s -- /opt/spark/bin/spark-submit ... --deploy-mode client --properties-file /opt/spark/conf/spark.properties --class org.apache.spark.examples.SparkPi local:///opt/spark/examples/jars/spark-examples.jar 10000 Files local:///opt/spark/examples/jars/spark-examples.jar from /opt/spark/examples/jars/spark-examples.jar to /opt/spark/work-dir/./spark-examples.jar ``` ### Does this PR introduce _any_ user-facing change? No, this is an additional file. ### How was this patch tested? Manually build the docker image and do `ls`. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#43324 from dongjoon-hyun/SPARK-45497. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
…cker images ### What changes were proposed in this pull request? This PR aims to add a symbolic link file, `spark-examples.jar`, in the example jar directory. Apache Spark repository is updated already via - apache/spark#43324 ``` $ docker run -it --rm spark:latest ls -al /opt/spark/examples/jars | tail -n6 total 1620 drwxr-xr-x 1 root root 4096 Oct 11 04:37 . drwxr-xr-x 1 root root 4096 Sep 9 02:08 .. -rw-r--r-- 1 root root 78803 Sep 9 02:08 scopt_2.12-3.7.1.jar -rw-r--r-- 1 root root 1564255 Sep 9 02:08 spark-examples_2.12-3.5.0.jar lrwxrwxrwx 1 root root 29 Oct 11 04:37 spark-examples.jar -> spark-examples_2.12-3.5.0.jar ``` ### Why are the changes needed? Like PySpark example (`pi.py`), we can submit the examples without considering the version numbers which was painful before. ``` bin/spark-submit \ --master k8s://$K8S_MASTER \ --deploy-mode cluster \ ... --class org.apache.spark.examples.SparkPi \ local:///opt/spark/examples/jars/spark-examples.jar 10000 ``` The following is the driver pod log. ``` + exec /usr/bin/tini -s -- /opt/spark/bin/spark-submit ... --deploy-mode client --properties-file /opt/spark/conf/spark.properties --class org.apache.spark.examples.SparkPi local:///opt/spark/examples/jars/spark-examples.jar 10000 Files local:///opt/spark/examples/jars/spark-examples.jar from /opt/spark/examples/jars/spark-examples.jar to /opt/spark/work-dir/./spark-examples.jar ``` ### Does this PR introduce _any_ user-facing change? No, this is an additional file. ### How was this patch tested? Manually build the docker image and do `ls`. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #67 from dongjoon-hyun/SPARK-45497. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
…3-4.0.0-preview1.jar` ### What changes were proposed in this pull request? This PR aims to use `spark-examples.jar` instead of `spark-examples_2.13-4.0.0-preview1.jar`. ### Why are the changes needed? To simplify the examples for Apache Spark 4+ via SPARK-45497. - apache/spark#43324 ### Does this PR introduce _any_ user-facing change? Yes, but only example images. ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #127 from dongjoon-hyun/SPARK-49705. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
What changes were proposed in this pull request?
This PR aims to add a symbolic link file,
spark-examples.jar, in the example jar directory.Why are the changes needed?
Like PySpark example (
pi.py), we can submit the examples without considering the version numbers which was painful before.The following is the driver pod log.
Does this PR introduce any user-facing change?
No, this is an additional file.
How was this patch tested?
Manually build the docker image and do
ls.Was this patch authored or co-authored using generative AI tooling?
No.