-
Notifications
You must be signed in to change notification settings - Fork 2.8k
[ZEPPELIN-1386] Docker images for running Apache Zeppelin releases #1538
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
merge latest master updates
bzz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for contributing!
Do you think some documentation updates could be part of this PR, i.e explaining about DOCKER_USERNAME, etc?
\cc @minahlee who was taking care of latest releases and @astroshim who was working on Docker images for other cases for review.
| @@ -0,0 +1,16 @@ | |||
| FROM alpine:3.3 | |||
| MAINTAINER Mahmoud Elgamal <[email protected]> | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the top of my head:
- there must be ASF Apache 2.0 license header at the beginning of every file
- ASF projects usually discourage author annotations, could you please replace this with something like
Apache Zeppelin authors <[email protected]>?
Attributions are kept though JIRA\Git commit logs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
usually people put Apache Software Foundation <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
2e5fd94 to
c9a32aa
Compare
|
@bzz Thank you for your feedback. A MD documentation page is added to the install section of Zeppelin documentation. |
c9a32aa to
e642309
Compare
| ENV PATH $PATH:$JAVA_HOME/bin | ||
|
|
||
| # ports for zeppelin | ||
| EXPOSE 8080 8081 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in Zeppelin we dont need to open the port 8081 (used for old implementation of websocket), only 8080 is required now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anthonycorbacho Ok, so if you want to build a docker image to the old version of zeppelin, you need port 8081, what do you think about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mfelgamal you are right, but the usage of port 8081 have been removed long time ago, do you think it is worth to keep it? I am note sure if ppl will sill want to use an relic version of zeppelin?
Maybe by nostalgia ?
Let me know if it make sense :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anthonycorbacho you are right, I removed 8081 port :)
|
@mfelgamal Thank you for contributing this.
|
da26c50 to
e731cb4
Compare
|
@astroshim Thank you for your reviews, point 1 and 3 are done, and I will work on the documentation. Can you check the docker image again? |
|
@mfelgamal I checked point 1 and 3. Thank you fix properly. |
|
@mfelgamal first of all thanks for initiative on this one, and few points here:
Note that these points may not need to be addressed right in this PR, it's more like things to consider and maybe future improvement. |
|
|
||
| ### Running a Zeppelin docker image | ||
|
|
||
| * To start Zeppelin, you need to pull the zeppelin release image: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AhyoungRyu Thank you for your reviews, I changed it, you can check now. ;)
|
@mfelgamal Awesome! |
|
@mfelgamal Ping. |
6ba7d78 to
e1d4b77
Compare
|
@astroshim R and Python are installed and I think that the tutorials should be run without errors. |
|
@khalidhuseynov Thank you for your reviews. For the point # 1, I will work on enhancing this in next pull request. And for 2 and 3, the docker image could be run in the daemon mode and I added it to the documentation, and also made 7077 port avaliable. |
|
@mfelgamal I got following error. Did i missed something? |
|
@astroshim I think it is just a message and the shell run normally. you should pull the image again and run with this command |
|
@mfelgamal Could you tell me more detail about the issue on |
|
I tried My Notebook complains like |
|
Additionally, It would be better to
You can refer this docker init script https://github.com/wurstmeister/kafka-docker/blob/master/start-kafka.sh |
|
@mfelgamal thank you for your great work and @astroshim @khalidhuseynov @AhyoungRyu @anthonycorbacho and @1ambda for prompt reviews! I think @1ambda raised very good points, using |
|
Hi :) ping |
|
@1ambda It's good if we did that in apache zeppelin repo. but in this case instead of using the created binary versions, we should build the source code inside the docker. I think that alpine doesn't play well with node, so we could use ubuntu, what do you think? |
|
Great job @mfelgamal thank you for taking care! The idea was to try to avoid building separate artefacts for Docker and use official convenience binaries from Apache release. May be I'm missing something here, but what is the reason such images can not be published under https://hub.docker.com/r/apache/zeppelin automatically as a part of release (as that's what we want to publish)? Alos, it would be so cool, if we could get rid of |
|
Hi @bzz So far, we have binary versions from 0.5.0 to 0.6.2, which help us building docker image to each version instead of building the source code, but I mean that if you want to make dockerfile to the latest version from zeppelin which haven't a binary version and is on master branch, so we may need to build the zeppelin in the docker. if the latest version isn't necessary now, we can ignore this, what do you think?
|
|
I think we should dockerize binary zeppelin images first because more users use binary versions. |
This better be handled under a separate JIRA issue as this one is about So all that sounds great, and image looks good, except for fat R dependencies, but I'm not sure if we can do something about it. @felixcheung as R expert, do you know if there is any way of installing R\deps without having How do you guys think, is there anything that's left here? |
|
to recap - theare are two final things, that I think might be very nice to have in this image:
|
|
Now this issue is like more about building base image :) I will start working on a new JIRA issue as @bzz mentioned about creating runnable zeppelin images per version based on this. That would be the next step. |
|
@bzz I think that we need to install |
|
@mfelgamal yes, exactly. Do you think this is possible? I wonder if the image size would go down, if we remove those guys after getting I'm not very familiar with R ecosystem, but isn't there some way of installing packages that comes with everything (including native dependecies) compiled, like .whl in Python? Then we could skip building layers of image \w gcc&co all together... |
|
Sort of. There's Conda for R: https://www.continuum.io/content/preliminary-support-r-conda But generally some R packages are compiled on installation; knitr is a relatively bigger one. |
| # ports for zeppelin | ||
| EXPOSE 8080 7077 | ||
|
|
||
| ENTRYPOINT ["/usr/local/bin/dumb-init"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be ENTRYPOINT ["/usr/local/bin/dumb-init", "--"] according to https://github.com/Yelp/dumb-init#usage. It allows extened images run their executables without specifying -- if fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@1ambda done.
|
@bzz I removed |
|
@mfelgamal that sounds awesome job, thank you very much. Please let me test it tomorrow and get back to you here, but otherwise I think it's ready to be merged! |
|
Let me update this comment. It's not due to |
|
@mfelgamal thanks again for the great work! One last question is - could you please explain, why do you think one more script is needed - @1ambda has the point, and we should try to reduce the number of shell scripts that need to be supported later on, as well as possible issues with setting up classpath, etc. |
|
I agree with you opinion. We can directly use https://github.com/mfelgamal/zeppelin/pull/3/files Regarding to CI failureIt's due to flaky tests. |
fix: Remove start-zeppelin.sh
|
@1ambda that sounds awesome job, thank you very much. Now the PR is merged. |
|
Looks great to me, thank you @1ambda @mfelgamal Let's wait for CI results (just to help \w ongoing CI stability work) and merge to master, if nothing unexpected comes up and there is no further discussion! |
|
@mfelgamal Could you retrigger CI? |
|
First and second failing CI profile hit ZEPPELIN-1797 Spark 1.5 had another troubles \w And Selenium profiles also fails on test, related to Though I belive none of these have to do anything with the changed introduced in this PR, so merging it to master if there is no further discussion |



What is this PR for?
This PR is for making docker images for zeppelin releases. It contains a script for building image for each release. Another script is used for publishing images to zeppelin Dockerhub account.
This repo, https://github.com/mfelgamal/zeppelin-dockers, is a demonstration of this PR. It contains zeppelin-base image and an image for each zeppelin release.
What type of PR is it?
[Feature]
Todos
What is the Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-1386
How should this be tested?
Screenshots (if appropriate)
Questions: