-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Notebook: (Web) HDFS as a backend storage (Read & Write Mode) #2333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I don't think it's good idea to include some interpreters as dependencies onto zeppelin-zengine. |
|
@jongyoul moved HDFSCommand to zeppelin-interpreter |
|
it might be important to call this |
|
@felixcheung |
|
Is this HDFS or WebHDFS protocol?
|
|
@felixcheung Do you want me to update the docs and the code or the docs only ? |
|
both of them if it makes sense? |
# Conflicts: # .travis.yml
# Conflicts: # docs/setup/storage/storage.md
|
@felixcheung |
|
Hello @jongyoul |
|
I see perhaps value in both web hdfs and hdfs (jar client)? |
|
maybe add one property to allow user to choose which method to use. And |
# Conflicts: # file/src/main/java/org/apache/zeppelin/file/HDFSCommand.java # file/src/main/java/org/apache/zeppelin/file/WebHDFSFileInterpreter.java # file/src/test/java/org/apache/zeppelin/file/WebHDFSFileInterpreterTest.java
What is this PR for ?
This PR replaces the PR-1479 by removing any hadoop dependency using WEBHDFS as a communication protocol (code borrowed from PR1600)
Zeppelin currently supports many backends for storing notes through Apache Commons VFS.
Apache Commons VFS supports HDFS in readonly mode.
This PR makes HDFS a first class citizen by allowing users to load notes from / save notes to HDFS.
What type of PR is it?
Improvement
Todos
Task
What is the Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-1515
How should this be tested?
Update zeppelin.notebook.dir property to a value like hdfs://localhost:9000/tmp/notebook and the property zeppelin.notebook.storage to the value org.apache.zeppelin.notebook.repo.HdfsNotebookRepo
check that your notes are loaded from and stored to HDFS by listing notes using the command :
hdfs dfs -ls /tmp/notebook
Screenshots (if appropriate)
Questions: