-
Notifications
You must be signed in to change notification settings - Fork 2.8k
ZEPPELIN-198 HDFS File Interpreter #752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…c ls, cd and pwd functionality against WebHDFS. It addresses ZEPPELIN-198
Merge with master so that the documentation can be checked-in using single commit
…ing creation to StringBuilder.
file/pom.xml
Outdated
| <artifactId>zeppelin-file</artifactId> | ||
| <packaging>jar</packaging> | ||
| <version>0.6.0-incubating-SNAPSHOT</version> | ||
| <name>Zeppelin: File Manager</name> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should probably say HDFS File Interpreter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently the only file system implemented in this project is HDFS. The breakout of having a generic FileInterpreter that that other file systems could easily extend would enable faster turn around on future file system interpreters such as S3, Swift, or Ceph. (the two funcions required are listAll and isDirectory)
I see two paths:
- Change the current name to reflect its current use, and have the next implementation that uses this class rename later to be more generic, in which case I agree
HDFS File Interpretershould be the name here. - Rename to something that would suggest to new developers that this is where "File System Interpreters" should be created. I propose
Zeppelin File System Interpreters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Zeppelin File System Interpreters sounds ok to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation uses HDFS File Interpreter while package name does not.
@runyontr could you update documentation?
You'll also need to add an link https://github.com/apache/incubator-zeppelin/blob/master/docs/_includes/themes/zeppelin/_navigation.html#L41
… match functionality
zeppelin-server/derby.log
Outdated
| @@ -0,0 +1,13 @@ | |||
| ---------------------------------------------------------------- | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm .. is this added by accident or missing .gitignore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like both were an issue. I accidentally added it, and it wasn't in the gitignore. When I deleted the file and removed it from git, after re-running mvn clean package, git wanted to add the file again. I've added a line to the .gitingore file to prevent this from happening again.
docs/interpreter/hdfs.md
Outdated
| --- | ||
| {% include JB/setup %} | ||
| .add(HDFS_USER, "hdfs", "The WebHDFS user") | ||
| .add(HDFS_MAXLENGTH, "1000", "Maximum number of lines of results fetched").build()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it intended to be here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, its absolutely not supposed to be there.
…en pom file and interpreter documentation.
docs/interpreter/hdfs.md
Outdated
| <br/> | ||
| This interpreter connects to HDFS using the HTTP WebHDFS interface. | ||
| It supports the basic shell file commands applied to HDFS, it currently only supports browsing | ||
| * You can use <i>ls [PATH]</i> and <i>ls -l [PATH]</i> to list a directory. If the path is missing, then the current directory is listed. <i>ls </i> supports a <i>-h</h> flag for human readable file sizes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…atting issues for hdfs.md
|
Thank you all for the comments. I've noticed that now there is a note for this pull request that this branch has conflicts that must be resolved. Would you like me to re-merge in the current master branch from apache/incubator-zeppelin, or will the approver of the pull request deal with the merge conflicts that have come up from pulling new branches into master? |
|
It's more recommended to merge current master to solve conflict. That'll run additional CI test before merge it into master. |
# Conflicts: # conf/zeppelin-site.xml.template
| static { | ||
| Interpreter.register( | ||
| "hdfs", | ||
| "hdfs", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second "hdfs" is interpreter group name.
If 'file' interpreter submodule planning to have more implementation other than HDFSFileInterpreter, i suggest change group name as "file".
|
I have tested and working well. Thanks for the contribution! |
|
|
||
| (CDDL 1.0) javax.activation (javax.activation:activation:jar:1.1.1 - http://java.sun.com/javase/technologies/desktop/javabeans/jaf/index.jsp) | ||
| (CDDL 1.1) Jersey (com.sun.jersey:jersey:jar:1.9 - https://jersey.java.net/) | ||
| (CDDL 1.1) (GPL2) jersey-core (org.glassfish.jersey.core:jersey-core:2.22.2 - https://jersey.java.net/) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove (GPL2) here, if we're providing jersey-core under CDDL 1.1 license.
Also all transitive dependency (e.g. org.glassfish.hk2:hk2-api, etc) need to be addressed, if license of jersey-core doesn't cover them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll make that change later and push the amended version back up.
WRT the transitive dependencies: I'm not very familiar with licensing and had a question: If there's a main package reference (e.g. org.glassfish.jersey in our case) that is provided under a particular license (e.g. CDDL 1.1) it's not true that the particular license covers all the dependencies of that package?
I'll run through the list individually and add any items that are needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressing all the transitive dependency is hard work. but, transitive dependencies are no different from first order dependencies. Please check here.
|
Looks good to me. |
|
Merge if there're no more discussions |
### What is this PR for? This pull request is a follow of apache#276 started by raj-bains. The additional commits address comments from the pull request regarding string creation and error propagation for bad object requests. ### What type of PR is it? [Bug Fix | Improvement | Feature | Documentation | Hot Fix | Refactoring] Feature/Subtask ### Todos ### Is there a relevant Jira issue? [ZEPPELIN-198](https://issues.apache.org/jira/browse/ZEPPELIN-198) ### How should this be tested? Outline the steps to test the PR here. ### Screenshots (if appropriate)     ### Questions: * Does the licenses files need update? No * Is there breaking changes for older versions? No * Does this needs documentation? No Author: Tom Runyon <runyontr@gmail.com> Author: Raj Bains <rajbains@Rajs-MacBook-Pro.local> Closes apache#752 from runyontr/master and squashes the following commits: f7bfef8 [Tom Runyon] ZEPPELIN-198 added transitive dependency lincense information af16ce0 [Tom Runyon] ZEPPELIN-198 removed GPL2 reference from license file 5e9e131 [Tom Runyon] ZEPPELIN-198 Updated ZeppelinConfiguration to hold HDFS interpreter 9832622 [Tom Runyon] ZEPPELIN-198 Changed group for hdfs interpreter c34a913 [Tom Runyon] ZEPPELIN-198 Updated licenses file to include org.glassfish.jersey.core 8d0ee3d [Tom Runyon] Merge https://github.com/apache/incubator-zeppelin 9f66514 [Tom Runyon] ZEPPELIN-198 Removed extra copy of configuration table and fixed formatting issues for hdfs.md 5938b0e [Tom Runyon] ZEPPELIN-198 Updated documentation to be consistent with naming between pom file and interpreter documentation. 67bbc5b [Tom Runyon] ZEPPELIN-198 Added navigation to hdfs interpreter 1c7a5c2 [Tom Runyon] ZEPPELIN-198 removed errant text in documentation. 56a5174 [Tom Runyon] ZEPPELIN-198 fixed logging to match standards 933c890 [Tom Runyon] MAINT Updated .gitignore file to remove zeppelin-server/derby.log 29540df [Tom Runyon] ZEPPELIN-198 removed zeppelin-server/derby.log 71d53d3 [Tom Runyon] ZEPPELIN-198 Changed pom name to Zeppelin File System Interpreters to match functionality aec0512 [Tom Runyon] ZEPPELIN-198 Fixed compile error for error logging. d24f4c0 [Tom Runyon] ZEPPELIN-198 Added error logging when returning error in interpet 227b815 [Tom Runyon] ZEPPELIN-198 Updated interpreter documentation. b505391 [Tom Runyon] ZEPPELIN-198 Added completion functionality to HDFSInterpreter 32ed7cb [Tom Runyon] ZEPPELIN-198 Added -h flag for human readable byte sizes. Updated string creation to StringBuilder. 797fd29 [Tom Runyon] Added org.glassfish.jersey.core to pom.xml file for hdfs intepretor 27e0438 [Tom Runyon] Modified string creation to use StringBuilder 79f0d90 [Tom Runyon] Merge branch 'master' of https://github.com/raj-bains/incubator-zeppelin 70507a8 [Raj Bains] Add Documentation and a missing dependency for HDFS File Browser 1239fe6 [Raj Bains] Merge remote-tracking branch 'upstream/master' 7d61e5f [Raj Bains] This is the first reviewed version of File Interpreter that adds basic ls, cd and pwd functionality against WebHDFS. It addresses ZEPPELIN-198 865e6ab [Raj Bains] Add File Interpreter, HDFS Interpreter and Tests

What is this PR for?
This pull request is a follow of #276 started by @raj-bains. The additional commits address comments from the pull request regarding string creation and error propagation for bad object requests.
What type of PR is it?
[Bug Fix | Improvement | Feature | Documentation | Hot Fix | Refactoring]
Feature/Subtask
Todos
Is there a relevant Jira issue?
ZEPPELIN-198
How should this be tested?
Outline the steps to test the PR here.
Screenshots (if appropriate)
Questions: